Full-time

LanceDB is hiring an Open Source Engineer

About the Role

LanceDB is hiring an Open Source Engineer to advance high-performance multimodal databases. You will leverage Java/Scala and Rust to expand the reach of Lance and LanceDB within the broader data infrastructure ecosystem.

What You'll Do

  • Drive OSS community efforts to integrate Lance format into Spark, Hive Metadata Store, Presto, Trino, Ray, and other data infrastructure systems.
  • Promote the Lance format at big data conferences and meetups.
  • Design and maintain efficient distributed Lance dataset operations.
  • Design efficient indices to power predicate pushdown in Spark, Ray, or Trino.
  • Work on table format, data encodings, and various aspects of the Lance format in Rust.
  • Operate on in-house data processing infrastructure.

What We're Looking For

  • At least five years of experience building high-performance databases, big data systems, or web-scale data services.
  • Experience with internals of open source big data or AI training systems, such as Hadoop, Spark, Flink, Ray, Iceberg, Delta-lake, Hudi, Clickhouse, Trino, Presto, PyTorch, or JAX.
  • Hands-on experience with high-performance computing in Java or Scala.
  • You thrive in a small, high-caliber team with autonomy, drive, and the ability to iterate fast.

Nice to Have

  • You are an open-source veteran, committer, or PMC of large open source systems in the Apache community.
  • You fearlessly challenge the status quo and dismiss mediocre engineering as unacceptable.
  • You have a proven record of driving large features in Apache projects.
  • You are familiar with Java, Rust, C++, Apache Arrow, Apache DataFusion, Apache Parquet, Apache Iceberg, and Delta Lake.

Technical Stack

  • Languages: Java, Scala, Rust, C++
  • Big Data Systems: Hadoop, Spark, Flink, Ray, Iceberg, Delta-lake, Hudi, Clickhouse, Trino, Presto
  • AI Frameworks: PyTorch, JAX
  • Apache Ecosystem: Apache Arrow, Apache DataFusion, Apache Parquet

Team & Environment

You'll work with a small, high-caliber team where autonomy, drive, and fast iteration are the standard.

Required Skills
JavaScalaRustHadoopSparkFlinkRayIcebergDelta-lakeHudiOpen SourceData EngineeringDistributed SystemsData LakeData Processing
Earn more as a remote developer

Performance pay that rewards your skills

Iglu's revenue-sharing model means top performers earn significantly more than traditional salaries. Choose your projects, deliver great work, and see it reflected in your pay.

Revenue-sharing compensation
Project choice & autonomy
International client base
Career growth support
Check compensation
Top earners exceed market rate
About company
LanceDB

LanceDB is a developer-friendly, open-source database for multimodal AI, providing a foundation for AI applications from vector search to AI dataset exploration.

Visit website
Job Details
Category data
Posted 8 months ago