As a Staff Engineer on the Distributed Storage & Transactions team, you'll shape the foundational layers of a distributed database engine. Your work will directly influence how data is stored, replicated, and consistently accessed across distributed environments, ensuring reliability and performance at scale.
What You'll Do
You will lead the design and implementation of critical storage and replication components, focusing on correctness, efficiency, and long-term maintainability. You'll develop robust systems for handling high-volume transactions, distributed consensus, and fault-tolerant data replication.
Debug complex issues across distributed subsystems, focusing on stability, latency, and throughput under real-world workloads. Optimize performance across the storage engine, transaction processing, and replication layers to support growing cluster sizes and data volumes.
Design and implement key operational capabilities such as rolling upgrades, online schema changes, point-in-time recovery, and cluster scaling. Contribute to the open-source evolution of the database, improving architectural resilience and operational safety.
You will also guide and mentor other engineers, sharing expertise in distributed systems design, low-level performance tuning, and systems debugging.
Requirements
- Minimum of 8 years of professional software development experience with strong proficiency in C/C++
- Degree in Computer Science or related field, or equivalent industry experience
- Proven expertise in distributed systems, including consensus, replication, fault tolerance, and consistency models
- Experience building or maintaining storage engines, databases, or infrastructure-level systems
- Strong analytical and problem-solving skills in complex, distributed environments
- Ability to collaborate effectively within a distributed engineering team
Preferred Qualifications
- Hands-on experience with transaction engines, distributed storage, or consensus algorithms
- Familiarity with LSM-tree architectures, write-ahead logging, snapshot isolation, and compaction techniques
- Knowledge of PostgreSQL internals or similar relational database systems
- Past contributions to open-source database or systems projects
Technical Environment
Primary development in C++ with components interacting with PostgreSQL. The system leverages LSM-tree storage, WAL, snapshots, compaction strategies, and consensus protocols to deliver scalable, durable, and consistent distributed data management.
Benefits
- Comprehensive health insurance options
- Retirement planning support
- Unlimited paid time off