At Zocdoc, our mission is to give power to the patient. We are looking for a Staff Software Engineer, Data Infrastructure to lead the design and development of the software that underpins our data platform. You'll focus on building APIs, libraries, and services that make data producers and consumers effective, while optimizing for reliability, performance, and spend on AWS.
What You'll Do
- Design and ship platform services for ingestion, transformation, orchestration, and metadata (e.g., service-backed interfaces for Dagster/Airflow, lineage, quality, and data contracts).
- Build execution & scheduling capabilities for Spark/SQL jobs (queuing, prioritization, retries, resource isolation on EMR/EKS/Databricks), focusing on throughput and developer experience.
- Implement lakehouse features (Delta/Iceberg): schema evolution, partitioning, compaction, vacuum, snapshotting, ACID guarantees, and table-format governance.
- Optimize Snowflake and other warehouses: cost controls, query profiling/pruning, workload isolation, RBAC; expose safe self-service patterns.
- Deliver SDKs, CLIs, and templates that standardize how teams build reliable data products; enable CI/CD for data and contract testing.
- Work across AWS (S3, EMR/EKS, Glue/Athena, Lambda, Kinesis/MSK) with IaC (Terraform) and strong observability (Datadog/CloudWatch).
What We're Looking For
- 8+ years building backend/platform software with Python/Scala/Java and strong SQL; proven track record designing distributed systems.
- Deep experience with Spark (Databricks or EMR/EKS) and AWS data services; solid grasp of scheduler/executor behavior and performance tradeoffs.
- Hands-on data warehouse optimization (Snowflake ideal; others welcome).
- Experience building platform APIs/SDKs that other engineers adopt; excellent collaboration and technical leadership.
Nice to Have
- Experience at petabyte-scale data platforms, distributed big data compute, or lakehouse engines (Delta/Iceberg).
- Familiarity with metadata/governance tech (Unity Catalog, Collibra, Lake Formation).
Technical Stack
- Languages: Python, Scala, Java, SQL
- Core Compute: Spark, Databricks, EMR, EKS
- AWS Services: S3, Glue, Athena, Lambda, Kinesis, MSK
- Infrastructure & Observability: Terraform, Datadog, CloudWatch
- Warehouse & Formats: Snowflake, Delta, Iceberg
- Orchestration: Dagster, Airflow
Benefits & Compensation
- Compensation range: $180,000—$275,000 USD
- Flexible, hybrid work environment at our convenient Soho location
- Unlimited Vacation
- 100% paid employee health benefit options (including medical, dental, and vision)
- Commuter Benefits
- 401(k) with employer funded match
- Corporate wellness program with Wellhub
- Sabbatical leave (for employees with 5+ years of service)
- Competitive paid parental leave and fertility/family planning reimbursement
- Cell phone reimbursement
- Catered lunch everyday along with beverages and snacks
- Employee Resource Groups and ZocClubs to promote shared community and belonging
Work Mode
This role is hybrid and located in NYC.
We’re an equal opportunity employer committed to providing employees with a work environment free of discrimination and harassment. Applicants are considered for employment regardless of race, color, ethnicity, ancestry, religion, national origin, gender, sex, gender identity, gender expression, sexual orientation, age, citizenship, marital or parental status, disability, veteran status, or any other class protected by applicable laws.





