Requirements
- 4+ years of professional software engineering experience in production environments.
- 4+ years of experience building or maintaining large-scale production data infrastructure, data platforms, distributed systems, or data lake systems.
- Strong experience with Apache Spark or similar distributed data processing systems.
- Experience operating production infrastructure in AWS, including services such as S3, RDS, DynamoDB, SQS, Kinesis, Lambda, or similar.
- Experience designing, building, and operating reliable systems with strong ownership of scalability, observability, security, and operational excellence.
- Proficiency in at least one production programming language such as Go, Python, Scala, or Java.
- Ability to collaborate effectively with cross-functional partners, including software engineers, data scientists, analysts, security teams, and product stakeholders.
Nice to Have
- Experience with Databricks, Delta Lake, or similar lakehouse technologies such as Iceberg or Hudi.
- Experience building data replication or ingestion systems from OLTP data stores into a data lake or lakehouse.
- Experience with Infrastructure-as-Code tools such as Terraform or CloudFormation.
- Familiarity with data catalogs, metadata systems, and data discovery tools such as Unity Catalog, Hive Metastore, DataHub, or Amundsen.
- Experience with orchestration systems such as Airflow, Dagster, or Prefect.
- Experience with streaming data, event-driven architectures, or systems that handle late-arriving or mutable data.
- Familiarity with containerization or orchestration technologies such as Docker, Kubernetes, ECS, or Fargate.
- Experience building internal platforms, libraries, or developer tooling used by other engineering teams.
- Experience contributing to data infrastructure roadmaps, evaluating new technologies, and driving improvements that create leverage for internal and external customers.
Additional Information
- Relocation assistance will not be provided for this role.