Start.io is hiring a Lead Data Engineer to own our core data infrastructure. In this role, you will be responsible for managing our data pipelines and leading the critical migration from a legacy Hadoop environment to a modern, Spark-based platform on AWS.
What You'll Do
- Manage the end-to-end data pipelines that pull data from operational systems and different endpoints.
- Transform and load data into the company data lake.
- Lead the complete migration from a Hadoop environment to a new Spark-based environment on AWS.
What We're Looking For
- 4+ years of hands-on experience as a Data Engineer.
- A BA/BSc in a related field such as CS, engineering, information systems, or equivalent.
- High proficiency in SQL and Python.
- Experience working with cloud-hosted data warehouses like Hive or Snowflake.
- Hands-on experience developing end-to-end ETL/ELT processes using Spark and SQL.
- Experience with the Hadoop framework and parallel computing technologies like EMR, HIVE, Presto/Trino/Athena, and Glue.
- Proven experience with data warehousing, modeling paradigms, and architectures, including DWH methodologies and best practices.
- Strong analytical skills and attention to detail.
- A collaborative team player.
- Ability to handle several tasks simultaneously.
- Comfortable working independently and leading projects from end to end.
Nice to Have
- Familiarity with Kafka, Confluent, Fluentd, Spark, and Airflow.
- Experience with data visualization tools such as Tableau.
- Experience within the media or TV industry.
Technical Stack
- Hadoop, Spark, AWS, SQL, Python, Hive, Snowflake, EMR
- Presto, Trino, Athena, Glue, Kafka, Confluent, Fluentd, Airflow, Tableau
Start.io is an equal opportunity employer.

