GitLab is hiring a Data Engineer I to champion data accessibility and empower colleagues to use data effectively. You will design and maintain data pipelines with high observability and reliability in a collaborative environment.
What You'll Do
- Design and maintain data pipelines with high observability and reliability.
- Build and maintain large, complex data sets that meet functional and non‑functional requirements.
- Identify and implement process improvements such as automating manual tasks, optimizing data delivery and redesigning infrastructure for scalability.
- Construct extraction, transformation and loading infrastructure in SQL, Python, pyspark, and dbt.
- Collaborate with executive, product, clinical, data and design teams to resolve data‑related technical issues and support their data infrastructure needs.
- Ensure that pipelines are scalable, efficient and secure.
What We're Looking For
- 2–3 years of professional data‑engineering experience
- Graduate degree in computer science, statistics, informatics, information systems or a related quantitative field
- Advanced proficiency in Python and SQL
- Experience using dbt to build data pipelines
- Expertise with relational, NoSQL, and cloud database technologies
- Hands‑on experience with Databricks (Delta Lake, Spark SQL and workflows)
- Strong knowledge of data‑warehousing concepts and ETL/ELT design
- Experience performing root‑cause analysis on internal and external datasets to answer business questions and identify improvement opportunities
- Proven history of manipulating and extracting value from large, disconnected datasets
- Basic understanding of performance tuning and optimization for data systems
- Experience working with cross‑functional teams in a dynamic environment
- Self‑motivated professional who takes ownership of assigned tasks and seeks guidance when necessary
Nice to Have
- Healthcare‑domain experience
- Experience with building a Data-as-a-service platform
- Experience with building APIs
- Experience with cloud-based data warehouse: Snowflake
- Experience with relational SQL and NoSQL databases
- Experience with object-oriented/object function scripting languages: Golang, Python, Java, C++, Scala, etc.
- Experience with big data tools: Spark, Kafka, etc.
- Experience with data pipeline and workflow management tools like Airflow
- Experience with AWS cloud services: EC2, EMR, RDS, Redshift
- Experience with stream-processing systems: Storm, Spark-Streaming, etc.
Technical Stack
- SQL
- Python
- pyspark
- dbt
- Databricks
- Delta Lake
- Spark SQL
Benefits & Compensation
- Salary: $95,000 - $105,000 USD + equity: Stock options
- Competitive medical, dental, and vision coverage
- Competitive 401(k) Plan with a generous company match
- Flexible Time Off/Paid Time Off, 12 paid holidays
- Protection Plans including Life Insurance, Disability Insurance, and Supplemental Insurance
- Mental Health and Wellness benefits
- Corporate bonus program or sales incentive
- Stock options
Work Mode
This position follows a local-country work mode for candidates based in the US.
GitLab is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.


