Orca Bio seeks a Senior Data Engineer for the Advanced Data Lake (ADL) team. In this infrastructure-heavy, hybrid cloud role, you will build and operate enterprise data Lakehouse platforms using Google Cloud Platform (GCP) to support large-scale analytics and digital transformation.
What You'll Do
- Architect and maintain automated data pipelines for ingesting, transforming, and integrating complex datasets.
- Build and operate enterprise data Lakehouse platforms.
- Use DataStream for real-time data movement and Dataflow for processing at scale.
- Leverage Composer/Airflow for scheduling, monitoring, and automation of pipeline operations.
- Handle infrastructure provisioning and workflow management with Terraform and Dataform to ensure reproducibility and best practices.
- Manage all code and pipeline assets through git repositories, with CI/CD automation and releases via Azure DevOps (ADO).
- Govern changes through ServiceNow processes for traceability, auditability, and operational compliance.
- Work with cross-functional teams to translate business needs into pipeline specifications.
- Build and optimize data models for advanced analytics.
- Maintain data quality and security throughout all processes.
- Automate workflow monitoring and proactively resolve data issues.
- Develop and deploy data pipelines, integrations, and transformations to support analytics and machine learning applications.
- Partner with product owners and Analytics and Machine Learning delivery teams to identify and retrieve data, conduct exploratory analysis, and transform data.
What We're Looking For
- Bachelor's degree in engineering, mathematics, computer science, information technology, health science, or other analytical/quantitative field and a minimum of five years of professional or research experience in data visualization, data engineering, analytical modeling techniques.
- OR Associate’s degree in a relevant field and a minimum of seven years of professional or research experience.
- Proficiency in Python and SQL.
- Significant experience in Google Cloud Platform (especially Dataflow and DataStream), Terraform, Dataform, and orchestration with Composer/Airflow.
- Experience managing code in git repositories, working with Azure DevOps workflows, and following ServiceNow change management processes.
- Advanced experience in SQL.
- Strong Experience in scripting languages such as Python, JavaScript, PHP, C++ or Java & API integration.
- Experience in hybrid data processing methods (batch and streaming) such as Apache Spark, Hive, Pig, Kafka.
- Experience with big data, statistics, and machine learning.
- Ability to navigate linux and windows operating systems.
- Ability to manage a varied workload of projects with multiple priorities.
- Interpersonal skills, time management skills, and demonstrated experience working on cross functional teams.
- Strong analytical skills and the ability to identify and recommend solutions and a commitment to customer service.
- Excellent verbal and written communication skills, attention to detail, and a high capacity for learning and problem resolution.
- GCP Professional Data Engineer certification is required.
Nice to Have
- Google Cloud Platform (GCP) certification.
- Knowledge of workflow scheduling (Apache Airflow Google Composer), Infrastructure as code (Kubernetes, Docker) CI/CD (Jenkins, Github Actions).
- Experience in DataOps/DevOps and agile methodologies.
- Experience with hybrid data virtualization such as Denodo.
- Working knowledge of Tableau, Power BI, SAS, ThoughtSpot, DASH, d3, React, Snowflake, SSIS, and Google Big Query.
- Hybrid or multi-cloud experience.
- Familiarity with enterprise data governance, metadata, and lineage tools.
- Experience working in large, regulated environments.
Technical Stack
- Python, SQL, Google Cloud Platform (GCP), Dataflow, DataStream, Composer/Airflow, Terraform, Dataform, git, Azure DevOps (ADO), ServiceNow, Apache Spark, Hive, Pig, Kafka, Kubernetes, Docker, Jenkins, Github Actions, Denodo, Tableau, Power BI, SAS, ThoughtSpot, DASH, d3, React, Snowflake, SSIS, Google Big Query
Team & Environment
You will be part of the Advanced Data Lake (ADL) team, working with cross-functional teams and partnering with product owners and Analytics and Machine Learning delivery teams.
Benefits & Compensation
- Medical: Multiple plan options.
- Dental: Delta Dental or reimbursement account for flexible coverage.
- Vision: Affordable plan with national network.
- Pre-Tax Savings: HSA and FSAs for eligible expenses.
- Retirement: Competitive retirement package to secure your future.
- Continuing education and advancement opportunities.
Work Mode
This role is remote and open to candidates located in the United States.
Orca Bio participates in E-Verify. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender identity, sexual orientation, national origin, protected veteran status or disability status.



