About the Role

The role involves building and maintaining scalable data pipelines that power machine learning systems, with a focus on reliability, performance, and integration within cloud environments.

Responsibilities

Develop and optimize data pipelines for machine learning applications
Design and implement data workflows on AWS infrastructure
Ensure data reliability, quality, and accessibility across systems
Collaborate with data scientists to operationalize ML models
Monitor pipeline performance and troubleshoot production issues
Improve data processing efficiency and scalability
Maintain data architecture documentation and best practices
Support data ingestion from multiple sources and formats
Implement automated testing for data workflows
Integrate new data sources into existing pipelines
Optimize ETL processes for speed and cost
Work with streaming and batch data systems
Ensure compliance with data governance policies
Contribute to system reliability and uptime
Participate in code reviews and technical design discussions
Use Python for data processing and pipeline orchestration
Deploy infrastructure using IaC tools
Support reproducibility and versioning of data pipelines
Enhance monitoring and alerting for data systems
Collaborate across engineering and research teams

Nice to Have

Experience with MLOps frameworks
Knowledge of feature store implementations
Background in real-time data processing
Familiarity with data mesh architectures
Experience with large-scale data platforms
Contributions to open-source data projects
Advanced degrees in computer science or related field
Published work in data engineering or ML systems

Compensation

Competitive salary and equity package

Work Arrangement

Remote with flexible hours

Team

Collaborative team of data scientists, engineers, and ML researchers

Tech Stack

Python, AWS (S3, Lambda, Glue, Redshift), Airflow, Docker, Kubernetes, Terraform, PostgreSQL, Apache Parquet, Pandas, NumPy

Impact

Your work will directly enable faster iteration on machine learning models by ensuring clean, reliable, and timely data delivery across research and production systems.

Available for qualified candidates

Vigil is hiring a Senior Data Engineer (Python / AWS / ML Pipelines)

About the Role

Responsibilities

Nice to Have

Compensation

Work Arrangement

Team

Tech Stack

Impact

Invoice multiple clients effortlessly

Similar Jobs

Senior Analytics Engineer

Senior Data Scientist

Farm Financial Research Analyst

Lead Data Analyst, Partner Development - SQL, R, Python, Pandas, Scala (Bangkok-based, relocation provided)

Data Engineer (Automotive & Energy)

Staff, Data Scientist