Lead full lifecycle execution of high-impact initiatives, from concept through deployment, including defining scope, estimating timelines, designing system architecture, and evaluating emerging technologies.
Develop and sustain robust data and machine learning pipelines that adapt to changing business and modeling demands, ensuring reliability and efficiency.
Enhance training pipeline performance by optimizing for speed, memory usage, and cost, including leveraging spot instances, efficient data loading, and reuse of preprocessed data.
Empower data scientists by delivering reusable, tested components such as transformers, data loaders, and training tools, while guiding contributions to shared codebases.
Expand data pipeline capabilities by incorporating new data sources, extending feature time windows, and scaling training datasets within existing infrastructure limits.
Support deep learning workflows by managing GPU-based training, implementing custom training loops in PyTorch, and assisting with model architecture design.
Ensure consistency and reproducibility across experimentation and production using version-controlled configurations, experiment logging, and alignment between offline and online systems.
Work with infrastructure teams to improve system scalability through resource management, monitoring, and CI/CD modernization, while participating in on-call rotations to address pipeline alerts.

Hybrid — Paris, Helsinki

Lead full lifecycle execution of high-impact initiatives, from concept through deployment, including defining scope, estimating timelines, designing system architecture, and evaluating emerging technologies.
Develop and sustain robust data and machine learning pipelines that adapt to changing business and modeling demands, ensuring reliability and efficiency.
Enhance training pipeline performance by optimizing for speed, memory usage, and cost, including leveraging spot instances, efficient data loading, and reuse of preprocessed data.
Empower data scientists by delivering reusable, tested components such as transformers, data loaders, and training tools, while guiding contributions to shared codebases.
Expand data pipeline capabilities by incorporating new data sources, extending feature time windows, and scaling training datasets within existing infrastructure limits.
Support deep learning workflows by managing GPU-based training, implementing custom training loops in PyTorch, and assisting with model architecture design.
Ensure consistency and reproducibility across experimentation and production using version-controlled configurations, experiment logging, and alignment between offline and online systems.
Work with infrastructure teams to improve system scalability through resource management, monitoring, and CI/CD modernization, while participating in on-call rotations to address pipeline alerts.

Comprehensive benefits package tailored to the employee's country of residence

Voodoo is hiring a Senior ML Engineer - Offline Team