Responsibilities
- Design, implement, and maintain CI/CD pipelines for machine learning workflows using tools like GitHub Actions, Azure DevOps, or Jenkins.
- Build and optimize data processing pipelines in Apache Spark (PySpark and Scala) for large-scale, distributed listener datasets.
- Deploy and manage Databricks environments, ensuring efficient cluster usage, job scheduling, and cost optimization.
- Collaborate with data scientists to productionize ML models, integrating them into scalable APIs or batch processing systems that feed real-time, machine-readable audience signals.
- Implement automated testing, monitoring, and alerting for ML pipelines to ensure the reliability and reproducibility that certified buyers require.
- Champion best practices in version control, model registry management, and environment reproducibility.
- Help evolve our listener data infrastructure toward agent-compatible supply — live, structured, queryable data feeds that autonomous buying systems can discover and act on without human mediation.
Work Arrangement
Remote (Country)
Team
Structure: Cross-functional collaboration with data scientists and other teams
Additional Information
- L’utilisation de l’anglais est nécessaire pour collaborer avec des équipes internes et internationaux, et pour accéder à des informations et des ressources.
- Fully remote position (must be based in ONTARIO or QUEBEC)
- 4 weeks of vacation + 5 paid personal days annually
- Group insurance programs as of your first day, including access to telemedicine and an EAP
- Collective RRSP with matching contribution
- Internet reimbursement and more