ABB is looking for a Machine Learning Engineer to own the stability, scalability, and performance of production-grade ML platforms. You will design and enhance backend services, orchestration layers, and system integrations that power critical ML workflows, ensuring resilient system architectures in distributed settings. You will also serve as a senior technical anchor for cross-functional teams and external vendors, supporting our mission to use information technology to provide valuable, reliable, and competitive services.
What You'll Do
- Oversee the design, improvement, and ownership of backend services that support ML workflows, ensuring well-architected systems across service boundaries, data flow, scalability, fault tolerance, performance, and maintainability.
- Diagnose and fix complex production incidents involving Spark jobs, Airflow pipelines, Azure ML runs, and AKS services through in-depth root-cause analysis.
- Design and maintain Dockerized services and Kubernetes (AKS) deployments, contribute to CI/CD pipelines, and establish best practices for configuration management and scaling.
- Serve as the primary technical contact, working with data scientists, ML engineers, business analysts, platform teams, and external vendors to convert business needs into scalable backend architectures.
- Coach less experienced engineers on system design and complex debugging techniques.
What We're Looking For
- 3–5 years of experience as a Backend Engineer, Platform Engineer, Senior ML Engineer, or in a similar role.
- Demonstrated backend development skills in Python (or Java/Go) with direct experience creating APIs and services.
- Robust understanding of system design, covering distributed systems, microservices, scalability, fault tolerance, and reliability.
- Direct experience with AML and Airflow focused on platform or operational responsibilities.
- Working knowledge of Azure ML pipelines, endpoints, and deployment patterns.
- Solid practical experience with Docker and Kubernetes (AKS).
- Ability to identify and resolve complex production issues across application, data, and infrastructure layers.
Technical Stack
- Languages: Python, Java, Go
- Infrastructure & Orchestration: Docker, Kubernetes (AKS), Airflow
- ML Platform: Azure ML, Spark
Team & Environment
You will report to a Senior Machine Learning Engineer and collaborate extensively with data scientists, ML engineers, business analysts, platform teams, and external vendors.
Work Mode
This position follows a hybrid work model.





