Design and implement frameworks, automation, and internal tools that promote efficiency and continuous innovation across engineering teams.
Utilize Kubernetes, Docker, and Python to improve developer velocity in building and deploying ML inference applications.
Build and maintain distributed systems through all stages of the software lifecycle, including design, coding, testing, documentation, and troubleshooting.
Create developer-facing products and services that simplify access to and interaction with the machine learning platform.
Work across public cloud platforms such as AWS and GCP, applying best practices in infrastructure scaling and capacity planning.
Deploy and manage containerized applications in production using technologies including Kubernetes, Service Mesh, ArgoCD, and related orchestration tools.
Collaborate with technical leads and machine learning engineers to define requirements and implement robust technical solutions.
Take full ownership of features from concept through deployment, including infrastructure defined through code.
Investigate, prototype, and integrate new machine learning tools with a focus on reliability, scalability, and long-term maintainability.
Proactively identify and resolve system issues, automate operational workflows, and enable self-service capabilities for engineering teams.
Participate in on-call rotations to support system reliability and incident response.

Hybrid

Workday is hiring a Software Development Engineer - ML Ops

Similar Jobs