About the Role
The role involves owning and optimizing CI/CD pipelines, ensuring system reliability, and supporting scalable infrastructure for a machine learning operations environment.
Responsibilities
- Design and maintain robust CI/CD systems for automated software delivery
- Monitor system performance and implement proactive reliability improvements
- Collaborate with engineering teams to streamline deployment processes
- Manage cloud infrastructure with a focus on scalability and security
- Troubleshoot production issues across distributed systems
- Implement observability tools for logs, metrics, and tracing
- Enforce infrastructure-as-code practices using modern configuration tools
- Support incident response and contribute to post-mortem analyses
- Optimize resource usage and cloud spending
- Ensure compliance with security and operational standards
- Automate routine operational tasks to reduce manual intervention
- Integrate security checks into development and deployment pipelines
- Maintain documentation for systems and operational procedures
- Evaluate and introduce new DevOps tools and technologies
- Work across time zones with global engineering teams
Nice to Have
- Experience supporting machine learning infrastructure
- Familiarity with data pipeline technologies
- Contributions to open-source DevOps projects
- Certifications in cloud or DevOps platforms
- Background in SaaS product operations
Compensation
Competitive salary and benefits package
Work Arrangement
Remote, Asia-Pacific and Japan region
Team
Collaborative engineering team focused on platform reliability and automation
Why This Role Matters
- This position directly impacts the stability and speed of product delivery for a growing AI platform.
- Your work will enable faster innovation by reducing deployment friction and increasing system resilience.
What You’ll Bring
- A mindset focused on automation, reliability, and continuous improvement.
- Proven ability to lead technical initiatives in a distributed environment.
Not applicable for remote positions