Responsibilities
- Define and refine the strategic direction and long-term plan for ML and LLM operations platforms.
- Guide how teams deploy and manage machine learning models and large language model applications in production.
- Identify limitations in current platform capabilities, including scalability and performance bottlenecks.
- Lead platform advancements to support development workflows designed for AI-native systems and autonomous coding agents.
- Collect input from data scientists, ML engineers, backend developers, and AI application teams to inform product decisions.
- Convert complex technical requirements into actionable product specifications and prioritization strategies.
- Proactively find ways to enhance system reliability, reduce operational friction, and speed up deployment cycles.
- Maintain a balance between enabling experimentation and enforcing governance, safety, and cost control.
- Manage features across their full lifecycle—from concept and definition to implementation and post-launch evaluation.
- Collaborate with engineering teams to align on scalability, performance, and technical debt management.
- Ensure platform features support monitoring, evaluation, retraining, version control, and system observability for ML and LLMs.
- Promote platform adoption through training, documentation, and internal advocacy.
- Work with engineering to adapt platforms for better integration with AI coding assistants and autonomous agents.
- Design APIs, tools, and abstractions that support meaningful contributions from AI-generated code.
- Help establish safe development workflows and guardrails for AI-assisted software development.
Work Arrangement
On-site — Bangkok, Thailand
Work Arrangement
On-site — Bangkok, Thailand