BJAK is looking for a ML Ops Engineer to bring AI features into production. In this remote role, you will fine-tune state-of-the-art models, design evaluation frameworks, and ensure models are safe, trustworthy, and impactful at scale.
What You'll Do
- Run and manage open-source models efficiently, optimizing for cost and reliability.
- Ensure high performance and stability across GPU, CPU, and memory resources.
- Monitor and troubleshoot model inference to maintain low latency and high throughput.
- Collaborate with engineers to implement scalable and reliable model serving solutions.
What We're Looking For
- Experience with model serving platforms such as vLLM or HuggingFace TGI.
- Proficiency in GPU orchestration using tools like Kubernetes, Ray, Modal, RunPod, LambdaLabs.
- Ability to monitor latency, costs, and scale systems efficiently with traffic demands.
- Experience setting up inference endpoints for backend engineers.
Technical Stack
- vLLM
- HuggingFace TGI
- Kubernetes
- Ray
- Modal
- RunPod
- LambdaLabs
Team & Environment
Flat structure & real ownership.
Benefits & Compensation
- Housing rental subsidies.
- Quality company cafeteria, and overtime meals.
- Health, dental & vision insurance.
- Global travel insurance (for you & your dependents).
- Unlimited, flexible time off.
Work Mode
This is a remote position open to candidates in Malaysia, Thailand, Taiwan, and Japan.
BJAK is an equal opportunity employer.



