BJAK is looking for an MLOps Engineer to join our team in a fully remote capacity. You will be responsible for fine-tuning state-of-the-art models, designing evaluation frameworks, and bringing AI features into production, ensuring they are safe, trustworthy, and impactful at scale.
What You'll Do
- Run and manage open-source models efficiently, optimizing for cost and reliability
- Ensure high performance and stability across GPU, CPU, and memory resources
- Monitor and troubleshoot model inference to maintain low latency and high throughput
- Collaborate with engineers to implement scalable and reliable model serving solutions
What We're Looking For
- Experience with model serving platforms such as vLLM or HuggingFace TGI
- Proficiency in GPU orchestration using tools like Kubernetes, Ray, Modal, RunPod, LambdaLabs
- Ability to monitor latency, costs, and scale systems efficiently with traffic demands
- Experience setting up inference endpoints for backend engineers
Technical Stack
- vLLM, HuggingFace TGI, Kubernetes, Ray, Modal, RunPod, LambdaLabs
Team & Environment
Flat structure with real ownership. You will have full involvement in direction and consensus decision making.
Benefits & Compensation
- Housing rental subsidies
- Quality company cafeteria
- Overtime meals
- Health, dental & vision insurance
- Global travel insurance (for you & your dependents)
- Unlimited, flexible time off
Work Mode
This is a fully remote position with a global work mode.
BJAK is an equal opportunity employer.




