Remote (Global) Full-time

Bjak is hiring a Founding Lead Machine Learning Engineer

About the Role

BJAK is hiring a Founding Lead Machine Learning Engineer to shape the core technical direction of a new global consumer AI product. This is a founding role where you will design systems from first principles, own deployment end-to-end, and push into novel AI-powered experiences beyond typical chatbot use cases.

What You'll Do

  • Build end-to-end training pipelines: data acquisition, training, evaluation, and inference.
  • Design new model architectures or adapt open-source frontier models.
  • Fine-tune models using state-of-the-art methods like LoRA/QLoRA, SFT, DPO, and distillation.
  • Architect scalable inference systems using vLLM, TensorRT-LLM, or DeepSpeed.
  • Build data systems for high-quality synthetic and real-world training data.
  • Develop alignment, safety, and guardrail strategies.
  • Design evaluation frameworks across performance, robustness, safety, and bias.
  • Own deployment: GPU optimization, latency reduction, and scaling policies.
  • Shape early product direction, experiment with new use cases, and build AI-powered experiences from zero.
  • Explore frontier techniques like retrieval-augmented training, mixture-of-experts, distillation, multi-agent orchestration, and multimodal models.

What We're Looking For

  • A strong background in deep learning and transformer architectures.
  • Hands-on experience training or fine-tuning large models (LLMs or vision models).
  • Proficiency with PyTorch, JAX, or TensorFlow.
  • Experience with distributed training frameworks like DeepSpeed, FSDP, Megatron, ZeRO, or Ray.
  • Strong software engineering skills—writing robust, production-grade systems.
  • Experience with GPU optimization: memory efficiency, quantization, and mixed precision.
  • Comfort owning ambiguous, zero-to-one technical problems end-to-end.

Nice to Have

  • Experience with LLM inference frameworks like vLLM, TensorRT-LLM, or FasterTransformer.
  • Contributions to open-source ML libraries.
  • Background in scientific computing, compilers, or GPU kernels.
  • Experience with RLHF pipelines (PPO, DPO, ORPO).
  • Experience training or deploying multimodal or diffusion models.
  • Experience in large-scale data processing (Apache Arrow, Spark, Ray).
  • Prior work in a research lab like Google Brain, DeepMind, FAIR, Anthropic, or OpenAI.

Technical Stack

  • Frameworks: PyTorch, JAX, TensorFlow
  • Training: DeepSpeed, FSDP, Megatron, ZeRO, Ray
  • Inference: vLLM, TensorRT-LLM, FasterTransformer
  • Data: Apache Arrow, Spark

Team & Environment

You’ll join a small, senior, high-performance team. This is a founding technical role collaborating directly with founders.

Benefits & Compensation

  • Remote-first flexibility
  • Competitive compensation and performance-based bonuses
  • Insurance coverage
  • Flexible time off
  • Global travel insurance

Work Mode

This is a remote-first position.

BJAK is an equal opportunity employer.

Required Skills
PyTorchJAXTensorFlowDeepSpeedFSDPMegatronZeRORayvLLMTensorRT-LLMMachine LearningLLMDistributed TrainingModel OptimizationCloud Infrastructure
Ready to relocate and code from paradise?

Thailand or Vietnam — your office, your rules

Iglu offers relocation to Bangkok, Chiang Mai, Ho Chi Minh City, or Hong Kong. Full employment, legal setup, and a community of 200+ digital professionals.

Relocation to 5 countries
Full legal work setup
Developer community access
Work-life balance culture
Explore locations
Relocation support included
About company
Bjak

Bjak is focused on providing access to affordable and sustainable financial services for people in ASEAN. Headquartered in Malaysia, Bjak is the largest insurance portal in Southeast Asia. Its main portal, Bjak.com, helps millions find the insurance policy with the best value and highest coverage.

Visit website
Job Details
Category data
Posted 2 months ago