BJAK is hiring a Founding Lead Machine Learning Engineer to shape the core technical direction of a new global consumer AI product. This is a founding role where you will design systems from first principles, own deployment end-to-end, and push into novel AI-powered experiences beyond typical chatbot use cases.
What You'll Do
- Build end-to-end training pipelines: data acquisition, training, evaluation, and inference.
- Design new model architectures or adapt open-source frontier models.
- Fine-tune models using state-of-the-art methods like LoRA/QLoRA, SFT, DPO, and distillation.
- Architect scalable inference systems using vLLM, TensorRT-LLM, or DeepSpeed.
- Build data systems for high-quality synthetic and real-world training data.
- Develop alignment, safety, and guardrail strategies.
- Design evaluation frameworks across performance, robustness, safety, and bias.
- Own deployment: GPU optimization, latency reduction, and scaling policies.
- Shape early product direction, experiment with new use cases, and build AI-powered experiences from zero.
- Explore frontier techniques like retrieval-augmented training, mixture-of-experts, distillation, multi-agent orchestration, and multimodal models.
What We're Looking For
- A strong background in deep learning and transformer architectures.
- Hands-on experience training or fine-tuning large models (LLMs or vision models).
- Proficiency with PyTorch, JAX, or TensorFlow.
- Experience with distributed training frameworks like DeepSpeed, FSDP, Megatron, ZeRO, or Ray.
- Strong software engineering skills—writing robust, production-grade systems.
- Experience with GPU optimization: memory efficiency, quantization, and mixed precision.
- Comfort owning ambiguous, zero-to-one technical problems end-to-end.
Nice to Have
- Experience with LLM inference frameworks like vLLM, TensorRT-LLM, or FasterTransformer.
- Contributions to open-source ML libraries.
- Background in scientific computing, compilers, or GPU kernels.
- Experience with RLHF pipelines (PPO, DPO, ORPO).
- Experience training or deploying multimodal or diffusion models.
- Experience in large-scale data processing (Apache Arrow, Spark, Ray).
- Prior work in a research lab like Google Brain, DeepMind, FAIR, Anthropic, or OpenAI.
Technical Stack
- Frameworks: PyTorch, JAX, TensorFlow
- Training: DeepSpeed, FSDP, Megatron, ZeRO, Ray
- Inference: vLLM, TensorRT-LLM, FasterTransformer
- Data: Apache Arrow, Spark
Team & Environment
You’ll join a small, senior, high-performance team. This is a founding technical role collaborating directly with founders.
Benefits & Compensation
- Remote-first flexibility
- Competitive compensation and performance-based bonuses
- Insurance coverage
- Flexible time off
- Global travel insurance
Work Mode
This is a remote-first position.
BJAK is an equal opportunity employer.





