Fastino.ai is hiring an AI Engineer to advance the state of large language models. You will innovate at the edge of efficiency, bridging the gap between research and production by designing and deploying high-performance agentic systems.
What You'll Do
- Design and deploy high-performance agentic systems that leverage Fastino’s optimized model architectures.
- Collaborate with engineering teams to turn novel architectural breakthroughs into scalable, low-latency solutions.
- Drive rapid, iterative prototyping of AI functionalities, refining model performance based on real-world telemetry.
- Own the stability and throughput of inference pipelines, proactively solving scalability bottlenecks.
- Architect large-scale data and fine-tuning strategies to continuously improve model precision and reliability.
What We're Looking For
- 2+ years of hands-on experience in AI/ML engineering roles.
- Demonstrated proficiency with LLMs and a track record of applying AI/ML techniques to solve complex problems.
- Comfortable working across the stack from prompt engineering and vector DB tuning to Kubernetes deployment and API design.
Nice to Have
- Experience building microservices that handle high-concurrency agentic workloads.
- Familiarity with GLiNER or other information extraction architectures.
Technical Stack
- LLMs
- Kubernetes
- Vector DB
Work Mode
This is a global position.




