Inworld AI is seeking an AI Engineer to build the engine powering the next generation of AI-driven software, with a focus on speech modeling (STT & TTS). You will research, build, optimize, and deploy production ML systems integrated by thousands of developers.
What You'll Do
- Research, build, optimize, and deploy production ML systems for speech modeling (STT & TTS)
- Solve challenges related to data collection, efficient training infrastructure, RL alignment environments, and ultra-low latency inference optimizations for audio
What We're Looking For
- A PhD in a relevant technical field, or a BA/BS degree with equivalent research and/or engineering experience
- 5+ years of combined experience in software development (e.g., with Python or C++) and applied ML engineering
- Demonstrated experience applying or researching Machine Learning in one or more of: Speech or video processing, Natural Language Processing (NLP), Action planning
- Strong foundation in data structures, algorithms, and neural network architectures
- Proficiency with ML frameworks such as PyTorch
- Professional working proficiency in English
Nice to Have
- A passion for learning and staying up-to-date with the latest advancements in ML/Voice AI research and its applications
- Ability to work collaboratively in a fast-paced environment with shifting priorities
- Familiarity with pre-training, fine-tuning, RLHF and evaluation of large language and speech models
- Knowledge of working with embedded systems and/or running ML on edge devices
- Strong background in mathematics and/or physics
Technical Stack
- Python
- C++
- PyTorch
Work Mode
This role operates in a local-country work mode and is based in Switzerland.
Inworld AI is an equal opportunity employer.




