Responsibilities
- Creating and advancing AI systems technologies focused on efficient inference
- Developing, refining, and optimizing computational kernels for high-performance AI tasks
- Building scalable and modular abstractions for large language model serving platforms
- Constructing high-efficiency just-in-time compilers and runtime environments tailored to specific domains
- Working in close collaboration with engineering teams across deep learning frameworks, libraries, kernel development, and GPU architecture
- Supporting and contributing to open-source projects such as FlashInfer, vLLM, and SGLang
Benefits
- Eligible for equity compensation
- Comprehensive benefits package available at official website
Other
- Applications will be accepted until at least March 15, 2026
- AI tools are utilized in the recruitment process
- This job posting corresponds to an existing open position
