任职要求
- Experience with model serving platforms such as vLLM or HuggingFace TGI
- Proficiency in GPU orchestration using tools like Kubernetes, Ray, Modal, RunPod, LambdaLabs
- Ability to monitor latency, costs, and scale systems efficiently with traffic demands
- Experience setting up inference endpoints for backend engineers
福利待遇
- Flat structure & real ownership
- Full involvement in direction and consensus decision making
- Flexibility in work arrangement
- High-impact role with visibility across product, data, and engineering
- Top-of-market compensation and performance-based bonuses
- Global exposure to product development
- Lots of perks - housing rental subsidies, a quality company cafeteria, and overtime meals
- Health, dental & vision insurance


