Responsibilities
- Design and refine new speculative decoding algorithms by integrating model architecture advances with targeted data to improve accuracy and efficiency.
- Bridge the gap between raw data inputs and deployable models, ensuring direct, measurable impact on customer outcomes.
- Operate in a dynamic environment focused on breakthroughs in generative artificial intelligence.
- Partner with specialists tackling demanding, real-world problems in high-performance computing and inference.
- Engage customers to identify requirements and coordinate with inference and applied machine learning research teams to deploy solutions.
Team
Collaborate with a team of experts, core inference team, and Applied ML research teams.