Responsibilities
- Develop strategic architecture for AI infrastructure systems
- Lead collaboration across Data Science and Security teams
- Design GPU cluster deployments across multiple regions
- Assess new technologies in the AI infrastructure space
- Define governance frameworks and operational best practices
- Implement caching mechanisms for prompts and context to improve inference efficiency
- Build systems enabling precise control over cache keys and data retrieval methods
- Enhance speed and cost performance of large-scale LLM inference operations
- Support infrastructure for Retrieval-Augmented Generation (RAG) workflows
- Design and deploy end-to-end encryption for stored AI-generated content
- Integrate customer-controlled encryption keys in cloud platforms
- Ensure secure separation of tenant data and adherence to compliance standards
- Develop vector search systems suitable for enterprise environments
- Improve Approximate Nearest Neighbor (ANN) algorithms for performance at scale
- Create ranking models supporting personalization, recommendations, and revenue generation
- Support scalable infrastructure for embedding-based search
- Design and manage distributed storage systems at petabyte scale
- Implement materialized views with reliable cross-datacenter synchronization
- Enable high-throughput update systems with fast point query responses
- Optimize execution of large table scans and distributed data processing workflows