Palo Alto, California, United States On-site Employment USD 172,000 - 250,000 Yearly

Inflection AI is hiring a Machine Learning Engineer

About the Role

At Inflection AI, our mission is to harness AI to improve human well-being and productivity. We are hiring a Senior Machine Learning Engineer to be a key technical leader on our AI Engineering team. You will design and scale the systems that bring our models from research into reliable, production-grade deployments, directly impacting how intelligence is delivered to millions of users.

What You'll Do

  • Design and implement scalable, low-latency model-serving infrastructure for large language models and multimodal systems.
  • Build and maintain robust APIs and services to support real-time conversational workloads.
  • Optimize inference systems for throughput, latency, cost-efficiency, and reliability.
  • Architect and improve end-to-end ML pipelines spanning training, evaluation, deployment, monitoring, and rollback.
  • Develop model lifecycle management systems with strong observability and performance tracking.
  • Partner with infrastructure teams to scale compute resources efficiently across distributed environments.
  • Improve CI/CD workflows and automation for model releases and infrastructure updates.
  • Collaborate with ML researchers to productionize new model architectures and capabilities.
  • Design abstractions that enable rapid experimentation while preserving safety, quality, and reliability.
  • Implement evaluation frameworks and guardrails to ensure models meet performance and safety standards before deployment.
  • Define data requirements and feedback loops to enable continuous model improvement.
  • Partner with product and safety teams to integrate telemetry, evaluation signals, and user feedback into training pipelines.
  • Ensure high-quality data ingestion and metadata tracking for ML readiness.
  • Lead architectural decisions that balance performance, scalability, safety, and maintainability.
  • Contribute to code reviews and engineering best practices across the team.
  • Mentor engineers and raise the bar for production ML excellence.
  • Help shape long-term technical strategy for deploying AI systems at global scale.

What We're Looking For

  • 1-4 years of experience in machine learning engineering, backend systems, or distributed infrastructure.
  • Proven experience deploying and operating ML models in production environments.
  • Strong programming skills in Python and/or C++ (or equivalent systems language).
  • Experience with large-scale model serving (LLMs, transformers, or similar architectures).
  • Deep understanding of distributed systems, API design, and cloud infrastructure.
  • Experience with MLOps tools and workflows (CI/CD, model monitoring, experiment tracking).

Nice to Have

  • Experience scaling high-throughput, low-latency inference systems.
  • Familiarity with GPU acceleration, model optimization (quantization, batching, caching), and performance tuning.
  • Experience working with conversational AI systems or real-time user-facing AI products.
  • Knowledge of ML evaluation methodologies, safety systems, and guardrail design.
  • Background collaborating closely with research teams in fast-paced AI environments.

Technical Stack

  • Python
  • C++
  • LLMs
  • Transformers

Team & Environment

You will be a key technical leader on the AI Engineering team.

Benefits & Compensation

  • Diverse medical, dental and vision options
  • 401k matching program
  • Unlimited paid time off
  • Parental leave and flexibility for all parents and caregivers
  • Support of country-specific visa needs for international employees living in the Bay Area
  • Compensation: $172,000.00 to $250,000.00 + meaningful equity component

Inflection AI values and supports our team’s mental and physical health. We are focused on building a positive, safe, inclusive and inspiring place to work.

Required Skills
PythonC++LLMstransformersmachine learning engineeringbackend systemsdistributed infrastructuremodel servingdistributed systemsAPI designcloud infrastructure
Your first international client?

Don't lose them over invoicing

Clients ghost freelancers with unprofessional invoicing. Glopay gives you a real EU company partnership so they take you seriously from invoice #1.

Instant EU company partnership
Invoice builder with your branding
Automated payment reminders
Real-time payment tracking
Get EU company now
Ready in 24 hours
About company
Inflection AI

Inflection AI pioneers human-centered AI models that unite emotional intelligence (EQ) and raw intelligence (IQ), transforming interactions from transactional to relational. The company's work comes to life through Pi, a personal AI companion, and a Platform of LLMs and APIs that enable builders to bring Pi-class emotional intelligence into their applications.

Visit website
Job Details
Department Engineering
Category data
Posted 14 days ago