San Francisco, CA On-site Employment USD 165,000 – 350,000 / year

Fluidstack is hiring a Software Engineer, Inference Platform

Responsibilities

  • Manage full lifecycle deployment of inference workloads, including setup, optimization, SLA adherence, and incident resolution.
  • Deliver quantifiable gains in token generation speed, latency, and cost efficiency across various model types and usage patterns.
  • Develop and maintain key infrastructure for KV cache management and request scheduling to improve system throughput.
  • Design and validate split prefill/decode processing pipelines along with scalable Kubernetes-based orchestration.
  • Identify and eliminate performance constraints across compute, memory, and inter-process communication layers; implement comprehensive monitoring.
  • Collaborate with clients to align deployment strategies and platform enhancements with their model designs and performance needs.
  • Influence platform evolution by contributing to architectural decisions focused on simplifying deployments, boosting hardware efficiency, and enabling new model support.
  • Join a rotating on-call schedule, covering up to one week per month, to ensure system stability and meet service level objectives.

Compensation

$165,000 – $350,000 base salary annually, with potential equity through stock options.

Work Arrangement

Not specified

Team

Not specified

Other

  • Base salary range is $165,000 – $350,000 per year, based on experience, skills, qualifications, and location.
  • Total compensation may include equity in the form of stock options.
  • Equal Employment Opportunity Employer policy is in effect.
  • Applicants with arrest and conviction records will be considered in accordance with applicable laws.
  • A confirmation email will be sent upon successful application submission.
  • If no confirmation is received, contact careers@fluidstack.io with resume/CV, role applied for, and submission date for follow-up.

Not specified

About company
Fluidstack
We’re building the infrastructure for abundant intelligence. We partner with top AI labs, governments, and enterprises to unlock compute at the speed of light.
All jobs at Fluidstack Visit website
Job Details
Department Software Engineering
Category infrastructure
Posted 2 months ago