Join Cisco Systems, Inc. as a Senior AI/ML DevOps Engineer on the CX AI Incubation Team. You will productionize LLM/SLM capabilities to power Intelligent Customer Experiences, moving AI prototypes into robust, scalable systems deployed across cloud and on-prem environments.
What You'll Do
- Productionize LLM/SLM-powered features by building robust model-serving and deployment pipelines (cloud + on-prem) with clear SLAs, monitoring, and rollback strategies.
- Optimize inference performance across CPU, small GPUs, and large multi-GPU servers using quantization, batching, KV-cache strategies, and runtime tuning for cost and latency.
- Package and integrate on-prem inference stacks (VM/containers) with customer environments, including secure configuration, versioning, and upgrade-safe deployments.
- Design scalable serving architectures for generative AI (multi-tenant, secure, cost-aware), including capacity planning and performance benchmarking.
- Build automated CI/CD for models and prompts: evaluation gates, regression testing, artifact management, and reproducible releases.
- Implement model and service observability: latency/throughput metrics, quality drift signals, safety checks, and incident triage workflows.
- Support training and fine-tuning workflows for LLMs/SLMs, including data curation, experiment tracking, and packaging models for production.
- Partner with product and engineering to integrate AI services into applications, ensuring reliability, security, and responsible AI behavior.
What We're Looking For
- Bachelor’s degree with 7+ years of related experience, or Master’s degree with 4+ years of related experience.
- Experience in Python, Java or C++, and building production services for ML/AI workloads.
- Experience with PyTorch/TensorFlow and tooling across the ML lifecycle (data pipelines, training, evaluation, deployment).
- Experience deploying and operating NLP/Generative AI systems in production, including performance tuning and reliability practices.
- Experience working in cross-functional teams, delivering in fast-paced environments, and communicating technical concepts clearly.
Nice to Have
- Proven experience productionizing LLMs/SLMs with GPU-backed inference and runtime optimization (quantization, batching, parallelism).
- Hands-on experience with on-prem deployment patterns (air-gapped, customer-managed), including packaging, integration, and upgrade strategy.
- Experience with AI infrastructure and MLOps/AI DevOps tooling (K8s, CI/CD, model registry, experiment tracking, observability).
- Familiarity with inference engines and GPU profiling (vLLM, Triton, TensorRT-LLM, llama.cpp).
- Exposure to edge deployments and resource-constrained inference environments.
- Strong written and verbal communication skills, with the ability to contribute to design discussions and documentation.
Technical Stack
- Languages: Python, Java, C++
- ML Frameworks: PyTorch, TensorFlow
- Infrastructure/Tools: Kubernetes
- Inference Engines: vLLM, Triton, TensorRT-LLM, llama.cpp
Team & Environment
You will join the CX AI Incubation Team, collaborating cross-functionally to bring AI capabilities to Cisco's customer experience solutions.
Benefits & Compensation
- Compensation: $199,700.00 to $254,600.00 (starting range for U.S. and Canada) + equity: Grants of Cisco restricted stock units.
- Medical, dental and vision insurance.
- 401(k) plan with Cisco matching contribution.
- Paid parental leave.
- Short and long-term disability coverage.
- Basic life insurance.
- Grants of Cisco restricted stock units.
- 10 paid holidays per year plus 1 floating holiday for non-exempt employees.
- 1 paid day off for employee’s birthday.
- Paid year-end holiday shutdown.
- 4 paid days off for personal wellness.
- Non-exempt employees receive 16 days of paid vacation time per year.
- Exempt employees participate in flexible vacation time off program.
- 80 hours of sick time off provided on hire date and each January 1st.
- Additional paid time away for critical or emergency family issues.
- Optional 10 paid days per year to volunteer.
- Annual bonuses for non-sales roles.
Work Mode
This role is local-country, open to candidates based in the U.S. and Canada.
Cisco is an equal opportunity employer.




