Sumo Logic is hiring a Senior Machine Learning Engineer to build the intelligence behind the next generation of agentic AI systems that reason over massive, heterogeneous log data. In this role, you will combine machine learning, prompt engineering, context engineering, and rigorous evaluation to create autonomous AI agents for observability and security use cases.
What You'll Do
- Design, implement, and optimize agentic AI components, including tools, memory management, and prompts.
- Develop and maintain golden datasets by defining sourcing strategies, working with data vendors, and ensuring quality and representativeness at scale.
- Prototype and evaluate novel prompting strategies and reasoning chains for model reliability and interpretability.
- Collaborate cross-functionally with product, data, and infrastructure teams to deliver end-to-end AI-powered insights.
- Operate autonomously in a fast-paced, ambiguous environment—defining scope, setting milestones, and driving outcomes.
- Ensure reliability, performance, and observability of deployed agents through rigorous testing and continuous improvement.
- Maintain a strong bias for action—delivering incremental, well-tested improvements that directly enhance customer experience.
What We're Looking For
- B.Tech, M.Tech, or Ph.D. in Computer Science, Data Science, or a related field.
- 6+ years of hands-on industry experience with demonstrable ownership and delivery.
- Strong understanding of machine learning fundamentals, data pipelines, and model evaluation.
- Proficiency in Python and ML/data libraries such as NumPy, pandas, and scikit-learn.
- Working knowledge of LLM core concepts, prompt design, and agentic design patterns.
- Experience with distributed systems and dealing with large amounts of data.
- Strong communication skills and a passion for shaping emerging AI paradigms.
Nice to Have
- Experience leading a team of engineers.
- Prior experience building and deploying AI agents or LLM applications in production.
- Familiarity with modern agentic AI frameworks such as LangGraph, LangChain, or CrewAI.
- Experience with ML infrastructure and tooling including PyTorch, MLflow, Airflow, Docker, and AWS.
- Exposure to LLM Ops—infrastructure optimization, observability, latency, and cost monitoring.
Technical Stack
- Languages & Core Libraries: Python, NumPy, pandas, scikit-learn, PyTorch
- ML Ops & Infrastructure: MLflow, Airflow, Docker, AWS
- Agentic AI Frameworks: LangGraph, LangChain, CrewAI
Team & Environment
You will join a small, high-impact team driving the future of AI at Sumo Logic.
Sumo Logic is an equal opportunity employer.






