Hybrid

Nebius is hiring a Senior ML Solutions Architect - AI Studio

About the Role

Nebius is looking for a Senior ML Solutions Architect to support customers leveraging our AI Studio's serverless inference platform for open-source LLMs across multiple modalities. In this role, you will collaborate with clients to design and implement customized LLM-based solutions, architect scalable AI applications, and work with our backend team to improve the platform. We are leading a new era in cloud computing to serve the global AI economy.

What You'll Do

  • Design and implement LLM-based solutions using Nebius AI Studio’s inference services to drive business value and support customer goals.
  • Build production-ready applications leveraging our serverless LLM APIs, including multimodal models (text, vision, audio) and domain-specific models.
  • Provide technical expertise in prompt engineering, RAG architectures, model selection, and inference optimization.
  • Collaborate with product and engineering teams to surface customer feedback and shape the platform roadmap.
  • Guide customers in scaling from POC to production with a focus on performance, reliability, and cost efficiency.

What We're Looking For

  • 5+ years of experience in ML/AI systems, with at least 2 years focused on LLMs and generative AI.
  • Deep knowledge of the LLM ecosystem, including model architectures and fine-tuning approaches.
  • Hands-on experience with prompt engineering and LLM pipeline development, including evaluation.
  • Hands-on experience with agentic frameworks such as Langchain, Langsmith, smolagents, or equivalent.
  • Hands-on experience with vector databases and RAG implementation patterns.
  • Hands-on experience deploying LLM-powered applications using APIs from OpenAI, Anthropic, or open-source models.
  • Strong Python programming skills.
  • Excellent communication skills, with the ability to clearly explain technical concepts to diverse audiences.

Nice to Have

  • Experience with inference frameworks and libraries (e.g., vLLM, SGLang, TensorRT-LLM, Transformers).
  • Familiarity with inference optimization techniques such as quantization, batching, caching, and routing.
  • Work with multimodal AI models (e.g., vision-language, speech).
  • Proficiency with DevOps tools (Docker, Kubernetes).
  • Contributions to open-source ML/AI projects.

Technical Stack

  • Python, vLLM, SGLang, TensorRT-LLM, Transformers, OpenAI/Anthropic SDKs, Langchain, Langsmith, smolagents, FastAPI, Flask, Kubernetes (K8s), Docker, Git
  • AWS (SageMaker, Bedrock), GCP (Vertex AI), Azure (Azure ML)

Team & Environment

You will join a company of over 800 employees, which includes more than 400 highly skilled engineers. We offer a dynamic and collaborative work environment that values initiative and innovation.

Benefits & Compensation

  • Compensation: $215k - $275k OTE (On-Target Earnings) + equity based on your experience, skills, and location.
  • 100% company-paid medical, dental, and vision coverage for employees and families.
  • 401(k) Plan: Up to 4% company match with immediate vesting.
  • Parental Leave: 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers.
  • Remote Work Reimbursement: Up to $85/month for mobile and internet.
  • Company-paid short-term, long-term, and life insurance coverage.

Work Mode

This role offers a hybrid work mode and is located in the United States.

Nebius is an equal opportunity employer.

Required Skills
PythonvLLMSGLangTensorRT-LLMTransformersOpenAI SDKAnthropic SDKLangchainFastAPIMachine LearningLLMSolution ArchitectureCloud PlatformsAPI DevelopmentDistributed Systems
Looking for a remote dev community?

200+ professionals, 37 countries, one network

Working remotely doesn't mean working alone. Iglu connects you with developers, designers, and digital experts worldwide. Collaborate, learn, and grow together.

Global professional network
Knowledge sharing & collaboration
Regular community events
Cross-project opportunities
Join the community
37 countries represented
About company
Nebius

Nebius is leading a new era in cloud computing to serve the global AI economy. It creates tools and resources for customers to solve real-world challenges without massive infrastructure costs.

Visit website
Job Details
Category data
Posted 8 months ago