Red Hat is seeking a Principal Software Engineer to work on advanced AI/ML applications and agent systems, leveraging modern inference platforms to build production-ready prototypes. This role involves deep technical contributions to open source communities and providing leadership across engineering teams in a globally distributed environment.

What You'll Do

Build high-quality, high-performing AI/ML applications and agent systems using modern inference platforms for multi-modal and distributed model serving
Apply and optimize inference techniques including KV cache management, model quantization, and distributed serving to production workloads
Contribute to upstream inference runtime communities such as vLLM, TGI, PyTorch, OpenVINO, and related projects
Build multi-modal AI applications integrating vision, language, and other modalities
Provide technical leadership and coordination across multiple stakeholders and engineering teams
Apply a growth mindset by staying current with rapid advancements in AI/ML inference technologies
Benchmark and analyze inference performance at scale, driving data-driven optimization decisions
Publicize innovations through blogs, presentations, conferences, and other technical venues

What We're Looking For

Bachelor's degree in Computer Science, Engineering, or equivalent experience
5+ years of experience in AI/ML engineering with focus on production inference systems
Deep expertise in PyTorch and modern deep learning frameworks
Hands-on experience with inference runtime optimization (model serving, batching, KV cache management)
Advanced programming skills in Python and C++
Proven ability to contribute to and lead open source projects
Strong self-motivation and organizational skills
Ability to work concurrently on multiple projects, independently and within a team environment
Excellent English written and verbal communication skills
Collaborative attitude and willingness to share ideas openly

Nice to Have

Experience with vLLM, TGI (Text Generation Inference), or similar inference runtimes
Contributions to PyTorch, OpenVINO, or other inference frameworks
Experience with distributed model serving and GPU optimization
Familiarity with Kubernetes and cloud-native AI/ML deployments
Knowledge of model quantization techniques (GPTQ, AWQ, FP8, etc.)
Experience with CUDA, Triton, or other GPU programming frameworks
Experience with diffusion models and diffusion transformers
Experience building AI agents and agentic systems

Technical Stack

vLLM
TGI
PyTorch
OpenVINO
Python
C++
Kubernetes
CUDA
Triton
Model quantization (GPTQ, AWQ, FP8)
Distributed model serving
GPU optimization
Cloud-native AI/ML deployments
Diffusion models
Diffusion transformers
Multi-modal AI

Benefits & Compensation

Flexible work environments (in-office, office-flex, fully remote depending on role)
Opportunity to work across 40+ countries
Inclusive and open culture based on open source principles
Encouragement to bring best ideas regardless of title or tenure
Support for individuals with disabilities including reasonable accommodations
Equal opportunity and affirmative action employment policy

Work Mode

Work environments vary by role: in-office, office-flex, or fully remote. This position is based in Ireland.

Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law.

Red Hat, LLC is hiring a Principal Software Engineer - AI/ML (Ireland)

About the Role