London, United Kingdom Remote (Global) Employment

Kraken is hiring a Senior AI Compute Infrastructure Engineer

About the Role

The role involves building and optimizing infrastructure to power advanced AI training and inference pipelines, working closely with research and engineering teams to deliver scalable solutions.

Responsibilities

  • Design scalable systems for AI model training and deployment
  • Optimize compute resource utilization across distributed environments
  • Collaborate with machine learning teams to understand infrastructure needs
  • Improve system reliability and fault tolerance
  • Develop automation tools for provisioning and monitoring infrastructure
  • Troubleshoot performance bottlenecks in compute clusters
  • Implement efficient data pipelines for AI workflows
  • Ensure security and compliance across infrastructure layers
  • Evaluate new hardware and software technologies for AI workloads
  • Maintain documentation for systems and processes
  • Support deployment of large language models in production
  • Drive initiatives to reduce operational costs
  • Monitor cluster health and respond to incidents
  • Integrate feedback from research teams into infrastructure design
  • Scale systems to accommodate growing model sizes
  • Work with containerization and orchestration platforms
  • Ensure low-latency communication between compute nodes
  • Optimize storage solutions for high-throughput access
  • Manage GPU resource allocation and scheduling
  • Contribute to capacity planning and forecasting

Nice to Have

  • Experience with large-scale transformer model training
  • Background in computer science or related technical field
  • Prior work with AI research teams
  • Familiarity with Slurm or similar workload managers
  • Knowledge of RDMA or high-speed interconnects
  • Experience with hardware provisioning at scale
  • Contributions to open-source infrastructure projects
  • Understanding of energy-efficient computing
  • Exposure to formal incident response procedures
  • Experience mentoring junior engineers

Compensation

Competitive salary with equity and benefits package

Work Arrangement

Hybrid work model with flexibility for remote or on-site

Team

Collaborative engineering team focused on AI infrastructure scalability

About the AI Infrastructure Team

  • The team operates at the intersection of machine learning and systems engineering, building platforms that enable rapid experimentation and deployment of AI models.
  • Focus areas include cluster management, resource scheduling, and performance optimization for GPU-intensive workloads.

Impact

  • Engineers directly influence the speed and efficiency of AI development by enabling faster training cycles and reliable inference systems.
  • Work contributes to reducing time-to-market for new AI capabilities.

Available for qualified candidates

About company
Kraken
Kraken is a cryptocurrency exchange building premium crypto products for experienced traders, institutions, and newcomers. The company is committed to industry-leading security, crypto education, and world-class client support through products like Kraken Pro, Desktop, Wallet, and Kraken Futures.
All jobs at Kraken Visit website
Job Details
Department Engineering, AI & Machine Learning
Category infrastructure
Posted a month ago