Remote (Global)

FAR.AI is hiring an Infrastructure Engineer

About the Role

The role involves building and managing core infrastructure systems that power AI research and deployment. Engineers will work closely with research and product teams to deliver robust, automated, and efficient platforms.

Responsibilities

  • Design and deploy scalable cloud infrastructure
  • Automate provisioning and configuration management
  • Ensure high availability and fault tolerance
  • Monitor system performance and optimize resource usage
  • Implement security best practices across environments
  • Support deployment pipelines for machine learning models
  • Troubleshoot production issues across distributed systems
  • Maintain documentation for infrastructure components
  • Collaborate with engineering teams on system design
  • Evaluate and integrate new infrastructure technologies
  • Manage container orchestration platforms
  • Enforce compliance with data protection standards
  • Develop backup and disaster recovery procedures
  • Optimize costs for cloud resource consumption
  • Integrate observability tools for system insights
  • Support on-call rotations for critical systems
  • Ensure infrastructure aligns with research workflows
  • Implement access controls and identity management
  • Work with distributed storage solutions
  • Contribute to incident response protocols
  • Improve deployment velocity through tooling
  • Maintain network architecture for low-latency communication
  • Support GPU-accelerated computing environments
  • Drive migration from legacy to modern infrastructure
  • Participate in code and design reviews

Nice to Have

  • Experience supporting AI or machine learning workloads
  • Background in high-performance computing
  • Familiarity with large-scale data pipelines
  • Knowledge of GPU cluster management
  • Experience with low-latency networking
  • Contributions to open-source infrastructure projects
  • Advanced degrees in computer science or related field
  • Certifications in cloud or systems engineering

Compensation

Competitive salary with equity and benefits

Work Arrangement

Remote with flexible hours

Team

Small, fast-moving team focused on AI infrastructure

Why This Role Matters

The infrastructure built by this role directly enables faster experimentation and deployment of AI models, accelerating research progress and product development.

Technology Stack

Uses Kubernetes, Terraform, Prometheus, Python, and cloud-native services to manage scalable, secure, and observable systems.

Available for qualified candidates

Required Skills
KubernetesCloud InfrastructureNetworkingDevOps
About company
FAR.AI
A non-profit AI research institute focused on ensuring advanced AI is safe and beneficial, conducting research on AI safety, risks, and potential solutions.
All jobs at FAR.AI Visit website
Job Details
Category infrastructure
Posted 7 months ago