Lisbon; Belgrade Remote (Global) Full-time

Aghanim is hiring a Senior/Principal DevOps

Responsibilities

  • Manage and advance production systems hosted on Google Cloud Platform and Cloudflare
  • Guarantee system uptime, responsiveness, and dependability according to defined service level objectives
  • Develop and maintain scalable architectures prepared for sudden traffic surges and heavy workloads
  • Lead planning for system capacity, auto-scaling strategies, elimination of performance bottlenecks, and fault tolerance enhancements
  • Implement and manage infrastructure through code using Terraform, supplemented by Terragrunt when needed
  • Oversee Kubernetes operations on Google Kubernetes Engine, including version updates, scaling, and security hardening
  • Sustain internal tooling, deployment pipelines, and operational best practices
  • Enhance system observability via Datadog with comprehensive monitoring, log management, and application performance tracking
  • Direct incident response efforts, conduct root cause investigations, and implement lasting improvements to system reliability
  • Minimize alert fatigue and refine detection mechanisms for critical system failures
  • Administer core security tools and support cross-team vulnerability remediation
  • Strengthen deployment stability and streamline developer workflows using automated CI/CD pipelines and platform controls
  • Take accountability for tracking infrastructure spending, driving cost reductions, and supporting financial operations initiatives

Benefits

  • Collaborate with seasoned engineers worldwide who have delivered platforms used by millions
  • Join a rapidly scaling organization where concepts become live features within days
  • Exercise independent judgment with full ownership over technical outcomes
  • Integrate AI-driven solutions and modern automation tools into daily workflows
  • Receive equity to share in the company's long-term growth and success

Compensation

Equity offered as part of compensation

Work Arrangement

Remote (Worldwide) — Los Angeles, New York, Seoul, Beijing, London, Lisbon, Belgrade, and other global locations

Team

High autonomy and ownership

Other

  • High autonomy and ownership
  • On-call responsibilities implied through incident management duties
  • Equity offered as part of compensation
About company
Aghanim
Aghanim helps game developers achieve financial and creative independence by providing the solutions they need to launch, run, and grow their businesses.
All jobs at Aghanim Visit website
Job Details
Department Engineering
Category infrastructure
Posted 15 days ago