Responsibilities
- Keep production systems healthy, stable, and highly available across multiple regions.
- Build and maintain infrastructure automation using Infrastructure as Code and configuration management using Ansible: Improve delivery, deployment, and maintenance processes for our managed services and tooling.
- Participate in incident response and the on-call rotation; troubleshoot issues end-to-end and drive fixes to resolution.
- Partner with global teams to plan and execute platform changes
- Contribute to automation initiatives (self-service workflows, safer changes, reduced manual tasks).
Requirements
- 2+ years as a System Administrator (or 1+ year as DevOps/SRE).
- Strong Linux administration and production troubleshooting.
- Scripting skills in Bash, Go or Python.
- Automation/IaC: Ansible (hands-on experience building reusable playbooks/roles).
- Virtualization: VMware and/or KVM experience.
- Networking fundamentals: L3/L4, DNS, DHCP, NAT, routing, load balancers, VPN/IPsec, firewalls; hands-on troubleshooting with tools like tcpdump, traceroute, iperf.
- Web/edge basics: reverse proxies/web servers (e.g., NGINX).
Nice to Have
- Observability & logs: Elastic/ELK, Prometheus, Grafana, Alertmanager.
- Cloud basics: Azure / AWS / GCP.
- Databases/queues: PostgreSQL/MySQL, Redis, RabbitMQ/Kafka.
- Security basics: hardening, secrets management, access management, RBAC/least privilege.
- CI/CD tools: Jenkins, GitLab CI, GitHub Actions.
- Interest or experience in AI for operations / automation
Work Arrangement
Remote (Worldwide)
Team
Structure: fully remote, globally distributed team
Additional Information
- Please submit your resume and application in English.
- All employment offers are contingent upon successful completion of applicable criminal, education and identity background checks.


