About the Role
The role involves maintaining and improving production systems, automating operations, and collaborating with engineering teams to ensure platform stability and efficiency.
Responsibilities
- Manage and maintain production environments to ensure high availability and performance
- Diagnose and resolve infrastructure and application issues promptly
- Implement and maintain monitoring and alerting systems
- Support deployment pipelines and continuous integration workflows
- Automate routine operational tasks to improve efficiency
- Ensure systems comply with security and regulatory standards
- Collaborate with development teams to troubleshoot production problems
- Maintain documentation for systems and operational procedures
- Respond to incidents and participate in on-call rotations
- Optimize cloud resource usage and associated costs
- Enforce configuration management across environments
- Support identity and access management systems
- Work on disaster recovery planning and execution
- Improve system reliability through proactive maintenance
- Assist in onboarding new services to production
- Evaluate and integrate new operational tools
- Contribute to post-incident reviews and action follow-ups
- Ensure logging systems are functional and centralized
- Participate in capacity planning activities
- Maintain network and firewall configurations
- Support containerized workloads and orchestration platforms
- Implement and manage secrets management solutions
- Collaborate on system architecture improvements
- Drive adoption of observability best practices
- Assist in technical audits and compliance checks
Compensation
Competitive salary based on experience and qualifications
Work Arrangement
Hybrid work model with flexibility for remote and office presence
Team
Collaborative engineering environment focused on operational excellence and innovation
Why Join Us
- You’ll work in a forward-thinking environment that values technical expertise and continuous improvement.
- The team prioritizes sustainable practices and leverages modern technologies to drive impact.
- Opportunities for professional growth and influence on technical direction are encouraged.
- You’ll contribute to systems that support a growing customer base with real-world impact.
- A culture of transparency, accountability, and teamwork defines how we operate.
Technology Stack
- Primary infrastructure hosted on AWS
- Container orchestration using Kubernetes
- Infrastructure managed via Terraform
- Monitoring stack includes Prometheus, Grafana, and Alertmanager
- Logging through centralized ELK or similar stack
- CI/CD pipelines powered by GitLab CI or equivalent
- Configuration managed with Ansible
- Applications deployed in microservices architecture
- Security enforced through automated compliance checks
- Secrets managed using HashiCorp Vault or equivalent
Visa sponsorship available for qualified international candidates