What You'll Do
Design and manage automated CI/CD pipelines using GitHub Actions to support fast, reliable software delivery. Apply Git best practices across branching, pull requests, and repository workflows to ensure code quality and collaboration efficiency. Build and maintain backend services and internal tools primarily with TypeScript and Python, supporting engineering teams across the organization.
Containerize applications using Docker, following optimization standards for performance and security. Deploy and manage workloads in Kubernetes, handling scaling, networking, and troubleshooting in production environments. Configure and maintain cloud infrastructure on Microsoft Azure, ensuring reliability and operational excellence.
Implement comprehensive observability by integrating logs, metrics, and traces across systems. Use Datadog and Splunk to centralize log data, create monitoring dashboards, and set up proactive alerting. Install and manage Datadog agents and integrations to ensure full visibility into system health.
Respond to incidents with timely triage and root cause analysis. Continuously refine deployment processes, system performance, and operational workflows. Collaborate across technical and non-technical teams to deliver stable, observable systems, and maintain clear, up-to-date documentation as systems evolve.
Requirements
- Minimum of 3 years of hands-on experience in software engineering or systems development
- Proven expertise in CI/CD pipelines, particularly with GitHub Actions
- Strong command of Git workflows, including branching strategies and pull request management
- Development experience using TypeScript or Python
- Practical knowledge of Docker and container optimization techniques
- Working experience with Kubernetes, including deployment, service configuration, and issue resolution
- Familiarity with Microsoft Azure services and infrastructure management
- Experience setting up observability with Datadog and Splunk
- Understanding of log aggregation, monitoring strategies, alerting systems, and dashboard creation
- Proficiency with command-line tools and debugging in production environments
- Ability to document system changes and keep technical records current
- Demonstrated ability to work independently and solve problems proactively
- Strong written and verbal communication skills, especially during high-pressure situations
- Experience collaborating asynchronously through code reviews, issue tracking, and team messaging platforms
Benefits
This role operates in a fully remote setup, offering flexibility in where you work. You’ll join a culture that values inclusion, diverse perspectives, and personal growth. The environment supports professional development and empowers individuals to contribute meaningfully. As part of a globally distributed team, you’ll help shape systems that are reliable, scalable, and aligned with real-world needs.