Responsibilities
- Develop and manage containerized systems using Docker and Kubernetes.
- Deploy and oversee monitoring, logging, and observability tools to maintain system performance and reliability.
- Maintain and enhance CI/CD pipelines for automated cloud application deployments.
- Strengthen infrastructure and application security, reliability, and operational standards.
- Apply distributed tracing and monitoring techniques to increase system transparency.
- Work with engineering teams to improve deployment efficiency, resolve issues, and enhance performance.
- Support initiatives focused on platform evolution, automation, and infrastructure resilience.
Requirements
- Professional background in DevOps, site reliability engineering, or platform development.
- Proven experience with Docker and Kubernetes in production environments.
- Track record implementing monitoring, logging, and observability solutions.
- Familiarity with distributed tracing and performance analysis in complex systems.
- Understanding of security practices for cloud and containerized platforms.
- Experience configuring and supporting automated deployment pipelines.
- Strong ability to diagnose and resolve technical and operational challenges.
Nice to Have
- Exposure to public cloud platforms including Azure or Google Cloud Platform.


