Colombia Remote (Country)

Concentrix is hiring a Lead Site Reliability Engineer

This role is central to ensuring the reliability, scalability, and security of our cloud-native platforms. The Lead Site Reliability Engineer will drive the integration of DevSecOps practices across the software development lifecycle, ensuring that systems are resilient, observable, and compliant. You will lead automation initiatives, mentor engineering teams, and work closely with stakeholders to align SRE strategy with business goals. This position requires a strong foundation in infrastructure as code, container orchestration, and incident management, along with the ability to lead technical transformation in a fast-paced, distributed environment.

Responsibilities

  • Collaborate with technology leaders to develop and implement SRE strategies that support digital transformation initiatives.
  • Design and optimize automated CI/CD pipelines using platforms such as GitHub Actions, Jenkins, or ArgoCD.
  • Automate the build, testing, and deployment of microservices across Kubernetes and multi-cloud environments.
  • Integrate security into all stages of the development pipeline through DevSecOps practices.
  • Implement automated vulnerability scanning, secrets management, and policy enforcement using tools like Snyk, HashiCorp Vault, and OPA.
  • Build resilient, fault-tolerant systems using cloud-native patterns including auto-scaling, self-healing, and blue/green deployments.
  • Use Kubernetes, service meshes, and distributed tracing to ensure system performance and uptime.
  • Deploy and manage observability solutions with Prometheus, Grafana, ELK, and OpenTelemetry for monitoring and alerting.
  • Define service level objectives and indicators, configure intelligent alerts, and lead incident response and postmortem analyses.
  • Manage infrastructure as code using Terraform and cloud-specific tools to provision and maintain cloud resources.
  • Ensure version-controlled infrastructure and enforce strict change management protocols.
  • Support compliance with regulatory standards such as SOC 2, HIPAA, and ISO 27001 through governance and audit automation.
  • Enable continuous compliance by automating audit trails and policy checks across environments.
  • Work closely with development, QA, and platform teams to integrate reliability and security into the software lifecycle.
  • Promote cloud-native best practices and scalable architectural patterns across engineering teams.
  • Mentor junior engineers, lead technical reviews, and foster a culture of automation, ownership, and continuous improvement.
  • Continuously optimize systems by identifying performance issues, reducing manual effort through automation, and evolving infrastructure to support innovation.

Requirements

  • Demonstrated experience with major cloud platforms such as AWS, Azure, or GCP, including services like EC2, S3, IAM, Lambda, AKS, or GKE.
  • Minimum of 3 to 5 years of relevant professional experience in site reliability or systems engineering.
  • Proficient in containerization and orchestration technologies including Docker, Kubernetes, Helm, LGTm, and Harbor.
  • Hands-on experience with CI/CD tools such as GitHub Actions, GitLab CI, or Azure DevOps.
  • Skilled in infrastructure-as-code tools like Terraform, Pulumi, or CloudFormation.
  • Familiarity with monitoring and observability stacks including Prometheus, Grafana, ELK, and OpenTelemetry.
  • Knowledge of DevSecOps tools and methodologies, including Vault, Snyk, OPA, and CIS benchmarks.
  • Strong programming skills in languages such as Python, Go, or Bash.
  • Proficient with Git and GitOps workflows in agile development environments.

Tech Stack

GitHub Actions, Jenkins, ArgoCD, Snyk, HashiCorp Vault, OPA, Kubernetes, service meshes, Prometheus, Grafana, ELK, OpenTelemetry, Terraform, Pulumi, CloudFormation, Docker, Helm, LGTm, Harbor, Git, GitOps, AWS, Azure, GCP, EC2

Work Arrangement

Remote work within Colomb

Additional Information

  • This position requires occasional on-call availability to support critical system incidents.
  • Candidates must be authorized to work in Colombia without sponsorship.
  • We offer flexible working hours with a strong emphasis on work-life balance.
  • The role involves collaboration with global teams across multiple time zones.
  • Professional development and certification reimbursement are available.
Required Skills
GitHub ActionsJenkinsArgoCDSnykHashiCorp VaultOPAKubernetesservice meshesPrometheusGrafanaELKOpenTelemetryTerraformPulumiCloudFormation GitHub ActionsJenkinsArgoCDSnykHashiCorp VaultOPAKubernetesservice meshesPrometheusGrafanaELKOpenTelemetryTerraformPulumiCloudFormation
About company
Concentrix
Concentrix ist ein internationales Unternehmen, das in mehr als 70 Ländern vertreten ist und führend in der Verbesserung der Kundenerfahrung und der Optimierung von Geschäftsprozessen ist.
All jobs at Concentrix Visit website
Job Details
Department Information Technology
Category infrastructure
Posted 5 months ago