India Remote (Global) Employment

Concentrix is hiring a Lead Site Reliability Engineer

About the Role

Concentrix is hiring a Lead Site Reliability Engineer to shape and scale our DevSecOps ecosystem. In this hands-on leadership role, you will own the reliability of production systems, lead the design of automated pipelines, and champion SRE principles across the software delivery lifecycle.

What You'll Do

  • Define, implement, and own Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets across critical services.
  • Use error budget policies to drive data-informed conversations on release velocity vs. reliability trade-offs.
  • Conduct capacity planning and proactive risk assessments to prevent incidents.
  • Lead incident response as incident commander, coordinating teams and driving resolution.
  • Facilitate blameless postmortems and ensure action items are tracked and resolved.
  • Develop and improve runbooks, escalation paths, and on-call practices to reduce MTTD and MTTR.
  • Design and maintain observability strategies using modern tooling.
  • Define intelligent, actionable alerting to minimize alert fatigue.
  • Drive adoption of distributed tracing and structured logging across services.
  • Identify and measure toil and lead initiatives to eliminate it through automation.
  • Build internal tooling and self-service capabilities to improve developer productivity.
  • Collaborate on cloud-native patterns for fault tolerance, auto-scaling, and disaster recovery.
  • Provide SRE input into CI/CD pipelines and deployment strategies (canary, blue/green).
  • Manage infrastructure using IaC practices with a focus on reliability.
  • Mentor and grow junior SREs, fostering a culture of ownership and continuous improvement.
  • Act as an SRE advocate across engineering, embedding reliability into the development lifecycle.
  • Partner with stakeholders to align SRE strategy with organizational goals.
  • Conduct regular 1:1s with direct reports and participate in team rituals.
  • Embed AI tools and practices into how we build and run our platform.
  • Support engagement and solutioning for AI-powered offerings.
  • Collaborate with cross-functional partners to ensure AI is delivered safely and effectively.

What We're Looking For

  • 7+ years of experience in SRE, platform engineering, or a related discipline.
  • Proven experience defining and managing SLOs, SLIs, and error budgets in a production environment.
  • Strong incident management experience, including leading postmortems and driving reliability improvements.
  • Hands-on experience with observability tooling (Prometheus, Grafana, OpenTelemetry, or similar).
  • Solid understanding of cloud platforms (AWS, Azure, or GCP) and containerized environments (Kubernetes).
  • Proficiency in at least one scripting or programming language (Python, Go, or Bash).

Nice to Have

  • Experience with chaos engineering tools (e.g., Chaos Monkey, Gremlin, LitmusChaos).
  • Familiarity with IaC tooling such as Terraform or Pulumi.
  • Knowledge of DevSecOps practices and security tooling.
  • Experience with GitOps workflows and CI/CD pipelines.
  • Bilingual proficiency (English & Spanish).

Technical Stack

  • Observability: Prometheus, Grafana, OpenTelemetry, ELK
  • Cloud Platforms: AWS, Azure, GCP
  • Infrastructure: Kubernetes
  • Languages: Python, Go, Bash
  • Infrastructure as Code: Terraform, Pulumi

Work Mode

This is a global, work-at-home position based in India.

Required Skills
PrometheusGrafanaOpenTelemetryELKAWSAzureGCPKubernetesPythonGoSLO/SLIIncident ManagementPlatform Engineering
Invoicing holding you back?

Focus on work, not paperwork

Stop worrying about invoicing, taxes, and compliance. Glopay handles the business setup, you handle the client work. Get paid faster and look professional.

Auto-generated compliant invoices
Built-in expense management
Income reports for tax season
95% of earnings stay with you
Try Glopay free
No credit card needed
About company
Concentrix

A global technology and services leader that powers the brands of the future, helping well-known brands improve their businesses with technology and integrated solutions in over 70 countries.

Visit website
Job Details
Department Information Technology
Category infrastructure
Posted 14 days ago