This role is central to advancing the reliability, scalability, and automation of our cloud-native infrastructure. As a Staff Site Reliability Engineer, you will lead high-impact initiatives across CI/CD pipeline optimization, Infrastructure as Code implementation, and system resilience. You will collaborate closely with engineering teams to enforce best practices, reduce toil through automation, and ensure consistent, secure, and repeatable deployments. In addition to technical leadership, you will mentor junior engineers, contribute to cross-team architecture discussions, and serve as an escalation point during critical incidents. This position plays a key role in shaping the future of our DevOps practices and driving operational excellence across a globally distributed engineering organization.
Responsibilities
- Lead the design and improvement of CI/CD pipelines to enhance deployment efficiency and reliability.
- Develop and manage Infrastructure as Code scripts to automate provisioning and infrastructure lifecycle management.
- Identify opportunities to automate manual processes and increase operational efficiency.
- Enforce CI/CD and IaC best practices to ensure consistency, repeatability, and compliance across systems.
- Maintain system resilience by preventing unapproved or undocumented changes to pipelines.
- Demonstrate consistent reliability and diligence as a role model for engineering teams.
- Deliver high-impact technical solutions recognized across teams and departments.
- Produce clear post-mortem reports for both internal and external stakeholders following incidents.
- Provide mentorship and actionable feedback to engineers across multiple teams.
- Review code and pull requests with a focus on strengthening CI/CD and automation standards.
- Act as a technical consultant for engineers in other squads on reliability and automation topics.
- Tackle complex, ambiguous problems effectively, especially under time pressure.
- Participate in on-call rotations to support system stability and incident response.
- Stay current with emerging trends and tools in CI/CD, automation, and cloud infrastructure.
- Lead Proof of Concept initiatives to evaluate and integrate new technologies into production workflows.
Requirements
- Intermediate-to-Advanced English proficiency (B2 level).
- Must be located in Brazil.
- Skilled in CI/CD platforms such as Argo and Codefresh.
- Expertise in Infrastructure as Code tools including Terraform.
- Solid understanding of Docker and Kubernetes for containerization and orchestration.
- Proficient with monitoring and observability solutions like Grafana, Grafana Loki, Honeycomb, OpenTelemetry, and Prometheus.
- Hands-on experience in cloud environments.
- Proven experience automating deployments and integrating CI/CD pipelines.
- Strong problem-solving abilities, particularly in high-pressure situations.
- Experience mentoring engineers and providing constructive feedback.
- Effective communication and collaboration skills across distributed teams.
Nice to Have
- Familiarity with service mesh technologies such as Istio.
- Programming experience in Golang, Java, or Groovy is a plus.
Tech Stack
Argo, Codefresh, Terraform, Docker, Kubernetes, Grafana, Grafana Loki, Honeycomb, OpenTelemetry, Prometheus, Istio, Golang, Java, Groovy
Work Arrangement
Remote position based in Brazil with potential requirement to attend Visa offices with advance notice.
Team
Part of a DevOps squad within a large engineering organization of 500+ employees across over 10 countries, focused on platform resilience and automation.
Additional Information
- This is a remote role.
- Remote employees may be required to report to a Visa office with scheduled notice.
- Applicants must be based in Brazil.
- Intermediate-to-Advanced English (B2 level) is required.


