Remote Work 4 min read

Remote SRE Jobs: Vanguard’s Cloud Transformation

Vanguard’s shift to cloud-native architecture and site reliability engineering is reshaping tech roles in financial services. Discover how this transformation is fueling demand for remote SRE jobs and what it means for DevOps career growth in 2026.

Apr 19, 2026
Home office setup with laptop running cloud monitoring tools, symbolizing remote SRE jobs in financial services cloud transformation.

As Vanguard advances its cloud-native journey, remote SRE roles are becoming central to its infrastructure reliability.

Remote SRE Jobs: A Growing Trend in Financial Services

As enterprises modernize infrastructure, remote SRE jobs are becoming increasingly common — especially in large financial institutions like Vanguard. With $8 trillion in assets under management, Vanguard has quietly evolved into a tech-first organization. Of its 17,000 employees, 7,000 work in technology — a clear signal that engineering is central to its mission. This shift has created new pathways for remote cloud engineering jobs with Vanguard, particularly in site reliability engineering.

Just seven years ago, Vanguard hosted all services in a private data center using monolithic applications. Deployments were quarterly, observability was minimal, and developers had to ticket their way into alerting changes. Dev and Ops were siloed. Today, the company runs a majority of its workloads in the public cloud, embracing DevOps, observability, and SRE practices that mirror leading tech firms — and doing so with a distributed workforce in mind.

From Monoliths to Microservices: Vanguard’s Cloud Journey

Vanguard’s transformation began with a move to an internal private cloud platform-as-a-service. This allowed engineering teams to carve out microservices from legacy monoliths gradually. While this improved deployment frequency and enabled automated testing, it introduced complexity.

"But it left us with an unnecessary abstraction layer…overcomplicating the environment and causing more problems than it solved." — Christina Yakomin, Senior SRE Coach at Vanguard

The lesson was clear: incremental modernization requires careful evaluation of abstractions. Vanguard eventually migrated its platform to the public cloud using a combination of Amazon ECS, AWS Lambda, and Amazon EKS. This shift eliminated the need for in-house PaaS maintenance and unlocked cost and scalability benefits — particularly for infrequently accessed microservices.

The move also enabled new patterns of ownership. Product teams, once isolated from operations, now take accountability for testing code and configurations. This cultural shift laid the foundation for remote SRE jobs where engineers are embedded in product squads but operate across time zones.

Chaos Engineering and Resilience at Scale

With distributed systems comes unpredictability. To combat this, Vanguard adopted failure modes and effects analysis (FMEA), a practice rooted in military engineering. Teams now proactively analyze how systems might fail and what the downstream impacts could be.

This thinking evolved into structured chaos engineering practices:

  • Chaos game days: Teams simulate crashes to validate self-healing and scaling.
  • Chaos fire drills: Unplanned failure injection to test observability tools like Honeycomb.
  • CI/CD break testing: Dummy builds run during off-hours to mimic peak load and uncover pipeline instability.

These rituals are now part of on-call training. Recordings from chaos fire drills are used to onboard new engineers — a practice that supports remote onboarding and asynchronous learning.

SRE at Vanguard: Embedded, Coached, and Evolving

Vanguard’s SRE model blends embedded and centralized support. Some product teams include both Application Engineers and Site Reliability Engineers. Others rely on a centralized SRE coaching team — of which Christina Yakomin is a part — to guide best practices.

The role of SRE at Vanguard has expanded beyond uptime. Engineers now focus on availability, resilience, and non-functional requirements, all while aligning with security controls. Shared on-call responsibilities ensure accountability.

The company has moved from binary “up/down” monitoring to a data-driven approach using SLIs, SLOs, and error budgets. This allows teams to balance feature velocity with reliability. A high-priority feature might run at 99% availability, while critical systems target 99.9%.

DevOps engineers share responsibility for managing alerts and meeting SLOs. But this autonomy came with challenges.

"As we saw this grow in scale, we saw some really positive outcomes and some unexpected consequences" — Christina Yakomin, Senior SRE Coach at Vanguard

Teams initially over-instrumented, creating dashboard clutter and alert fatigue. Yakomin noted:

"just because you can do everything in your log aggregation tool, doesn’t mean you should" — Christina Yakomin, Senior SRE Coach at Vanguard

This insight led to a strategic pivot: standardizing on OpenTelemetry.

Observability, OpenTelemetry, and the Future of Remote SRE Jobs

To avoid vendor lock-in and unify observability, Vanguard standardized on OpenTelemetry (OTel). This decision allows teams to extract common fields using centrally maintained libraries, regardless of backend tooling.

Monitoring is now split across tools: Amazon CloudWatch for metrics and Honeycomb for deep observability. But the real win is portability. As Yakomin explained:

"This allows us to make the investment for the future but avoid vendor lock-in — we’re not the only ones standardizing around this framework. It looks like a lot of the industry is as well" — Christina Yakomin, Senior SRE Coach at Vanguard

The adoption of OTel supports remote SRE jobs USA by enabling engineers to work across systems without deep vendor-specific knowledge. It also future-proofs careers in a rapidly evolving landscape.

Vanguard’s SRE coaching team now focuses on validating tools, creating self-study curricula, and advising SRE leads. Training remains a challenge — especially when pulling engineers from feature work.

"they may see the impact of my work as challenging" — Christina Yakomin, Senior SRE Coach at Vanguard

Yet the payoff is clear. SRE is one of Vanguard’s fastest-growing roles. The company is up-skilling internal talent while also hiring externally — a dual strategy that opens doors for aspiring engineers.

Sources

Thenewstack.

Topics

Remote SRE JobsCloud Native SRE TransformationDevOps Career GrowthTech Roles in Financial ServicesCloud Migration EngineeringHow to Start a Career in Site Reliability EngineeringRemote Cloud Engineering Jobs with VanguardSRE Jobs USAVanguard SREOpenTelemetryChaos EngineeringFMEASLIs SLOsAWS LambdaHoneycomb Observability