Bethesda, United States of America Remote (Global)

Lockheed Martin Canada is hiring a Site Reliability Engineer (SRE)

The Site Reliability Engineer (SRE) plays a critical role in ensuring the stability, scalability, and performance of the Apriso platform hosted on AWS. This position bridges the gap between development and operations by applying engineering principles to operations problems. The SRE is responsible for automating infrastructure, monitoring system health, responding to incidents, and driving continuous improvement in reliability and efficiency. The role requires deep technical expertise in cloud platforms, scripting, and infrastructure-as-code, as well as strong collaboration skills to work across teams. The SRE ensures compliance with SOX regulations while fostering a culture of innovation, resilience, and operational excellence within a highly regulated environment.

Responsibilities

  • Lead infrastructure-as-code implementation and maintenance for the Apriso platform on AWS.
  • Oversee cloud platform operations and ensure reliability for production systems.
  • Design and maintain highly available, fault-tolerant systems.
  • Adhere to SOX-compliant change management processes.
  • Partner with cross-functional teams to resolve complex technical and business challenges.
  • Present technical solutions clearly to non-technical stakeholders and leadership.

Requirements

  • Extensive experience with software development principles, including system architecture, SDLC, and CI/CD pipelines.
  • Strong understanding of operations technologies such as compute infrastructure, O&M, and IaaS platforms.
  • Proven experience working with SQL and relational database systems like SQL Server, Oracle, or RDS.
  • Significant hands-on experience with cloud service providers, particularly AWS or Azure.
  • Proficiency with automation tools and scripting, including GitLab, Ansible, and infrastructure-as-code frameworks.
  • Working knowledge of both Windows and Linux operating environments.
  • Familiarity with Agile methodologies and their application in development and operations.

Nice to Have

  • Hands-on experience with Infrastructure as Code tools such as Terraform, Ansible, and YAML configurations.
  • Scripting proficiency in bash and PowerShell for system automation.
  • Experience building and managing CI/CD pipelines using GitLab or Jenkins.
  • Direct work with relational databases, including writing DDL and DML SQL statements.
  • In-depth knowledge of AWS services including EC2, RDS, S3, VPC, and Systems Manager.
  • Experience designing cloud-native application architectures.
  • Background in developing High Availability and Disaster Recovery solutions in cloud environments.
  • Demonstrated ability to solve complex, cross-functional business problems.
  • Experience communicating technical concepts to non-technical audiences, including customers and executives.

Tech Stack

AWS, Terraform, Ansible, YAML, GitLab, Jenkins, SQL, RDS, EC2, S3, VPC, Systems Manager, PowerShell, bash, Windows, Linux, CI/CD, IaC, Relational Databases

Benefits

  • Medical insurance coverage
  • Dental insurance benefits
  • Vision care insurance
  • Life insurance
  • Short-Term Disability coverage
  • Long-Term Disability protection
  • 401(k) matching program
  • Flexible Spending Accounts (FSA)
  • Employee Assistance Program (EAP)
  • Education Assistance for professional development
  • Parental Leave benefits
  • Paid time off
  • Paid holidays

Compensation

The annual base salary range for this position is $113,900 - $200,905, depending on location, with rates adjusted for states including California, Massachusetts, New York, Colorado, Hawaii, Illinois, Maryland, Minnesota, New Jersey, Vermont, Washington, and Washington DC. For other states, compensation reflects the candidate's final work location. This role is eligible for an incentive plan.

Work Arrangement

Full-time remote telework with flexibility to work from any location. Employee may travel to company offices in Bethesda, MD, Littleton, CO, or Moorestown, NJ for periodic meetings.

Team

Part of DevSecOps Product Teams within Enterprise Operations.

Additional Information

  • This role supports 24/7 system availability and may require on-call rotation participation.
  • Regular collaboration with security teams to ensure compliance with enterprise policies.
  • Opportunities for professional growth through certifications and training programs.
  • Active participation in incident response and post-mortem analysis is expected.
  • The SRE contributes to capacity planning and cost optimization initiatives.
  • Team uses agile ceremonies including sprint planning, stand-ups, and retrospectives.
  • Close partnership with development teams to improve application observability.
  • Participation in change advisory boards (CAB) for production deployments.
  • Use of monitoring and alerting tools such as CloudWatch, Prometheus, or Datadog.
  • Emphasis on documentation and knowledge sharing across teams.
  • Regular engagement with audit teams to demonstrate SOX compliance controls.
Required Skills
AWSTerraformAnsibleYAMLGitLabJenkinsSQLRDSEC2S3VPCSystems ManagerPowerShellbashWindows AWSTerraformAnsibleYAMLGitLabJenkinsSQLRDSEC2S3VPCSystems ManagerPowerShellbashWindows
About company
Lockheed Martin Canada
Lockheed Martin Canada, headquartered in Ottawa, is the Canadian division of Lockheed Martin Corporation, a global leader in the defense technology industry driving innovation and scientific advances. Their vision of developing solutions for missions across all domains and 21st Century Security® accelerates the delivery of transformative technologies to ensure those they serve are at the forefront. They operate in Ottawa, Montreal, Halifax, Calgary, and Victoria on a wide range of programs including leading-edge naval technology products, aircraft maintenance, and remote systems software. They also provide in-service support for state-of-the-art military aircraft and aircraft engine repair capabilities.
All jobs at Lockheed Martin Canada Visit website
Job Details
Department Information Technology
Category infrastructure
Posted 2 months ago