As a Senior Site Reliability Engineer, you will play a pivotal role in ensuring the reliability, scalability, and security of mission-critical government systems. You will bridge the gap between development and operations by designing resilient infrastructure, automating deployment pipelines, and enforcing compliance with federal standards. This role demands a strong foundation in systems engineering, deep expertise in virtualized and containerized environments, and a proactive mindset toward incident prevention and resolution. You will work closely with development teams to reduce friction, improve system observability, and deliver high-impact solutions in fast-paced, high-stakes environments.
Responsibilities
- Design and manage critical application deployments in virtualized or containerized environments such as VMware and Kubernetes, ensuring scalability, uptime, and adherence to federal standards.
- Build and maintain automated CI/CD pipelines, monitoring systems, and configuration management processes to enable reliable software delivery and operational visibility across all environments.
- Provision and support developer environments and toolchains to enable fast, secure, and efficient development aligned with mission objectives.
- Detect and resolve obstacles in the software development lifecycle by implementing developer-centric solutions that improve productivity and workflow efficiency.
- Foster strong customer relationships through technical leadership and deliver innovative, mission-driven solutions using deep systems expertise.
Requirements
- Active Top Secret security clearance with eligibility for Sensitive Compartmented Information (SCI).
- Possession of a DoD 8140-compliant certification such as Security+ or higher.
- Minimum of 7 years of experience in software development, systems engineering, or IT operations with a focus on system reliability, performance, and availability.
- Proven ability to integrate software engineering and systems administration to support scalable and highly available production systems.
- Hands-on experience creating and managing monitoring, alerting, and observability frameworks to meet service level objectives.
- Background in incident response, root cause analysis, and driving post-incident improvements.
- Proficiency with Ansible and Desired State Configuration for infrastructure automation.
- Experience using GitLab CI/CD and Bash scripting to streamline deployment workflows.
- Familiarity with container-native and object storage technologies including MinIO, S3-compatible services, and PortWorx.
- Knowledge of enterprise load balancing platforms such as F5.
- Ability to quickly contribute in high-pressure, mission-critical environments with minimal onboarding time.
Nice to Have
- Bachelor’s degree in Computer Science or a related technical field; relevant professional experience may be considered in lieu of a degree.
Tech Stack
VMware, Kubernetes, Ansible, Desired State Configuration, GitLab CI/CD, Bash scripting, MinIO, S3-compatible services, PortWorx, F5
Benefits
- Comprehensive health, dental, and vision insurance coverage
- 401(k) retirement plan with company matching contributions
- Paid time off and paid holidays
- Parental leave and dependent care support
- Flexible work arrangements including hybrid work options
Compensation
SALARY RANGE: $170,000 - $220,000. Includes performance-based bonuses, company-paid training and certifications, referral incentives, and additional rewards tied to individual and organizational performance.
Work Arrangement
hybrid — Flexible work arrangements
Team
Part of a collaborative, high-performing team delivering technical solutions for federal government clients.
- Deep commitment to employee well-being and development
- Customer-focused mission delivery
- Two decades of experience assembling top-tier technical teams
- Comprehensive benefits, professional growth opportunities, and support for work-life balance
Additional Information
- This role is designated as essential personnel and may require on-call availability during critical incidents.
- Candidates must be willing to undergo periodic reinvestigations for security clearance maintenance.
- Position requires collaboration across geographically distributed teams, including occasional travel to government facilities.
- All systems and processes must comply with federal security standards such as NIST 800-53 and FISMA.
- Regular participation in disaster recovery drills and system audits is expected.


