Boston, Massachusetts, United States Hybrid Employment USD 135,000 - 155,000 Yearly

LineVision, Inc. is hiring a Site Reliability Engineer

About the Role

LineVision is looking for a Site Reliability Engineer to establish our dedicated SRE practice. You will ensure our grid intelligence platform delivers exceptional reliability for utility customers by owning the development of critical systems observability, deployment processes, and incident response protocols.

What You'll Do

  • Establish and maintain Service Level Objectives and observability frameworks for critical services supporting utility grid operations.
  • Implement CI/CD guardrails including canary deployments, automated rollbacks, and pre-production validation to improve deployment reliability.
  • Develop comprehensive incident response procedures with documented runbooks, escalation paths, and blameless post-incident review processes.
  • Partner with platform, engineering, and customer support teams to instrument systems and build reliability capabilities.
  • Design and implement monitoring dashboards tracking SLA compliance, reliability metrics, and error budgets.
  • Complete a comprehensive assessment of current infrastructure, identifying critical services requiring immediate observability improvements.
  • Establish baseline SLOs for top-priority services and implement initial monitoring dashboards.
  • Document current deployment processes and incident response procedures, identifying gaps and quick-win improvements.
  • Deploy a production-ready observability framework covering all critical customer-facing services, with alerts configured for key reliability signals.
  • Implement CI/CD improvements including automated testing gates, canary deployments, and rollback capabilities for core platform services.
  • Lead 3+ blameless post-incident reviews, establishing templates and processes that become standard practice.
  • Achieve measurable improvements in deployment success rates and mean time to recovery through implemented SRE practices.
  • Build strong cross-functional partnerships resulting in proactive reliability improvements identified through error budget monitoring.
  • Establish LineVision's SRE practice as a recognized capability, with documentation, runbooks, and processes that can scale with company growth.

What We're Looking For

  • Strong experience with core AWS services including EC2, RDS, Lambda, and networking/VPC configuration for production environments.
  • Hands-on proficiency with observability tools like Datadog, Prometheus, Grafana, or CloudWatch for instrumenting distributed systems.
  • Experience with Infrastructure as Code tools like Terraform, CloudFormation, or Pulumi for managing and versioning infrastructure.
  • Python and TypeScript experience for automation, tooling, and system instrumentation.
  • Demonstrated experience establishing Service Level Objectives and tracking error budgets.
  • Critical Thinking: Lead problem-solving efforts around complex reliability challenges, consistently applying critical thinking to identify root causes and prevent future incidents.
  • Taking Ownership: Lead reliability projects with minimal supervision, taking full ownership of SRE practice development and system observability outcomes.
  • Stakeholder Management: Manage relationships across engineering, platform, and support teams, providing clear updates on reliability metrics and leveraging influence to align on SRE priorities.
  • Delivering Innovative Solutions: Lead implementation of modern SRE practices, inspiring teams to think creatively about reliability challenges in utility infrastructure context.

Nice to Have

  • Background in energy, utility, or critical infrastructure sectors where reliability directly impacts public services.
  • AWS certifications demonstrating deep platform expertise.
  • Experience with security compliance frameworks relevant to utility operations.
  • Track record of building SRE practices from the ground up in fast-growing technical organizations.

Technical Stack

  • AWS, EC2, RDS, Lambda, VPC
  • Datadog, Prometheus, Grafana, CloudWatch
  • Terraform, CloudFormation, Pulumi
  • Python, TypeScript

Team & Environment

You will partner with platform, engineering, and customer support teams. You will work in a communicative, collaborative environment with high autonomy and trust.

Benefits & Compensation

  • Impactful work accelerating our mission of providing utilities with grid intelligence.
  • Ownership with high autonomy and trust.
  • Flexibility with trust-based PTO and a flexible work schedule.
  • Real world innovation working with patented technology.

Work Mode

This role operates on a hybrid work model based out of our Boston, MA headquarters.

Required Skills
AWSEC2RDSLambdaVPCDatadogPrometheusGrafanaCloudWatchTerraformPythonTypeScriptSLOsInfrastructure as Code
Landing international contracts?

Invoice globally with an EU company

GloPay creates an Estonian partnership for you automatically. Your clients get proper invoices, you keep 95% of payments. Setup takes 5 minutes, works in 100+ currencies.

EU-registered company for compliance
Multi-currency invoicing & payments
Expense tracking & tax reports
Money in your bank in 1 business day
Start invoicing free
5% per invoice • No subscriptions
About company
LineVision, Inc.

LineVision is a grid-enhancing technology company enabling electric utilities to deliver affordable, reliable power and accelerate the electrification of the global economy. Our grid intelligence platform delivers the most accurate, network-wide dynamic line ratings and enables safer, more reliable grid operations with a combination of optical sensors and advanced environmental modeling.

Visit website
Job Details
Department Information Technology
Category infrastructure
Posted 14 days ago