Remote (Global) USD 160,000 – 220,000 / year

Judi Health is hiring a Senior Scalability Engineer - Observability

Responsibilities

  • Design, implement, and maintain the observability stack (Loki, Grafana, Tempo, Mimir/Prometheus) as the primary observability platform, balancing cost, performance, and developer experience.
  • Design and develop internal platform products with React/TypeScript frontends and Python/Rust backends for log search, metrics visualization, and trace analysis.
  • Architect and build high-performance log indexing solutions using Rust for efficient log processing and search.
  • Design and implement SQL analytics for logs using AWS Athena or similar engines for ad-hoc analysis and historical queries.
  • Build web interfaces for querying logs, metrics, and traces with features like saved queries, query templates, and pattern detection.
  • Architect solutions leveraging both AWS-managed services and open-source tooling to optimize for cost, performance, and operational flexibility.
  • Design seamless integration between AWS CloudWatch and the custom observability platform for unified visibility.
  • Develop smart dashboards, monitors, and alerting systems to reduce noise and detect anomalies.
  • Work with product teams to integrate observability into their services and establish logging and metrics standards.
  • Provide the observability foundation for identifying performance bottlenecks and measuring platform stability.
  • Define and document observability standards including logging patterns, metric naming conventions, and dashboard design principles.
  • Lead workshops, create documentation, and build self-service tooling to promote observability best practices.
  • Mentor engineers on observability practices and lead architecture reviews for instrumentation approaches.
  • Work in an Agile/Scrum environment to deliver value to stakeholders and clients.
  • Adhere to the company's Code of Conduct, including reporting noncompliance.

Compensation

Competitive

Work Arrangement

On-site

Team

Collaborative and innovative engineering teams

Qualifications

  • Proven experience in designing and implementing observability platforms.
  • Proficiency in Rust, Python, and TypeScript.
  • Experience with AWS services and open-source tooling.
  • Strong architectural and problem-solving skills.
  • Ability to work in an Agile/Scrum environment.
  • Excellent communication and mentoring skills.

Preferred Qualifications

  • Experience with log indexing systems and SQL analytics.
  • Familiarity with React and Grafana.
  • Knowledge of performance optimization techniques.
  • Experience with cloud-native and open-source technologies.
  • Ability to lead workshops and create documentation.

Not specified

Required Skills
React.jsTypeScriptPythonFlaskSQLAlchemyRustGrafanaPrometheusDistributed SystemsMonitoring
About company
Judi Health
Judi Health is an enterprise health technology company providing a comprehensive suite of solutions for employers and health plans, including Capital Rx (a PBM), Judi Health™ (health benefit management), and Judi® (the Enterprise Health Platform).
All jobs at Judi Health Visit website
Job Details
Category infrastructure
Posted 3 months ago