Berlin, UK, Germany Hybrid Employment

Cephalgo is hiring a Senior Site Reliability Engineer

About the Role

The Senior Site Reliability Engineer will join NinjaOne's Platform Engineering organization to help scale its cloud-based IT solutions for millions of end-users. This role focuses on automation, observability, and ensuring high availability and security of services through proactive infrastructure and application management.

What You'll Do

  • Diagnose and resolve complex application and infrastructure issues
  • Participate in 24x7 on-call rotation, SCRUM, and deployment planning
  • Perform Root Cause Analysis (RCA) and provide recommendations for application teams
  • Improve availability and reduce customer impact using industry best observability tools
  • Ensure best-practice and security-minded architecture by influencing design decisions
  • Create and maintain technical documentation and SOPs
  • Develop software, scripts, or tooling to improve efficiency and reduce delivery time of applications and infrastructure
  • Other duties as needed

What We're Looking For

  • 10+ years’ experience in Site Reliability Engineer roles
  • Expert+ level Linux administration, scripting, and troubleshooting
  • Demonstrable knowledge of Observability tools (Prometheus/Grafana, New Relic, Splunk, DataDog)
  • Comprehensive experience with AWS (Amazon Web Services) and its core capabilities (VPC, EC2, ECS, Route53, Fargate, ALB/NLB distributions, etc)
  • Extensive experience with cloud automation and infrastructure-as-code (IaC) toolsets, primarily CloudFormation
  • Good understanding of containers, Fargate, Kubernetes, and overall distributed microservice architectures
  • Passionate about automation, security, and self-service environments/portals
  • Hands-on experience with CI/CD and SDLC (Software Development Life Cycle) processes
  • Effective communication skills, both verbal and written

Nice to Have

  • Experience with Terraform, Helm, Ansible
  • CDK a plus

Technical Stack

  • Java
  • Kotlin
  • C++
  • Postgres
  • AWS
  • VPC
  • EC2
  • ECS
  • Route53
  • Fargate
  • ALB
  • NLB
  • Prometheus
  • Grafana
  • New Relic
  • Splunk
  • DataDog
  • CloudFormation
  • Terraform
  • Helm
  • Ansible
  • CDK
  • Kubernetes
  • containers

Team & Environment

  • SRE team in the Platform Engineering organization

Benefits & Compensation

  • Grow personally and professionally with one of the fastest growing companies
  • Access to our Corporate Benefits Platform (with discounts for brands such as Expedia, FitX, Zalando and many more)
  • Develop your skills through our renowned training platform
  • Receive competitive compensation
  • Collaborate with a curious, kind, international and intercultural workforce

Work Mode

  • Hybrid work model
  • Available locations: UK, Germany, Berlin
  • Fully remote with option to be hybrid in Berlin office

All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, genetic information, marital status, veteran status, or any other status protected by applicable law.

Required Skills
JavaKotlinC++PostgresAWSVPCEC2ECSRoute53FargateLinux administrationscriptingtroubleshootingObservabilityInfrastructure-as-Code JavaKotlinC++PostgresAWSVPCEC2ECSRoute53FargateLinux administrationscriptingtroubleshootingObservabilityInfrastructure-as-Code
Earn more as a remote developer

Performance pay that rewards your skills

Iglu's revenue-sharing model means top performers earn significantly more than traditional salaries. Choose your projects, deliver great work, and see it reflected in your pay.

Revenue-sharing compensation
Project choice & autonomy
International client base
Career growth support
Check compensation
Top earners exceed market rate
About company
NinjaOne unifies IT to simplify work for more than 35,000 customers in 140+ countries. The NinjaOne Unified IT Operations Platform delivers endpoint management, autonomous patching, backup, and remote access in a single console to improve efficiency, increase resilience, and reduce spend. By automating IT and managing all endpoints, organizations give employees a great technology experience at work.
All jobs at Cephalgo Visit website
Job Details
Category infrastructure
Posted 3 months ago