Responsibilities
- Investigate system incidents, drive Root Cause Analysis (RCAs), and execute long-term remedial fixes.
- Proactively reduce the number of incidents caused by system changes.
- Define and enforce Service Level Agreements (SLAs), Service Level Objectives (SLOs), and success metrics for new initiatives.
- Build and maintain comprehensive dashboards to achieve observability excellence.
- Identify and help resolve performance bottlenecks.
- Optimize infrastructure and code to maintain fast service.
- Conduct capacity planning to forecast future hardware or cloud resource requirements.
- Guarantee the Platform components remain highly reachable and functional for users.
- Oversee deployments to ensure new code does not disrupt the existing system.
Benefits
- Not explicitly mentioned in the posting.
Work Arrangement
Remote (Country) — Malta
Additional Information
- Applicants must submit their CV in English.
- Personal data will be processed as set out in the company's privacy policy.