Responsibilities
- Establish and monitor service-level indicators, objectives, and error budgets, escalating issues when limits are breached.
- Identify repetitive operational work and implement automation to reduce manual effort.
- Automate routine tasks such as data handling, infrastructure updates, and report generation.
- Streamline system operations including service restarts and certificate renewals through automated solutions.
- Optimize CI/CD pipelines and deployment workflows for efficiency and reliability.
- Ensure cloud applications and underlying infrastructure operate consistently and without disruption.
- Validate system stability by testing environments following configuration or code changes.
Manage Additional Tooling and Infrastructure
Support and improve tools and systems used throughout the product development and operations lifecycle.
Execute Service and Change Requests
Process operational tasks including new version deployments, database updates, and implementation of planned changes.
Incident Management and Continuous Improvement
- Lead responses to incidents and conduct post-mortem reviews to identify root causes and prevent recurrence.
- Engage in problem and change management processes following ITIL guidelines.
Collaborate Across Teams
- Partner with development, operations, and support teams to ensure smooth service delivery.
- Provide occasional after-hours on-call support and join scheduled maintenance activities as required.


