About the Role
This role involves maintaining and enhancing cloud infrastructure and data systems, ensuring high availability, security, and performance while collaborating with engineering teams to support scalable solutions.
Responsibilities
- Monitor and maintain cloud-based data platforms for optimal performance
- Implement and manage automated deployment pipelines
- Ensure data integrity, availability, and recovery across systems
- Troubleshoot infrastructure and data flow issues promptly
- Support secure configuration of cloud environments
- Collaborate with engineering teams to scale data solutions
- Maintain documentation for operational procedures and system architecture
- Respond to system alerts and perform root cause analysis
- Optimize resource utilization to control cloud spending
- Enforce compliance with data protection and security standards
- Deploy monitoring tools for proactive system health tracking
- Participate in on-call rotation for critical incidents
- Upgrade and patch systems to maintain stability
- Assist in migration of data workloads to cloud platforms
- Evaluate new tools and technologies for operational improvements
- Configure access controls and identity management systems
- Support disaster recovery planning and testing
- Ensure logging and auditing systems are functional
- Work with stakeholders to understand data infrastructure needs
- Maintain consistent configuration across environments
- Improve system reliability through iterative changes
- Contribute to incident post-mortem reviews
- Assist in capacity planning for future growth
- Promote best practices in cloud resource management
- Ensure alignment with organizational SLAs and uptime goals
Compensation
Competitive salary and benefits package
Work Arrangement
Hybrid remote
Team
Collaborative technical team focused on infrastructure and data systems
Technology Stack
- We use AWS and Google Cloud Platform for infrastructure
- Terraform is used for infrastructure provisioning
- We rely on Kubernetes for container orchestration
- Monitoring is handled through Prometheus and Grafana
- Logging is centralized using the ELK stack
- CI/CD pipelines are built with Jenkins and GitHub Actions
- Configuration management uses Ansible
- We store data in BigQuery and S3 equivalents
- Access management is handled via IAM and Okta
- We follow GitOps practices for deployment consistency
Growth Opportunities
- Team members lead internal tech initiatives
- Regular access to training and certification funding
- Opportunities to present at industry events
- Internal mentorship programs available
- Pathways to senior and architecture roles
- Chance to work on open-source contributions
- Rotational projects across teams
- Leadership training for technical leads
- Annual skills assessment and development planning
- Support for publishing technical content
Available for qualified candidates