Bridge software engineering and infrastructure operations by maintaining and enhancing cloud-native systems. Focus on scalability, reliability, and performance through automation, observability, CI/CD, and platform security. Collaborate with engineering and product teams to support system stability and development efficiency.
Responsibilities
- Enhance and support AWS-based infrastructure through proactive maintenance and improvements
- Continuously optimize system performance, reliability, and security across services
- Build and manage observability systems including metrics, logging, distributed tracing, and dashboards
- Collaborate with engineering teams to improve application instrumentation for monitoring and diagnostics
- Develop and maintain tooling to increase developer efficiency and streamline workflows
- Participate in on-call rotations to ensure system availability and rapid incident response
Requirements
- Minimum of 2 years of experience in a Site Reliability Engineering role focused on AWS environments
- At least 2 years of professional software development experience
- Demonstrated experience with observability platforms, particularly Datadog
- Strong proficiency in Terraform for infrastructure as code
- Proficiency in one or more programming or scripting languages such as Ruby, Python, Go, or Shell Script
- Hands-on experience with CI/CD pipelines using tools like GitHub Actions, Jenkins, or CircleCI
Nice to Have
- Excellent communication skills for cross-team and external collaboration
- Understanding of networking protocols including TCP/IP
- Experience working with AWS Direct Connect
Tech Stack
AWS, Terraform, Datadog, CI/CD, GitHub Actions, Jenkins, CircleCI, Ruby, Rails, Python, Go, Shell Script, observability, distributed tracing, logging, metrics, alerting, cloud infrastructure
Benefits
- Hybrid work model with support for remote work and optional office access
- 10 days of regular vacation time
- Additional 5 days of summer vacation
- 5 days of winter vacation
- Paid holiday on employee's birthday
- Self-learning budget to support ongoing skill development
- Access to O’Reilly Learning Platform
- Language training for Japanese and English
Work Arrangement
hybrid — Remote work embraced; office space available for those who prefer in-person collaboration
Additional Information
- On-call rotation participation is a required responsibility
- Language training in Japanese and English is provided
- Emphasis on improving developer productivity through tooling and automation
- Maintaining system security and compliance is a core part of the role


