Responsibilities
- To be successful the candidate will be expected to rapidly deliver solutions leveraging the following specific cloud services and tooling down below:
- Infrastructure as Code (IaC) Mastery: Quickly implement and adapt infrastructure using Terraform , Pulumi, or other major IaC tools.
- Containers proficiency: Experience with Docker is critical. You must deeply understand how to design, build, and optimize secure, multi-stage Dockerfiles.
- CI/CD Implementation: Design, build, and manage robust Continuous Integration and Continuous Delivery (CI/CD) pipelines to automate code testing, building, and deployment across environments.
- Core Cloud Services (AWS or GCP): Proven ability to provision and manage foundational services. Deep expertise in one major cloud provider is required, with the understanding that knowledge is transferable to the other.
- Container Compute: Expertise in at least one major container platform: EKS, GKE, ECS, Fargate, OR Cloud Run. (Expertise in Kubernetes (k8s) is still highly valued, particularly with EKS or and GKE.)
- Networking: Understand when to use load balancers, VPNs for secure connectivity, and private VPCs for network isolation. Applies knowledge of subnetting, routing, VPC peering, and NAT gateways to build secure applications.
- Storage: Experience with S3 (AWS) OR Cloud Storage (GCP).
- Databases: Experience with RDS (AWS) OR CloudSQL (GCP).
- Serverless: Deploy event-driven components using Cloud Functions (AWS Lambda/GCP Functions) or their equivalents.
- CDNs, Message Queues.
- Security: Understands the value of security such as protecting PII, the role of encryption, secrets management, network firewalls, and web application firewalls (AWS WAF, GCP Cloud Armor) following security best practices.
- Automation & Scripting: Write high-quality automation and tooling using Go, Python, Node.js or Bash for client-specific operational challenges.
- Monitoring and Operations: Ensure robust monitoring (leveraging out-of-the-box cloud provider solutions) and high system uptime.
Requirements
- Minimum of 5 years dedicated experience in DevOps, Infrastructure, or SRE roles.
- Expert with Docker, Kubernetes (k8s), and Terraform/Pulumi.
- Deep, proven expertise in either AWS or GCP infrastructure, with the ability to quickly grasp and transition to other cloud providers.
- Strong ability to write clean, maintainable code for automation in Go, Python, or Node.js.
- Demonstrable experience implementing and maintaining modern cloud security controls and meeting key compliance standards (SOC2, PIPEDA, HIPAA and/or GDPR).
- Proven ability to quickly onboard, diagnose problems, and propose/implement solutions with minimal oversight.
- Experienced in a consultant or freelancer capacity, demonstrating an aptitude for understanding and effectively communicating with both technical and non-technical stakeholders.
Nice to Have
- Experience deploying and managing AI/ML workloads. Provisioning vector databases, and GPU/TPU compute resources.
- Prior hands-on experience in both AWS and GCP environments.
- Cloud Certifications such as AWS or GCP certificates.
- Advanced monitoring (e.g., Prometheus, Datadog) or logging experience.
- Experience with GitOps, ArgoCD or Flux.
- Experience with a Service Mesh, Ingress gateway.
Benefits
- Work From Anywhere: Choose your workspace — all you need is strong wifi and a passion for building!
- Work / Life Balance: We believe in our team’s ability to have it all; a great career, and time to unplug and live…you know...life.
- Employee Care: We provide full benefits (healthcare, dental, vision) for our employees (401k for our US employees)
- Unlimited PTO: Everyone needs a break. Take at least 15 days off a year, and more if you need—just be cool about it and keep the team in mind.
- Regular Team Retreat: Join us for a week of team bonding at amazing destinations. Recent trips include the Dominican Republic, Cancun, and Hawaii — plus ones welcome.
Work Arrangement
Remote (Worldwide) — Toronto, Ontario
Team
Team size: 180+. Structure: senior engineers and designers
Additional Information
- We use AI to help us review and shortlist applications based on job-related criteria. A human hiring manager always makes the call on who moves forward. As a company that builds with AI every day, we're all for candidates using it too — just be upfront about how it helped.