Remote (Global) Full-time

Kraken is hiring a Site Reliability Engineer - Data Platform

About the Role

Kraken is hiring a Senior Site Reliability Engineer to join our Data Infrastructure team. In this role, you will be pivotal in upholding the reliability, scalability, and efficiency of our robust Data Platform, collaborating with cross-functional teams to build and oversee foundational data infrastructure. We are a mission-focused company accelerating the global adoption of crypto for financial freedom and inclusion.

What You'll Do

  • Design data governance mechanisms to ensure the lakehouse is easy to interact with, secure, and compliant.
  • Implement infrastructure for data ingestion, storage, cataloging with metadata, and lineage capture.
  • Provide a state-of-the-art suite of BI tools for multiple teams within the company.
  • Guarantee the availability, high performance, scalability, and cost efficiency of the data platform.
  • Implement self-service data infrastructure solutions for 10+ business units and over 100 engineers and data analysts.
  • Utilize Infrastructure as Code (IaC) to design, provision, and manage on-premises and cloud (AWS) infrastructure using Terraform.
  • Develop and maintain automation scripts using bash/shell scripting to automate operational tasks and deployments.
  • Enhance and manage CI/CD pipelines for consistent software deployments.
  • Implement robust data monitoring and alerting solutions to proactively detect anomalies.
  • Manage role-based access control (RBAC) and permissions for user groups and machine workflows.
  • Manage real-time streaming data architecture using Kafka and Debezium Change Data Capture (CDC).
  • Utilize Kubernetes to manage containerized applications within the data infrastructure.
  • Implement effective incident response procedures and participate in on-call rotations.
  • Collaborate with data analysts, engineers, and cross-functional teams to understand requirements and implement solutions.
  • Document architecture, processes, and best practices.
  • Support AI/ML teams with their infra requests.

What We're Looking For

  • Proven experience (5+ years) as a Site Reliability Engineer, Infrastructure Engineer, Data Infrastructure Engineer, or similar role with a focus on data infrastructure and security.
  • Experience with maintaining real-time data processing technologies like Kafka and Flink clusters and Debezium instances.
  • Working experience managing hybrid multi-tenant cloud systems, particularly on AWS.
  • Experience with Infrastructure as Code tools such as Terraform, Terragrunt, and Atlantis.
  • Experience with containerization and orchestration tools, particularly Kubernetes, Nomad, and Docker.
  • Solid understanding of bash/shell scripting and proficiency in at least one programming language (preferably Python or JVM languages).
  • Experience maintaining data-related technologies: Apache Airflow, Apache Spark, databases, and BI tooling.
  • Experience solving data access management issues at large scale data-lake.
  • Familiarity with CI/CD deployment pipelines and related tools.
  • Strong problem-solving skills and ability to troubleshoot complex systems.

Nice to Have

  • Experience with data-related technologies (databases, data lakes, Airflow, Spark) is a plus.

Technical Stack

  • Cloud & IaC: AWS, Terraform, Terragrunt, Atlantis
  • Orchestration & Containers: Kubernetes, Nomad, Docker
  • Scripting & Languages: bash/shell, Python, JVM languages
  • Data & Workflow: Apache Airflow, Apache Spark, Kafka, Flink, Debezium

Team & Environment

You will join the Data Infrastructure team, collaborating with diverse cross-functional teams including data analysts, engineers, and AI/ML specialists.

Work Mode

This is a global, remote role. Krakenites are located in 70+ countries and speak over 50 languages.

Kraken is an equal opportunity employer. We don’t tolerate discrimination or harassment of any kind based on race, ethnicity, age, gender identity, citizenship, religion, sexual orientation, disability, pregnancy, veteran status or any other protected characteristic as outlined by federal, state or local laws.

Required Skills
AWSTerraformTerragruntKubernetesPythonbash/shellJVM languagesNomadDockerAtlantisSite Reliability EngineeringData PlatformInfrastructure as CodeMonitoringIncident Management
Need to work legally in Thailand?

Work permits without the paperwork nightmare

Thai immigration rules are strict and easy to get wrong. SVBL handles the bureaucracy — correct visa type, proper documentation, timely submissions. You focus on your work.

Right visa type for your situation
Document preparation & submission
Deadline tracking & renewals
Direct liaison with immigration
Talk to an expert
10+ years experience
About company
Kraken

Kraken is a cryptocurrency exchange building premium crypto products for experienced traders, institutions, and newcomers. The company is committed to industry-leading security, crypto education, and world-class client support through products like Kraken Pro, Desktop, Wallet, and Kraken Futures.

Visit website
Job Details
Category infrastructure
Posted 7 months ago