EXL is hiring a Manager / Cloud Data Platform Engineer to play a central role in deploying, monitoring, and optimizing cloud platforms used by our data scientists. You will leverage your expertise in cloud infrastructure, DevOps, and automation to build reliable and scalable systems.
What You'll Do
- Manage, scale, and optimize cloud environments for data science workloads, primarily on AWS, Databricks, and dbt.
- Provision, maintain, and optimize compute clusters for ML workloads, including Kubernetes, ECS/EKS, Databricks, and SageMaker.
- Implement high-availability solutions for mission-critical analytics platforms.
- Develop CI/CD pipelines for model deployment, infrastructure-as-code, and automated testing.
- Build monitoring, alerting, and logging systems using tools like Datadog, CloudWatch, Prometheus, Grafana, and ELK.
- Automate provisioning and deployments with Terraform, CloudFormation, and GitHub Actions.
- Enable data ingestion, transformation, and model execution workflows through platform automation.
- Develop self-service capabilities for data scientists to provision reproducible environments.
- Collaborate with Data Engineering on integrations between data pipelines and cloud systems.
- Support application networking capabilities like API gateways, load balancers, TLS, and WAFs.
- Implement data science security and compliance controls aligned with enterprise standards.
- Conduct risk assessments and remediation efforts to strengthen security and resiliency.
- Support secure handling of sensitive financial data.
- Partner with data scientists, ML engineers, and data engineers to understand and support their workflows.
- Serve as a technical advisor on cloud architecture, performance optimization, and production readiness.
- Champion Agile, DevOps, and Platform Engineering practices.
- Lead data engineering strategy, drive innovation, and represent the team in executive meetings.
What We're Looking For
- A bachelor’s degree or higher in a STEM field.
- 5+ years of experience in cloud operations, DevOps, platform engineering, SRE, or sysadmin roles.
- Strong proficiency with at least one major cloud provider (AWS preferred).
- Hands-on experience with IaC tools like Terraform or CloudFormation.
- Strong scripting skills in Python, Bash, or PowerShell.
- Strong understanding of modern authentication, authorization, and secrets management.
- Experience with CI/CD systems like GitHub Actions, Jenkins, GitLab CI, or ArgoCD.
- Familiarity with container orchestration (Docker Compose, EKS/Kubernetes, ECS).
- Experience supporting data-intensive or ML workloads.
- Graduate degree in Computer Science, Data Science, or a related field.
- 4-6 years of experience in data engineering or a related field.
Nice to Have
- Experience in financial services or other highly regulated industries.
- Knowledge of ML/AI platform tools like Databricks, SageMaker, MLflow, or Airflow.
- Hands-on experience with AI Engineering and LLMOps tools (LLM observability, eval pipelines, agentic workflows).
- Understanding of networking, VPC architectures, and cloud security best practices.
- Familiarity with distributed compute frameworks like Spark, Ray, or Dask.
Technical Stack
- Cloud & Platforms: AWS, Databricks, dbt, Kubernetes, ECS/EKS, SageMaker
- Infrastructure & Ops: Terraform, CloudFormation, GitHub Actions, Datadog, CloudWatch, Prometheus, Grafana, ELK
- Languages & Scripting: Python, Bash, PowerShell
- CI/CD: Jenkins, GitLab CI, ArgoCD
- Containers: Docker Compose
- ML Tools: MLflow, Airflow, Spark, Ray, Dask
Team & Environment
You will partner closely with data scientists, machine learning engineers, and data engineers within data-driven initiatives. You will also work collaboratively with the broader Platform Engineering team.
Work Mode
This is a fully remote position for candidates located within the United States.






