Requirements
- Extensive experience with end-to-end distributed data systems, specially ML-centric ones
- Previous experience as Data Scientist in large scale product team / business
- Excellent Python and Software Engineering knowledge
- Ability to work with Java if needed
- Demonstrable experience collaborating with engineers on services
- Strong drive to solve problems for Data Scientists
- Ability to work independently in a cross-functional and cross-team environment
- Good communication skills
- Ability to get the point across to non-technical individuals and back it up with data (and statistical analysis)
- Ability to engage and manage project stakeholders
- Strong problem solving skills
- Ability to help refine problem statements and propose solutions taking effort-impact-scalability tradeoff into account
Nice to Have
- Apache Spark, Airflow, Iceberg, Kafka, dbt
- Scikit-Learn, XGBoost, MLFlow, Ray, PyTorch, Graph-tool (or similar)
- AWS (S3, EMR, SageMaker, Lakeformation), Terraform, Docker, GitHub CI/CD
- Knowledge Graphs (+ RAG), graph ML, probabilistic programming, A/B testing


