Responsibilities
- End-to-End ML Lifecycle: Design, develop, and deploy production-grade ML models using Python and Spark. You will own the full cycle from feature engineering to model monitoring.
- Data Architecture & Pipelines: Build and maintain robust data pipelines within our Databricks environment.
- Exploratory Data Analysis (EDA) & Discovery: Dive deep into large datasets to uncover hidden patterns, anomalies, and opportunities. You don’t just process data; you interpret what it says about our users.
- Statistical Rigor & Hypothesis Testing: Design and execute rigorous A/B tests and multivariate experiments. You will be responsible for calculating sample sizes, p-values, and confidence intervals to ensure product changes are statistically significant.
- Metric Definition: Work with stakeholders to define what "success" looks like. You will translate vague business questions (e.g., "Why is churn increasing?") into measurable data science problems.
- Predictive Modeling & Insights: Beyond production pipelines, you will create ad-hoc models to forecast business trends and provide actionable insights that influence the product roadmap.
- Data Storytelling: Communicate complex findings through high-quality visualizations and dashboards (using tools like Tableau, PowerBI, or Databricks SQL). You can tell a "story" with data to convince leadership of a strategic direction.
- Product Impact: Collaborate with Product Managers to translate business goals into technical ML objectives. You will be responsible for defining and moving key performance indicators (KPIs) through algorithmic improvements.
- Collaborative Engineering: Work as a peer within the engineering team, applying software best practices (unit testing, code reviews, design docs) to the ML stack.