Responsibilities
- Design and establish Unity Catalog components including Catalogs, Schemas, and Volumes to enable a secure, well-documented, and governed data ecosystem serving as the organization's trusted data foundation.
- Lead the transition of intricate business logic from outdated systems into a consolidated Databricks Lakehouse environment, transforming monolithic SQL into modular, efficient, and sustainable code structures.
- Develop an internal data transformation framework using open-source technologies such as Delta Live Tables or custom Python and SQL-based Spark pipelines, ensuring scalability without dependency on managed SaaS solutions.
- Act as the go-to expert for query optimization by evaluating Spark execution plans and Spark UI metrics to identify performance issues, minimize data skew, and enhance join efficiency on large datasets.
- Optimize Databricks compute usage through strategic implementation of Z-Ordering, Liquid Clustering, partitioning strategies, and Serverless SQL Warehouse settings to achieve maximum performance relative to cost.
- Develop and manage automated CI/CD pipelines using GitHub Actions or similar tools to streamline testing, validation, and deployment of data models.
- Design the semantic layer in Omni to enable self-service analytics with data models that support fast, sub-second dashboard response times.
- Occasionally perform duties of a BI Developer by creating high-level executive dashboards that extract clear, actionable insights from complex data sources.
- Collaborate with stakeholders across Finance, Sales, Product, Marketing, and Trust & Safety to convert business inquiries into scalable and maintainable data architectures.
- Transform technical performance and cost metrics into strategic recommendations for senior leadership, aligning engineering precision with business outcomes.