Responsibilities
- Design and implement the Unity Catalog structure — Catalogs, Schemas, and Volumes — to create a governed, secure, and well-documented data environment that serves as a Single Source of Truth across the organization.
- Lead the migration of complex business logic from legacy systems into a unified Databricks Lakehouse, refactoring tightly coupled SQL into modular, maintainable, and performant code.
- Architect our internal transformation framework using open-source tooling (Delta Live Tables or custom Python/SQL Spark pipelines), building scalable pipelines without reliance on managed SaaS platforms.
- Serve as the resident query performance expert — analyze Spark execution plans and Spark UI to diagnose bottlenecks, reduce data skew, and optimize join strategies on large-scale datasets.
- Govern our Databricks compute footprint through strategic application of Z-Ordering, Liquid Clustering, partition design, and Serverless SQL Warehouse configurations to maximize performance per dollar.
- Build and maintain CI/CD pipelines (GitHub Actions or equivalent) to automate testing, validation, and deployment of data models.
- Architect the semantic layer in Omni — designing data models built for self-service reporting with sub-second dashboard latency.
- Occasionally take on the BI Developer role, building executive-level dashboards that surface clear, actionable narratives from complex datasets.
- Partner with cross-functional stakeholders across Finance, Sales, Product, Marketing, and Trust & Safety to translate business questions into scalable data solutions.
- Translate performance and cost metrics into clear recommendations for senior leadership, balancing engineering rigor with business impact.
Additional Information
- 6-8 Months Contractor