Responsibilities
- Design and maintain Lakehouse environments using Delta Lake and Fabric Data Warehouses.
- Develop batch and near-real-time data ingestion pipelines using Dataflows Gen2.
- Create and refine notebook-based transformations using PySpark and SQL, along with stored procedures for data warehouse processing.
- Implement medallion architecture with bronze, silver, and gold layers to enable scalable data curation.
- Publish verified semantic models and Power BI datasets that align with business domain needs.
- Optimize storage and compute performance in OneLake through efficient file formats, partitioning, and z-ordering.
- Tune Spark and SQL workloads using caching, concurrency control, and workload isolation techniques.
- Establish reliable retry mechanisms, alerting systems, and monitoring via Fabric Monitoring Hub and Metrics app.
- Perform comprehensive performance testing and scalability evaluations across end-to-end pipelines.
- Enforce data governance using sensitivity labels, row- and column-level security, and workspace-level roles.
- Control access at the item level for Lakehouse tables, data warehouse schemas, and datasets, including Managed Identities for data sources.
- Apply data quality standards, track lineage, and maintain documentation such as descriptions, tags, and ownership metadata, integrating with Purview when applicable.
- Ensure adherence to organizational policies for PII handling, auditing, and data retention.
- Utilize Git integration and Deployment Pipelines in Fabric to support CI/CD across development, testing, and production environments.
- Parameterize pipelines and environments, and externalize configuration settings and secrets using Key Vault.
- Develop automated testing frameworks for validating data transformations and schema integrity.
- Manage release cycles, change control processes, and rollback procedures for data deployments.
- Collaborate with analytics engineers and BI teams to define star schemas, semantic models, and DAX measures.
- Coordinate with data source owners to establish SLAs, manage schema changes, and formalize data contracts.
- Translate business requirements into technical architectures and document key design decisions.
- Deliver knowledge transfer sessions, promote best practices, and provide ongoing support to data consumers.
Compensation
Competitive salary and benefits package offered.
Work Arrangement
Hybrid work model with flexibility for remote and on-site work.
Team
Collaborative data engineering team focused on modern cloud analytics and scalable data platforms.
Responsibilities
- Build and manage Lakehouses (Delta Lake) and Fabric Data Warehouses.
- Develop Data Pipelines and Dataflows Gen2 for batch and near-real-time ingestion.
- Create and optimize Notebook-based transformations (PySpark/SQL) and SQL stored procedures for DW workloads.
- Implement medallion architecture (bronze/silver/gold) for scalable curation.
- Publish certified semantic models and Power BI datasets aligned to business domains.
- Optimize storage/compute in OneLake (file formats, partitioning, z-ordering).
- Tune Spark and SQL workloads (caching strategies, concurrency, workload isolation).
- Implement robust retry, alerting, and monitoring (Fabric Monitoring Hub, Metrics app).
- Conduct end-to-end pipeline performance testing and scalability assessments.
- Enforce data governance with sensitivity labels, row-level/column-level security, and workspace roles.
- Manage item-level permissions (Lakehouse tables, DW schemas, datasets) and Managed Identities for sources.
- Apply data quality rules, lineage, and documentation (Descriptions, Tags, Owner metadata; Purview if applicable).
- Ensure compliance with organizational standards (PII handling, audit, retention).
- Use Fabric Git integration and Deployment Pipelines for CI/CD across dev/test/prod.
- Parameterize pipelines and environments; externalize configuration and secrets (Key Vault).
- Implement automated testing for data transformations and schemas.
- Drive release management, change control, and rollback strategies.
- Partner with analytics engineers and BI teams to design star schemas, semantic models, and DAX measures.
- Work with data source owners for SLAs, schema change management, and contracts.
- Translate business requirements into technical designs and document architecture decisions.
- Provide knowledge transfer, best practices, and support to data consumers.
Available for qualified candidates requiring sponsorship.