Responsibilities
- Design and sustain ETL workflows using Python (PySpark) in Azure Synapse Analytics Notebooks or Pipelines to enable efficient data movement and transformation.
- Leverage data warehousing expertise, including star schema design with facts and dimensions, to construct optimized storage models in MPP SQL Pools.
- Pull data from diverse sources such as REST APIs, SQL database tables, and CSV files.
- Apply in-depth knowledge of Azure Synapse Analytics to build scalable and high-performance data notebooks and pipelines.
- Support adoption of Data Fabric components like data lakes, lakehouses, delta lakes, and data cataloging to improve data management.
- Partner with data architects to develop data models and schemas that meet business needs.
- Establish data validation rules and quality controls to ensure accuracy and consistency.
- Detect and fix performance issues in ETL processes to meet service level agreements.
- Monitor ETL job execution, troubleshoot failures, and apply fixes to maintain pipeline reliability.
- Document ETL workflows, data transformations, and system architecture thoroughly.
- Collaborate with cross-functional teams to define data needs and support data-driven projects.
- Enforce data security practices and adhere to governance and privacy regulations.
Work Arrangement
Remote (Worldwide)
Team
cross-functional teams
Work Arrangement
Remote (Worldwide)
Team
Structure: cross-functional teams
Other
- Flexible working hours
- Work remotely
- Continuing education, training, conferences
- Company-sponsored coursework, exams, and certifications