Responsibilities
- Build and support ETL workflows in Azure Synapse using Python, particularly PySpark, via Notebooks or Pipelines for reliable data processing.
- Design and implement data warehouse structures using star schema principles, including facts and dimensions, within a Massively Parallel Processing SQL Pool.
- Pull data from multiple sources such as REST APIs, SQL tables, and CSV files.
- Leverage in-depth knowledge of Azure Synapse Analytics to create high-performance, scalable data notebooks and pipelines.
- Support adoption of Data Fabric components like data lakes, lakehouses, delta lakes, and data cataloging to improve data organization and access.
- Partner with data architects to develop models and schemas that reflect business needs.
- Establish data validation rules and quality checks to ensure consistency and correctness.
- Detect and fix performance issues in ETL processes to meet service level agreements.
- Monitor data pipeline executions, troubleshoot failures, and apply fixes to maintain system reliability.
- Keep detailed records of ETL workflows, data transformations, and system architecture.
- Collaborate with cross-functional teams to define data needs and support analytics initiatives.
- Enforce data security practices and adhere to governance and privacy regulations.
Work Arrangement
Remote (Worldwide) — Latin America
Other
- Remote work available for candidates in Latin America.
- Position involves direct interaction with clients as part of a consulting team.
- Chance to engage in multiple projects across different industries and domains.
- Work alongside modern data technologies and experienced professionals.
- Accelerated career development through exposure to varied client challenges and solutions.