Responsibilities
- Design, develop, and maintain data pipelines and ETL/ELT processes using Microsoft Fabric and PySpark.
- Build and manage Lakehouse and Data Warehouse solutions within the Microsoft Fabric ecosystem.
- Develop scalable data processing workflows using Python and PySpark.
- Write optimized SQL queries for data transformation, analysis, and performance tuning.
- Integrate data from various sources such as APIs, databases, cloud storage, and streaming platforms.
- Implement data modeling techniques to support analytics and reporting requirements.
- Ensure data quality, governance, and security across the data platform.
- Monitor and optimize data pipeline performance and reliability.
- Collaborate with data analysts, data scientists, and business stakeholders to understand data requirements.
- Document architecture, workflows, and technical processes.
Requirements
- Strong hands-on experience with Microsoft Fabric (Lakehouse, Data Factory, Data Warehouse, Notebooks).
- Expertise in PySpark and Spark-based data processing.
- Strong programming skills in Python.
- Advanced knowledge of SQL and database optimization.
- Experience with ETL/ELT pipeline development.
- Understanding of data modeling concepts and data warehousing principles.
- Experience working with large-scale structured and unstructured data.
Nice to Have
- Experience with Azure Data Factory / Azure Synapse / Databricks.
- Knowledge of Power BI integration with Microsoft Fabric.
- Familiarity with CI/CD pipelines and DevOps practices.
- Experience with data governance and security frameworks.
- Exposure to real-time or streaming data processing.
Additional Information
- Shift Timings: 2PM to 11PM IST