About the Role
The role involves developing and managing data pipelines within the Azure Databricks environment, supporting analytics and machine learning initiatives through robust data architecture and engineering practices.
Responsibilities
- Design and implement data pipelines using Azure Databricks
- Develop and maintain ETL workflows for large-scale data processing
- Collaborate with data scientists and analysts to support data needs
- Optimize data storage and query performance in cloud environments
- Ensure data quality and integrity across systems
- Support the deployment of data solutions in production
- Participate in architecture reviews and technical design sessions
- Monitor and troubleshoot data pipeline issues
- Implement data security and compliance standards
- Document data models, pipelines, and system configurations
- Work with Azure services including Data Lake, Blob Storage, and Synapse
- Integrate data from multiple source systems
- Apply software engineering best practices to data projects
- Contribute to CI/CD processes for data solutions
- Use version control for code and configuration management
- Support data governance and metadata management
- Stay current with cloud data platform developments
- Assist in performance tuning of Spark-based workloads
- Participate in agile development cycles
- Provide technical guidance on data engineering best practices
Nice to Have
- Master’s degree in a technical field
- Certification in Azure data or analytics solutions
- Experience with machine learning pipelines
- Knowledge of Delta Lake architecture
- Familiarity with Kubernetes or containerization
- Experience with real-time data streaming
- Background in financial or enterprise sectors
- Exposure to data governance frameworks
- Contributions to open-source data projects
- Public speaking or technical writing experience
Compensation
Competitive salary based on experience
Work Arrangement
Hybrid working model with flexibility for remote and office-based work
Team
Collaborative team of data professionals focused on cloud data platforms and engineering excellence
Technology Stack
- Primary tools include Azure Databricks, Azure Data Lake, Azure Blob Storage, Azure Synapse Analytics, and Azure DevOps
- Languages used: Python, Scala, SQL
- Infrastructure as code via ARM templates or Terraform
- Monitoring with Azure Monitor and Log Analytics
Professional Development
- Opportunities for technical training and certifications
- Access to cloud platform learning resources
- Mentorship from senior engineering staff
- Support for attending industry conferences
Visa sponsorship available for qualified candidates