Responsibilities
- Design, build, and maintain ETL/ELT pipelines using Databricks (PySpark, Delta Lake).
- Optimize pipelines for performance, cost efficiency, and scalability within GCP.
- Develop batch and streaming data processes using Spark Streaming, and related technologies.
- Implement data solutions leveraging GCP services such as BigQuery, Cloud Storage, Dataflow, Cloud Composer, and Vertex AI integrations.
- Apply best practices for cloud security, IAM configuration, monitoring, and cost management.
- Build and maintain data models, including dimensional modeling and data vault structures.
- Implement data quality frameworks, validation rules, and automated testing.
- Manage data versioning, governance, and lineage using tools such as Unity Catalog or GCP Data Catalog.
- Partner with cross-functional teams to gather requirements and translate them into technical designs.
- Provide technical guidance and influence engineering best practices across the team.
- Contribute to documentation, architectural diagrams, and knowledge sharing.
Requirements
- 3+ years of experience as a Data Engineer or similar role.
- Strong hands-on experience with Databricks, including: PySpark/Spark, Delta Lake, Databricks workflows/jobs.
- Proficiency with GCP: BigQuery, Cloud Storage, Dataflow or Dataproc.
- Strong coding skills in Python and SQL.
- Solid understanding of distributed systems, data warehousing, and data architecture principles.
- Experience with CI/CD tools (GitHub, GitLab, Azure DevOps, or similar).
Nice to Have
- Databricks or GCP certifications (e.g., Data Engineer, Architect).
- Experience with Terraform or other Infrastructure-as-Code tools.
- Knowledge of ML workflows or MLOps frameworks.
- Familiarity with data governance tools (Unity Catalog, Great Expectations, dbt, etc.).
Benefits
- Paid time off based on employee grade (A-F), defined by policy: Vacation: 12-25 days, depending on grade, Company paid holidays, Personal Days, Sick Leave
- Medical, dental, and vision coverage (or provincial healthcare coordination in Canada)
- Retirement savings plans (e.g., 401(k) in the U.S., RRSP in Canada)
- Life and disability insurance
- Employee assistance programs
- Other benefits as provided by local policy and eligibility
Work Arrangement
Remote (City/Region)
Additional Information
- Compensation range for this role in the posted location is $110,011 - $169,000.
- Actual compensation offered may fall outside of the posted range and will be determined based on multiple factors legally permitted in the applicable jurisdiction.
- In addition to base salary, this role may be eligible for additional compensation such as variable incentives, bonuses, or commissions, depending on the position and applicable laws.
- Capgemini reserves the right to amend or withdraw compensation programs at any time, within the limits of applicable legislation.
- Capgemini may capture your image (video or screenshot) during the interview process and that image may be used for verification, including during the hiring and onboarding process.
