Capgemini is hiring a Data Engineer to design, build, and optimize data pipelines and analytics solutions using Databricks and Google Cloud Platform (GCP). You will work closely with data analysts, data scientists, and business stakeholders to deliver scalable, reliable, and high-quality data products.
What You'll Do
- Design, build, and maintain ETL/ELT pipelines using Databricks (PySpark, Delta Lake).
- Optimize pipelines for performance, cost efficiency, and scalability within GCP.
- Develop batch and streaming data processes using Spark Streaming and related technologies.
- Implement data solutions leveraging GCP services such as BigQuery, Cloud Storage, Dataflow, Cloud Composer, and Vertex AI integrations.
- Apply best practices for cloud security, IAM configuration, monitoring, and cost management.
- Build and maintain data models, including dimensional modeling and data vault structures.
- Implement data quality frameworks, validation rules, and automated testing.
- Manage data versioning, governance, and lineage using tools such as Unity Catalog or GCP Data Catalog.
- Partner with cross-functional teams to gather requirements and translate them into technical designs.
- Provide technical guidance and influence engineering best practices across the team.
- Contribute to documentation, architectural diagrams, and knowledge sharing.
What We're Looking For
- 3+ years of experience as a Data Engineer or similar role.
- Strong hands-on experience with Databricks, including: PySpark/Spark, Delta Lake, Databricks workflows/jobs.
- Proficiency with GCP: BigQuery, Cloud Storage, Dataflow or Dataproc.
- Strong coding skills in Python and SQL.
- Solid understanding of distributed systems, data warehousing, and data architecture principles.
- Experience with CI/CD tools (GitHub, GitLab, Azure DevOps, or similar).
Nice to Have
- Databricks or GCP certifications (e.g., Data Engineer, Architect).
- Experience with Terraform or other Infrastructure-as-Code tools.
- Knowledge of ML workflows or MLOps frameworks.
- Familiarity with data governance tools (Unity Catalog, Great Expectations, dbt, etc.).
Technical Stack
- Databricks, PySpark, Spark, Delta Lake
- Google Cloud Platform (GCP), BigQuery, Cloud Storage, Dataflow, Dataproc, Cloud Composer, Vertex AI
- Python, SQL
- CI/CD tools, Terraform, MLOps
Team & Environment
You will partner closely with data analysts, data scientists, and business stakeholders.
Benefits & Compensation
- Compensation range: $110,011 - $169,000
- Paid time off: Vacation (12-25 days based on grade), Company paid holidays, Personal Days, Sick Leave
- Medical, dental, and vision coverage
- Retirement savings plans (e.g., 401(k) in the U.S.)
- Life and disability insurance
- Employee assistance programs
Work Mode
This role is based in Nashville, TN.
Capgemini is an Equal Opportunity Employer encouraging inclusion in the workplace. All qualified applicants will receive consideration for employment without regard to race, national origin, gender identity/expression, age, religion, disability, sexual orientation, genetics, veteran status, marital status or any other characteristic protected by law.






