Bengaluru, Karnataka, India Employment

Western Digital is hiring a Data Quality Engineer

About the Role

Western Digital is seeking a forward-thinking Data Quality Engineer to advance the data trust, governance, and certification framework for our enterprise Data Lakehouse platform. This role is critical in ensuring that data across Bronze (raw), Silver (curated), and Gold (business-ready) layers is certified, discoverable, and AI/BI-ready. You will ensure that all 9 dimensions of data quality are continuously met, so both humans and AI systems can trust and use the data effectively.

What You'll Do

  • Build and maintain automated validation frameworks across Bronze → Silver → Gold data pipelines.
  • Develop tests for schema drift, anomalies, reconciliation, timeliness, and referential integrity.
  • Integrate validation into Databricks (Delta Lake, Delta Live Tables, Unity Catalog) and Apache Iceberg-based pipelines.
  • Define data certification workflows ensuring only trusted data is promoted for BI/AI consumption.
  • Leverage Atlan and AWS Glue Catalog for metadata management, lineage, glossary, and access control.
  • Utilize Iceberg’s schema evolution & time travel to ensure reproducibility and auditability.
  • Build a governed semantic layer on gold data to support BI and AI-driven consumption.
  • Enable Power BI dashboards and self-service reporting with certified KPIs and metrics.
  • Partner with data stewards to align semantic models with business glossaries in Atlan.
  • Prepare and certify datasets that fuel conversational analytics experiences.
  • Collaborate with AI/ML teams to integrate LLM-based query interfaces (e.g., natural language to SQL) with Dremio, Databricks SQL, and Power BI.
  • Ensure LLM responses are grounded on high-quality, certified datasets, reducing hallucinations and maintaining trust.
  • Provide certified, feature-ready datasets for ML training and inference in SageMaker Studio.
  • Collaborate with ML engineers to ensure input data meets all 9 quality dimensions.
  • Establish monitoring for data drift and model reliability.
  • Continuously enforce all 9 dimensions of data quality: Accuracy, Completeness, Consistency, Timeliness, Validity, Uniqueness, Integrity, Conformity, Reliability.

What We're Looking For

  • Advanced experience in data engineering, data quality, or data governance roles.
  • Strong skills in Python, PySpark, and SQL.
  • Hands-on experience with Databricks (Delta Lake, Unity Catalog, Delta Live Tables) and Apache Iceberg.
  • Expertise in the AWS data stack (S3, Glue ETL, Glue Catalog, Athena, EMR, Redshift, SageMaker Studio).
  • Experience with Power BI semantic modeling, DAX, and dataset certification.
  • Familiarity with Dremio or query engines like Trino/Presto.
  • Knowledge of Atlan or equivalent catalog/governance tools.
  • Experience with data quality testing frameworks such as Great Expectations, Deequ, or Soda.

Nice to Have

  • Exposure to Conversational Analytics platforms or LLM-powered BI (e.g., natural language query over Lakehouse/Power BI).
  • Experience integrating LLM pipelines (LangChain, OpenAI, AWS Bedrock, etc.) with enterprise data.
  • Familiarity with data observability tools like Monte Carlo, Bigeye, DataDog, or Grafana.
  • Knowledge of data compliance frameworks such as GDPR, CCPA, or HIPAA.
  • Cloud certifications: AWS Data Analytics Specialty or Databricks Certified Data Engineer.

Technical Stack

  • Platforms & Engines: Databricks, Apache Iceberg, AWS (Glue, Glue Catalog, SageMaker Studio), Dremio, Atlan, Power BI
  • Languages & Frameworks: Python, PySpark, SQL
  • Databricks Components: Delta Lake, Unity Catalog, Delta Live Tables
  • AWS Services: S3, Athena, EMR, Redshift
  • Data Quality Tools: Great Expectations, Deequ, Soda
  • LLM Tooling: LangChain, OpenAI, AWS Bedrock

We are a company of problem solvers who thrive on the power and potential of diversity. We are committed to an inclusive environment where every individual can thrive through a sense of belonging, respect, and contribution.

Western Digital is committed to providing equal opportunities to all applicants and employees and will not discriminate based on race, color, ancestry, religion, sex, gender, age, national origin, sexual orientation, medical condition, marital status, physical or mental disability, genetic information, protected medical and family care leave, Civil Air Patrol status, military and veteran status, or other legally protected characteristics.

Required Skills
DatabricksApache IcebergAWS GlueAWS Glue CatalogSageMaker StudioDremioAtlanPower BIPythonPySparkSQLDelta LakeUnity CatalogDelta Live TablesData Quality
Relocating to Thailand?

Visa and work permit handled by experts

SVBL manages your entire visa process — from application to approval. Work permits, extensions, and compliance all covered. One partner for legal, immigration, and settling in.

Work permit processing
Visa extensions & renewals
Immigration compliance
Banking & housing guidance
Get free consultation
Free initial consultation
About company
Western Digital

Western Digital powers global innovation and pushes the boundaries of technology. It is a company of problem solvers offering an expansive portfolio of technologies, HDDs, and platforms for business, creative professionals, and consumers under its Western Digital®, WD®, and WD_BLACK™ brands. It is a key partner to large organizations, enabling systems from city infrastructure to data centers and AI-era data storage needs.

Visit website
Job Details
Department Data and Analytics
Category data
Posted 14 days ago