NVIDIA is looking for a Senior Data Engineer to build and evolve the data backbone for our R&D telemetry and performance analytics ecosystem. In this role, you will be responsible for processing raw, large quantities of data from live systems at the cluster level and developing flexible, reliable, and scalable data handling pipelines that adapt to rapid change, delivering clean, trusted data for engineers and researchers.
What You'll Do
- Build flexible data ingestion and transformation frameworks that can easily handle evolving schemas and changing data contracts
- Develop and maintain ETL/ELT workflows for refining, enriching, and classifying raw data into analytics-ready form
- Collaborate with R&D, hardware, DevOps, ML engineers, data scientists and performance analysts to ensure accurate data collection from embedded systems, firmware, and performance tools
- Automate schema detection, versioning, and validation to ensure smooth evolution of data structures over time
- Maintain data quality and reliability standards, including tagging, metadata management, and lineage tracking
- Enable self-service analytics by providing curated datasets, APIs, and Databricks notebooks
What We're Looking For
- B.Sc. or M.Sc. in Computer Science, Computer Engineering, or a related field
- 5+ years of experience in data engineering, ideally in telemetry, streaming, or performance analytics domains
- Confirmed experience with Databricks and Apache Spark (PySpark or Scala)
- Understanding of streaming processes and their applications (e.g., Apache Kafka for ingestion, schema registry, event processing)
- Proficiency in Python and SQL for data transformation and automation
- Shown knowledge in schema evolution, data versioning, and data validation frameworks (e.g., Delta Lake, Great Expectations, Iceberg, or similar)
- Experience working with cloud platforms (AWS, GCP, or Azure) — AWS preferred
- Familiarity with data orchestration tools (Airflow, Prefect, or Dagster)
- Experience handling time-series, telemetry, or real-time data from distributed systems
Nice to Have
- Exposure to hardware, firmware, or embedded telemetry environments
- Knowledge of real-time analytics frameworks (Spark Structured Streaming, Flink, Kafka Streams)
- Understanding of system performance metrics (latency, throughput, resource utilization)
- Experience with data cataloging or governance tools (DataHub, Collibra, Alation)
- Familiarity with CI/CD for data pipelines and infrastructure-as-code practices
Technical Stack
- Databricks, Apache Spark, PySpark, Scala, Apache Kafka, Python, SQL
- Delta Lake, Great Expectations, Iceberg
- AWS, GCP, Azure
- Airflow, Prefect, Dagster, Spark Structured Streaming, Flink, Kafka Streams
- DataHub, Collibra, Alation
Team & Environment
You will be part of a fast-paced R&D organization.
Benefits & Compensation
- Competitive salaries
- Generous benefits package
Work Mode
This position follows a hybrid work model.
NVIDIA is an equal opportunity employer.



