Pune, /, India

Mastercard is hiring a Lead Data Engineer

Requirements

  • Extensive Data Engineering Experience: 8–12+ years in data engineering or backend engineering, including senior/lead roles. Experience designing end-to-end data systems, solving scale/performance challenges, integrating diverse sources, and operating pipelines in production.
  • Big Data & Cloud Expertise: Strong skills in Python and/or Java/Scala. Deep experience with Spark, Hadoop, Hive/Impala, and Airflow. Hands-on work with AWS, Azure, or GCP using cloud-native processing and storage services (e.g., S3, Glue, EMR, Data Factory). Ability to design scalable, cost-efficient workloads for experimental and variable R&D environments.
  • AI/ML Data Lifecycle Knowledge: Understanding of data needs for machine learning—dataset preparation, feature/label management, and supporting real-time or batch training pipelines. Experience with feature stores or streaming data is useful.
  • Leadership & Mentorship: Ability to translate ambiguous goals into clear plans, guide engineers, and lead technical execution.
  • Problem-Solving Mindset: Approach issues systematically, using analysis and data to select scalable, maintainable solutions.
  • Education & Background: Bachelor’s degree in Computer Science, Engineering, or related field. 8-12+ years of proven experience architecting and operating production-grade data systems, especially those supporting analytics or ML workloads.
  • Pipeline Development: Expert in ETL/ELT design and implementation, working with diverse data sources, transformations, and targets. Strong experience scheduling and orchestrating pipelines using Airflow or similar tools.
  • Programming & Databases: Advanced Python and/or Scala/Java skills and strong software engineering fundamentals (version control, CI, code reviews). Excellent SQL abilities, including performance tuning on large datasets.
  • Big Data Technologies: Hands-on Spark experience (RDDs/DataFrames, optimization). Familiar with Hadoop components (HDFS, YARN), Hive/Impala, and streaming systems like Kafka or Kinesis.
  • Cloud Infrastructure: Experience deploying data systems on AWS/Azure/GCP. Familiar with cloud data lakes, warehouses (Redshift, BigQuery, Snowflake), and cloud-based processing engines (EMR, Dataproc, Glue, Synapse). Comfortable with Linux and shell scripting.
  • Data Governance & Security: Knowledge of data privacy regulations, PII handling, access controls, encryption/masking, and data quality validation. Experience with metadata management or data cataloging tools is a plus.
  • Collaboration & Agile Delivery: Strong communication skills and experience working with cross-functional teams. Ability to document designs clearly and deliver iteratively using agile practices.

Nice to Have

  • Advanced Cloud & Data Platform Expertise: Experience with AWS data engineering services, Databricks, and Lakehouse/Delta Lake architectures (including bronze/silver/gold layers).
  • Modern Data Stack: Familiarity with dbt, Great Expectations, containerization (Docker/Kubernetes), and monitoring tools like Grafana or cloud-native monitoring.
  • DevOps & CI/CD for Data: Experience implementing CI/CD pipelines for data workflows and using IaC tools like Terraform or CloudFormation. Knowledge of data versioning (e.g., Delta Lake time-travel) and supporting continuous delivery for ML systems.
  • Continuous Learning: Motivation to explore emerging technologies, especially in AI and generative AI data workflows.
Required Skills
Apache SparkMicrosoft Azure
About company
Mastercard
Mastercard is a global technology company in the payments industry that connects and powers an inclusive, digital economy. Using secure data and networks, partnerships and passion, our innovations and solutions help individuals, financial institutions, governments, and businesses realize their greatest potential.
All jobs at Mastercard Visit website
Job Details
Department Data and Analytics
Category data
Posted 3 months ago