Remote (Global) Full-time

People Data Labs is hiring a Senior Data Engineer

About the Role

People Data Labs is looking for a Senior Data Engineer to build the infrastructure for ingesting, transforming, and loading exponentially increasing data volumes. You will be a key part of the Data Engineering Team, described as the 'secret sauce,' solving unique, complex problems.

What You'll Do

  • Build infrastructure for ingestion, transformation, and loading of exponentially increasing data volumes from various sources using Spark, SQL, AWS, and Databricks.
  • Build an organic entity resolution framework capable of correctly merging hundreds of billions of individual entities into clean, consumable datasets.
  • Develop CI/CD pipelines and anomaly detection systems to continuously improve the quality of data being pushed into production.
  • Dream up solutions to largely novel data engineering and data science problems.

What We're Looking For

  • 5-7+ years of industry experience with clear examples of strategic technical problem-solving and implementation.
  • Strong software development fundamentals.
  • Experience with Python.
  • Expertise with Apache Spark (Java, Scala, and/or Python-based).
  • Experience with SQL.
  • Experience building scalable data processing systems (e.g., cleaning, transformation) from the ground up.
  • Experience using developer-oriented data pipeline and workflow orchestration (e.g., Airflow (preferred), dbt, dagster or similar).
  • Knowledge of modern data design and storage patterns (e.g., incremental updating, partitioning and segmentation, rebuilds and backfills).
  • Experience working in Databricks (including delta live tables, data lakehouse patterns, etc.).
  • Experience with cloud computing services (AWS (preferred), GCP, Azure or similar).
  • Experience with data warehousing (e.g., Databricks, Snowflake, Redshift, BigQuery, or similar).
  • Understanding of modern data storage formats and tools (e.g., parquet, ORC, Avro, Delta Lake).
  • Balance high ownership and autonomy with a strong ability to collaborate.
  • Work effectively remotely (proactive about managing blockers, asking questions, participating in team activities).
  • Demonstrate strong written communication skills on Slack/Chat and in documents.
  • Exhibit experience in writing data design docs (pipeline design, dataflow, schema design).
  • Scope and breakdown projects, communicate and collaborate progress and blockers effectively with your manager, team, and stakeholders.

Nice to Have

  • Degree in a quantitative discipline such as computer science, mathematics, statistics, or engineering.
  • Experience working with entity data (entity resolution / record linkage).
  • Experience working with data acquisition / data integration.
  • Expertise with Python and the Python data stack (e.g., numpy, pandas).
  • Experience with streaming platforms (e.g., Kafka).
  • Experience evaluating data quality and maintaining consistently high data standards across new feature releases (e.g., consistency, accuracy, validity, completeness).

Technical Stack

  • Python, Apache Spark, SQL, AWS, Databricks
  • Airflow, dbt, dagster
  • Delta Lake, Parquet, ORC, Avro
  • Kafka, numpy, pandas

Team & Environment

You will join the Data Engineering Team, described as the 'secret sauce' at People Data Labs.

Benefits & Compensation

  • Compensation range: $190K - $210K
  • Stock
  • Competitive Salaries
  • Unlimited paid time off
  • Medical, dental, & vision insurance
  • Health, fitness, and office stipends

Work Mode

This role is global, offering the permanent ability to work wherever and however you want.

People Data Labs does not discriminate on the basis of race, sex, color, religion, age, national origin, marital status, disability, veteran status, genetic information, sexual orientation, gender identity or any other reason prohibited by law in provision of employment opportunities and benefits.

Required Skills
PythonApache SparkSQLAWSDatabricksAirflowdbtdagsterDelta LakeParquetData EngineeringData InfrastructureETLData PipelinesData Modeling
Invoicing holding you back?

Focus on work, not paperwork

Stop worrying about invoicing, taxes, and compliance. Glopay handles the business setup, you handle the client work. Get paid faster and look professional.

Auto-generated compliant invoices
Built-in expense management
Income reports for tax season
95% of earnings stay with you
Try Glopay free
No credit card needed
About company
People Data Labs

People Data Labs (PDL) is the provider of people and company data. We do the heavy lifting of data collection and standardization so our customers can focus on building and scaling innovative, compliant data solutions. Our sole focus is on building the best data available by integrating thousands of compliantly sourced datasets into a single, developer-friendly source of truth.

Visit website
Job Details
Category data
Posted 4 months ago