What You'll Do
Design and manage robust data pipeline architectures that support large-scale data processing and analytics. You'll integrate and automate data flows from diverse sources, ensuring reliability, performance, and scalability across systems.
Develop and optimize ETL processes using SQL and AWS big data tools to extract, transform, and load data efficiently. Build analytics infrastructure that delivers actionable insights into customer behavior, product performance, and operational metrics.
Collaborate with data analysts, scientists, and cross-functional teams to understand data needs and deliver tailored solutions. Support technical initiatives by troubleshooting data issues and refining data access protocols.
Requirements
- Minimum of five years in a data engineering or similar technical role
- Proficiency with SQL and NoSQL databases such as Postgres, Oracle, and Cassandra
- Hands-on experience with AWS services including S3, EC2, EMR, RDS, and Redshift
- Familiarity with stream-processing platforms like Spark Streaming, Storm, or Amazon Kinesis
- Strong coding skills in Python, Java, or Node.js for data automation and pipeline development
- Experience with workflow management tools and data orchestration systems
- Ability to diagnose and resolve data flow issues across complex systems
- Background in modeling both structured and unstructured datasets
- Understanding of message queuing, distributed data storage, and scalable data architectures
- Proven ability to manage multiple data projects in fast-paced environments
- Experience working securely across multiple data centers and cloud regions
Technical Environment
You'll work primarily with SQL, Python, Java, and Node.js, leveraging AWS cloud infrastructure and big data technologies to build and maintain high-performance data systems. The role emphasizes automation, data integrity, and cross-team collaboration to support scalable analytics solutions.
