Responsibilities
- Design and manage ETL pipelines using AWS Glue, Python, and Apache Spark.
- Construct and refine data lakes on AWS with S3, Lake Formation, and Glue Data Catalog.
- Apply efficient data partitioning, schema management, and performance optimization in distributed systems.
- Work closely with data scientists, analysts, and business teams to provide accurate and timely data.
- Establish and uphold metadata frameworks, data lineage tracking, and governance policies in AWS environments.
- Oversee, debug, and enhance ETL workflows for improved scalability, dependability, and cost efficiency.
- Consolidate structured and unstructured data from diverse sources into unified storage platforms.
- Maintain adherence to data security protocols, privacy standards, and regulatory mandates.
- Support the design and strategic direction of enterprise-wide data infrastructure and analytics systems.
Benefits
- Remote work
Work Arrangement
Remote
Other
No Agencies Please!
