Responsibilities
- Develop and support data extraction services using Python or Scala, including tools like Debezium Server, custom APIs, and rclone.
- Implement change data capture, delta processing, and event-driven data integration patterns.
- Configure and manage push-based HTTP data delivery with Kerberos authentication for secure DLT transfers.
- Set up, run, and troubleshoot data extraction processes from SAP systems using Theobald Extract Universal.
- Connect to external platforms such as Salesforce and SharePoint via APIs or file-based interfaces.
- Build and manage an accessible web-based data catalog with dataset descriptions, metadata, and user-friendly navigation.
- Enable dataset discovery, preview capabilities, and lineage tracking using Unity Catalog as the metadata backend.
- Create structured workflows for data access requests, including submission, approvals, audit logging, and automated provisioning.
- Conduct technical design reviews in collaboration with cybersecurity teams.
- Maintain comprehensive documentation and compliance standards for all data interfaces and entry points.
- Oversee auditability and traceability across data pipelines.
- Work closely with IT and business teams to convert business needs into robust, scalable technical solutions.
- Act as the primary technical contact for resolving complex source system integration challenges.
- Define and deploy a layered data quality framework covering unit, integration, and cross-pipeline validation rules.
- Store and manage data quality rules in a centralized, version-controlled repository integrated with orchestration and CI/CD systems.
- Implement automated monitoring of data quality with severity classification and logic for flagging, filtering, or isolating problematic data.
- Partner with system owners and business stakeholders to establish practical and enforceable data quality thresholds.
Work Arrangement
Remote (Worldwide)
Team
Global data engineering team