Lead the creation and deployment of internal SDKs and self-service tools that empower distributed engineering teams to independently manage data ingestion and transformation.
Transition focus from building individual data pipelines to developing scalable platform solutions, establishing reusable methods for handling both batch and real-time event data.
Own the cost efficiency of the Databricks environment by optimizing Spark execution plans, shuffle partitions, and implementing auto-scaling to control DBU usage.
Maintain platform performance as data volumes increase, balancing latency, throughput, and cloud infrastructure costs.
Enforce Schema-on-Write validation and implement Data Contracts to guarantee data from numerous internal services meets high quality standards prior to entering the Bronze layer.
Collaborate with Data Architects and Data Stewards to uphold data privacy, including PII handling, security protocols, and end-to-end metadata traceability across the global data ecosystem.
Promote adoption of AI-powered development tools such as GitHub Copilot and Cursor to speed up development cycles and enhance code quality.
Guide engineers in best practices for distributed computing through in-depth code reviews emphasizing scalability and long-term maintainability.

Remote (Worldwide)

G-P is hiring a Senior Software Engineer (Data Platform)