About the Role
The role involves building and optimizing data pipelines, working with large datasets, and ensuring data integrity and performance within a security-first environment.
Responsibilities
- Design and implement robust data pipelines for high-volume transaction data
- Develop and maintain scalable data storage solutions
- Ensure data consistency, accuracy, and accessibility across systems
- Collaborate with software engineers to integrate data services
- Monitor data workflows for performance and reliability
- Troubleshoot and resolve data processing issues
- Optimize queries and data transformation processes
- Support data governance and compliance standards
- Work with cloud-based data platforms and services
- Implement automated testing for data pipelines
- Participate in architectural design discussions
- Document data models and system workflows
- Enforce data security and access controls
- Contribute to disaster recovery and backup strategies
- Evaluate new data technologies and tools
- Ensure system scalability under peak transaction loads
- Maintain metadata management practices
- Support auditing requirements for financial data
- Improve data observability and monitoring capabilities
- Collaborate with product teams to understand data needs
- Refactor legacy data systems for improved performance
- Integrate third-party data sources securely
- Apply best practices in version control for data code
- Participate in code reviews and technical planning
- Respond to incidents related to data infrastructure
Compensation
Competitive salary and benefits package
Work Arrangement
Remote with flexible scheduling
Team
Collaborative engineering team focused on data systems
Why This Role Matters
This position plays a critical role in ensuring the integrity and reliability of data systems that protect financial transactions from fraud. The engineer will directly contribute to infrastructure that safeguards sensitive data and enables secure payments.
Technology Stack
The team uses AWS for cloud infrastructure, Python for pipeline development, Airflow for orchestration, Kafka for streaming, PostgreSQL and Redshift for storage, and Terraform for infrastructure as code.
Not available