About the Role
The role involves building and maintaining scalable backend systems for a data-intensive platform, with a focus on data pipelines, storage, and processing infrastructure.
Responsibilities
- Design and implement core components of the data platform architecture
- Develop and optimize data pipelines for ingestion and transformation
- Ensure data consistency, reliability, and fault tolerance across systems
- Collaborate with data scientists and analysts to understand requirements
- Improve system performance and scalability through iterative development
- Maintain high standards for code quality and system observability
- Troubleshoot and resolve production issues in distributed environments
- Evaluate and integrate new technologies to enhance platform capabilities
- Support data governance and compliance initiatives
- Write clean, maintainable, and well-documented code
- Participate in system design reviews and technical planning
- Monitor platform health and proactively address potential risks
- Contribute to API design and backend service development
- Work with streaming and batch processing frameworks
- Ensure secure handling and access controls for sensitive data
- Optimize data storage solutions for cost and performance
- Collaborate on disaster recovery and backup strategies
- Mentor junior engineers and promote technical best practices
- Integrate with machine learning workflows and model deployment systems
- Drive automation in testing, deployment, and operations
Nice to Have
- Master’s degree in computer science or related field
- Experience with real-time data processing systems
- Contributions to open-source data infrastructure projects
- Knowledge of data warehouse architectures
- Experience with Kubernetes and CI/CD pipelines
- Background in AI-driven applications
- Familiarity with data lineage and metadata management
- Exposure to regulatory frameworks like GDPR or HIPAA
Compensation
Competitive salary and equity package
Work Arrangement
Hybrid remote with office options
Team
Collaborative engineering team focused on data infrastructure
Tech Stack
- Primary languages include Python and Go
- Cloud platforms: AWS and GCP
- Data processing with Apache Kafka and Spark
- Container orchestration via Kubernetes
- Databases: PostgreSQL, BigQuery, and Redis
Growth and Impact
- Opportunity to shape the evolution of the data platform
- Direct influence on product scalability and performance
- Work on systems that power AI and analytics use cases