Fitch Group is hiring a Lead Data Engineer to design and architect end-to-end data pipelines and solutions on modern cloud platforms. The role is central to leading data platform modernization initiatives, mentoring engineers, and integrating AI/ML capabilities into data workflows.
What You'll Do
- Lead the design and architecture of end-to-end data pipelines and solutions on modern cloud-based platforms, including Snowflake, Databricks, and AWS.
- Build and optimize robust, scalable data orchestration workflows using Apache Airflow and implement best practices across multiple agile squads.
- Design and implement data solutions using PostgreSQL for relational data and MongoDB for NoSQL requirements.
- Architect and deploy containerized data applications using Docker, Kubernetes, and AWS EKS, incorporating GitHub Actions for automated deployments.
- Design and implement CI/CD pipelines using GitHub Actions, establish branching strategies, and ensure automated testing, code quality checks, and security scanning.
- Collaborate with cross-functional teams—including Data Scientists, Analytics teams, and business stakeholders—to translate requirements into scalable technical solutions.
- Mentor and guide data engineers by promoting technical excellence, establishing coding standards, and conducting architecture reviews.
- Drive data platform modernization initiatives and ensure data quality, reliability, and governance across all data systems.
- Design and implement AI-enhanced data pipelines that leverage LLMs and Agentic AI frameworks to automate data quality checks, anomaly detection, and intelligent data transformation workflows.
- Architect data infrastructure to support AI/ML workloads, including feature stores, vector databases, and real-time inference pipelines integrated with cloud-native services.
- Leverage established standards and best practices to integrate AI agents into data engineering workflows, including context management protocols (MCP) for seamless AI-to-data-platform communication.
What We're Looking For
- 8+ years of data engineering experience, including 3+ years in a lead role architecting large-scale data platforms.
- Expert-level proficiency in Python and Java for building cloud-native data processing solutions.
- Deep hands-on experience with Apache Airflow, Snowflake (data warehousing, modeling, optimization), and Databricks.
- Strong AWS expertise, including S3, Lambda, Glue, EMR, Kinesis, EKS, and RDS.
- Production database experience with PostgreSQL (design, optimization, replication) and MongoDB (document modeling, sharding, replica sets).
- Solid experience with containerization and orchestration using Docker, Kubernetes, and AWS EKS, including cluster management and autoscaling.
- Proven CI/CD and GitOps experience using GitHub, GitHub Actions, and ArgoCD for automated deployments and multi-environment management.
- Proficient with agile tools such as JIRA for sprint management and Confluence for technical documentation and knowledge sharing.
- Excellent analytical, problem-solving, and communication skills, with the ability to explain complex concepts to non-technical stakeholders and drive initiatives in complex environments.
Nice to Have
- Working knowledge of AI/ML frameworks (LangChain, LlamaIndex, AutoGen, etc.) and understand how Agentic AI can enhance data engineering workflows through automated data validation, intelligent orchestration, and self-healing pipelines.
- Practical understanding of AI integration patterns in data platforms, including prompt engineering, RAG architectures, and vector database implementations.
- Familiar with Model Context Protocol (MCP) or similar frameworks for enabling AI agents to interact securely and efficiently with data sources, APIs, and tools.
- Experience with AI-powered development tools such as GitHub Copilot and Amazon Q.
Technical Stack
- Languages: Python, Java
- Orchestration & Warehousing: Apache Airflow, Snowflake, Databricks
- Cloud Services: AWS, S3, Lambda, Glue, EMR, Kinesis, EKS, RDS
- Databases: PostgreSQL, MongoDB
- Containerization: Docker, Kubernetes
- CI/CD & Tools: GitHub Actions, ArgoCD, JIRA, Confluence
- AI/ML: AI/ML Frameworks, LangChain, LlamaIndex, AutoGen, GitHub Copilot, Amazon Q
Team & Environment
This role is part of the Technology & Data Team, which includes the Chief Data Office, Chief Software Office, Chief Technology Office, Emerging Technology, Shared Technology Services, Technology, Risk and the Executive Program Management Office (EPMO).
Work Mode
This position is based in Chicago.
Fitch Group fosters a culture of credibility, independence, and transparency. We are a dynamic department where innovation meets impact, driven by investment in technologies like AI and cloud solutions. We are home to a diverse range of roles and backgrounds united by a shared passion for leveraging modern technology to drive projects that matter. Fitch Group has been recognized by Built In as a 'Best Place to Work in Technology' for 3 years in a row and offers an environment where you can grow, innovate, and make a difference.





