Lead the technical vision for a clinical data platform that powers insights in Medicaid care delivery. As a Principal Data Engineer, you'll design and operate high-reliability data pipelines that ingest, normalize, and route clinical data from EHRs, health plans, and health information exchanges. Your work will ensure near real-time availability of structured, standards-compliant data to support risk prediction, care gap detection, and patient outreach.
Key Responsibilities
- Design and deploy production data workflows that process FHIR R4, HL7v2, and CCDA content from diverse sources, ensuring sub-hour latency and robustness across heterogeneous systems
- Own technical integration from evaluation through production with health plans, HIEs, provider networks, and third-party vendors
- Build cloud-native infrastructure on AWS using Step Functions, dbt, and containerized services to support scalable ETL/ELT with full data lineage and quality monitoring
- Develop reusable tooling and frameworks that accelerate onboarding of new data partners and reduce engineering effort by 50% or more
- Collaborate with data science teams to operationalize models for clinical risk scoring and care prioritization
- Translate clinical and product requirements into durable, maintainable backend systems that scale with growing data volume and complexity
- Enforce strict compliance with HIPAA, HITECH, and privacy regulations through encryption, access controls, audit logging, and PHI de-identification
- Mentor engineers across the organization and advance best practices in healthcare data engineering
What You Bring
- 8+ years of software engineering experience with a focus on healthcare data integration, EHR systems, or clinical data platforms
- Proficiency in Python and experience with a compiled or concurrent language such as Elixir
- Strong grasp of distributed systems, including REST APIs, microservices, event-driven architectures, message queues (e.g., RabbitMQ), and caching strategies
- Experience modeling data in both transactional (PostgreSQL) and analytical (Redshift) databases
- Hands-on work with Docker and Kubernetes for container orchestration
- Proven ability to design resilient integration systems for inconsistent or delayed clinical data from EHRs, payers, or HIEs
- Deep familiarity with healthcare terminologies including ICD-10, SNOMED CT, LOINC, and RxNorm
- Clear communication skills for both technical and non-technical stakeholders
- Experience leading technical direction without formal authority across multiple engineering teams
Nice to Have
- Advanced degree in Computer Science, Software Engineering, or related field
- Background in value-based care, population health, or Medicaid/Medicare data environments
- Exposure to ML/AI frameworks like TensorFlow or PyTorch and MLOps practices in healthcare contexts
Technology Environment
Our stack centers on Python and Elixir for backend services, FHIR R4 APIs and HL7v2 for clinical data exchange, and AWS for cloud infrastructure. We use Step Functions for orchestration, dbt for transformation workflows, PostgreSQL and Redshift for data storage, and rely on Docker and Kubernetes for deployment. Our architecture emphasizes microservices, event-driven patterns, and RESTful APIs, with strong attention to security, scalability, and data integrity.
Work Model
This is a hybrid role, blending remote collaboration with in-person engagement to support close teamwork across technical and clinical stakeholders.
Our Culture
We are driven by a mission to improve access to care and eliminate systemic barriers in Medicaid. Our team brings together clinicians, engineers, and product builders in a collaborative environment where curiosity and innovation are central. We value thoughtful problem-solving and aim to transform how care is delivered through technology that works for everyone.


