College Board is looking for an AI/ML Data Engineer to design, build, and operate the data and ML plumbing that powers personalized student experiences at scale. You'll create batch and streaming pipelines, ML‑ready datasets, and the services that move models into production safely and compliantly.
What You'll Do
- Design, build, and own batch and streaming ETL pipelines (e.g., Kinesis/Kafka → Spark/Glue → Step Functions/Airflow) for training, evaluation, and inference.
- Stand up and maintain offline/online feature stores and embedding pipelines with reproducible backfills.
- Implement data contracts & validation, schema evolution, and metadata/lineage capture.
- Optimize lakehouse/warehouse layouts and partitioning for scalable ML and analytics.
- Productionize training and evaluation datasets with versioning and experiment tracking.
- Build RAG foundations: document ingestion, chunking, embeddings, retrieval indexing, and quality evaluation.
- Collaborate with Data Science to ship models to serving, automate feature backfills, and capture inference data for continuous improvement.
- Define SLOs and instrument observability across data and model services.
- Embed security & privacy by design, aligning with College Board standards and FERPA.
- Build CI/CD for data and models with automated testing, quality gates, and safe rollouts.
- Maintain docs‑as‑code for pipelines, contracts, and runbooks; create internal guides and tech talks.
- Mentor peers through design reviews, pair/mob sessions, and post‑incident learning.
What We're Looking For
- 4+ years in data engineering (or 3+ years with substantial ML productionization), with strong Python and distributed compute skills.
- Proven experience shipping ML data systems, including training/eval datasets, feature or embedding pipelines, and artifact/version management.
- MLOps/LLMOps experience with orchestration, containerization, and deployment; CI/CD for data & models.
- Expert SQL and data modeling for lakehouse/warehouse, with performance tuning for large datasets.
- Experience with data quality & contracts, lineage/metadata, and drift/skew monitoring.
- Cloud experience preferably with AWS services such as S3, Glue, Lambda, Athena, Bedrock, and SageMaker.
- Experience with BI tools like Tableau, Quicksight, or Looker for real-time analytics.
- A security and privacy mindset; ability to design compliant pipelines handling sensitive student data.
- Ability to judiciously evaluate the feasibility, fairness, and effectiveness of AI solutions.
- Excellent communication, collaboration, and documentation habits.
- Authorization to work in the United States for any employer.
- Curiosity and enthusiasm for emerging technologies, with a willingness to experiment.
- Clear and concise written and verbal communication skills.
- A learner's mindset and a commitment to growth.
- A drive for impact and excellence, solving complex problems and making data-informed decisions.
- A collaborative and empathetic approach, fostering trust and a culture of shared success.
Nice to Have
- RAG & vector search experience and prompt/eval frameworks.
- Real‑time feature engineering and low‑latency stores for online inference.
- Testing strategies for ML systems.
- Experience in higher‑ed or assessments data domains.
Technical Stack
- Languages & Compute: Python, Spark, Glue, Dask, SQL
- AWS: S3, Glue, Lambda, Athena, Bedrock, OpenSearch, API Gateway, DynamoDB, SageMaker, Step Functions, Redshift, Kinesis
- Orchestration & Tools: Kafka, Airflow, Great Expectations, Deequ, OpenLineage, DataHub, Amundsen, DVC, LakeFS, MLflow
- Infrastructure: Docker, EKS, ECS
- BI & Search: Tableau, Quicksight, Looker, OpenSearch KNN, pgvector, FAISS
Team & Environment
You'll join a small, highly collaborative team of engineers and architects blending expertise in data engineering, analytics, and product strategy.
Benefits & Compensation
- Salary range: $137,000–$148,000
Work Mode
This is a remote position.
We value curiosity, reliability, and clear communication, and are driven by a passion for expanding educational and career opportunities.

