Remote: Argentina; Brazil; Chile; Colombia; Costa Rica; Mexico Remote (Global)

Newsela is hiring a Contract: AI Operations Specialist

We are seeking a skilled AI Operations Specialist to join our growing ML/AI team. In this contract position, you'll play a central role in transitioning machine learning models from research to reliable, large-scale production systems. Your work will directly support the expansion of AI-driven learning experiences, ensuring performance, scalability, and observability across services.

Key Responsibilities

Develop and manage CI/CD pipelines tailored for machine learning workflows, enabling seamless model training, packaging, and deployment across microservices.
Operate and optimize containerized applications on AWS ECS, balancing efficiency, responsiveness, and uptime.
Automate infrastructure setup and configuration using Terraform to ensure consistent, reproducible environments.
Support and scale backend services that integrate with external large language model providers.
Design and maintain data pipelines that extract, transform, and load data from BigQuery, S3, and DynamoDB into training and inference systems.
Implement monitoring and tracing solutions using Datadog, OpenTelemetry, and Langfuse to track model behavior and define service-level objectives.
Partner with machine learning engineers to deploy models using BentoML and FastAPI in containerized production environments.

Required Expertise

2–3 years of experience in ML operations, with a foundation of 3–4 years in DevOps, CloudOps, or site reliability engineering.
Strong programming skills in Python, with hands-on experience in Docker and container orchestration.
Proven track record building CI/CD systems for machine learning in enterprise settings.
Familiarity with Infrastructure as Code, particularly Terraform.
Experience working with AWS services including ECS, ECR, S3, DynamoDB, and CloudWatch, as well as GCP tools like BigQuery and Vertex AI.
Direct experience integrating and monitoring LLMs using OpenAI API, Google GenAI, and tracing tools like Langfuse.
Background in constructing and maintaining data pipelines for model training and feature engineering.
Understanding of the full ML lifecycle, including training, evaluation, experiment tracking (e.g., MLFlow, Weights & Biases), and model version control.
Ability to detect and respond to model drift over time.
Exposure to NLP frameworks such as Hugging Face Transformers, spaCy, or sentence-transformers.
Knowledge of vector databases like LanceDB or FAISS and embedding-based retrieval systems.
Experience deploying deep learning models built with TensorFlow or PyTorch in production environments.
Familiarity with classical ML libraries including scikit-learn, XGBoost, and LightGBM, along with explainability tools such as SHAP.
Working knowledge of model serving platforms like BentoML and async Python web frameworks such as FastAPI.

Required Skills

PythonDockerAWSTerraformECSECRS3DynamoDBCloudWatchCI/CDcontainerizationML OpsDevOpsCloudOpsSRE PythonDockerAWS ECSTerraformBigQueryS3DynamoDBDatadogOpenTelemetryLangfuseCI/CDML OpsCloudOpsSREInfrastructure as Code

About company

Newsela is a leading education technology company dedicated to meaningful classroom learning for every student. The company delivers integrated, AI-powered solutions designed to unlock student engagement, empower teachers, and drive meaningful learning outcomes. Its suite of products supports knowledge and skill development, writing practice, daily instruction, assessment, and data-informed decision-making across K–12 classrooms. Grounded in learning science research, Newsela’s solutions integrate content, assessment, and analytics to help educators track progress, understand student outcomes, and deliver high-impact instruction that supports every learner.

All jobs at Newsela Visit website

Job Details

Category infrastructure

Posted 2 months ago