Responsibilities
- Automate data classification to resolve existing issues and integrate classification processes into data workflows in collaboration with the Governance Lead.
- Develop production-grade AI agents that extend beyond prototype examples to support real-world operational use cases on the agentic platform.
- Define data requirements for the agentic platform, specifying necessary data inputs, formats, and quality standards in coordination with the Principal AI Engineer.
- Build FastAPI-based services that wrap LLM APIs and support version-controlled prompt templates.
- Design structured prompts for classification and briefing tasks that produce validated JSON outputs including tags, confidence scores, and source references.
- Maintain prompt templates in configuration files to allow updates without requiring code changes.
- Log all LLM interactions with details including input hash, model version, output result, response time, and token usage for monitoring and auditing.
- Implement fallback strategies to ensure system resilience when LLM APIs are inaccessible or degraded.
- Conduct regular evaluation of model outputs using precision and recall metrics against human-labeled samples, then refine prompts based on results.
Compensation
Competitive salary and benefits package.
Work Arrangement
Hybrid work model with flexibility for remote and on-site collaboration.
Team
Collaborative engineering team focused on delivering scalable and robust AI solutions in production environments.
Responsibilities
- Data classification automation: implementing automated classification to remediate current failures, embedding classification into data pipelines alongside the Governance Lead
- Operational AI agents: building production agents on top of the agentic platform — going beyond the sample agents the external partner delivers into real operational workflows
- Agentic platform data contracts: defining what data the platform needs, in what format, with what quality guarantees — working with the Principal AI Engineer
- AI service implementation: FastAPI service around LLM APIs with versioned prompt templates
- Classification and briefing prompts: structured prompts returning validated JSON with tags, confidence levels, source attribution
- Prompt versioning: templates in configuration, editable without code changes
- Observability: every LLM call logged with input hash, model version, output, latency, token count
- Fallback logic: graceful degradation when LLM APIs are unavailable
- Quality evaluation: running precision/recall evaluations against human reviewer samples, reporting results, iterating prompts
Available for qualified candidates requiring sponsorship.