Responsibilities
- Build and ship LLM-powered features end-to-end: from prototype to production-ready systems (RAG, agents, tool calling, workflow automation)
- Design retrieval and search pipelines using OpenSearch / Elasticsearch, including indexing strategies and query patterns that work for real user needs
- Develop backend services and APIs in Python, using Pydantic for robust data validation and clear contracts
- Orchestrate async and scheduled workloads (batch jobs, pipelines, background workers) with Celery / Prefect
- Own data modeling and persistence for AI workflows using SQLAlchemy
- Add observability and reliability with OpenTelemetry: tracing, metrics, and logs that make systems debuggable and safe to operate
- Collaborate async-first with product and engineering: align on trade-offs, ship continuously, improve based on feedback and usage
- Proactively identify edge cases and failure modes (hallucinations, retrieval misses, long-tail inputs, timeouts) and fix them with pragmatic engineering
Requirements
- Strong software engineering fundamentals with excellent Python (clean architecture, testable code, API design)
- Practical experience building LLM applications in real contexts (RAG, agents, tool calling, workflow automation)
- Comfort integrating AI into business processes: you care about reliability, UX constraints, and operational realities, not just model outputs
- Ability to handle multiple tasks and quickly re-prioritize without losing clarity or quality
- Clear and consistent communication in a fully remote team (async-first)
Nice to Have
- Experience with LLM evaluation, guardrails, and quality measurement (test suites, regression checks, prompt/versioning strategies)
- Experience with BS4 and/or Playwright for scraping, data extraction, or automated validation flows
- Familiarity with practical security/privacy considerations in AI systems (PII handling, data retention, access control)