Responsibilities
- Develop objective, verifiable evaluation criteria (rubrics) for system performance
- Review system logs and execution paths to improve reliability and code quality
- Refactor code and optimize system behavior toward ideal outcomes
- Test systems for vulnerabilities, including data exposure and edge-case failures
- Provide detailed, high-quality feedback on system performance and outputs
Requirements
- 2+ years of experience in backend engineering, AI automation, or systems integration
- Strong proficiency in at least two programming languages (e.g., Python, JavaScript, Go, Java)
- Experience working with SQL databases
- Proven ability to build and maintain production-grade systems
- Experience working in live (non-mocked) environments with multi-step interactions
- Strong analytical skills and attention to detail
Nice to Have
- Experience with multi-stage system workflows and coordination tasks
- Familiarity with integrating tools such as APIs, databases, or external platforms
- Understanding of system vulnerabilities (e.g., privacy leaks, prompt injection, access escalation)
- Experience working with AI systems or agent-based workflows
- Comfort working with persistent state tracking or similar frameworks