Responsibilities
- Utilize Automatic Prompt Generation (APG) tools to create baseline prompts for complex parent-child template clusters.
- Run and supervise Automated Prompt Optimization (APO) tool, review the outputs, and flag when the APO reaches deadlocks or plateaus.
- Manually draft, test, and refine prompts to navigate complex template architectures, overcome anti-patterns, and handle edge cases where tooling is lacking or broken. Solve edge-case scenarios by designing and refining manual prompts.
- Monitor shadowbot runs to ensure sufficient disagreements (between human and LLM ratings) are registered, generated, and tracked.
- Run prompt versions against established gold data to continuously measure autorater quality against the human crowd baseline, calculating accuracy metrics such as F1 scores, precision, and recall.
- Draft technical launch readiness justifications (Launch Certification Documentation) for final.
Work Arrangement
Remote (Country)
Team
Structure: Lead managing a team of prompt engineers
Additional Information
- Verify identity and eligibility to work in the United States.
- Complete a required employment eligibility verification form.
- Complete a live video verification with selected IDs and provide photos of these selected IDs within first 3 days of employment.