Responsibilities
- Define, own and run the AI evaluation strategy for AI products in life sciences, diagnostics, and biotechnology.
- Design and implement robust evaluation frameworks for agentic workflows, LLMs / NLP, computer vision and multimodal models.
- Develop and execute evaluation plans to measure performance, reliability, and safety across multimodal datasets.
- Collaborate with the Sr. Director for the Initiative, Sr. AI Engineers and product teams to align evaluation criteria with product KPIs and regulatory needs.
- Analyze evaluation results, identify weaknesses, and recommend improvements to AI models and workflows.
- Build automated pipelines for continuous evaluation and monitoring of AI systems in production.
Requirements
- Bachelor’s degree in Computer Science, Engineering, Data Science, or related field; MS/PhD preferred.
- Proven experience designing and implementing evaluation methodologies for AI systems, including LLMs and computer vision.
- Strong knowledge of metrics for AI performance, robustness, and fairness, especially in regulated domains.
- Expertise in at least 3 of the following: benchmarking frameworks, statistical validation, synthetic data generation, adversarial testing, explainability techniques.
- Proficiency in Python and ML libraries (e.g., PyTorch, TensorFlow) and familiarity with evaluation tools (e.g., OpenAI Evals, Dynabench, Promptfoo).
- Ability to communicate complex evaluation results to technical and non-technical stakeholders and influence model improvements.
Nice to Have
- Experience with regulatory processes, especially for medical devices and AI/ML-based software as a medical device (SaMD).
- Familiarity with quality management systems and standards relevant to the life sciences and diagnostics industries.
- Knowledge of instrument control mechanisms and how they integrate with AI systems for enhanced automation.
Additional Information
- Position is remote in Germany.
- The role is part of the AI Product and Imaging Innovation team.
- The job posting includes a reference code: #LI-AC1.

