Define, own and run the AI evaluation strategy for AI products in life sciences, diagnostics, and biotechnology.
Design and implement robust evaluation frameworks for agentic workflows, LLMs / NLP, computer vision and multimodal models.
Develop and execute evaluation plans to measure performance, reliability, and safety across multimodal datasets.
Collaborate with the Sr. Director for the Initiative, Sr. AI Engineers and product teams to align evaluation criteria with product KPIs and regulatory needs.
Analyze evaluation results, identify weaknesses, and recommend improvements to AI models and workflows.
Build automated pipelines for continuous evaluation and monitoring of AI systems in production.

Bachelor’s degree in Computer Science, Engineering, Data Science, or related field; MS/PhD preferred.
Proven experience designing and implementing evaluation methodologies for AI systems, including LLMs and computer vision.
Strong knowledge of metrics for AI performance, robustness, and fairness, especially in regulated domains.
Expertise in at least 3 of the following: benchmarking frameworks, statistical validation, synthetic data generation, adversarial testing, explainability techniques.
Proficiency in Python and ML libraries (e.g., PyTorch, TensorFlow) and familiarity with evaluation tools (e.g., OpenAI Evals, Dynabench, Promptfoo).
Ability to communicate complex evaluation results to technical and non-technical stakeholders and influence model improvements.

Experience with regulatory processes, especially for medical devices and AI/ML-based software as a medical device (SaMD).
Familiarity with quality management systems and standards relevant to the life sciences and diagnostics industries.
Knowledge of instrument control mechanisms and how they integrate with AI systems for enhanced automation.

Danaher is hiring an AI Evaluation Engineer

Similar Jobs