Design and manage automated workflows for continuous integration, delivery, and training of machine learning models.
Package and deploy ML models as scalable microservices or batch processes with high uptime requirements.
Implement centralized monitoring, logging, and alerting to track model behavior, system performance, and data or concept drift.
Configure and optimize cloud-hosted ML environments, including GPU-accelerated clusters, using Infrastructure as Code methods.
Collaborate with product teams to enhance infrastructure efficiency through APIs, SDKs, and automation tools.
Ensure full traceability of data, code, and model versions to support compliance, security, and reproducibility.
Partner with data engineers to strengthen data pipelines using streaming technologies for real-time inference.
Guide teams on best practices in ML engineering, scalable operations, and AI-native development patterns.

Approach unclear challenges methodically, defining problem scope before selecting tools or solutions.
Communicate technical decisions clearly, collaborate across disciplines, and respect diverse viewpoints.
Take end-to-end ownership of systems from design through deployment and ongoing operations.
Integrate AI development tools as active collaborators—delegating specific coding tasks, validating outputs critically, and managing multiple concurrent workflows.

ZenGRC is hiring a Senior Machine Learning Operations Engineer II (AI Native)