Remote Bulgaria Remote (Country) Full-time

Point Wild is hiring a Principal Platform Engineer

Responsibilities

  • Design and manage scalable cloud infrastructure on GCP with Kubernetes to support demanding machine learning workloads.
  • Create automated workflows for training, evaluating, and releasing machine learning models using tools such as Jenkins, GitHub Actions, or Airflow.
  • Set up monitoring systems to detect model degradation, accuracy loss, latency issues, and data drift in live environments.
  • Collaborate across data, machine learning, backend, and frontend teams to ensure seamless integration and operations.
  • Establish monitoring solutions that track both system performance and ML-specific indicators like feature drift and prediction consistency.
  • Deploy observability platforms that allow individual engineering teams to oversee their own services and pipelines.
  • Take part in on-call duties and contribute to maintaining compliance with security standards such as SOC.

Nice to Have

  • Prior experience implementing systems for continuous monitoring of model accuracy and detecting data or concept drift.
  • Background in using Ansible for cluster provisioning and disaster recovery procedures.
  • Holding recognized certifications such as CKA, CKS, or GCP Professional Cloud Architect/Security Engineer.
  • Exposure to modern observability tools including Loki, Grafana, or large-scale ClickHouse operations.

What you bring to the table

  • 8 - 10+ years in DevOps/Platform Engineering, with at least 2 years operating production ML workloads.
  • Deep hands-on experience with GCP (VPC-SC, IAM, Organization Policies) and GKE (Cluster topology, Helm, Kustomize, ArgoCD).
  • High proficiency with Istio (VirtualServices, mTLS, sidecar injection) and Kong API Gateway.
  • Expert-level Terraform skills using Atlantis/GitOps in large, multi-hundred-file environments.
  • Experience managing enterprise identity and secrets with tools like Auth0, Dex, ESO, or SOPS.
  • Production experience with Airflow and ML-serving stacks (e.g., Triton, vLLM, MLflow).
  • Comfortable managing Cloud SQL (PostgreSQL), BigQuery, and in-cluster stores like Elasticsearch or ClickHouse.
  • Upper-intermediate or higher English proficiency in speaking and writing.

It would be great if you also had

  • Experience monitoring model accuracy and detecting data/concept drift.
  • Familiarity with Ansible for cluster bootstrapping and recovery.
  • Kubernetes (CKA/CKS) or GCP Professional Cloud Architect/Security Engineer certifications.
  • Exposure to Loki, Grafana, or managing ClickHouse at scale.

As part of Point Wild, you will

  • Solve real customer problems through targeted cybersecurity solutions.
  • See your impact daily in a fast-moving, contributor-focused organization.
  • Accelerate your career by learning new technologies and working with talented peers.
  • Have the chance to shape the direction and growth of the organization.
About company
Point Wild
Point Wild helps customers monitor, manage, and protect against the risks associated with their identities and personal information in a digital world. Backed by WndrCo, Warburg Pincus and General Catalyst, Point Wild is dedicated to creating the world’s most comprehensive portfolio of industry-leading cybersecurity solutions. Our vision is to become THE go-to resource for every cyber protection need individuals may face - today and in the future.
All jobs at Point Wild Visit website
Job Details
Category infrastructure
Posted 24 days ago