Responsibilities
- Manage and maintain global infrastructure spanning Windows Server 2022, SQL Server, MongoDB, Redis, and Cloudflare across four regions.
- Ensure availability, performance, and scalability of the CloudToolz platform, which is revenue-critical and must meet defined SLAs.
- Support the CTO in planning and executing deployments of new CloudToolz releases, an ASP.NET Core-based platform integrating with Dynamics, iMIS, Salesforce, and native CRM and CMS capabilities.
- Lead the migration of CloudToolz workloads toward a Kubernetes-based container orchestration model, defining the target architecture, migration sequencing, and rollout approach in collaboration with the CTO.
- Manage and evolve Kubernetes cluster operations including deployments, scaling, resource management, network policies, and RBAC.
- Build and maintain an observability stack using Grafana OSS, including dashboards, alerting, and integration with metrics, logs, and traces from application and infrastructure layers.
- Manage DNS, CDN, WAF, and DDoS protection through Cloudflare.
- Monitor system performance, identify and resolve bottlenecks, and lead incident response and recovery.
- Maintain up-to-date documentation, runbooks, configuration standards, and disaster recovery procedures.
- Lead IT security operations including identity and access management, endpoint protection, and vulnerability management.
- Administer and enforce policies across Microsoft 365, Bitwarden, Bitdefender GravityZone, and KnowBe4.
- Develop and enforce data protection and compliance policies aligned to GDPR and ISO 27001 frameworks.
- Conduct and coordinate security awareness training, phishing simulations, and incident response exercises.
- Monitor for threats and manage response, escalation, and remediation processes.
- Administer and optimise Microsoft 365 across the organisation, including Exchange Online, SharePoint, Teams, and Entra ID.
- Manage Salesforce administration within scope, in coordination with the product team.
- Own configuration and policy management for endpoint security tools and password management.
- Collaborate with the CTO on infrastructure scaling strategy, architecture decisions, and technology selection, with a particular focus on the containerisation roadmap.
- Drive adoption of infrastructure-as-code practices to support repeatable, auditable deployments across environments.
- Drive automation of routine operations using PowerShell, scripting, and monitoring tooling.
- Provide input into sprint planning and release coordination with the Product Owner for environment-level requirements.
- Participate in on-call rotation to ensure platform availability across time zones.
Requirements
- 5 or more years of experience in IT infrastructure, cloud operations, systems administration, or DevOps roles.
- Strong hands-on experience with Windows Server 2022 and SQL Server in production environments.
- Proven experience with MongoDB and Redis in operational contexts.
- Demonstrated Cloudflare experience covering DNS management, WAF rules, CDN configuration, and security features.
- Solid proficiency in Microsoft 365 administration, including Entra ID (Azure AD), Exchange Online, and SharePoint.
- Working knowledge of enterprise security tools; direct experience with Bitwarden, KnowBe4, Bitdefender GravityZone, or equivalent products is preferred.
- Strong understanding of networking fundamentals including firewalls, VPNs, DNS, and access control architecture.
- Sound knowledge of GDPR and practical experience implementing or maintaining compliance controls.
- Ability to work independently, manage competing priorities, and communicate clearly with both technical and non-technical stakeholders.
- Experience working with geographically distributed teams across multiple time zones.
Nice to Have
- Hands-on Kubernetes experience, including cluster administration, workload deployment, networking (ingress, network policies), RBAC, and persistent storage.
- Experience building or operating an observability stack with Grafana OSS, including Prometheus metrics, Loki log aggregation, and Grafana dashboards and alerting.
- Familiarity with container runtimes and tooling including Docker, containerd, Helm, and kubectl.
- Relevant certifications such as MCSE, CISSP, CompTIA Security+, CKA, or equivalent.
- Familiarity with ISO 27001 and experience contributing to certification or audit processes.
- PowerShell scripting proficiency for automation, configuration management, and reporting.
- Experience with infrastructure-as-code tools such as Terraform, Pulumi, or Ansible.
- Exposure to CI/CD pipelines and GitOps workflows in a production environment.
- Salesforce administration experience.
Benefits
- Full-time, permanent position
- Remote-first; flexible hours with availability for global coordination
- Base salary range: €60,000 – €85,000 per year
- Typical target: €70,000 – €80,000 per year
- Performance bonus: Discretionary, based on individual and company performance
- Pension: Employer contribution in line with local requirements
- Probation: 6 months
Additional Information
- The CloudToolz platform operates continuously across multiple regions; the role includes on-call and out-of-hours escalation responsibilities.
- Compensation is commensurate with demonstrable experience and the scope of the role.
- The range reflects a small-business context while remaining competitive for the Berlin IT market for this level of responsibility.
- Note: Please include a cover letter with your CV explaining why you believe your qualifications and experience make you a strong fit for this position.