Responsibilities
- work with several teams of passionate and talented engineers that are building the internal Control Plane used by our SREs, and Infrastructure Operations teams to manage our internal DCaaS and IaaS platforms.
- be responsible for tools that support the management of a growing, globally distributed fleet of servers, storage, and network gear spread across over a thousand colos worldwide.
- play an active part in shaping the future of the infrastructure that propels Cloudflare’s scale and growth.
- have the opportunity to write code to bring this design to fruition as well as to mentor high-potential engineers on their distributed system journey.
- work alongside engineers who have presented at DevOPs Days, Config Management Camp 2024 & 2025, Monitorama, OSMC, Kubecon and Promcon.
- deliver on the key Health Mediated Deployment projects that are being tracked through senior leadership of Cloudflare up to the founders.
Requirements
- Minimum 10 years of experience working with distributed systems.
- Experience designing, building and managing high volume software applications.
- Expert in at least one modern strongly-typed programming language
- Experience debugging, measuring, optimizing and identifying failure modes in a large-scale distributed system.
- Excellent collaboration skills
- Proven ability to convey ideas effectively through verbal and written communication
- Ability to translate business needs into requirements, design documents and technical solutions
- Knowledge of API design standards, patterns and best practices
- Proven ability to use data to drive business outcomes
- Proven experience in developing architects and lead engineers
- Solid understanding of computer science fundamentals including data structures, algorithms, and object-oriented or functional design.
Nice to Have
- Experience with optimizing and scaling infrastructure provisioning, repair, and decommissioning processes and automations.
- Experience with scaling and simplifying Configuration Management systems managing hundreds of thousands of nodes
Benefits
- Medical/Rx Insurance
- Dental Insurance
- Vision Insurance
- Flexible Spending Accounts
- Commuter Spending Accounts
- Fertility & Family Forming Benefits
- On-demand mental health support and Employee Assistance Program
- Global Travel Medical Insurance
- Short and Long Term Disability Insurance
- Life & Accident Insurance
- 401(k) Retirement Savings Plan
- Employee Stock Participation Plan
- Flexible paid time off covering vacation and sick leave
- Leave programs, including parental, pregnancy health, medical, and bereavement leave.
Team
Structure: The Infrastructure Tooling Team within the Resiliency Engineering organization is responsible for defining, building and supporting the tools that can be leveraged by the rest of the Infrastructure Engineering team to manage our infrastructure at scale.
Additional Information
- Compensation may be adjusted depending on work location.
- This role is eligible to participate in Cloudflare’s equity plan.