QuickNode is seeking a Technical Operations Engineer, Solana to ensure the reliability, scalability, and performance of our Solana-based services. You will be instrumental in managing, optimizing, and enhancing validator nodes, RPC deployments, and related infrastructure, directly impacting our operational excellence and customer trust.
What You'll Do
- Lead end-to-end deployment and optimization projects for Solana infrastructure, including validator nodes, RPC endpoints, and indexing services.
- Own SEV 0/1 incident response, coordinating mitigation, running postmortems, and ensuring root-cause resolution.
- Define and manage service-level objectives (SLOs) and SLAs. Build cost models and capacity planning tools.
- Develop dashboards and alerting solutions using tools like Grafana and DataDog for proactive issue detection.
- Implement and maintain automation via Ansible, Terraform, and Kubernetes to reduce toil and ensure consistent environments.
- Provide mentorship to engineers on deployment, observability, and Solana-specific operations. Review infrastructure code.
- Act as a technical representative in Solana forums and community calls, collaborating with the Solana Foundation.
- Partner with internal infrastructure, platform, and support teams to solve customer-impacting issues.
- Participate in a 24/7 on-call rotation for critical systems.
What We're Looking For
- Minimum of 5+ years in Technical Operations, Site Reliability Engineering (SRE), or related roles.
- Proven Linux/Unix system administration and advanced troubleshooting capabilities.
- Hands-on experience operating and optimizing Solana validator nodes, RPC endpoints, and associated infrastructure at scale.
- Familiar with high-level Solana protocol and core components. Proficient in analyzing validator logs and RPC debugging.
- Solid hands-on experience with configuration management and infrastructure automation tools (Helm, Terraform, Ansible, Consul).
- Containerization expertise (Docker, Kubernetes) managing and scaling services in cloud environments.
- Competency in scripting/programming languages (Rust, Go, JavaScript).
- Advanced proficiency in monitoring and analytics platforms (Grafana, DataDog).
- Demonstrated ability to identify performance patterns, forecast issues, and implement preventive solutions.
- Strong track record defining, measuring, and maintaining SLAs/SLOs, and experienced with incident response tooling (PagerDuty).
- Exceptional interpersonal and communication skills, with a proven ability to collaborate across teams.
- Self-motivated, solution-oriented, and consistently striving for operational improvements.
- Self-starter driven by curiosity and initiative, proactively identifying opportunities and implementing solutions.
- Thrives in dynamic environments and committed to maintaining industry leadership in Web3.
Nice to Have
- Holding an RHCE-level Linux or similar certification would be beneficial.
- Contributions into open-source Solana projects is an asset.
Technical Stack
- Linux/Unix
- Solana
- Grafana, DataDog
- Ansible, Terraform, Kubernetes, Helm, Consul, Docker
- Rust, Go, JavaScript
- PagerDuty
Team & Environment
You will join a team of over 120 people globally. QuickNode's mission is to be the indispensable utility empowering companies to build Web3 businesses. We are committed to transparency, accountability, and ethical behavior, prioritize attracting and retaining the best talent globally, and maintain a high-performing and flexible way of working.
Work Mode
This is a global position.
We at Quicknode are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity or expression, pregnancy, age, national origin, disability status, genetic information, protected veteran status, or any other characteristic protected by law.





