This role focuses on maintaining and modernizing complex IT environments for large-scale clients in industries such as finance and retail. As a System Administrator Specialist, you will oversee system performance, ensure security compliance, and drive improvements across on-premises and cloud platforms. Your work directly supports operational stability and long-term infrastructure evolution.
Key Responsibilities
- Monitor system health and application performance using observability tools including Zabbix, Grafana, and APM solutions.
- Respond to and resolve incidents using structured methodologies aligned with ITIL standards, including root cause analysis and post-incident reviews.
- Provide technical support across Wintel, Unix, Microsoft 365, Exchange, SharePoint, and Active Directory environments.
- Participate in urgent response sessions to quickly address critical outages and service disruptions.
- Integrate with IT service management platforms such as ServiceNow and GLPI, and leverage internal observability systems like Kyndryl Bridge.
- Support automation initiatives and adopt Site Reliability Engineering practices to improve system resilience and efficiency.
- Manage system data, maintain compliance standards, and deliver consistent operational solutions.
- Supervise task queues, assign priorities, and coordinate with technical teams to ensure timely resolution.
- Identify opportunities for infrastructure modernization and advise on potential service enhancements.
- Build strong client relationships through clear communication and collaborative problem-solving.
Required Qualifications
- Proven experience managing Microsoft Windows Server 2016/2019 and administering Linux systems.
- Hands-on work with VMware ESX virtualization environments.
- Experience managing public and private cloud infrastructures, particularly migrations from on-prem to cloud.
- Fluency in Portuguese is required; working knowledge of English for technical documentation.
- Willingness to support 24x7 operations as needed.
- Flexibility to work in a hybrid model, including time at client locations and remotely.
Preferred Skills
- Experience with infrastructure automation and Infrastructure-as-Code (IaC) practices.
- Background in vulnerability management and system hardening.
- Familiarity with Microsoft Active Directory and identity federation services.
- Proficiency in monitoring and observability tools such as Datadog, Elastic Stack, OpenTelemetry, and Grafana.
- Solid understanding of system and database administration principles.
- Knowledge of ITIL processes, particularly Incident, Problem, and Major Incident Management.
Work Environment
This position operates in a hybrid mode, combining remote work with on-site presence at client facilities. You must be available to support critical systems across time zones and adapt to evolving operational demands.
Learning and Growth
You’ll have access to training and certification programs across major technology platforms including Microsoft, Amazon, Google, and Skillsoft. Career progression paths support advancement from technical roles to architectural leadership. Continuous learning is embedded in daily work, with resources to grow expertise in cloud, automation, and reliability engineering.
Culture and Values
The environment emphasizes inclusion, empathy, and sustainable progress. Teams are encouraged to bring their full selves to work, supported by employee-led networks that foster connection and belonging. The organization values diverse perspectives and promotes equitable opportunities across all levels. Community engagement is encouraged through volunteering and giving programs that support local and global causes.
