Responsibilities
- Install, configure, patch, and upgrade AIX, RHEL, and Debian systems
- Support and maintain high-availability clusters using HP Serviceguard, Pacemaker/Corosync, and Bull ARF
- Diagnose and resolve complex system-level issues involving CPU, memory, I/O, network, kernel, and applications
- Enforce security policies and system hardening through SSH, sudo, PAM, firewalls, SELinux, auditing, and access controls
- Execute backup strategies and disaster recovery procedures using mksysb, NIM, and restore operations
- Develop and deploy automation scripts using Bash, KSH, Python, and Ansible
- Participate in on-call support and conduct disaster recovery and failover testing
- Work with DBAs, application teams, storage, network, and security groups to ensure system reliability
- Create and maintain detailed technical documentation and operational runbooks
Requirements
- Extensive understanding of Unix operating system internals
- Strong analytical, problem-solving, and communication abilities
- Proven experience with enterprise Unix and Linux platforms, specifically IBM AIX, Red Hat Enterprise Linux, and Debian
- Background in production support, including troubleshooting, patching, automation, security hardening, and cluster management
- In-depth knowledge of Unix infrastructure components such as storage, filesystems, networking, authentication, backup, restore, and performance optimization
- Hands-on experience with clustering solutions including HP Serviceguard, Bull ARF, Pacemaker, and Corosync
- Demonstrated skill in scripting and automation using Bash, KSH, Python, and Ansible