About the Role

The role involves proactive monitoring, incident response, and system maintenance to support continuous operations and rapid resolution of technical issues.

Responsibilities

Monitor network and system performance using centralized tools
Respond to alerts and initiate incident resolution procedures
Perform root cause analysis for recurring system issues
Maintain documentation for configurations and procedures
Deploy and configure server infrastructure as needed
Support escalation workflows during major incidents
Collaborate with engineering teams to improve system resilience
Implement and verify backup and recovery processes
Apply security patches and system updates on schedule
Troubleshoot connectivity and service delivery problems
Ensure compliance with internal operational standards
Participate in on-call rotations for after-hours support
Optimize monitoring dashboards and alert thresholds
Track and report on system uptime and incident metrics
Assist in capacity planning for infrastructure growth

Nice to Have

Certification in CompTIA, CCNA, or equivalent
Experience with containerization or orchestration tools
Familiarity with configuration management systems
Background in high-availability environments
Exposure to DevOps practices and CI/CD pipelines

Compensation

Competitive salary and benefits package

Work Arrangement

Hybrid work model with on-site and remote options

Team

Part of a 24/7 Network Operations Center team ensuring system reliability

On-Call Expectations

Team members rotate through on-call duties to handle incidents outside business hours
Response time targets are defined by severity level
Tools and runbooks are provided to support rapid diagnosis

Growth Opportunities

Engineers are encouraged to pursue advanced training
Internal mobility is supported across technical teams

Available for qualified candidates

Atlas Technica is hiring a System Engineer (NOC Team)