Responsibilities
- Provision, configure, and maintain cloud infrastructure across AWS, Azure, GCP, and OCI.
- Monitor, troubleshoot, and resolve incidents, performance issues, and service outages in production and staging environments.
- Implement and maintain monitoring, alerting, and logging solutions to ensure high availability and reliability.
- Lead root cause analysis and post-mortem documentation for major incidents.
- Execute patch management, upgrades, and regular maintenance activities.
- Develop and maintain backup, disaster recovery, and failover strategies and operations.
- Participate in on-call rotation and after-hours support as required.
- Develop and maintain Infrastructure as Code (IaC) templates using tools such as Terraform, CloudFormation, ARM, or OCI Resource Manager.
- Use scripting (e.g., Python, Bash, PowerShell) to automate repetitive tasks and operational processes.
- Champion the use of configuration management tools and assist in DevOps pipeline integrations.
- Recommend and implement cost optimization, resource utilization, and rightsizing strategies.
- Ensure adherence to security best practices, including least-privilege access, encryption, and network segmentation.
- Implement and manage identity and access management (IAM) policies and roles.
- Monitor, identify, and remediate security vulnerabilities reported by scanning tools or external advisories.
- Support compliance efforts related to customer and regulatory requirements (TxRAMP, ISO, SOC2, etc.).
- Work closely with application, security, and network teams for solution delivery and support.
- Mentor junior engineers and provide technical guidance as needed.
- Create and update technical documentation, runbooks, and SOPs.
- Participate in client calls to provide technical input when required.
Requirements
- 5+ years of hands-on experience in cloud engineering, operations, or support.
- 3+ years multi-cloud experience (must have hands-on in at least 2 of AWS/Azure/GCP/or OCI; familiarity in all is preferred; AWS is mandatory).
Nice to Have
- Bachelor’s degree in Computer Science, Engineering, Information Systems, or a related discipline; or equivalent professional experience.
- At least two of the following certifications (or equivalent experience):
- AWS Certified Solutions Architect / SysOps Administrator
- Microsoft Certified: Azure Administrator Associate or Solutions Architect Expert
- Google Professional Cloud Architect / Engineer
- Oracle Cloud Infrastructure Architect Associate/Professional
- DevOps or automation certifications (e.g., Kubernetes, Terraform, Ansible).
- ITIL Foundation or other support framework knowledge.
- Direct experience in managed services/NOC/SOC/MSP environments is a plus.
- In-depth expertise with provisioning, configuring, securing, supporting, and optimizing cloud-native and hybrid workloads in AWS, Azure, GCP, and/or OCI.
- Administration of compute, storage, networking, database, and PaaS services across supported platforms.
- Deep hands-on knowledge in architecture, deployment, monitoring, and troubleshooting in major public cloud platforms (AWS, Azure, GCP, OCI).
- Experience with CI/CD pipelines, containerization (Docker, Kubernetes), and automation tools (Terraform, CloudFormation, ARM templates, etc.).
- Familiarity with cloud-native security best practices (IAM, network security, data encryption, etc.).
- Proficiency in scripting languages (Python, Bash, PowerShell, etc.).
- Expert in ServiceNow ITSM
- Expert in Cloud Cost Optimization. Familiar with Apptio Cloudability.
- Strong expertise in multi-cloud disaster recovery.
- Familiarity in AppGate SDP, Qualys TotalCloud, Qualys Patch Management, Qualys CSAM, CrowdStrike, Palo Alto NGFW, etc.
- Ability to analyze logs and monitor performance using native tools (CloudWatch, Azure Monitor, Stackdriver, OCI Monitoring, etc.)
- Strong understanding of backup strategy, disaster recovery, and high-availability architecture.
- Have strong expertise in multi-cloud security compliance, data encryption, network security, user access control, private endpoint setup, etc.
- Have strong expertise in GitHub and Repository management
- Be able to set up rules/thresholds in Azure Monitor, AWS CloudWatch, GCP Monitoring and OCI Monitoring to generate alerts and connect with ServiceNow Incident Ticketing
- Be able to connect multi-cloud VMs and instances with Microsoft Sentinel SIEM
- Be able to support customer self-provision cloud instances with required security (guardrail) via Azure Blueprints, AWS Control Tower, etc.
Benefits
- medical, dental, and vision insurance
- flexible spending or health savings account
- life and AD&D insurance
- short and long term disability coverage
- paid time off
- employee assistance
- participation in a 401k program with company match
- additional voluntary or legally-required benefits
Work Arrangement
Hybrid
Additional Information
- Must be a US citizen or Green card holder to proceed with applying.
- An FBI CJIS Background check is administered every 12 months.
- This position aligns with supporting a 24*7 operations production support model.

