We are seeking a Site Reliability Engineer (SRE) to support and manage our Infrastructure-as-a-Service (IaaS) platform built on the VMware Cloud Foundation (VCF) stack. You will be responsible for maintaining, automating, and optimizing the VCF environment to ensure high availability, scalability, and operational efficiency. This role is ideal for someone passionate about infrastructure automation, modern cloud technologies, and delivering reliable, enterprise-grade services.
Key Responsibilities
- · Manage and maintain VMware Cloud Foundation (VCF) components, including vCenter, ESXi, NSX, and vRealize Suite (vRA, vRO, vROps).
- · Automate deployment, configuration, and operational tasks using Ansible, PowerCLI, and other scripting tools.
- · Monitor infrastructure health, performance, and capacity to ensure optimal uptime and scalability.
- · Implement and manage infrastructure-as-code (IaC) practices for consistent environment management.
- · Troubleshoot complex infrastructure issues across compute, storage, and network layers.
- · Collaborate with cross-functional teams to design and implement enhancements to the IaaS platform.
- · Develop and maintain operational documentation, runbooks, and standard operating procedures (SOPs).
- · Participate in an on-call rotation to support critical infrastructure incidents.
Qualifications and Experience
- · 5+ years of experience as an SRE, Systems Engineer, or Cloud Infrastructure Engineer.
- · Hands-on expertise with VMware vCenter, ESXi, NSX, and vRealize Suite (vRA/vRO/vROps).
Pay: RM6,000.00 - RM10,000.00 per month
Work Location: In person