Job Summary
You will maintain and optimize cloud infrastructure reliability and performance, automate deployments, support incident response, and collaborate with engineering teams to enhance system scalability and operational efficiency.
Responsibilities
- Maintain system reliability, availability, and performance for cloud infrastructure and services to ensure continuous operations
- Monitor production environments and manage observability tools to track metrics, logs, and alerts for proactive issue detection
- Support incident response by troubleshooting issues, conducting root cause analysis, and leading post-incident reviews to prevent recurrence
- Manage and optimize cloud infrastructure on AWS, Azure, or GCP to improve resource utilization and cost efficiency
- Implement Infrastructure as Code (IaC) and automate deployments through CI/CD pipelines to accelerate delivery and reduce errors
- Enhance system scalability, resilience, and operational efficiency by identifying and applying improvements
- Support security best practices by ensuring compliance and coordinating vulnerability remediation efforts
- Collaborate with engineering and platform teams to drive service reliability improvements and operational excellence
- Perform system performance tuning, capacity planning, and infrastructure optimization to meet evolving business demands
- Execute additional platform engineering and operational tasks as assigned to support overall system health
Required competencies and certifications
- Bachelor’s degree in Computer Science, IT, Engineering, or related field
- 3–5 years of experience in SRE, DevOps, Cloud Infrastructure, or related roles
- Hands-on experience with cloud platforms such as AWS, Azure, or GCP
- Familiarity with monitoring and observability tools
- Experience with CI/CD pipelines, Infrastructure as Code (IaC), and automation tools
- Knowledge of Linux systems, networking, Docker, and Kubernetes
- Basic scripting or programming skills (e.g., Python, Bash, or Go)
- Strong troubleshooting, problem-solving, and incident management skills
- Effective communication and teamwork skills in fast-paced environments
Preferred competencies and qualifications
- Relevant cloud certifications (e.g., AWS Certified Solutions Architect, Azure Administrator, Google Cloud Professional) are an added advantage
Pay: $7,000.00 - $10,000.00 per month
Work Location: In person