Job Overview
We are seeking a Reliability Engineer to support enterprise IT operations, monitoring, automation, and reliability engineering initiatives. The ideal candidate will work closely with IT and infrastructure teams to improve operational stability, automate manual processes, and enhance observability across systems and applications.
Key Responsibilities
Observability & Monitoring
- Monitor system, application, and infrastructure health to ensure operational reliability and availability
- Install, configure, and manage monitoring and observability tools
- Implement and enhance monitoring solutions to support proactive incident detection and operational improvements
- Analyze logs, metrics, and operational data to generate dashboards, reports, and actionable insights
- Support continuous improvement of operational monitoring and alerting capabilities
Automation & Operational Support
- Automate day-to-day operational tasks using Ansible, Jenkins, Shell scripting, PowerShell, or Python
- Develop scripts and automation workflows to improve efficiency and reduce manual operational effort
- Support deployment automation and operational process optimization initiatives
Reliability Engineering & Production Support
- Provide operational and project support to IT and infrastructure teams
- Handle service requests and support activities related to reliability engineering functions
- Participate in production incident troubleshooting and resolution activities
- Support project cutovers, deployments, and maintenance activities, including after-hours support when required
- Collaborate effectively with internal stakeholders, vendors, and external partners
Required Skills & Qualifications
- Minimum 3 years of experience in IT operations, automation, monitoring, or reliability engineering
- Strong scripting experience using Ansible, Shell scripting, Python, or PowerShell
- Hands-on experience with monitoring and observability solutions
- Familiarity with Windows, Linux, Unix, AWS cloud platforms, middleware, and database environments
- Good understanding of production support, incident management, and operational processes
- Strong troubleshooting, analytical, and problem-solving skills
- Excellent communication and stakeholder management abilities
Preferred Skills
- SRE certification or equivalent certification
- Experience supporting enterprise production environments and 24x7 operations
- Exposure to CI/CD, DevOps, and infrastructure automation practices
- Familiarity with operational dashboards, logging frameworks, and monitoring integrations
Soft Skills
- Strong ownership and accountability mindset
- Ability to work independently and collaboratively in cross-functional teams
- Calm and effective under pressure during production incidents
- Proactive approach toward operational improvement and automation initiatives
Work Conditions
- Some operational support activities may require after-office-hours support
- Participation in production incident management and project cutover activities as needed
EA License Number: 23C2060 Registration ID is R22109715
Disclaimer: The company is committed to ensuring the privacy and security of your information. By submitting this form, you consent to the collection, processing, and retention of the information you provide. The data collected (which may include your contact details, educational background, work experience and skills) will be used solely for the purpose of evaluating your qualifications for the position you're applying for. Your data will be stored securely and retained for the duration necessary to fulfill our hiring process. If you are not selected for the position, your data will be kept on file for a limited period in case future opportunities arise. You have the right to access, correct, or delete your data at any time by contacting us at Quess Singapore | A Leading Staffing Services Provider in Singapore (quesscorp.sg)
This is in partnership with the Employment and Employability Institute Pte Ltd (“e2i”).
e2i is the empowering network for workers and employers seeking employment and employability solutions. e2i serves as a bridge between workers and employers, connecting with workers to offer job security through job-matching, career guidance and skills upgrading services, and partnering employers to address their manpower needs through recruitment, training, and job redesign solutions. e2i is a tripartite initiative of the National Trades Union Congress set up to support nation-wide manpower and skills upgrading initiatives. By applying for this role, you consent to Quesscorp Singapore’s PDPA and e2i’s PDPA