300+ Site Reliability Engineering Jobs in Malaysia | Job Vacancies | June 2026 | Ricebowl

Showing 364 jobs results for "site reliability engineering"

Never miss any updates for Site Reliability Engineering jobs

SGD5,800 - SGD5,800 Per Month

Singapore

  • Conduct routine preventive maintenance and manage operational activities efficiently.
  • Prepare comprehensive project documentation and reports.
  • Troubleshoot and resolve technical issues promptly. ...
Posted
14 days ago
SGD4,000 - SGD4,000 Per Month

Singapore

  • Collaborate with development and operations teams to embed SRE principles, define and maintain Service Level Objectives (SLOs), and ensure scalable, resilient system architecture and disaster recovery readiness
  • Provide secondary support and technical expertise for messaging and middleware platforms, including Kafka and MQ, by assisting with administration, configuration, performance tuning, and incident resolution
  • Conduct architectural reviews to identify infrastructure gaps, remediate network vulnerabilities, and advise application teams on operational excellence, security, and compliance best practices ...
Posted
14 days ago

AMAZON ASIA-PACIFIC RESOURCES PRIVATE LIMITED

SGD16,000 - SGD16,000 Per Month

Singapore

  • Bachelor's degree in Electrical or Mechanical Engineering, Engineering Technology, Reliability Engineering, or 10+ years of managing, analyzing and communicating results to senior leadership experience
  • Experience in supply chain, commodity, and supplier management in a high volume, global sourcing and operations manufacturing environment with a global supply base of contract manufacturers
  • Knowledge of critical data center mechanical and electrical equipment ...
Posted
14 days ago
SGD7,000 - SGD7,000 Per Month

Singapore

  • Monitor and scale telephony quotas, concurrent tasks, and backend infrastructure to support peak operational demand.
  • Manage and support core AWS infrastructure services including EC2, ECS/EKS, S3, Lambda, DynamoDB, and VPC networking.
  • Implement security best practices including IAM least-privilege access, encryption (KMS), and compliance standards such as SOC2, HIPAA, or PCI-DSS where applicable. ...
Posted
14 days ago
SGD5,000 - SGD5,000 Per Month

Singapore

  • Monitor and scale telephony quotas, concurrent tasks, and backend infrastructure to support peak operational demand.
  • Manage and support core AWS infrastructure services including EC2, ECS/EKS, S3, Lambda, DynamoDB, and VPC networking.
  • Implement security best practices including IAM least-privilege access, encryption (KMS), and compliance standards such as SOC2, HIPAA, or PCI-DSS where applicable. ...
Posted
14 days ago
Undisclosed

Singapore

  • Manage virtualisation platforms (e.g., VMware, Hyper-V), including capacity monitoring, performance optimisation, and lifecycle management.
  • Implement and maintain robust monitoring and observability solutions for all platform components using modern tooling (e.g., Prometheus, Grafana, ELK stack).
  • Execute platform patching strategies, leveraging automation to maintain security and stability while minimising service disruption. ...
Posted
15 days ago

Elitez India

Undisclosed

Singapore

  • 3 to 5 years experiences.
  • Must have Java, Spring boot
  • Good hands on for SQL and Linux. ...
Posted
15 days ago
Undisclosed

Ang Mo Kio

  • Perform and manage routine preventive maintenance and operational activities promptly and effectively
  • Provide project documentation and reports
  • Diagnose and rectify technical issues ...
Posted
15 days ago
Undisclosed

Singapore

  • Infrastructure, CI/CD & Portability
  • Cluster Operations: Deploy, configure, and maintain platform services on enterprise Kubernetes (EKS) using Helm charts.
  • Portable Deployment: Implement infrastructure configurations with a strict focus on portability, ensuring applications can be cleanly migrated between distinct Kubernetes clusters. ...
Posted
15 days ago
Undisclosed

Singapore

  • Infrastructure Architecture & Highly Portable Systems
  • Cloud Infrastructure: Design, harden, and operate enterprise Kubernetes (EKS) clusters on GCC+.
  • Portable Architecture: Own the infrastructure design with a strict emphasis on portability. Configuration management, state, and workloads must be architected to allow easy porting across different Kubernetes environments without platform lock-in. ...
Posted
15 days ago
Undisclosed

KL City

  • Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues around the world, and where you’ll be able to reimagine what’s possible. Join us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world.
  • Skills
Posted
15 days ago
Undisclosed

Singapore

  • Lead technical operations and release readiness, including change management, incident/problem management, capacity planning, configuration management and disaster recovery.
  • Partner product, infrastructure and security teams to address deployment challenges, technical and operational constraints.
  • Strengthen monitoring and observability capabilities through metrics, logs, alerts and dashboards to support proactive system operations. ...
Posted
17 days ago
Undisclosed

Singapore

  • Programming: High proficiency in Python and Java for developing network management platforms and automation scripts.
  • Observability Tools: Hands-on experience with Grafana, Elasticsearch,
  • CI/CD : Experience building automated pipelines (Jenkins, bitbucket, jira) for validating network changes before production deployment.
Posted
18 days ago
Undisclosed
  • Provide technical support for installed systems during and outside normal office hours, as required.
  • Act as the primary liaison between customers and the back-office Level 2 support team based in Malaysia.
  • Assist with system commissioning and deployment during the project implementation phase by providing onsite support. ...
Posted
18 days ago
Undisclosed
  • Provide technical support for installed systems during and outside normal office hours, as required.
  • Act as the primary liaison between customers and the back-office Level 2 support team based in Malaysia.
  • Assist with system commissioning and deployment during the project implementation phase by providing onsite support. ...
Posted
18 days ago
Undisclosed

Singapore

  • Automation & Tooling: Develop and implement automation tools and scripts using Python to reduce manual operational tasks and improve efficiency. This includes automating health checks, operational tasks, and contributing to CI/CD pipelines.
  • Monitoring & Observability: Implement and enhance monitoring, alerting, and logging systems to ensure comprehensive visibility into application health and performance. Define and measure Service Level Objectives (SLOs) and Service Level Indicators (SLIs).
  • Collaboration & Communication: Act as a primary point of contact for users, effectively communicating status updates and resolution plans during live issues. Collaborate closely with development, infrastructure, and other technology teams, as well as external vendors, to drive issue resolution and system enhancements. ...
Posted
4 days ago
Undisclosed

Singapore

  • Automation & Tooling: Develop and implement automation tools and scripts using Python to reduce manual operational tasks and improve efficiency. This includes automating health checks, operational tasks, and contributing to CI/CD pipelines.
  • Monitoring & Observability: Implement and enhance monitoring, alerting, and logging systems to ensure comprehensive visibility into application health and performance. Define and measure Service Level Objectives (SLOs) and Service Level Indicators (SLIs).
  • Collaboration & Communication: Act as a primary point of contact for users, effectively communicating status updates and resolution plans during live issues. Collaborate closely with development, infrastructure, and other technology teams, as well as external vendors, to drive issue resolution and system enhancements. ...
Posted
4 days ago
Undisclosed

Singapore

  • Plan and execute production activities including upgrades, migrations, failover testing, and resilience enhancements.
  • Support onboarding of new applications and data services, ensuring alignment with architecture and governance standards.
  • Collaborate with Market Data Admin and Commercial teams on entitlement, usage reviews, and audit requests. ...
Posted
4 days ago
Undisclosed

Singapore

  • Plan and execute production activities including upgrades, migrations, failover testing, and resilience enhancements.
  • Support onboarding of new applications and data services, ensuring alignment with architecture and governance standards.
  • Collaborate with Market Data Admin and Commercial teams on entitlement, usage reviews, and audit requests. ...
Posted
4 days ago
Undisclosed

KL City

  • Develop and maintain automation tools for deployment, monitoring, and incident response.
  • Create scripts and workflows to reduce manual intervention and improve efficiency.
  • Respond to system outages and incidents, performing root cause analysis and implementing fixes. ...
Posted
6 days ago
Undisclosed

Singapore

  • Backup and Recovery: Design comprehensive backup and recovery strategies ensuring business continuity and data integrity.
  • Resource Management: Provision and manage infrastructure resources such as servers, virtual machines, containers, and related services to support development and operations teams.
  • Product Reliability: Troubleshoot performance, storage, networking, database, and application issues. Support incident response, root cause analysis, and preventive measure implementation. ...
Posted
19 days ago
SGD4,000 - SGD4,000 Per Month

Singapore

  • Author, maintain, and improve operational runbooks, knowledge base articles, and diagnostic scripts to build a library of repeatable solutions and empower first-line support capabilities.
  • Collaborate directly with the core development teams to reproduce and diagnose root-cause defects, contributing detailed findings and logs to the reliability backlog for sprint prioritisation.
  • Proactively analyse system performance and observability data (metrics, logs, traces) to identify degradation trends, capacity bottlenecks, and potential failures before they impact customers. ...
Posted
19 days ago
Undisclosed

KL City

  • Data Platform Integration: Support integration and operation of the Dataiku platform, enabling robust data pipelines and analytics workflows.
  • Monitoring and Observability: Implement and maintain monitoring solutions with Splunk and Grafana to ensure application and infrastructure health, proactively identifying issues and performance bottlenecks.
  • CI/CD and Automation: Build, optimise, and maintain CI/CD pipelines for rapid and reliable deployment of code and infrastructure changes. ...
Posted
9 days ago
Undisclosed
WFH

KL City

  • Analytical Problem Solving: Investigate deep-tier technical bottlenecks and follow through to resolution.
  • Knowledge Management: Maintain and update technical documentation, ensuring new issues and solutions are cataloged for the team.
  • Process Optimization: Drive team productivity by identifying opportunities for automation and tool integration. ...
Posted
9 days ago
Undisclosed

Singapore

  • Collaborate closely with the product manager to accurately implement UI mockups and interactive features.
  • Integrate frontend interfaces with backend APIs efficiently and ensure seamless data flow.
  • Contribute to improving the team’s development workflow, code quality, and system performance. ...
Posted
9 days ago
Undisclosed

Singapore

  • Reliability & Performance:
  • Design, implement, and maintain robust situational awareness, monitoring and alerting, to ensure high availability and performance of banking services.
  • Drive the adoption of best practices in system design, capacity planning, and performance optimization. ...
Posted
20 days ago
Undisclosed

Singapore

  • Design, implement, and maintain robust situational awareness, monitoring and alerting, to ensure high availability and performance of banking services.
  • Drive the adoption of best practices in system design, capacity planning, and performance optimization.
  • Identify and mitigate potential risks to system reliability, proactively addressing issues before they impact customers. ...
Posted
20 days ago

Mfex Malaysia Sdn. Bhd.

Undisclosed
  • Supporting FundsPlace operational teams in achieving their objectives by leveraging a structured citizen developer framework, including solution development, lifecycle management, and risk compliance
  • Reviewing, challenging, and proposing business and technical solutions with end-users, including operational process flow reviews
  • Developing tools, queries, and automation solutions using MS Office, VBA, SQL, and technologies such as Power Platform, Power BI, and UIPath (Robotics) ...
Posted
10 days ago
Undisclosed

Singapore

  • Provides continuous training and development opportunities to help employees achieve their career goals, whatever their background or experience.
  • Is committed to advancing our tools, technology, and ways of working to better serve our clients and their evolving business needs.
  • Believes in responsible growth and is dedicated to supporting our communities by connecting them to the lending, investing and giving them what they need to remain vibrant and vital. ...
Posted
10 days ago

INNOVATIQ SOLUTIONS PTE. LTD.

SGD9,000 - SGD9,000 Per Month

Singapore

  • Establish and enforce development standards, best practices, and governance.
  • Evaluate and recommend emerging technologies and technical solutions.
  • Manage end-to-end project delivery, including planning, execution, monitoring, and closure. ...
Posted
11 days ago