300+ Site Reliability Engineering Jobs in Malaysia | Job Vacancies | June 2026 | Ricebowl

Showing 357 jobs results for "site reliability engineering"

Never miss any updates for Site Reliability Engineering jobs

Undisclosed

Singapore

  • Familiarity with network telemetry tools such as SolarWinds and NetScout.
  • Proficiency in packet level debugging, including capturing traffic with tools like tcpdump and analyzing packets using Wireshark.
  • Broad understanding of end to end infrastructure supporting payment platforms—spanning platform services, networking, databases, and storage. ...
Posted
6 days ago
SGD7,500 - SGD8,500 Per Month

Singapore

  • Experiencewith Monitoring Tools: Dynatrace, Splunk
  • Workingknowledge of Java (1.8+)
  • Strongexpertise in SQL and database troubleshooting (query optimization, performancetuning, and data analysis for incident resolution) ...
Posted
10 days ago
SGD8,000 - SGD8,800 Per Month

Singapore

  • Java& Application Troubleshooting
  • Programming Languages
  • Strong experience in incident management and distributed systems support , excellent troubleshooting and problem-solving skills & automation of operational tasks ...
Posted
10 days ago
SGD6,500 - SGD6,500 Per Month

Singapore

  • 7+ years strong experience in Production Support / SRE / BizOps (L2 Operations - hands-on troubleshooting, monitoring, and incident handling)
  • Hands-on expertise in Linux (commands, system operations)
  • Strong scripting skills in Shell / Python / Jython ...
Posted
16 days ago
Undisclosed

Singapore

  • Strong knowledge of HTTP, DNS, and TLS protocols, with the ability to troubleshoot at the application and transport layers.
  • Familiarity with Content Delivery Networks (CDNs) and DDoS protection services.
  • Solid Linux fundamentals, including networking, system configuration, and troubleshooting. ...
Posted
14 days ago
Undisclosed

KL City

  • Organise collaboration within technology teams by creating efficient communication to enhance collaboration and achieve shared goals.
  • Assist on identifying risk in technology infrastructure by conducting proactive risk assessments and develop contingency plan to ensure regulatory and security compliance for technology infrastructure deliverables
  • Execute strategic vision for technology infrastructure improvement by executing the roadmap of initiatives which align with company goals to ensure continuous improvement ...
Posted
12 days ago
Undisclosed

KL City

  • Preferred Skills
  • •Experience with containers and container orchestration platforms such as Docker and Kubernetes.
  • •Proficiency in or exposure to machine learning frameworks such as TensorFlow, PyTorch, MXNet, or PaddlePaddle. ...
Posted
a month ago
Undisclosed

KL City

  • Lead every Sev1/2 Incident, run the bridge, write RCA within 48H, enforce blameless post-mortems the same week, and ship permanent automated fixes so the same outage never happens twice.
  • Review team members' code scripts by evaluating adherence to better code quality standards to ensure high-quality software delivery.
  • Evolve product Observability. This includes metrics (Prometheus/Tempo), Logs (Loki/Cloudwatch), Traces (Tempo/OpenTelemetry) and proactively updates on the design, and implementation. ...
Posted
19 days ago
Undisclosed

Singapore

  • Familiarity with network telemetry tools such as SolarWinds and NetScout.
  • Proficiency in packet level debugging, including capturing traffic with tools like tcpdump and analyzing packets using Wireshark.
  • Broad understanding of end to end infrastructure supporting payment platforms—spanning platform services, networking, databases, and storage. ...
Posted
a month ago
SGD6,000 - SGD6,000 Per Month

Singapore

  • Collaborate closely with cross-functional engineering and infrastructure teams to ensure operational readiness and platform stability
  • Design and implement robust monitoring frameworks, intelligent alerting systems, and incident response processes to achieve operational excellence
  • Define and maintain Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to measure and improve system reliability ...
Posted
23 days ago
SGD15,800 - SGD15,800 Per Month

Singapore

  • About the Role:
  • Mastercard’s Program aligned Site Reliability Engineering (SRE) teams are dedicated to delivering a seamless experience for our customers. We achieve this by maintaining every aspect of our Programs infrastructure and technology ecosystem to the highest standards, ensuring compliance with rigorous security requirements.
  • Within Mastercard, SRE focuses on the reliability and performance of core infrastructure, networks, and foundational services that power our applications. Our mission is to ensure these components operate with excellence, enabling applications to deliver an outstanding customer experience. ...
Posted
a month ago
Undisclosed

Singapore

  • Familiarity with network telemetry tools such as SolarWinds and NetScout.
  • Proficiency in packet level debugging, including capturing traffic with tools like tcpdump and analyzing packets using Wireshark.
  • Broad understanding of end to end infrastructure supporting payment platforms—spanning platform services, networking, databases, and storage. ...
Posted
a month ago
Undisclosed

KL City

  • Soft Skills: Strategic thinking, exceptional communication, and the ability to collaborate effectively with cross-functional teams in a fast-paced environment.
  • Coding: Proficient in at least one high-level programming language (e.g., Python, Go, C++, or Java) and shell scripting. Strong understanding of data structures and algorithms.
  • Systems: Strong understanding of Linux operating systems and open-source technologies and a solid understanding of network architecture. ...
Posted
22 days ago
Undisclosed

Singapore

  • Deliver a playbook for onboarding new tasks / activities covering both Application and Infrastructure support models
  • Identify opportunities to automate Production support activities (App & Infra) and reduce manual interventions
  • Drive application and infrastructure improvements including performance, capacity, resilience, and operational stability; eliminate toil through automation ...
Posted
12 days ago
Undisclosed

Singapore

  • Deliver a playbook for onboarding new tasks / activities covering both Application and Infrastructure support models
  • Identify opportunities to automate Production support activities (App & Infra) and reduce manual interventions
  • Drive application and infrastructure improvements including performance, capacity, resilience, and operational stability; eliminate toil through automation ...
Posted
12 days ago
Undisclosed

Singapore

  • Deliver a playbook for onboarding new tasks / activities covering both Application and Infrastructure support models
  • Identify opportunities to automate Production support activities (App & Infra) and reduce manual interventions
  • Drive application and infrastructure improvements including performance, capacity, resilience, and operational stability; eliminate toil through automation ...
Posted
12 days ago
Undisclosed

Singapore

  • Entitled to Yearly Bonus & Performance Bonus
  • Bachelor's Degree or above; a degree in computer science or a related field is preferred.
  • At least 2 years of experience in cloud services products and application operations. ...
Posted
23 days ago
SGD4,000 - SGD4,000 Per Month

Singapore

  • Experience with automation operations and container technologies (Docker, Kubernetes).
  • Familiarity with CI / CD processes and tools (e.g Jenkins, GitLab CI).
  • Includes daily monitoring, alert response, emergency handling, on-call duties, regular system health checks, and performance optimization. ...
Posted
25 days ago
TWD40,000 - TWD40,000 Per Month

台灣

  • 將有許多與國外夥伴合作的機會,能累積國際溝通和商業英語能力
  • 以 Site Reliability Engineering(網站可靠性工程)為核心,運用 RedHat、Windows、JBOSS、SpringBoot、MQ、AMQ、NGINX 等相關技術維運系統。
  • 負責專案管理與優化,包括整合使用者需求與服務工單資料,並開發相關儀表板。 ...
Posted
22 days ago
Undisclosed

Singapore

  • Build and maintain production tooling that supports deployment, orchestration, monitoring, and system diagnostics
  • Define and maintain observability, SLI/SLOs, and performance metrics in partnership with product owners
  • Leverage metrics and capacity planning to ensure scalability and uptime ...
Posted
2 days ago

Aspire Systems India Private Limited

SGD12,500 - SGD25,000 Per Month

Singapore

  • Develop and implement automation tools for system management
  • Collaborate with development teams to ensure seamless deployment and operation of applications
  • Maintain documentation of system architecture and processes ...
Posted
2 days ago

Tata Consultancy Services Limited

SGD140 - SGD144,001 Per Month

Singapore

  • As part of Cloud Engineering Team, the SRE Engineer engages in and improves the full lifecycle of cloud platform solutions from design, deployment, operation and refinement with accuracy and in compliance with organization policies and security requirements.
  • The SRE Engineer treats operations as a software problem and therefore will code to automate repetitive tasks and optimize cloud operations.
  • Support services before go-live through activities like system design consulting, developing software platforms and launch reviews. Maintain post-live cloud operations by measuring and monitoring availability, latency and overall system health with any prompt and remediative actions. ...
Posted
3 days ago
Undisclosed

KL City

  • Collaboration: Partner with Python development squads to ensure new features are designed with reliability in mind; conduct code reviews for reliability-critical paths; participate in Agile ceremonies.
  • Incident Management: Conduct root cause analysis for incidents and implement corrective actions to prevent recurrence; participate in on-call rotations for critical systems; maintain runbooks in version-controlled Python projects.
  • Continuous Improvement: Drive initiatives to improve system performance, reliability, and scalability through Python best practices, including profiling, benchmarking, and dependency management. ...
Posted
4 days ago

Tata Consultancy Services Limited

SGD250 - SGD5,000 Per Month

Singapore

  • As part of Cloud Engineering Team, the SRE Engineer engages in and improves the full lifecycle of cloud platform solutions from design, deployment, operation and refinement with accuracy and in compliance with organization policies and security requirements.
  • The SRE Engineer treats operations as a software problem and therefore will code to automate repetitive tasks and optimize cloud operations.
  • Support services before go-live through activities like system design consulting, developing software platforms and launch reviews. Maintain post-live cloud operations by measuring and monitoring availability, latency and overall system health with any prompt and remediate actions. ...
Posted
5 days ago
Undisclosed

Singapore

  • Troubleshoot priority incidents, facilitate blameless post-mortems and ensure permanent closure of incidents
  • Perform analytics on previous incidents and usage patterns to better predict issues and take proactive actions
  • Build and maintain CI/CD pipelines for the bank. ...
Posted
5 days ago
SGD4,500 - SGD4,800 Per Month

Singapore

  • Guarantee the solution functionality and stability according to customer requirements.
  • Ensure the compliances with Thales deployment rules, and applied the best practices.
  • Assist the development and validation team during the project delivery. ...
Posted
6 days ago
SGD4,500 - SGD4,800 Per Month

Singapore

  • Guarantee the solution functionality and stability according to customer requirements.
  • Ensure the compliances with Thales deployment rules, and applied the best practices.
  • Assist the development and validation team during the project delivery. ...
Posted
6 days ago
Undisclosed

Singapore

  • Help improve the whole lifecycle of infrastructure services from inception and design throughout development to deployment, user support, and refinement.
  • Deploy and configure solutions in the cloud.
  • Automate cloud operations, develop infrastructure automation scripts and participates in the continuous improvement of cloud solutions. ...
Posted
12 days ago
Undisclosed

KL City

  • Effectively utilize our world class AIOPS and autonomous service governance platform to ideate new ways to streamline process, accuracy of alerts, time series-based trend analysis, anomaly detection, risk identifications.
  • Support platform/service expansions, migrations to new architectures, upgrades and drill activities across different technology domains.
  • Incorporate mature chaos engineering for risk identification, IPDRR for security, comprehensive automation frameworks to reduce ops effort to reach lowest possible level and make time, space for engineering related focus for the team. ...
Posted
10 days ago
Undisclosed

Singapore

  • Improve deployment safety through CI/CD workflows, release controls, rollback paths, and environment consistency
  • Drive incident response and production readiness practices including runbooks, on-call hygiene, postmortems, capacity planning, and resilience testing
  • Reduce operational toil by automating repetitive work and improving internal developer tooling ...
Posted
10 days ago