300+ Site Reliability Engineering Jobs in Malaysia | Job Vacancies | June 2026 | Ricebowl

显示359个工作的结果 "site reliability engineering"

不要错过任何 Site Reliability Engineering 的新工作机会

Undisclosed

KL City

  • Manage cloud infrastructure provisioning and configuration using IaC tooling (Terraform, Helm), supporting both AWS/Azure cloud deployments and on-premises customer environments.
  • Implement and maintain CI/CD pipelines for GFS solutions (Jenkins, etc.)
  • Work with Engineering teams to ensure security and compliance readiness for Managed services — including PCI DSS, ISO 27001, SOC 1/2/3, PDPA/GDPR — in close coordination with InfoSec teams. ...
Posted
9 days ago
Undisclosed

KL City

  • Collaboration: Partner with Python development squads to ensure new features are designed with reliability in mind; conduct code reviews for reliability-critical paths; participate in Agile ceremonies.
  • Incident Management: Conduct root cause analysis for incidents and implement corrective actions to prevent recurrence; participate in on-call rotations for critical systems; maintain runbooks in version-controlled Python projects.
  • Continuous Improvement: Drive initiatives to improve system performance, reliability, and scalability through Python best practices, including profiling, benchmarking, and dependency management. ...
Posted
9 days ago
Undisclosed

Singapore

  • Experience with Monitoring Tools: Dynatrace, Splunk
  • Working knowledge of Java (1.8+)
  • Strong expertise in SQL and database troubleshooting (query optimization, performance tuning, and data analysis for incident resolution) ...
Posted
18 days ago
Undisclosed

Singapore

  • Support cloud security operations, including cloud security alert management and compliance auditing.
  • 3+ years of DevOps or SRE experience; experience with AIOps or observability platform development is a plus.
  • Proficient in Python; familiar with at least one of Go or Java. Full-stack capability (React/Vue frontend + backend API) is a plus. ...
Posted
10 days ago
Undisclosed

KL City

  • Collaboration: Partner with Python development squads to ensure new features are designed with reliability in mind; conduct code reviews for reliability-critical paths; participate in Agile ceremonies.
  • Incident Management: Conduct root cause analysis for incidents and implement corrective actions to prevent recurrence; participate in on-call rotations for critical systems; maintain runbooks in version-controlled Python projects.
  • Continuous Improvement: Drive initiatives to improve system performance, reliability, and scalability through Python best practices, including profiling, benchmarking, and dependency management. ...
Posted
10 days ago
Undisclosed

Singapore

  • Systems monitoring
  • Automation and Infrastructure-as-Code
  • Plan and complete systems administration tasks on Linux and Windows systems such as application tuning, configuration management, security hardening and resource management (processors, memory, storage, networking) ...
Posted
18 days ago
Undisclosed

KL City

  • Experience with CICD development & deployment tools such as Maven, Jenkins, Nexus, Git, and Docker.
  • Proficiency in Linux OS
  • Proficiency in scripting and automation (e.g. Python, PowerShell, YAML) with the ability to develop tools and infrastructure as code (Preferably Ansible, Terraform, Kubernetes, OpenShift). ...
Posted
11 days ago
Undisclosed

KL City

  • Collaboration at its Best: Work closely with product teams, stakeholders, and global support. Immerse in and contribute to a rich tapestry of insights and expertise.
  • Mentorship and Growth: Guide budding engineers and share best practices, fostering a collective ascent.
  • Tech Evaluation: Regularly scrutinize platforms and apps, suggesting improvements rooted in data and hands-on experience ...
Posted
11 days ago
Undisclosed

Singapore

  • • Engineering-driven culture with strong investment in cloud infrastructure, stability, and platform scalability
  • Responsibilities:
  • • Ensure system reliability, scalability, and production stability across core business services ...
Posted
11 days ago
Undisclosed

Singapore

  • Improve team processes to meet business needs efficiently
  • Review services, assess implementations, and recommend improvements
  • Develop AI-based solutions to boost reliability, efficiency and productivity ...
Posted
12 days ago
SGD4,500 - SGD4,800 每月

Singapore

  • Guarantee the solution functionality and stability according to customer requirements.
  • Ensure the compliances with Thales deployment rules, and applied the best practices.
  • Assist the development and validation team during the project delivery. ...
Posted
19 days ago
SGD4,500 - SGD4,800 每月

Singapore

  • Guarantee the solution functionality and stability according to customer requirements.
  • Ensure the compliances with Thales deployment rules, and applied the best practices.
  • Assist the development and validation team during the project delivery. ...
Posted
19 days ago
SGD13,000 - SGD13,000 每月

Singapore

  • Own team-level SLOs, runbooks, and DevOps performance metrics.
  • Collaborate with central DevOps and security teams to ensure compliance and resilience.
  • 6+ years in an SRE, DevOps, or infrastructure-focused engineering role. ...
Posted
14 days ago

Tata Consultancy Services Limited

SGD250 - SGD5,000 每月

Singapore

  • As part of Cloud Engineering Team, the SRE Engineer engages in and improves the full lifecycle of cloud platform solutions from design, deployment, operation and refinement with accuracy and in compliance with organization policies and security requirements.
  • The SRE Engineer treats operations as a software problem and therefore will code to automate repetitive tasks and optimize cloud operations.
  • Support services before go-live through activities like system design consulting, developing software platforms and launch reviews. Maintain post-live cloud operations by measuring and monitoring availability, latency and overall system health with any prompt and remediate actions. ...
Posted
20 days ago
Undisclosed
WFH

Malaysia

  • Design and operate CloudBlue’s observability stack across metrics, logs, and traces using tools such as Datadog, Grafana, and Elastic Stack
  • Develop actionable alerting strategies and dashboards that provide clear insight into platform and business health
  • Design and maintain high-availability architectures, implementing redundancy, failover, and disaster recovery strategies across regions and availability zones ...
Posted
21 days ago
Undisclosed

KL City

  • Analyze production issues, identify root causes, and implement long-term reliability improvements through automation, monitoring, and architectural enhancements.
  • Work collaboratively with other team members and provide guidance to more junior team members.
  • Organize an efficient handover through high quality documentation and training. ...
Posted
15 days ago

Tata Consultancy Services Limited

SGD140 - SGD144,001 每月

Singapore

  • As part of Cloud Engineering Team, the SRE Engineer engages in and improves the full lifecycle of cloud platform solutions from design, deployment, operation and refinement with accuracy and in compliance with organization policies and security requirements.
  • The SRE Engineer treats operations as a software problem and therefore will code to automate repetitive tasks and optimize cloud operations.
  • Support services before go-live through activities like system design consulting, developing software platforms and launch reviews. Maintain post-live cloud operations by measuring and monitoring availability, latency and overall system health with any prompt and remediative actions. ...
Posted
21 days ago
Undisclosed

Singapore

  • Monitoring & Observability Tools
  • Experience with Dynatrace, Splunk, alerting, dashboards
  • Java & Application Troubleshooting ...
Posted
19 hours ago
Undisclosed

KL City

  • Build and enhance our observability platform, enabling real-time monitoring of our golden signals (uptime, latency, saturation, error rate)
  • Develop automation solutions for incident response, disaster recovery, and business continuity
  • Drive our DevSecOps platform to enable safe, rapid deployments through CI/CD, GitOps, and self-service capabilities ...
Posted
15 days ago
Undisclosed

KL City

  • Strong experience in site reliability engineering, infrastructure engineering or a similar role.
  • Strong knowledge on network and protocols, network security and cloud networking
  • Proven strong record of cloud cost optimisation ...
Posted
2 days ago
SGD10,000 - SGD10,000 每月

Singapore

  • Participate in architecture reviews and provide reliability-focused recommendations for high-concurrency, low-latency distributed systems.
  • Develop and maintain CI/CD pipelines to improve engineering productivity and deployment quality.
  • Lead capacity planning, performance tuning, disaster recovery planning, and resilience engineering initiatives. ...
Posted
a day ago
Undisclosed

KL City

  • Tool Utilization with CI/CD pipelines, monitoring systems, and analytics to streamline workflows.
  • Bachelor’s/Master’s in Computer Science or related field.
  • 5+ years in cloud operations, SRE, or platform engineering. ...
Posted
2 days ago
Undisclosed

Singapore

Posted
7 days ago
Undisclosed

Singapore

  • Implement and enforce operational best practices: observability, logging, metrics, alerting, capacity planning, failover strategies, and backups.
  • Collaborate with Engineering, Product, Compliance, and Operations teams to ensure infrastructure meets reliability, compliance, and security standards.
  • Support service scaling, database operations, cloud infrastructure (GCP preferred), networking, and microservices orchestration. ...
Posted
16 days ago
Undisclosed

KL City

  • Implement and enforce operational best practices: observability, logging, metrics, alerting, capacity planning, failover strategies, and backups.
  • Collaborate with Engineering, Product, Compliance, and Operations teams to ensure infrastructure meets reliability, compliance, and security standards.
  • Support service scaling, database operations, cloud infrastructure (GCP preferred), networking, and microservices orchestration. ...
Posted
16 days ago
MYR8,000 - MYR9,000 每月

KL City

  • Soft Skills: Strategic thinking, exceptional communication, and the ability to collaborate effectively with cross-functional teams in a fast-paced environment.
  • Coding: Proficient in at least one high-level programming language (e.g., Python, Go, C++, or Java) and shell scripting. Strong understanding of data structures and algorithms.
  • Systems: Strong understanding of Linux operating systems and open-source technologies and a solid understanding of network architecture. ...
Posted
16 days ago
MYR8,000 - MYR9,000 每月

KL City

  • Soft Skills: Strategic thinking, exceptional communication, and the ability to collaborate effectively with cross-functional teams in a fast-paced environment.
  • Coding: Proficient in at least one high-level programming language (e.g., Python, Go, C++, or Java) and shell scripting. Strong understanding of data structures and algorithms.
  • Systems: Strong understanding of Linux operating systems and open-source technologies and a solid understanding of network architecture. ...
Posted
16 days ago
Undisclosed

Singapore

  • Implement and enforce operational best practices: observability, logging, metrics, alerting, capacity planning, failover strategies, and backups.
  • Collaborate with Engineering, Product, Compliance, and Operations teams to ensure infrastructure meets reliability, compliance, and security standards.
  • Support service scaling, database operations, cloud infrastructure (GCP preferred), networking, and microservices orchestration. ...
Posted
17 days ago
Undisclosed

KL City

  • Implement and enforce operational best practices: observability, logging, metrics, alerting, capacity planning, failover strategies, and backups.
  • Collaborate with Engineering, Product, Compliance, and Operations teams to ensure infrastructure meets reliability, compliance, and security standards.
  • Support service scaling, database operations, cloud infrastructure (GCP preferred), networking, and microservices orchestration. ...
Posted
17 days ago