700+ Reliability Jobs - June 2026 - High Salaries

显示725个工作的结果 "reliability"

不要错过任何 Reliability 的新工作机会

Undisclosed
  • Manage inventory for spare parts and sample tracking list
  • Perform lab housekeeping by following 7s standard
  • Shipping and collection of delivery items from store ...
Posted
10 hours ago
SGD13,000 - SGD13,000 每月

Singapore

  • Own team-level SLOs, runbooks, and DevOps performance metrics.
  • Collaborate with central DevOps and security teams to ensure compliance and resilience.
  • 6+ years in an SRE, DevOps, or infrastructure-focused engineering role. ...
Posted
5 days ago
SGD10,000 - SGD10,000 每月

Singapore

  • Integrate automation workflows with enterprise DevOps toolchains (e.g., GitHub, Ansible, Terraform, ITSM tools).
  • Develop automated validation and rollback mechanisms for network changes.
  • Perform hands-on development, testing, and deployment of automation workflows in production environments. ...
Posted
8 hours ago
Undisclosed

Singapore

  • Manage Predictive Maintenance (PDM) programs such as vibration analysis, infrared thermography, motor circuit analysis, ultrasonics, and related technologies.
  • Interpret predictive maintenance data outputs and prioritize corrective actions accordingly.
  • Review, monitor, and ensure compliance with engineering design requirements, standards, specifications, and local regulatory codes. ...
Posted
16 days ago
Undisclosed

Singapore

  • Work closely with development teams or other internal teams to ensure that solutions are designed with customer user experience, scale/performance, security and operability in mind.
  • Support and ensure that the software releases are align with the organization’s internal software release and deployment process.
  • Facilitate and support the troubleshooting or root cause analysis of platform issues or incidents with other internal teams. ...
Posted
6 days ago
Undisclosed

Malacca City

  • Able to work Office Hour (8am-5.15pm) or 12 hours rotating Shift (7am-7pm, 7pm-7am)
Posted
7 days ago
Undisclosed

KL City

  • Incident Management & Root Cause Analysis
  • Participate as a Subject Matter Advisor during production incidents and outages.
  • Provide insights backed by system monitoring, code review, and database analysis. ...
Posted
7 days ago
Undisclosed

KL City

  • Build and enhance our observability platform, enabling real-time monitoring of our golden signals (uptime, latency, saturation, error rate)
  • Develop automation solutions for incident response, disaster recovery, and business continuity
  • Drive our DevSecOps platform to enable safe, rapid deployments through CI/CD, GitOps, and self-service capabilities ...
Posted
7 days ago
Undisclosed
  • Perform installation, set up and commissioning for reliability equipment to support new product REL testing. To carry out system verification activities prior release reliability equipment for operation.
  • To involve in the operation and safety of the laboratory and ensure all activities are carried out according to safety manual procedures that comply with standard.
  • Involvement in continuous improvement of Reliability processes to support new products,  reliability techniques and applications. Work with external vendor on equipment and process application topics. ...
Posted
7 days ago
Undisclosed

KL City

  • Analyze production issues, identify root causes, and implement long-term reliability improvements through automation, monitoring, and architectural enhancements.
  • Work collaboratively with other team members and provide guidance to more junior team members.
  • Organize an efficient handover through high quality documentation and training. ...
Posted
6 days ago
Undisclosed

Singapore

  • Experience with distributed systems or infrastructure environments is a plus
  • Scripting/automation exposure (Python, Bash, etc.) preferred
  • Open to candidates from finance and non-finance industries ...
Posted
7 days ago
Undisclosed

Singapore

  • Implement and enforce operational best practices: observability, logging, metrics, alerting, capacity planning, failover strategies, and backups.
  • Collaborate with Engineering, Product, Compliance, and Operations teams to ensure infrastructure meets reliability, compliance, and security standards.
  • Support service scaling, database operations, cloud infrastructure (GCP preferred), networking, and microservices orchestration. ...
Posted
8 days ago
Undisclosed

KL City

  • Implement and enforce operational best practices: observability, logging, metrics, alerting, capacity planning, failover strategies, and backups.
  • Collaborate with Engineering, Product, Compliance, and Operations teams to ensure infrastructure meets reliability, compliance, and security standards.
  • Support service scaling, database operations, cloud infrastructure (GCP preferred), networking, and microservices orchestration. ...
Posted
8 days ago
MYR8,000 - MYR9,000 每月

KL City

  • Soft Skills: Strategic thinking, exceptional communication, and the ability to collaborate effectively with cross-functional teams in a fast-paced environment.
  • Coding: Proficient in at least one high-level programming language (e.g., Python, Go, C++, or Java) and shell scripting. Strong understanding of data structures and algorithms.
  • Systems: Strong understanding of Linux operating systems and open-source technologies and a solid understanding of network architecture. ...
Posted
8 days ago
MYR8,000 - MYR9,000 每月

KL City

  • Soft Skills: Strategic thinking, exceptional communication, and the ability to collaborate effectively with cross-functional teams in a fast-paced environment.
  • Coding: Proficient in at least one high-level programming language (e.g., Python, Go, C++, or Java) and shell scripting. Strong understanding of data structures and algorithms.
  • Systems: Strong understanding of Linux operating systems and open-source technologies and a solid understanding of network architecture. ...
Posted
8 days ago
Undisclosed

Malacca City

  • Allocation of workload for the test center operation team, reviews backlogs, and defines a catch-up plan
  • Plans the timely allocation of resources (manpower, machine capacity, materials) of the test operation team to meet team targets (cycle time, staff productivity, and homemade failure)
  • Ensure the discipline of the test operation team, including 5s, adherenceto procedure, attendance, and overtime/breaktime. Takes the necessaryactions when necessary ...
Posted
8 days ago
Undisclosed

Singapore

  • Implement and enforce operational best practices: observability, logging, metrics, alerting, capacity planning, failover strategies, and backups.
  • Collaborate with Engineering, Product, Compliance, and Operations teams to ensure infrastructure meets reliability, compliance, and security standards.
  • Support service scaling, database operations, cloud infrastructure (GCP preferred), networking, and microservices orchestration. ...
Posted
8 days ago
Undisclosed

KL City

  • Implement and enforce operational best practices: observability, logging, metrics, alerting, capacity planning, failover strategies, and backups.
  • Collaborate with Engineering, Product, Compliance, and Operations teams to ensure infrastructure meets reliability, compliance, and security standards.
  • Support service scaling, database operations, cloud infrastructure (GCP preferred), networking, and microservices orchestration. ...
Posted
8 days ago

AMAZON ASIA-PACIFIC RESOURCES PRIVATE LIMITED

SGD16,000 - SGD16,000 每月

Singapore

  • Bachelor's degree in Electrical or Mechanical Engineering, Engineering Technology, Reliability Engineering, or 10+ years of managing, analyzing and communicating results to senior leadership experience
  • Experience in supply chain, commodity, and supplier management in a high volume, global sourcing and operations manufacturing environment with a global supply base of contract manufacturers
  • Knowledge of critical data center mechanical and electrical equipment ...
Posted
8 days ago
Undisclosed

Singapore

  • Design and integrate the systems underneath our services: messaging (e.g. Kafka), orchestration (e.g. Kubernetes), and performance-sensitive infrastructure.
  • Partner with product engineers on release readiness, rollout strategy, and production hardening before things ship.
  • Continuously reduce toil: measure it, attack it with code, and raise the floor on what "easy to maintain" looks like. ...
Posted
8 days ago
Undisclosed

Singapore

  • Design and integrate the systems underneath our services: messaging (e.g. Kafka), orchestration (e.g. Kubernetes), and performance-sensitive infrastructure.
  • Partner with product engineers on release readiness, rollout strategy, and production hardening before things ship.
  • Continuously reduce toil: measure it, attack it with code, and raise the floor on what "easy to maintain" looks like. ...
Posted
8 days ago
Undisclosed

Singapore

  • Knowledge of basic networking, linux, cloud provider, container operation. (such as Ubuntu, AWS, K8S)
  • Excellent analytical and problem solving skills with a passion for challenging issues.
  • Good teamwork spirit and strong communication skills. ...
Posted
9 days ago
MYR5,000 - MYR6,000 每月

KL City

  • Maternity leave
  • Opportunities for promotion
  • Parental leave ...
Posted
18 days ago
Undisclosed

KL City

  • Collaboration: Work with development squads to ensure new features are designed with reliability in mind; participate in Agile ceremonies
  • Incident Management: Conduct root cause analysis for incidents and implement corrective actions to prevent recurrence; participate in on-call rotations for critical systems
  • Continuous Improvement: Drive initiatives to improve system performance, reliability, and scalability through best practices. ...
Posted
18 days ago
SGD7,000 - SGD9,000 每月

Singapore

  • - 处理突发重大故障和普通故障,进行服务恢复。分析事件的根本原因,并改进和优化。
  • - 开发和维护自动化运维工具,提高运维工作效率,优化运维流程。
  • - 提供 7 * 24 OnCall 技术支持服务,5 * 8工作时间服务。 ...
Posted
18 days ago

Truewatch Technology Inc. Pte Ltd

SGD7,000 - SGD10,000 每月

Singapore

  • Manage and optimize cloud infrastructure on AWS, Azure, or GCP to improve resource utilization and cost efficiency
  • Implement Infrastructure as Code (IaC) and automate deployments through CI/CD pipelines to accelerate delivery and reduce errors
  • Enhance system scalability, resilience, and operational efficiency by identifying and applying improvements ...
Posted
18 days ago
Undisclosed
  • Collaborate with cross-functional teams to integrate tribological insights into product development and optimization
  • Present findings and recommendations to diverse audiences, including management and global teams
  • Contribute to the development of new testing methods and equipment to enhance tribological analysis capabilities ...
Posted
18 days ago
Undisclosed

Singapore

  • CI/CD golden path - Codify Cloud Build pipelines and automated canary rollouts for Cloud Functions / Cloud Run.
  • Infrastructure as Code - Manage GCP resources; embed security, IAM least-privilege, and cost controls by default.
  • Performance & cost tuning - Profile hot paths (BigQuery, Firestore, Pub/Sub), and implement caching or concurrency improvements to keep user latency ...
Posted
18 days ago
Undisclosed

Singapore

  • Unplanned downtime reduction metrics
  • Private Health Insurance
  • Training & Development ...
Posted
18 days ago
Undisclosed

Singapore

  • Monitor system performance, troubleshoot issues, and ensure optimal operation
  • Partner with development teams to improve system reliability and performance at the code and architecture level
  • Develop and implement automation tools to streamline operations and reduce toil ...
Posted
18 days ago