700+ Reliability Jobs - June 2026 - High Salaries

显示721个工作的结果 "reliability"

不要错过任何 Reliability 的新工作机会

Undisclosed

KL City

  • General Reliability Physics competence support to all stakeholders, like Innovation, PE/TE and Manufacturing
  • Close cooperation required with life test engineer, operation team, and BL design/application engineer for project management of reliability service to achieve BL/requestor target planning.
  • Execution according reliability qualification strategy, engineering risk mitigation, and, reliability monitoring plan, with agree milestone planning to deliver results on time. ...
Posted
8 days ago

Tata Consultancy Services Limited

SGD250 - SGD5,000 每月

Singapore

  • As part of Cloud Engineering Team, the SRE Engineer engages in and improves the full lifecycle of cloud platform solutions from design, deployment, operation and refinement with accuracy and in compliance with organization policies and security requirements.
  • The SRE Engineer treats operations as a software problem and therefore will code to automate repetitive tasks and optimize cloud operations.
  • Support services before go-live through activities like system design consulting, developing software platforms and launch reviews. Maintain post-live cloud operations by measuring and monitoring availability, latency and overall system health with any prompt and remediate actions. ...
Posted
9 days ago

VisionPower Semiconductor Manufacturing Company (VSMC)

Undisclosed

Singapore

  • Manage the execution of reliability qualification processes, ensuring continuous monitoring and timely interventions for corrective actions.
  • Apply advanced statistical tools and quality assurance methods to proactively detect and address potential reliability issues.
  • Assess reliability risks for customer products, providing strategic guidance to mitigate potential issues. ...
Posted
9 days ago
Undisclosed
WFH

Malaysia

  • Design and operate CloudBlue’s observability stack across metrics, logs, and traces using tools such as Datadog, Grafana, and Elastic Stack
  • Develop actionable alerting strategies and dashboards that provide clear insight into platform and business health
  • Design and maintain high-availability architectures, implementing redundancy, failover, and disaster recovery strategies across regions and availability zones ...
Posted
9 days ago
Undisclosed

Singapore

  • Experience with distributed systems or infrastructure environments is a plus
  • Scripting/automation exposure (Python, Bash, etc.) preferred
  • Open to candidates from finance and non-finance industries ...
Posted
9 days ago

Tata Consultancy Services Limited

SGD140 - SGD144,001 每月

Singapore

  • As part of Cloud Engineering Team, the SRE Engineer engages in and improves the full lifecycle of cloud platform solutions from design, deployment, operation and refinement with accuracy and in compliance with organization policies and security requirements.
  • The SRE Engineer treats operations as a software problem and therefore will code to automate repetitive tasks and optimize cloud operations.
  • Support services before go-live through activities like system design consulting, developing software platforms and launch reviews. Maintain post-live cloud operations by measuring and monitoring availability, latency and overall system health with any prompt and remediative actions. ...
Posted
10 days ago

VisionPower Semiconductor Manufacturing Company (VSMC)

Undisclosed

Singapore

  • Assist the manufacturing team in managing abnormal events by conducting reliability risk assessments and suggesting improvements to mitigate risks.
  • Bachelor’s degree or higher in Electronics, Electrical Engineering, Physics, or a related technical field.
  • Minimum 2 years of relevant experience in semiconductor process reliability engineering. ...
Posted
11 days ago
SGD7,000 - SGD8,500 每月

Singapore

  • Implement monitoring, alerting, and incident response systems
  • Ensure security compliance, vulnerability assessments, and penetration testing
  • Define and monitor SLOs/SLIs for system reliability ...
Posted
11 days ago
Undisclosed

Singapore

  • Assist and support engineers to identify opportunities to optimize processes, reduce costs, and enhance efficiency in relation to equipment and maintenance practices.
  • Assist and support engineers to ensure all maintenance activities are performed safely, adhering to company safety policies, risk assessments, and regulatory requirements.
  • Bachelor degree in Engineering
Posted
11 days ago
SGD7,000 - SGD7,000 每月

Singapore

  • Ensure security compliance, vulnerability assessments, and penetration testing
  • Define and monitor SLOs/SLIs for system reliability
  • Improve resilience through DR planning and performance optimisation ...
Posted
12 days ago
Undisclosed

Singapore

  • Manage Predictive Maintenance (PDM) programs such as vibration analysis, infrared thermography, motor circuit analysis, ultrasonics, and related technologies.
  • Interpret predictive maintenance data outputs and prioritize corrective actions accordingly.
  • Review, monitor, and ensure compliance with engineering design requirements, standards, specifications, and local regulatory codes. ...
Posted
13 days ago
Undisclosed

Singapore

  • Site Reliability Engineer
  • We are looking for a motivated junior Site Reliability Engineer (SRE) to join our infrastructure and reliability team in this role, you will help ensure the reliability, availability, and performance of our systems while learning best practices in automation, monitoring, and incident management. You will work closely with senior SREs and software engineers to support production systems and improve operational efficiency.
  • This role is ideal for candidates with a strong interest in cloud infrastructure, automation, and reliability engineering who are looking to grow into a full-fledged SRE. ...
Posted
13 days ago
Undisclosed

KL City

  • Execute the delay, pirep and component alerting and closure system.
  • Maintain appropriate records for the alerting system.
  • Optimize maintenance programs. ...
Posted
13 days ago
Undisclosed

Singapore

  • Perform proactive capacity planning to handle peak traffic, including telephony quotas and concurrent workloads.
  • Manage and optimize core AWS services including EC2, ECS/EKS, S3, Lambda, DynamoDB, and VPC networking.
  • Implement security best practices, including IAM least-privilege access, encryption (KMS), and compliance with standards such as SOC2, HIPAA, or PCI-DSS. ...
Posted
13 days ago
Undisclosed

KL City

  • Execute the delay, pirep and component alerting and closure system.
  • Maintain appropriate records for the alerting system.
  • Optimize maintenance programs. ...
Posted
13 days ago
Undisclosed

Singapore

  • Manage Predictive Maintenance (PDM) programs such as vibration analysis, infrared thermography, motor circuit analysis, ultrasonics, and related technologies.
  • Interpret predictive maintenance data outputs and prioritize corrective actions accordingly.
  • Review, monitor, and ensure compliance with engineering design requirements, standards, specifications, and local regulatory codes. ...
Posted
13 days ago
MYR5,000 - MYR6,000 每月

KL City

  • Maternity leave
  • Opportunities for promotion
  • Parental leave ...
Posted
15 days ago
Undisclosed

KL City

  • Collaboration: Work with development squads to ensure new features are designed with reliability in mind; participate in Agile ceremonies
  • Incident Management: Conduct root cause analysis for incidents and implement corrective actions to prevent recurrence; participate in on-call rotations for critical systems
  • Continuous Improvement: Drive initiatives to improve system performance, reliability, and scalability through best practices. ...
Posted
15 days ago
SGD7,000 - SGD9,000 每月

Singapore

  • - 处理突发重大故障和普通故障,进行服务恢复。分析事件的根本原因,并改进和优化。
  • - 开发和维护自动化运维工具,提高运维工作效率,优化运维流程。
  • - 提供 7 * 24 OnCall 技术支持服务,5 * 8工作时间服务。 ...
Posted
15 days ago

Truewatch Technology Inc. Pte Ltd

SGD7,000 - SGD10,000 每月

Singapore

  • Manage and optimize cloud infrastructure on AWS, Azure, or GCP to improve resource utilization and cost efficiency
  • Implement Infrastructure as Code (IaC) and automate deployments through CI/CD pipelines to accelerate delivery and reduce errors
  • Enhance system scalability, resilience, and operational efficiency by identifying and applying improvements ...
Posted
15 days ago
Undisclosed
  • Collaborate with cross-functional teams to integrate tribological insights into product development and optimization
  • Present findings and recommendations to diverse audiences, including management and global teams
  • Contribute to the development of new testing methods and equipment to enhance tribological analysis capabilities ...
Posted
15 days ago
Undisclosed

Singapore

  • Unplanned downtime reduction metrics
  • Private Health Insurance
  • Training & Development ...
Posted
15 days ago
Undisclosed

Singapore

  • CI/CD golden path - Codify Cloud Build pipelines and automated canary rollouts for Cloud Functions / Cloud Run.
  • Infrastructure as Code - Manage GCP resources; embed security, IAM least-privilege, and cost controls by default.
  • Performance & cost tuning - Profile hot paths (BigQuery, Firestore, Pub/Sub), and implement caching or concurrency improvements to keep user latency ...
Posted
15 days ago
Undisclosed

Singapore

  • Monitor system performance, troubleshoot issues, and ensure optimal operation
  • Partner with development teams to improve system reliability and performance at the code and architecture level
  • Develop and implement automation tools to streamline operations and reduce toil ...
Posted
15 days ago
Undisclosed
  • Review Preventive Maintenance plan to be dynamic and suit to real condition of plant.
  • Defining standards & procedure.
  • Coordinating Maintenance Program within internal & external team. ...
Posted
15 days ago
MYR19,000 - MYR19,000 每月

KL City

  • Conduct thorough post-mortem analyses following incidents, driving continuous improvement through root cause identification and solution implementation.
  • Collaborate with development and operations teams to establish best practices in system reliability and incident management.
  • Troubleshoot and resolve issues related to database performance, network connectivity, and deployment failures, including diagnosing problems at the underlying platform level (e.g., Kubernetes, virtual machines). ...
Posted
15 days ago
Undisclosed

Singapore

Posted
15 days ago
Undisclosed

KL City

  • Implement monitoring, alerting, SLIs, SLOs, and SLA tracking.
  • Participate in 24/7 on-call rotations and incident response activities.
  • Conduct root cause analysis and support post-mortem reviews. ...
Posted
15 days ago
Undisclosed

Singapore

  • Build and maintain production tooling that supports deployment, orchestration, monitoring, and system diagnostics
  • Define and maintain observability, SLI/SLOs, and performance metrics in partnership with product owners
  • Leverage metrics and capacity planning to ensure scalability and uptime ...
Posted
16 days ago
Undisclosed
WFH

Singapore

  • Apply SRE principles to Customer Success - enabling customers and team members to monitor and proactively assist our most important customers.
  • Detect issues commonly occurring in the platform, either underlying or immediate, and work with teams to ensure their priority is recognised.
  • Proactively find improvements in the platform and methods of implementation that can unblock them ...
Posted
16 days ago