Reliability Engineer Jobs in Malaysia. Job Vacancies April 2024

Working closely with trading teams, risk management, business management and compliance to understand their needs. Acting as the main coordinator to make sure all requirements are met and appropriate solution is setup in production accordingly.
Solving problems by providing level one and level two support for our production systems.
Contributing to the design and implementation of the support system to enhance reliability and self-correction. ...

An understanding of infrastructure elements and deployment architecture is required.
A strong programming background and ability to deliver successful outcomes consistently on infrastructure automation solutions are essential in this role.
The candidate should have a good understanding of interfacing applications with vRA / vRO APIs. ...

Develop and implement automation tools and processes to streamline operational tasks and enhance system reliability.
Collaborate closely with developers and infrastructure teams to optimize system architecture and improve deployment processes.
Lead initiatives to continuously improve system reliability, scalability, and performance. ...

Java Experience (2+ years) or equivalent level of coding knowledge
Python/Shell Scripting (2+ years) or data analysing experience with Python
Bachelor’s degree in computer science or equivalent experience ...

Competitive salary range (MYR6,000- MYR15,000 per month, depending on experience).
Project Incentive (upon Company Declaration based on project performance)
Transport, Meal, Birthday Allowances (upon Confirmation) ...

Analyze predictive maintenance data and initiate follow-on corrective work
Develop, review data, and report out on key performance indicators
Work with Operations and Maintenance personnel to troubleshoot electrical equipment on an as-needed basis ...

Planning and coordination of shutdown maintenance activities of Rotating Equipment based on local and Global PM requirements, Vendor recommendation & Industrial Best practices.
Review risk analysis and maintenance procedures on regular basis.
Upkeep maintenance records for equipment modifications, repair and rerating on regular basis. ...

Link dev and ops by applying software engineering mindset and instilling Agile approach.
Maintain and improve the resiliency of core applications and infrastructure platforms through a continuous improvement backlog.
Possess a modern approach aligned to things such as Infrastructure as Code, Configuration as Code, and DevOps. ...

Managing the widely-deployed Order Management Systems and Market Data Delivery Systems involving every major electronic exchange and asset class.
Working closely with trading teams, risk management, business management and compliance to understand their needs. Acting as the main coordinator to make sure all requirements are met and appropriate solution is setup in production accordingly.
Solving problems by providing level one and level two support for our production systems. ...

A strong programming background and ability to deliver successful outcomes consistently on infrastructure automation solutions are essential in this role.
The candidate should have a good understanding of interfacing applications with vRA / vRO APIs.
A strong programming background and ability to deliver successful outcomes consistently on infrastructure automation solutions are essential in this role. ...

For our teams, we create an environment with opportunities for our people to succeed, backed by the culture and support to ensure they are enabled to truly own their careers. We are motivated individuals who tackle unique technical challenges at scale and solve them as a team. Together, we deliver innovative and ethical solutions that help businesses achieve their ambitions faster.
Site Reliability Engineer
We provide our merchants a single platform, capable of meeting the rapidly evolving needs of today's fast-growing global businesses. To meet the high expectations of our merchants, Adyen has adopted and embedded principles from the Site Reliability Engineering discipline, offering an environment whereby data-driven decisions, intellectual curiosity, problem solving and openness are key drivers for success. ...

A Day in the Life of a Lead / Senior Site Reliability Engineer:
For this role, you will play a key role in maintaining our cloud platform, which includes an assortment of Kubernetes, Microservices, MongoDB, RabbitMQ, MySQL, Windows Server VM Infrastructure, Orchestration Engines, CI/CD and Monitoring platforms. Your day will consist of:
Executing projects that rollout new platform maintenance features, automate tasks, or other big picture changes ...

Position Overview
As a Site Reliability Engineer (SRE) based in Singapore, you will play a critical role supporting our Blockdaemon team by ensuring the reliability, scalability, and performance of our systems and services. You will collaborate closely with cross-functional teams to design, implement, and maintain robust and resilient infrastructure solutions. The ideal candidate is passionate about automation, possesses strong analytical skills, and thrives in a fast-paced, dynamic environment.
Your Impact ...

Define metrics to evaluate system performance and runtime, improving observability. Plan system capacities to accommodate business growth and promotions.
Analyze production incidents to establish best practices for a highly available payment architecture.
At least 3 years relevant work experience from a large-scale systems. ...

Review architecture and software components with software engineers and architects, ensuring consistent best practices across all teams.
Own and ensure Service Level Objectives (SLOs) and Service Level Agreements (SLAs) are met, monitoring operational metrics and leading improvement plans.
Manage and audit security controls to meet enterprise requirements, collaborating with legal and compliance for risk management. ...

Enterprise products, etc.) reliability testing and design analysis.
Familiar with MTBF, De-rating analysis
Familiar with Environmental and reliability testing such as Vibration test, ...

Develop and implement automation tools and processes to streamline operational tasks and enhance system reliability.
Collaborate closely with developers and infrastructure teams to optimize system architecture and improve deployment processes.
Lead initiatives to continuously improve system reliability, scalability, and performance. ...

Planning and coordination of shutdown maintenance activities of Rotating Equipment based on local and Global PM requirements, Vendor recommendation & Industrial Best practices. .
Review risk analysis and maintenance procedures on regular basis. .
Upkeep maintenance records for equipment modifications, repair and rerating on regular basis. . ...

Why Join Us
Creation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible.
Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day. ...

Creation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible.
Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day.
To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always. ...

Monitor application performance, identify bottlenecks, and implement performance enhancements.
Conduct root cause analysis for recurring issues and implement preventive measures.
Collaborate with internal stakeholders to gather requirements and provide recommendations for application improvements. ...

Production Engineering is responsible for the world’s most reliable, observable, performant, and safe network ecosystem. Our customers rely on our products and systems to safely modify, troubleshoot, and release products without external impact.
Our external customers rely on us to provide seamless and predictable incident, traffic, policy management, resulting in the fastest and safest network services in the world.
We are accountable for the overall performance of internal and external facing services, guiding our product teams to optimal configurations and maximum efficiency. From the moment that a packet enters the Cloudflare ecosystem, we know exactly what its expected purpose and behavior is and we are capable of determining and exposing anomalous behavior. ...

Define metrics to evaluate system performance and runtime, improving observability. Plan system capacities to accommodate business growth and promotions.
Analyze production incidents to establish best practices for a highly available payment architecture.
At least 3 years relevant work experience from a large-scale systems. ...

ExxonMobil Business Support Centre Malaysia Sdn. Bhd.

LUDISIA (CAYMAN) LTD. SINGAPORE BRANCH

NITYO 3P SOLUTIONS PTE. LTD.

Nicoll Curtin Group

Quest Global

NodeFlair

SNSoft Sdn Bhd

Oilandgasjobsearch.com

SNSOFT SDN BHD

GMP Group

Ambition Group Malaysia

LUDISIA (CAYMAN) LTD. SINGAPORE BRANCH

NITYO 3P SOLUTIONS PTE. LTD.

ADYEN SINGAPORE PTE. LTD.

Appspace

Softbank Investment Advisers

Morgan McKinley

Nodeflair

NodeFlair

Celestica

Nicoll Curtin

GMP Technologies

TIKTOK PTE. LTD.

NodeFlair

NodeFlair

TikTok

NodeFlair

Jobline Resources Pte Ltd

CLOUDFLARE, PTE. LTD.

Morgan McKinley

Never miss any updates for this job

Activate Job Alert

Manage Alert