jobs in DEXIAN SINGAPORE PTE. LTD.

DEXIAN SINGAPORE PTE. LTD. Hiring! Full Time HPC System Engineer in Islandwide (Singapore), Earn up to SGD 9,000 - Ricebowl

HPC System Engineer

DEXIAN SINGAPORE PTE. LTD.

SGD9,000 - SGD9,000 Per Month

Islandwide (Singapore)

Share
Save

Working Location

  • Islandwide (Singapore) Singapore

Job Description

Responsibilities

Key Responsibilities

  • Design and develop compute cluster architectures optimized for performance, reliability, scalability, and serviceability within KLA systems.
  • Define and validate server hardware configurations, including CPUs, GPUs, memory subsystems, storage, networking, and specialized accelerators.
  • Analyze and optimize system-level performance across hardware and software layers, including CPU/GPU utilization, memory bandwidth, PCIe topology, NUMA architecture, and I/O performance.
  • Collaborate with hardware, software, firmware, and systems engineering teams to ensure seamless integration of compute clusters into broader system architectures.
  • Support server bring-up, hardware integration, diagnostics, benchmarking, stress testing, and root-cause analysis activities.
  • Manage and troubleshoot enterprise server platforms, including BIOS/firmware configuration, BMC/IPMI management, thermal and power optimization, and hardware health monitoring.
  • Participate in architecture reviews, integration planning, technical discussions, and cross-functional problem-solving sessions.
  • Create and maintain technical documentation for hardware design decisions, validation procedures, deployment standards, and troubleshooting workflows.

Required Skills & Qualifications

  • Strong experience in computer hardware and system architecture design, particularly in compute clusters, HPC environments, or enterprise server platforms.
  • Deep understanding of modern CPU and GPU architectures, including multicore processing, NUMA, PCIe, memory hierarchy, and hardware-software interactions.
  • Experience with GPU-accelerated systems and accelerator integration (e.g., NVIDIA GPU platforms, CUDA environments, or similar technologies).
  • Hands-on experience with Linux system administration and OS customization (preferably SUSE Linux Enterprise Server).
  • Familiarity with enterprise server management technologies such as BIOS/UEFI, BMC, IPMI, iDRAC, or similar remote management tools.
  • Understanding of distributed systems, high-performance networking, and cluster infrastructure technologies such as InfiniBand, RDMA, or high-speed Ethernet.
  • Experience with system performance tuning, hardware validation, benchmarking, and low-level troubleshooting.
  • Strong analytical, documentation, and communication skills.

Preferred Qualifications

  • Experience in high-performance computing (HPC), AI/ML infrastructure, or large-scale distributed compute environments.
  • Familiarity with server hardware bring-up, failure analysis, thermal/power optimization, and reliability engineering.
  • Exposure to hardware diagnostic and monitoring tools for server and cluster environments.
  • Understanding of storage architectures, parallel file systems, and distributed storage solutions.
  • Experience working in cross-functional engineering teams across hardware, firmware, and software domains.
  • Test-driven and detail-oriented engineering mindset with strong problem-solving skills.
  • Self-motivated individual with a proactive approach to continuous improvement and technical innovation.

Important Information

Never provide your bank or credit card details when applying for jobs. Do not transfer any money or complete unrelated online surveys. If you see something suspicious, Report this Job ad.

Learn More