- Islandwide (Singapore) Singapore

工作地点
职位描述
岗位职责
· Hardware Integration &Standardized Delivery: Responsible for the rack installation and physical connection of GPU servers, liquid-cooled cabinets, and network equipment. Strictly adhere to hardware installation standards to ensure operational safety and compliance.
· Hardware Diagnostics &Maintenance: Troubleshoot and maintain core server components (motherboard, CPU, GPU, memory, HDD, PSU). Monitor hardware health status to promptly identify and mitigate potential risks.
· Liquid Cooling System Support: Collaborate with Facility Management engineers for liquid cooling system inspections and fault handling. Possess rapid response capabilities for emergencies such as leaks or blockages.
· System-Level Troubleshooting: Conduct in-depth fault diagnosis based on Linux/Unix systems, produce professional hardware failure analysis reports, and provide improvement recommendations.
· Skills: Proficient in Linux/UNIX systems and Shell/Python scripting, with the ability to in dependently troubleshoot system-level issues.
· Experience: 3+ years of experience in data center hardware installation, testing, and maintenance. Experience with high-performance GPU servers (e.g., NVIDIA A100/H100/B300) or HPC cluster maintenance is preferred.
· Certifications: NVIDIA Certified Engineer (NCE) hardware certification is preferred.
· Liquid Cooling: Familiarity with liquid cooling principles; hands-on experience in disassembly, leak detection, and fluid replenishment for liquid-cooled servers is a plus.
· Project Background: Prior participation in the construction and delivery of large-scale AI Computing Centers (AICC) or Supercomputing Centers is preferred.
重要安全守则
申请工作时,切勿提供您的银行或信用卡详细资料。不要转账或完成无关的在线调查问卷。如果您发现可疑内容,请举报此招聘广告。