jobs in HPC AI TECHNOLOGY PTE. LTD.

HPC AI TECHNOLOGY PTE. LTD. Hiring! Full Time Machine Learning Systems Engineer (MLSys) in , Earn up to SGD 9,000 - Ricebowl

Machine Learning Systems Engineer (MLSys)

HPC AI TECHNOLOGY PTE. LTD.

SGD6,000 - SGD9,000 Per Month

Singapore

Share
Save

Working Location

  • Singapore

Job Description

Responsibilities

Responsibilities

  • System Development & Maintenance
    Contribute to the development, optimization, and maintenance of core components of the machine learning platform, including feature stores, experiment tracking systems, model registries, workflow orchestration, and serving frameworks
  • Training Efficiency Optimization
    Assist in optimizing the performance of distributed training frameworks (e.g., PyTorch DDP, DeepSpeed, FSDP) on large-scale clusters, addressing challenges such as resource scheduling and communication bottlenecks
  • Inference Performance Optimization
    Participate in model deployment and serving, including performance profiling and acceleration through model compilation (e.g., TVM, TensorRT), operator optimization, computation graph optimization, and batching strategies
  • Infrastructure Support
    Leverage technologies such as containerization (Docker), orchestration (Kubernetes), and monitoring (Prometheus/Grafana) to improve observability, reliability, and resource utilization of ML systems
  • Tooling & Developer Productivity
    Build and maintain internal tools to improve engineering efficiency, such as automated evaluation systems, stress testing tools, and debugging utilities

Qualifications

Education

  • Bachelor’s degree or above in Computer Science, Software Engineering, Electronic Engineering, or related fields

Fundamental Knowledge

  • Solid foundation in computer science fundamentals: operating systems, computer networks, data structures, and algorithms
  • Strong programming skills, with proficiency in Python; experience with Go or C++ is a strong plus
  • Basic understanding of software engineering principles, including design patterns and clean coding practices

Technical Skills

  • Familiarity with Linux development environments, including common commands and shell scripting
  • Experience with at least one mainstream deep learning framework (preferably PyTorch), with curiosity about its underlying mechanisms
  • Basic hands-on experience with containerization (Docker), CI/CD pipelines, and version control (Git)

Soft Skills

  • Strong passion for engineering and building high-performance, highly available systems
  • Excellent problem-solving and debugging skills, with a mindset for optimization
  • Good communication and teamwork skills, able to collaborate effectively across cross-functional teams
  • Strong curiosity and willingness to deeply understand machine learning algorithms and their integration with system engineering

Preferred Qualifications (Nice to Have)

  • Familiarity with Kubernetes and cloud-native technologies
  • Experience with model serving frameworks such as Triton, TensorFlow Serving, or TorchServe
  • Understanding of compiler fundamentals (e.g., LLVM), high-performance computing (HPC), or hardware acceleration (GPU/ASIC)
  • Contributions to open-source projects or relevant system/infrastructure projects on GitHub
  • Experience with large-scale data processing (e.g., Spark, Flink) or storage systems

Important Information

Never provide your bank or credit card details when applying for jobs. Do not transfer any money or complete unrelated online surveys. If you see something suspicious, Report this Job ad.

Learn More