jobs in Beijing Foreign Enterprise Management Consultants Co.,Ltd.

全职 AI Inference - Compression Engineer 工作, 薪水, Beijing Foreign Enterprise Management Consultants Co.,Ltd. 公司招聘中 - Ricebowl

AI Inference - Compression Engineer

Beijing Foreign Enterprise Management Consultants Co.,Ltd.

Undisclosed

Singapore

分享
保存

工作地点

  • Singapore

职位描述

岗位职责

On behalf of Huawei, a world-renowned information and communication technology company, we are seeking passionate and talented individuals to join our team as AI Inference & Compression Engineer.


Key Responsibilities

  • LLM Inference Acceleration. Research and develop advanced compression algorithms to accelerate LLM serving. Focus on KV cache optimization, model quantization, and resolving memory bandwidth bottlenecks during autoregressive decoding.
  • Classical Codec Development. Design and implement advanced video compression algorithms, focusing on improving Rate–Distortion (RD) performance, optimizing entropy coding, and enhancing quantization design for real-world applications.
  • AI-Based Media Coding. Develop and optimize AI-based video coding components, including AI-based loop filters, optical flow, and intelligent rate control.
  • Model Deployment & Fusion. Bridge the gap between AI research and production. Optimize deep learning models for efficient inference and ensure seamless integration of compression algorithms into deployment frameworks (e.g., vLLM).
  • Performance & Quality Evaluation. Conduct rigorous objective and subjective visual quality assessments such as PSNR and VMAF for video systems, as well as perplexity, zero-shot benchmarks, latency, and throughput analysis for LLM systems.


Required Qualifications

  • Master’s or PhD in Computer Science, Electronic Engineering, Mathematics, or related fields (PhD preferred).
  • Solid understanding of video coding fundamentals including prediction, transform coding, quantization, and entropy coding with hands-on experience in standards such as H.265/HEVC, AV1, or H.266/VVC.
  • Strong understanding of Transformer architectures and attention mechanisms, as well as key performance bottlenecks in generative AI inference, particularly memory bandwidth constraints (“memory wall”).
  • Strong proficiency in Python and C/C++. Hands-on experience building, training, and modifying models using PyTorch, TensorFlow, etc.


Preferred Qualifications

  • ISP Knowledge. Familiarity with Image Signal Processing flow, such as demosaicing, denoising, and tone mapping.
  • Image Processing. Experience in computer vision-based image enhancement (e.g., de-blurring, artifact removal, or HDR).
  • Hardware Optimization. Knowledge of SIMD, CUDA, or other hardware acceleration techniques for video and tensor processing.

重要安全守则

申请工作时,切勿提供您的银行或信用卡详细资料。不要转账或完成无关的在线调查问卷。如果您发现可疑内容,请举报此招聘广告。

了解更多