全职 AI Inference - Compression Engineer 工作, 薪水, Beijing Foreign Enterprise Management Consultants Co.,Ltd. 公司招聘中

jobs in Beijing Foreign Enterprise Management Consultants Co.,Ltd.

AI Inference - Compression Engineer

Beijing Foreign Enterprise Management Consultants Co.,Ltd.

Undisclosed

全职

Singapore

保存

工作地点

Singapore

职位描述

岗位职责

On behalf of Huawei, a world-renowned information and communication technology company, we are seeking passionate and talented individuals to join our team as AI Inference & Compression Engineer.

Key Responsibilities

LLM Inference Acceleration. Research and develop advanced compression algorithms to accelerate LLM serving. Focus on KV cache optimization, model quantization, and resolving memory bandwidth bottlenecks during autoregressive decoding.
Classical Codec Development. Design and implement advanced video compression algorithms, focusing on improving Rate–Distortion (RD) performance, optimizing entropy coding, and enhancing quantization design for real-world applications.
AI-Based Media Coding. Develop and optimize AI-based video coding components, including AI-based loop filters, optical flow, and intelligent rate control.
Model Deployment & Fusion. Bridge the gap between AI research and production. Optimize deep learning models for efficient inference and ensure seamless integration of compression algorithms into deployment frameworks (e.g., vLLM).
Performance & Quality Evaluation. Conduct rigorous objective and subjective visual quality assessments such as PSNR and VMAF for video systems, as well as perplexity, zero-shot benchmarks, latency, and throughput analysis for LLM systems.

Required Qualifications

Master’s or PhD in Computer Science, Electronic Engineering, Mathematics, or related fields (PhD preferred).
Solid understanding of video coding fundamentals including prediction, transform coding, quantization, and entropy coding with hands-on experience in standards such as H.265/HEVC, AV1, or H.266/VVC.
Strong understanding of Transformer architectures and attention mechanisms, as well as key performance bottlenecks in generative AI inference, particularly memory bandwidth constraints (“memory wall”).
Strong proficiency in Python and C/C++. Hands-on experience building, training, and modifying models using PyTorch, TensorFlow, etc.

Preferred Qualifications

ISP Knowledge. Familiarity with Image Signal Processing flow, such as demosaicing, denoising, and tone mapping.
Image Processing. Experience in computer vision-based image enhancement (e.g., de-blurring, artifact removal, or HDR).
Hardware Optimization. Knowledge of SIMD, CUDA, or other hardware acceleration techniques for video and tensor processing.

重要安全守则

申请工作时，切勿提供您的银行或信用卡详细资料。不要转账或完成无关的在线调查问卷。如果您发现可疑内容，请举报此招聘广告。

了解更多

现在申请

全职 AI Inference - Compression Engineer 工作, 薪水, Beijing Foreign Enterprise Management Consultants Co.,Ltd. 公司招聘中 - Ricebowl

AI Inference - Compression Engineer

Beijing Foreign Enterprise Management Consultants Co.,Ltd.