Skip to content

Distinguished Engineer - AI Computing System

    • Vancouver, British Columbia
    • Markham, Ontario
    +1 more
  • 4u4mf

Job description

Huawei Canada has an immediate permanent opening for an Distinguished Engineer - AI Computing System

About the Job:

  • As a leading expert in the industry in the field of training cluster software frameworks and technologies, gain insights into the evolution direction of industry AI large model training frameworks and key features. Plan and layout AI frameworks and software features for scenarios such as large model pre-training, post-training, and integrated training and inference, building key capabilities for the company's training cluster software framework.

  • Focusing on the company's large model training optimization field, lead the team to build key technologies such as low-precision training, parallel strategy tuning, and training resource optimization, promoting the commercial implementation of large model perception optimization-related technologies.

  • Focusing on the company's training servers and super nodes and other products, lead the team to build large model AI training frameworks, operator libraries, acceleration libraries, and other software frameworks and acceleration features, fully leveraging system engineering and software-hardware collaboration capabilities to enhance AI cluster computing efficiency.

  • Identify high-quality academic resources in the direction of large model training, collaborate with domain experts and scholars on projects, layout related standards and patents, support the company's continuous innovation in the training cluster field, and build long-term competitiveness in the AI training cluster direction.

  • Cultivate a team of technical experts and key technical backbone in the direction of AI training cluster frameworks and software optimization. 

The base salary for this position ranges from $172,000 to $230,000 depending on education, experience and demonstrated expertise.

Job requirements

About the ideal candidate:

  • Major in artificial intelligence, computer science, software, automation, physics, mathematics, electronics, microelectronics, information technology, or related fields, with more than 5 years of R&D experience in large model training and optimization.

  • Proficient in common model structures of large models such as Deepseek and Llama, with deep technical expertise in large model training and inference optimization in fields like LLM, MoE, and multimodal learning.

  • Familiar with the hardware architecture and programming systems of AI accelerators such as GPU and NPU, with experience in optimizing AI systems with software-hardware-cores collaboration.

  • Familiar with cluster computing and cloud computing fields, with experience in software architecture design for cluster scheduling.

  • Enjoys research, has strong learning ability, good communication skills, and teamwork ability.

or