Job description
Huawei Canada has an immediate permanent opening for a Principal Scientist.
About the team:
The Computing Data Application Acceleration Lab aims to create a leading global data analytics platform organized into three specialized teams using innovative programming technologies. This team focuses on full-stack innovations, including software-hardware co-design and optimizing data efficiency at both the storage and runtime layers. This team also develops next-generation GPU architecture for gaming, cloud rendering, VR/AR, and Metaverse applications.
One of the goals of this lab are to enhance algorithm performance and training efficiency across industries, fostering long-term competitiveness.
About the job:
- Build an accurate and universal AI performance model based on mainstream AI acceleration technologies to support theoretical analysis.
- Track the emerging hardware designs in the industry, conduct in-depth insight and survey analysis, and identify the direction of key cutting-edge technologies.
- Cooperate with our AI research team to identify key performance bottlenecks in future AI workloads, and define key algo-hw codesign features of our next-generation chips, for the objectives of low cost, high throughput, great scalability, and stability.
- Performance modelling of representative AI workloads with state of the art training & inference algorithms on different hardware specs for quantitative analysis of compute, memory, IO and interconnect.
- Lead our team for acceleration algorithm breakthrough in best tradeoff between model quality and compute efficiency.
- Track the emerging algorithm-hardware codesign technologies in the industry, conduct in-depth insight and survey analysis, and deeply understand main directions and trends of cutting-edge algorithm-hardware codesign technologies.
Job requirements
About the ideal candidate:
- Master's or Doctoral degree in Computer Science or Electronic Engineering.
- At least 5+ years of experience in low-level computing algorithm development, AI accelerator/ large scale parallel computing / high performance computing system design is an asset.
- Deep understanding of the basic principles and workload characteristics of large language models / multimodal models, the popular AI software stack (operators, compilers, acceleration libraries, frameworks) and mainstream large model training and inference algorithms, such as hybrid parallelism, low precision data formats, sparsity, P/D splitting, etc.
- Familiarity with microarchitecture of AI chips is a good plus.
or
All done!
Your application has been successfully submitted!