Aller au contenu

Senior Researcher – Voice AI

    • Markham, Ontario
  • wv4yg

Job description

Huawei Canada has an immediate permanent opening for a Senior Researcher. 

About the team:

The Human-Machine Interaction Lab unites global talents to redefine the relationship between humans and technology. Focused on innovation and user-centered design, the lab strives to advance human-computer interaction research. Our team includes researchers, engineers, and designers collaborating across disciplines to develop novel interactive systems, sensing technologies, wearable and IoT systems, human factors, computer vision, and multimodal interfaces. Through high-impact products and cutting-edge research, we aim to enhance user experiences and interactions with technology.

About the job:

  • Conduct advanced research and rapid prototyping in speech and audio AI, including speech enhancement, separation, recognition, speaker modeling, and audio-language/vision models.

  • Design, implement, and evaluate state-of-the-art deep learning architectures for speech and audio understanding.

  • Contribute to Huawei’s next-generation intelligent products, including smartphones, earbuds, wearables, and smart glasses, by developing innovative audio AI capabilities.

  • Collaborate closely with research scientists, software engineers, and product teams to translate research outcomes into deployable systems.

  • Stay current with emerging technologies in audio, multimodal, and large foundation models, and contribute to publications, patents, or product features.

  • Present research progress and findings to internal and external audiences.

Job requirements

About the ideal candidate:

  • PhD degree in Electrical Engineering, Computer Science, Speech and Audio Processing, Machine Learning, or a related field.

  • Strong background in speech/audio signal processing, including time–frequency analysis, speech enhancement, and feature extraction.

  • Hands-on experience developing and training deep learning models for speech, audio, or multimodal applications using PyTorch, TensorFlow, or JAX.

  • Experience with speech foundation models, self-supervised audio pretraining, or multimodal learning (audio-language, audio-vision).

  • Proficiency in Python and solid experience in implementing, debugging, and optimizing research code for experiments and deployment.

  • Strong ability to prototype quickly, conduct comprehensive evaluations, and iterate based on experimental results.

  • Experience deploying AI models into real-time or embedded systems for mobile or wearable devices. Familiarity with datasets, benchmarks, and evaluation metrics commonly used in speech processing and audio-language tasks.

  • Proven research record demonstrated by first-authored papers, patents, or released systems in top-tier venues (e.g., ICASSP, INTERSPEECH, NeurIPS, ICLR, ICML, ACL).

or