Skip to content

Staff Researcher – Speech AI

    • Markham, Ontario
  • o3vfs

Job description

Huawei Canada has an immediate permanent opening for a Staff Researcher.

About the team:
The Human-Machine Interaction Lab unites global talents to redefine the relationship between humans and technology. Focused on innovation and user-centered design, the lab strives to advance human-computer interaction research. Our team includes researchers, engineers, and designers collaborating across disciplines to develop novel interactive systems, sensing technologies, wearable and IoT systems, human factors, computer vision, and multimodal interfaces. Through high-impact products and cutting-edge research, we aim to enhance user experiences and interactions with technology.

About the job:

  • Design and implement speech foundation models and speech language models (SLMs) for a variety of applications

  • Develop algorithms for speech and acoustic signal processing, including ASR, speech enhancement, beamforming, and acoustic event detection

  • Conduct original research and prototyping using deep learning, transformers, RNNs, and other modern machine learning techniques

  • Collaborate closely with cross-functional teams to bring speech-related solutions from concept to integration into real-world systems

  • Evaluate and benchmark algorithms using both quantitative metrics (e.g., WER, PESQ, STOI) and qualitative assessments

  • Develop robust infrastructure and pipelines to support rapid experimentation and deployment of research ideas

  • Stay at the forefront of developments in speech technology, foundation models, and conversational AI, and incorporate state-of-the-art methods into research and product development

Job requirements

About the ideal candidate:

  • Ph.D. in Computer Science, Electrical Engineering, or a related field with a strong focus on speech processing and machine learning

  • Demonstrated experience with LLMs, ASR systems, and modern speech models (e.g., Whisper, wav2vec, HuBERT, Conformer)

  • Proficiency in Python and experience with at least one additional programming language (e.g., C++, Java, JavaScript)

  • Expertise in deep learning frameworks such as PyTorch, TensorFlow, or Keras

  • Familiarity with common audio/speech processing libraries (e.g., TorchAudio, Librosa, PyAudio)

  • Strong grasp of digital and statistical signal processing, including spectral and spatial filtering

  • Experience working with large-scale, noisy datasets in real-world environments

  • A solid publication record in relevant venues (e.g., NeurIPS, ICML, ICLR, Interspeech, ICASSP, AES)

or