Skip to content

Researcher - AI Data System

    • Markham, Ontario
  • 7u9kb

Job description

Our team has an immediate 12-month contract opening for a researcher.

About the team:

Cloud Native Data Engine team within Distributed Scheduling and Data Engine Lab, led by esteemed technical experts with extensive industry and academic experience, merge software development with cutting-edge industrial research in cloud database area. Our research currently focuses on cloud native database architecture (TaurusDB) and high-performance query and transaction processing (SQL Engine) in next-generation cloud infrastructure. Team publishes innovative research at leading conferences SIGMOD, VLDB, ICDE and recognized as key technology contributors in industry.

About the job:

  • This unique role combines software development with cutting-edge industrial research in databases, encompassing cloud-native database architecture (TaurusDB) and high-performance query and transaction processing (GaussDB SQL Engine) within next-generation cloud infrastructure.

  • Design, implement, and maintain database architectures for machine learning workloads, ensuring efficient data management and optimized performance.

  • Research and stay updated on emerging trends in database technology and machine learning to propose innovative solutions that improve system efficiency and capability.

  • Investigate and summarize state-of-the-art database technologies by reviewing the latest conference papers, attending workshops, and engaging with industry trends.

  • Assist in the implementation of AI-driven analytics and advanced features like vector search, similarity matching, and recommendation systems.

  • Actively pursue opportunities to invent and submit patents, as well as write papers in leading academic and industrial conference.

Job requirements

About the ideal candidate:

  • 1-3 years of strong programming skills in C/C++, with expertise in systems-level programming and debugging.

  • Deep understanding of cloud computing technologies, including cloud storage, distributed systems, parallel computing, and consistency protocols.

  • Experience working with machine learning frameworks (e.g., TensorFlow, PyTorch, scikit-learn) and understanding how they can be applied within database contexts.

  • Familiarity with MySQL, PostgreSQL, or other open-source databases — including knowledge of their internal mechanisms such as transaction management, storage engines, MVCC, SQL optimization, query execution, and vector execution — is considered an asset.

  • Familiarity with AI agents and practical experience in deployment, or experience integrating ML models into production databases or data pipelines, is considered an asset.

  • Experience with database extensions or ML-related plugins (e.g., pgvector for PostgreSQL); Preferably using modern AI accelerators, such as GPUs, NPUs, or TPUs.

  • Proven ability to conduct research and quickly learn new technologies and products.

  • A master’s or Ph.D. in Computer Science, Computer Engineering, Mathematics, or a related field is an asset.

or