• SDS Portal
Search
CUHK-Shenzhen
简体中文
  • About SDS
    • Overview
    • Academic Area
    • Dean’s Message
    • Publications
      • Brochure
      • School Newsletter
      • Annual Report
    • FAQ
    • Contact Us
  • Programmes
    • Introduction
    • Undergraduate
      • Data Science and Big Data Technology
      • Statistics
      • Computer Science and Engineering
      • Financial Engineering
      • 2+2 Double Major Programme
        • Interdisciplinary Data Analytics + X Double Major Programme
        • Aerospace Science and Earth Informatics + X Double Major Programme
      • Columbia University 3+2 Initiative (Columbia Class)
    • Taught Postgraduate
      • M.Sc in Data Science
      • M.Sc in Financial Engineering(Full-time/Part-time)
      • M.Sc in Artificial Intelligence and Robotics
      • M.Sc in Computer Science
      • M.Sc in Statistics
      • M.Sc in Bioinformatics
    • Research Postgraduate
      • M.Phil.-Ph.D. Programme in Data Science
      • M.Phil.-Ph.D. Programme in Computer Science
  • Faculty
    • Faculty
    • Emeritus Faculty
    • Affiliated Appointments
    • Researchers/Visitors
  • Students
    • UG Academic Advisory System
    • Ph.D. Students
    • Student Interviews
  • News & Announcements
    • News
    • Announcements
  • School Events
    • Academic Conferences
      • DDTOR 2025
      • CSAMSE 2023
      • RMTA 2023
      • ICASSP 2022
      • Mostly OM 2019
    • Academic Activities
    • SDS Colloquium Series
    • Other Events
  • Research
  • Jobs
    • Faculty Positions
    • Postdoctoral Fellowships
  • Career
    • Graduate Placements
    • International Programmes
  • About SDS
    • Overview
    • Academic Area
    • Dean’s Message
    • Publications
      • Brochure
      • School Newsletter
      • Annual Report
    • FAQ
    • Contact Us
  • Programmes
    • Introduction
    • Undergraduate
      • Data Science and Big Data Technology
      • Statistics
      • Computer Science and Engineering
      • Financial Engineering
      • 2+2 Double Major Programme
        • Interdisciplinary Data Analytics + X Double Major Programme
        • Aerospace Science and Earth Informatics + X Double Major Programme
      • Columbia University 3+2 Initiative (Columbia Class)
    • Taught Postgraduate
      • M.Sc in Data Science
      • M.Sc in Financial Engineering(Full-time/Part-time)
      • M.Sc in Artificial Intelligence and Robotics
      • M.Sc in Computer Science
      • M.Sc in Statistics
      • M.Sc in Bioinformatics
    • Research Postgraduate
      • M.Phil.-Ph.D. Programme in Data Science
      • M.Phil.-Ph.D. Programme in Computer Science
  • Faculty
    • Faculty
    • Emeritus Faculty
    • Affiliated Appointments
    • Researchers/Visitors
  • Students
    • UG Academic Advisory System
    • Ph.D. Students
    • Student Interviews
  • News & Announcements
    • News
    • Announcements
  • School Events
    • Academic Conferences
      • DDTOR 2025
      • CSAMSE 2023
      • RMTA 2023
      • ICASSP 2022
      • Mostly OM 2019
    • Academic Activities
    • SDS Colloquium Series
    • Other Events
  • Research
  • Jobs
    • Faculty Positions
    • Postdoctoral Fellowships
  • Career
    • Graduate Placements
    • International Programmes
  • SDS Portal
CUHK-Shenzhen
简体中文

Breadcrumb

  • Home
  • News & Announcements
  • News
  • Professor Benyou Wang Selected for Tencent Rhino-bird Programme

Professor Benyou Wang Selected for Tencent Rhino-bird Programme

April 30, 2024 News

Assistant Professor Benyou Wang from the School of Data Science at The Chinese University of Hong Kong, Shenzhen, has been selected for the 2024 Tencent AI Lab Rhino-bird Special Research Programme. His project, "Evaluation of Fine-Grained Open-Ended Generation and Intermediate Path Verification in Model Generation Planning," was chosen from 177 applications submitted by scholars from 75 universities. Nearly 90% of these applicants were from top-tier institutions, resulting in a highly competitive acceptance rate of approximately 11%. The selected scholars specialise in natural language processing and large model research, including experts from prestigious universities such as Peking University, Tsinghua University, Fudan University, Zhejiang University, and The Hong Kong University of Science and Technology.

Professor Wang’s research focuses on natural language processing, applied machine learning, and information retrieval. His team has developed several influential open-source large language models, including Phoenix, a multilingual model designed to support intelligent and efficient learning and research within the university community; HuaTuo GPT, a medical domain model that has passed the pharmacist qualification exam and is used in hospitals; and AceGPT, an Arabic language model.

Program Introduction

Tencent AI Lab, a corporate-level AI laboratory of Tencent, emphasises both research and application. The lab is dedicated to fundamental technology development and breakthroughs, continuously advancing new AI technologies, promoting industrial innovation, and exploring new paradigms for AI-empowered scientific discovery.

The Tencent AI Lab Rhino-bird Special Research Programme is committed to original and leading-edge technological breakthroughs. The 2024 programme focuses on five major research themes: machine learning, computer vision and graphics, natural language processing, speech technology, and robotics, with plans to fund 18-21 projects.

Evaluation of Fine-Grained Open-Ended Generation and Intermediate Path Verification in Model Generation Planning

Evaluating open-ended generation poses a significant challenge in large model research. While human evaluation is costly and subjective, the model community has increasingly turned to advanced large models for assessment. However, the effectiveness of this method is debated, as critics argue that evaluation models may lack discriminative power in specific domains. To address this, Professor Wang and his team have developed detailed evaluation criteria for specific tasks or prompts, establishing standards for each sample to guide advanced models in scoring generated content. This approach[1] has been successfully applied in recent multimodal large model research, receiving a meta-score of four out of five in the recent ACL Rolling Review.

Moreover, these evaluation standards can score the model generation process, formalising it into a sequence decision process where each step samples a word until a complete sequence is formed. This process allows for the introduction of additional validators for real-time scoring. For instance, in mathematical reasoning tasks, a value network can reorder the generation space and prune low-value paths during decoding. This strategy has proven effective in data reasoning, with Professor Wang's OVM model increasing the accuracy of a 7B model on the popular mathematical reasoning dataset GSM8K from 0.7 to 0.8. Recent work based on OVM has further improved the 7B model's data reasoning ability to approximately 0.95.

The OVM has significantly enhanced the mathematical reasoning capabilities of mainstream large models, and the paper has been accepted by Findings of NAACL 2024 [2].

Professor Wang's Rhino-bird project aims to extend the fine-grained evaluation criteria similar to [1] to more general scenarios, enhancing the quality of language model generation by using additional validators as in [2].

[1] Wentao Ge, Shunian Chen, Guiming Chen, Junying Chen, Zhihong Chen, Shuo Yan, Chenghao Zhu, Ziyue Lin, Wenya Xie, Xidong Wang, Anningzhe Gao, Zhiyi Zhang, Jianquan Li, Xiang Wan, Benyou Wang. MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria.  
https://arxiv.org/abs/2311.13951

[2] Fei Yu, Anningzhe Gao, Benyou Wang. OVM, Outcome-supervised Value Models for Planning in Mathematical Reasoning. Findings of NAACL 2024. 
https://arxiv.org/abs/2311.09724.

Professor Introduction

WANG, Benyou
Assistant Professor
Ph.D. Information Engineering (Information Science and Technology), University of Padua

Research Field
Natural Language Processing, Information Retrieval, Applied Machine Learning, Quantum Machine Learning

Biography
Professor Benyou Wang received Ph.D. degree at the University of Padua, Italy, funded by the Marie Curie Fellowship. He obtained a master's degree from Tianjin University. In his research career, he has visited University of Copenhagen (Denmark), University of Montreal (Canada), University of Amsterdam (the Netherlands), Huawei Noah's Ark Laboratory (Shenzhen, China), Institute of Theoretical Physics (Chinese Academy of Sciences, Beijing China), and Language Institute (Chinese Academy of Social Sciences, Beijing, China). He is committed to building explainable, robust, and efficient natural language processing approaches that are with both technical rationality and linguistic motivation. So far, he and his collaborators have won the Best Paper Nomination Award in SIGIR 2017 (the top conference in information retrieval) and Best Explainable NLP Paper in NAACL 2019 (a top conference on natural language processing). He had many articles in top international conferences NeurIPS/ICLR/SIGIR/WWW/NAACL and top international journals such as IEEE Transactions on Information System (TOIS), IEEE Transaction on Cybernetics (TOC), and Theoretical Computer Science (TCS). According to Google Scholar, he had achieved nearly 1000 citations with an h-index of 16 when he receives his Ph.D. degree.

 

Address: 3 - 6 Floor, Dao Yuan Building, 2001 Longxiang Road, Longgang District, Shenzhen
E-mail: sds@cuhk.edu.cn
Wechat Account: cuhksz-sds

sds.cuhk.edu.cn

Copyright © CUHK-Shenzhen School of Data Science