【数据科学名家讲坛】A New AI Paradigm in the Pre-training Big Model Era(Jiaxing ZHANG, Chair Scientist, IDEA Research)
主题:A New AI Paradigm in the Pre-training Big Model Era
报告人:Jiaxing ZHANG, Chair Scientist, IDEA Research
主持人:Zhizheng WU, Associate Professor, School of Data Science, CUHK-Shenzhen
日期:13 October (Thursday), 2022
时间:10:00 to 11:00 AM, Beijing Time
形式:Hybrid
线下地点:103 Meeting Room, Daoyuan Building
Zoom链接:https://cuhk-edu-cn.zoom.us/j/5304767369?pwd=aFErUGFSSDlLNWJld0VNNmpTL0k0UT09
Zoom会议号:5304767369
密码:852648
语言:English
摘要:
With the emergence of pre-trained big models, they are building the foundation of cognitive intelligence, which cause the field of Natural Language Processing (NLP) is undergoing a seismic change. A new AI paradigm is coming. Big models are pre-trained on large-scale data to obtain prior knowledge and powers, and then are adapted to massive downstream tasks. However, one place we usually overlook is a bridge to communicate the underlying big models and downstream tasks such as NLU tasks, which we term ‘phenotype’. Therefore, we propose UnifiedMC as a new phenotype to enable models less than 1 billion parameters to outperformance super-large models with 540 billion parameters in zero-shot scenarios. On the other hand, powered by the language generation from big models, closed-loop training is becoming practical. By trusting each other, multi-party models could converge to self-consistency by exchanging data and feedbacks between them. Interestingly, the models in equilibrium present significant performance improvement on various NLU and NLG tasks . Furthermore, the open-source pre-trained models are taking a prominent place in the revolutionary era of AI ecosystem due to model generalization and expensive training costs. Finally, the dream of using AI to produce AI is coming true. The big models are changing the world and making the future.
简介:
Jiaxing Zhang is the Chair Scientist leading the Cognitive Computing Research Center in IDEA Research. Jiaxing Zhang obtained his PhD in 2006 from Peking University in the area of Computational Quantum Physics. Jiaxing Zhang worked as a researcher in Microsoft Research Asia (MSRA), a Senior Staff Engineer in Ant Financial, the Chief Scientist in 360 Financial. Jiaxing Zhang has a wide interest in Deep Learning, Natural Language Processing, Distributed Systems and Physics. His research is documented in over 20 research papers in top conferences and journals among many areas (NIPS, OSDI, CVPR, SIGMOD, NSDI, AAAI, WWW…), and over 70 submitted patents. At present, Jiaxing Zhang is focusing on new technology for cognitive intelligence based on large-scale pretrained models. A series foundations for Chinese natural language processing are released, including “Fengshenbang” open source big models and GTS model production platform.


