喜讯 | 港中大(深圳)数据科学学院师生27篇论文被顶会NeurIPS 2024接收
香港中文大学(深圳)数据科学学院师生共27篇论文被国际顶会NeurIPS 2024(神经信息处理系统大会)接收。论文来自学院的2位硕士生、13位博士生、1位博士后、16位教授,充分展现了学院师生卓越的科研潜力与实力。
数据科学学院在NeurIPS的论文接收数量逐年上升:2022年为18篇,2023年为26篇,2024年为27篇。这一趋势显示出学院不断提升的学术影响力和研究水平。
NeurIPS是中国计算机学会(CCF)推荐的人工智能领域A类国际顶级学术会议,与国际机器学习大会(ICML)和国际学习表征会议(ICLR)并称为“机器学习三大顶会”。 NeurIPS 2024共有15,671篇有效投稿,接收率仅为25.8%。
NeurIPS 2024数据科学学院师生作者包括:
▪️ 2位硕士生:蔡镇阳、杨洋
▪️ 13位博士生:敖君逸、董婧、李子牛、欧阳屹东、任文頔、魏少魁、杨林鑫、应潮龙、尤润泽、于恒旭、张雨舜、赵鑫鉴、朱明丽
▪️ 1位博士后:金瑞楠
▪️ 16位教授:代忠祥、樊继聪、Cosme Louart、李海洲、李爽、李彤欣、李肖、罗智泉、濮实、孙若愚、王趵翔、王本友、吴保元、武执政、于天舒、查宏远
NeurIPS 简介
神经信息处理系统大会(Neural Information Processing Systems,简称NeurIPS)是机器学习和计算神经科学领域的顶尖国际会议,为中国计算机学会(CCF)推荐的人工智能领域 A 类学术会议,与国际机器学习大会(ICML)和国际学习表征会议(ICLR)并称为“机器学习三大顶会”。大会讨论的内容包含深度学习、计算机视觉、大规模机器学习、学习理论、优化、稀疏理论等众多细分领域。NeurIPS 2024共有15,671篇有效投稿,接收率仅为25.8%。今年是该会议举办的第38届,将于2024年12月9日-15日在加拿大温哥华会议中心召开。
来源:NeurIPS官网、百度百科
学生作者简介
*按姓名首字母排序
硕士生
蔡镇阳
2023级 硕士生
人工智能与机器人理学硕士项目
指导老师:王本友
杨洋
2022级 硕士生
数据科学理学硕士项目
指导老师:李爽
博士生
敖君逸
2022级 博士生
计算机科学博士
指导老师:李海洲
董婧
2021级 博士生
数据科学博士
指导老师:王趵翔
李子牛
2020级 博士生
数据科学博士
指导老师:罗智泉
欧阳屹东
2021级 博士生
数据科学博士
指导老师:查宏远
任文頔
2023级 博士生
计算机科学博士
指导老师:李爽
魏少魁
2020级 博士生
数据科学博士
指导老师:查宏远、吴保元
杨林鑫
2022级 博士生
数据科学博士
指导老师:罗效东、孙若愚、张寅
应潮龙
2022级 博士生
计算机科学博士
指导老师:于天舒
尤润泽
2022级 博士生
数据科学博士
指导老师:濮实
于恒旭
2020级 博士生
数据科学博士
指导老师:李肖
张雨舜
2019级 博士生
数据科学博士
指导老师:罗智泉
赵鑫鉴
2023级 博士生
计算机科学博士
指导老师:于天舒
朱明丽
2020级 博士生
数据科学博士
指导老师:吴保元
博士后
金瑞楠
数据科学学院 博士后
教授作者简介
*按姓名首字母排序
代忠祥
助理教授
新加坡国立大学博士
新加坡国立大学计算机学院院长研究生研究卓越奖、新加坡国立大学计算机学院研究成就奖(两次)、IJCAI 2021高级程序委员会成员、曾任麻省理工学院博士后研究员
研究领域:机器学习、多臂老虎机、贝叶斯优化、数据高效的大语言模型、大语言模型的提示优化
个人简介:代忠祥博士现于香港中文大学(深圳)数据科学学院担任助理教授。他于2024年在麻省理工学院担任博士后研究员,并在2021年至2023年期间在新加坡国立大学担任博士后研究员。他于2015年获得新加坡国立大学电气工程学士学位(一等荣誉),并于2021年获得新加坡国立大学计算机科学博士学位。在新加坡国立大学计算机学院攻读博士学位期间,代博士获得了院长研究生研究卓越奖和多次研究成就奖。
代博士的研究兴趣涵盖了机器学习的理论和应用。在理论方面,他专注于多臂老虎机和贝叶斯优化的理论研究。在应用方面,他致力于利用多臂老虎机和贝叶斯优化来(1)解决实际应用中的黑盒优化问题(例如自动机器学习和AI4Science),以及(2)实现数据中心的人工智能,如数据高效的大语言模型提示优化、数据高效的基于人类反馈的强化学习(RLHF)等。代博士已在顶级人工智能会议和期刊上发表了25篇以上的论文,其中包括在ICML、NeurIPS和ICLR(人工智能三大会议)上发表的20多篇论文。他常年担任顶级人工智能会议和期刊的程序委员会成员和审稿人,如ICML、NeurIPS、ICLR、AAAI、TPAMI等。此外,他还担任了IJCAI 2021的高级程序委员会成员。
樊继聪
助理教授
香港城市大学博士
曾获中国过程控制会议张钟俊院士优秀论文奖、香港城市大学杰出学术表现奖、中国自动化学会自然科学奖一等奖,多篇论文被评为CVPR/AAAI-oral和ICLR-spotlight, 现为SCI期刊Neural Processing Letters副编辑, 主持国家自然科学基金青年项目和面上项目
研究领域:人工智能、机器学习
个人简介:樊继聪现为香港中文大学(深圳)数据科学学院助理教授。樊教授于2018年在香港城市大学电子工程系获得博士学位,并分别于2013年和2010年在北京化工大学获得控制科学与工程硕士学位和自动化学士学位。在加入香港中文大学(深圳)之前,他是康奈尔大学的博士后。他还曾在美国威斯康星大学麦迪逊分校和香港大学担任研究职位。
樊教授的研究方向是人工智能和机器学习,他在矩阵/张量方法、聚类算法、异常/离群点/故障检测、深度学习和推荐系统等方面做了大量研究工作。他的研究成果曾在多个知名学术期刊与著名国际会议上发表,如IEEE TSP/TNNLS/TII、KDD、NeurIPS、CVPR、ICML、ICLR、AAAI和IJCAI等。 他是IEEE高级会员,目前担任期刊《Pattern Recognition》(中科院一区,CCF-B类)和《Neural Processing Letters》的副编辑,主持国家自然科学基金青年项目一项、面上项目一项、广东省面上项目一项,获得2023年中国自动化学会自然科学奖一等奖,入选斯坦福大学/爱思唯尔2023、2024年“全球Top 2% 科学家”榜单。
樊教授目前招收博士生、博士后、研究助理以及访问学生。感兴趣的同学可通过邮件联系。
LOUART, Cosme
助理教授
格勒诺布尔-阿尔卑斯大学博士
曾在EDF(法国电力公司)北京研发中心担任数据工程师,在数学和机器学习期刊(ICML、AISTATS、IEEE、Annals of Applied Probability)上发表多篇文章
研究领域:高维数据处理应用数学、机器学习
个人简介:Cosme Louart现为香港中文大学(深圳)数据科学学院助理教授。他毕业于巴黎高等师范学院(ENS),在ENS Paris Saclay获得机器学习硕士学位,在格勒诺布尔大学获得博士学位(隶属于GIPSA实验室和CEA list)。他在数学和机器学习期刊(ICML、AISTATS、IEEE、Annals of Applied Probability)上发表了多篇文章。他的主要贡献是应用集中的度量理论来研究随机矩阵和一些高维数据处理技术(ELMs, GANs, robust estimation of Scatter matrices, softmax classifiers, robust linear regression...)。
他在EDF(法国电力公司)的北京研发中心工作了两年,担任数据工程师,为能源问题提供人工智能解决方案。
李海洲
校长学勤讲座教授
执行院长
华南理工大学博士
新加坡工程院院士、IEEE 信号处理学会副会长(任期2024-2026)、IEEE会士、国际语音通信学会会士、亚太人工智能学会会士、曾获新加坡总统科技奖、曾任国际语音通信学会主席、顶级期刊IEEE/ACM《音频、语音和语言处理汇刊》主编,原新加坡国立大学终身教授
研究领域:语音信息处理、自然语言处理、类脑计算、人机交互
个人简介:李海洲教授(新加坡工程院院士、IEEE Fellow、ISCA Fellow)现任香港中文大学(深圳)数据科学学院执行院长、校长学勤讲座教授,同时他也是新加坡国立大学客座教授和德国不来梅大学卓越讲座教授。此前,他曾于2006年至2016年分别担任新加坡南洋理工大学和新加坡国立大学教授,于2009年担任东芬兰大学客座教授,于2011年至2016年任澳洲新南威尔士大学客座教授,于2003年至2016年担任新加坡科技研究局通信与资讯研究院首席科学家和研究总监。
李教授曾任顶级期刊IEEE/ACM Transactions on Audio、Speech and Language Processing主编 (2015-2018年);目前任Computer Speech and Language副主编 (2012-2022年)、Springer International Journal of Social Robotics副主编 (2008-2022年)。李教授也曾担任多个学术委员会委员:IEEE语音与语言处理技术委员会委员 (2013-2015年)、IEEE信号处理学会出版委员会委员 (2015-2018年)、 IEEE 信号处理学会奖励委员会委员 (2021-2023年)。李教授也曾是多个学会主席:国际语音通信学会主席 (ISCA, 2015-2017年)、亚太信号与信息处理协会主席(APSIPA, 2015-2016年)、亚洲自然语言处理联合会主席(AFNLP, 2017-2018年)、IEEE 信号处理学会副会长(IEEE SPS, 2024-2026年)。此外,他还担任过ACL 2012、INTERSPEECH 2014、IEEE ICASSP 2022 等多个大型学术会议的大会主席。
李教授享誉国际,他不仅在语音识别和自然语言处理研究领域有着突出贡献,还领导研发了多项知名的语音产品,如1996年苹果电脑公司为Macintosh发行的中文听写套件、1999年Lernout & Hauspie公司为亚洲语言发行的Speech-Pen-Keyboard文本输入解决方案。他是一系列重大技术项目的架构师,项目包括2001年为新加坡樟宜国际机场研发的具有多语种语音识别功能的TELEFIQS自动呼叫中心、2012年为联想A586智能手机研发的声纹识别引擎、2013年为百度音乐研发的听歌识曲引擎。
李爽
助理教授
佐治亚理工学院博士
曾获INFORMS QSR最佳学生论文竞赛决赛入围奖、INFORMS社交媒体分析最佳学生论文竞赛决赛入围奖、H. Milton Stewart工业学院研究生贾维斯奖第二名、中国科学技术大学自动化系优秀本科论文奖,曾任哈佛大学博士后
研究领域:时序数据分析和决策的机器学习方法、及其在医疗保健、智慧城市和社交媒体中的应用
个人简介:李爽于2011年获得中国科学技术大学学士学位,并分别于2014年和2019年获得佐治亚理工学院硕士和博士学位。
在加入香港中文大学(深圳)之前,李爽在哈佛大学任博士后研究员,研究移动健康中的多智能体强化学习。她曾为哈佛大学序贯决策2021年春季课程提供课程材料。2014 年至 2019 年期间,她在佐治亚理工学院担任教学助理,曾负责机器学习、计算数据分析,和计算数据分析概要。2018年,她在谷歌进行了三个月的研究实习,研究推荐系统的用户行为建模。同年,李爽荣获INFORMS QSR最佳学生论文竞赛决赛入围奖和INFORMS社交媒体分析最佳学生论文竞赛决赛入围奖。2016年,获得 H. Milton Stewart 工业学院研究生贾维斯奖第二名。2011年至2012年,获得美国马萨诸塞大学阿默斯特分校工程学院的Hluchyj奖学金。2011年,获得中国科学技术大学自动化系优秀本科论文奖。她的研究领域包括用于序列数据分析和决策的机器学习、新序列模型、可靠高效的学习方法、有效推理程序、医疗保健、智慧城市和社交媒体。
李彤欣
助理教授
加州理工学院博士
曾获2022年SIGEnergy博士论文奖荣誉奖、曾获Resnick可持续发展中心2021影响力资助奖,曾两次任亚马逊云计算安全组实习应用科学家,曾参与可再生能源实验室、帕萨迪纳市水电局以及加州理工学院能源设施中心等项目
研究领域:可信机器学习、在线学习、智能电网、电力系统与控制
个人简介:李彤欣博士现为香港中文大学(深圳)数据科学学院的助理教授。他于2022年获得加州理工学院计算机与数学科学博士学位。在此之前,他于香港中文大学取得数学与信息工程双学位及哲学硕士学位。
李彤欣教授目前的研究方向包括机器学习、控制、以及优化等交叉学科以及在电力系统与可持续低碳清洁能源领域的应用。研究兴趣包括探索研发可信人工智能从而改进智能电网的可持续性、鲁棒性、可拓展性、隐私安全以及可适应性。李彤欣教授多次获邀参加国际会议报告,在国际顶尖期刊与会议发表文章。李彤欣教授在产研结合方面拥有丰富经验。他曾两次在亚马逊云计算安全组担任实习应用科学家。他曾参与可再生能源实验室、帕萨迪纳市水电局以及加州理工学院能源设施中心合作的多个电力系统优化设计项目。他参与的智能电网项目曾获Resnick可持续发展中心2021影响力资助奖。
李肖
助理教授
香港中文大学博士
国际著名期刊机器学习研究杂志、IEEE信号处理汇刊、NeurIPS、ICLR审稿人,论文曾发表在2019 NeurIPS(排名前3%)、2020 ICLR(排名前1.85%)、SIAM优化期刊、SIAM成像科学期刊、IEEE图像处理汇刊,曾任香港中文大学助教
研究领域:数学优化(非光滑、非凸以及随机优化)、机器学习、信号处理
个人简介:李肖教授现为香港中文大学(深圳)助理教授。李教授于2016年获得浙江工业大学通信工程学士学位,于2020年获得香港中文大学电子工程博士学位。李教授曾于 2018年10月至2019年4月,作为访问学者在南加利福尼亚大学学习。
李教授的研究方向包括数据科学、计算成像、机器学习、数学优化等领域。他在2019年神经信息处理系统大会(NeurIPS)中其著作被收录为 Spotlight 论文(排名前3%),在2020年国际学习表征会议(ICLR)中其著作被收录为 Oral 论文(排名前1.85%)。除此之外,李教授曾在SIAM Journal on Optimization,SIAM Journal on Imaging Science,IEEE Transactions on Image Processing等国际性学术刊物上发表研究成果。
罗智泉
校长学勤讲座教授
香港中文大学(深圳)副校长(学术)
麻省理工学院博士
全球Top 2%顶尖科学家、全球计算机科学和电子领域千强科学家、中国工程院院士、加拿大皇家科学院院士、IEEE会士、SIAM会士,曾获第一届王选应用数学奖,深圳市大数据研究院院长
研究领域:大数据分析的最优化方法、信号处理中的算法设计与复杂性分析、数据通信
个人简介:罗智泉教授是中国工程院外籍院士、加拿大皇家科学院院士、香港中文大学(深圳)副校长、深圳市大数据研究院院长、香港中文大学(深圳)—深圳市大数据研究院—华为未来网络系统优化创新实验室主任、广东省大数据计算基础理论与方法重点实验室主任。他于1984年获北京大学数学系学士学位,1989年获美国麻省理工学院电子工程与计算机科学系运筹学博士学位。他是SIAM 会士和IEEE 会士以及IEEE信号处理期刊主编(2012-2014)。
罗智泉教授的学术成果包括无线通信的收发机优化设计、最优鲁棒波束成形设计、动态频谱管理等,相关论文分别获得2004年、2009年、2011年和2015年IEEE信号处理学会、2011年国际通信大会、欧洲信号处理学会以及2020年世界华人数学家联盟最佳论文奖;因在优化理论方面的杰出贡献,2010年被美国运筹和管理科学协会授予Farkas奖,2018年被国际数学优化学会授予Tseng纪念奖,2022年被中国工业与应用数学学会授予第一届王选应用数学奖。
2020年,挑战网络效能最大化的难题,他首次提出了数据驱动的现实网络统计模拟技术,研究建立了大规模4/5G异构网络参数最优化模型,突破了求解超大规模混合整数优化模型的算法瓶颈,从无到有建立了网络性能的数学模型和算法框架;2021年6月被认证为CSIAM应用数学落地成果;2021年9月,研究成果入围2021年世界计算大会计算创新与数字赋能专题展。2021年,他入选全球计算机研究领域的领先门户网站Guide2Research全球前1000位计算机科学和电子领域顶级科学家榜单。
濮实
助理教授
弗吉尼亚大学博士
IEEE Control Systems Society会议编委,曾任波士顿大学、佛罗里达大学和亚利桑那州立大学博士后研究员
研究领域:分布式机器学习、大规模优化、多智能体网络
个人简介:濮实博士现任香港中文大学(深圳)数据科学学院助理教授。在此之前, 曾任佛罗里达大学、亚利桑那州立大学和波士顿大学博士后研究员。2012年取得北京大学工学学士学位, 2016年取得弗吉尼亚大学系统工程博士学位。主要研究方向为分布式机器学习和大规模优化算法。2017年获弗吉尼亚大学Louis T. Rader杰出毕业生荣誉称号。以第一或通讯作者身份在Mathematical Programming、IEEE Transactions on Automatic Control、SIAM Journal on Control and Optimization、Operations Research等运筹优化和控制领域的权威期刊发表10余篇论文,其中一篇代表作入选ESI高被引论文。正在主持国家自然科学基金青年项目、深圳市优秀科技创新人才培养项目(优秀青年基础研究)等。入选2021年度广东省青年人才计划。2022年以来担任IEEE Control Systems Society会议编委。
孙若愚
副教授
明尼苏达大学博士
NeurIPS、ICML、ICLR、AISTATS等人工智能会议领域主席,曾获INFORMS George Nicolson学生论文竞赛第二名、INFORMS优化协会学生论文竞争荣誉奖,曾任Facebook 人工智能研究所全职访问科学家,原伊利诺伊大学香槟分校助理教授
研究领域:深度学习理论和算法、生成模型、大规模优化算法、学习优化、图神经网络、人工智能在通信网络的应用、通信网络容量理论、通信网络优化算法
个人简介:孙若愚现为香港中文大学(深圳)数据科学学院副教授、博士生导师。此前他于2017年至2022年任伊利诺伊大学香槟分校(UIUC)助理教授、博士生导师,2016年任脸书人工智能研究所(由LeCun领导)全职访问科学家,2015-2016年任斯坦福大学博士后研究员。他2015年在美国明尼苏达大学电子与计算机工程系获得博士学位,2009年在北京大学数学科学学院基础数学系获得本科学位。他的主要研究领域为人工智能和机器学习、数学优化理论与算法、无线通信和信号处理等,具体研究方向包括神经网络理论和算法、生成模型、大数据优化算法、学习优化、通信网络容量理论与优化算法等。他曾获得INFORMS(国际运筹与管理协会) George Nicolson学生论文竞赛第二名,以及INFORMS优化协会学生论文竞争荣誉奖。在人工智能与机器学习会议NeurIPS, ICML, ICLR, AISTATS,顶尖信息论与通信杂志IEEE transaction on information theory, IEEE Signal Processing Magazine, Journal of Selected Areas in Communications,顶尖数学优化与运筹杂志Mathematical Programming, SIAM Journal on Optimization, Math of Operations Research等会议与杂志发表数十篇文章。目前担任NeurIPS, ICML, ICLR, AISTATS等人工智能会议的领域主席。
王趵翔
助理教授
香港中文大学博士
国际著名期刊Management Science、INFORMS Journal on Computing审稿人,曾任哥伦比亚大学博士后
研究领域:强化学习、在线学习、学习理论
个人简介:王趵翔现为香港中文大学(深圳)数据科学学院助理教授。王趵翔于2014年在上海交通大学获信息安全专业工程学士学位;其后于2020年在香港中文大学计算机科学与工程系获博士学位。就读博士期间,他曾在阿尔伯塔大学和加拿大皇家银行长期访问。
王趵翔的研究方向包括强化学习,在线学习,和学习理论等。他的研究成果发表在ITCS, NeurIPS, ICML, ICLR等会议。他关于The Gambler's problem的研究解决了强化学习教科书中的开放问题,并证明了强化学习中的混沌现象。
王本友
助理教授
意大利帕多瓦大学博士
曾获NLPCC 2022最佳论文奖、NAACL 2019最佳可解释NLP论文、SIGIR 2017最佳论文提名奖、玛丽居里奖学金,长期担任ICLR/NeurIPS/ICML审稿人
研究领域:自然语言处理、信息检索、应用机器学习
个人简介:王本友教授于2022年在意大利帕多瓦大学取得博士学位。他曾是欧盟玛丽居里研究员,曾在丹麦哥本哈根大学、加拿大蒙特利尔大学、荷兰阿姆斯特丹大学、华为诺亚方舟实验室、中科院理论物理所、社科院语言所交流访问。王教授的主要研究方向为自然语言处理方面、应用机器学习、信息检索。他曾获得国际信息检索顶级会议SIGIR 2017(CCF A类会议)最佳论文提名奖,获奖论文(IRGAN)是SIGIR历史上引用最多的论文之一,是最早也是最成功的GAN在信息检索领域的应用;曾获得国际自然语言处理顶级会议NAACL 2019最佳可解释论文奖,与自然语言处理里程碑工作BERT一起同台领奖。王教授在国际顶级会议ICLR/NeurIPS/ACL/EMNLP/NAACL/SIGIR/WWW/CIKM/AAAI/IJCAI和国际顶级期刊TOIS/TOC/TCS等发表了20余篇论文。他的专著《推荐系统与深度学习》由清华大学出版社出版。他长期担任ICLR/NeurIPS/ICML审稿人。
吴保元
副教授
助理院长(科研)
中国科学院自动化研究所博士
全球Top 2% 顶尖科学家、深圳市大数据研究院大数据安全计算实验室主任、NeurIPS 2022区域主席、Neurocomputing期刊编委,曾获国家自然科学基金面上项目资助,曾任腾讯AI Lab专家研究员
研究领域:人工智能安全隐私、计算机视觉、机器学习与最优化
个人简介:吴保元教授现为香港中文大学(深圳)数据科学学院副教授。吴教授于2009年毕业于北京科技大学自动化学院,2011至2013年以访问学生身份赴美国伦斯勒理工学院研究机器学习和计算机视觉,2014年6月在中国科学院自动化研究所模式识别国家实验室获得模式识别和智能系统博士学位。其后,吴教授在2014至2016年在沙特阿拉伯的阿卜杜拉国王科技大学担任博士后研究员,在2016至2018年在腾讯AI lab担任高级研究员一职,并于2019年1月荣升为专家研究员。
吴教授的研究领域集中于机器学习、计算机视觉和优化,包括对抗样本、模型压缩、视觉推理、图像标注、弱监督或无监督学习、结构化预测、概率图形模型、视频处理和整数规划。
武执政
副教授
南洋理工大学博士
全球Top 2% 顶尖科学家、IEEE语音与语言处理技术委员会委员、IEEE/ACM 音频、语音和语言处理汇刊编委,曾获2012亚太信号与信息处理协会年度峰会最佳论文奖,曾任职Facebook、京东、苹果、爱丁堡大学、微软亚洲研究院等
研究领域:语音交互、语音生成、音频鉴伪
个人简介:武执政博士现任香港中文大学(深圳)副教授,连续多次入选斯坦福大学“全球前2%顶尖科学家”。于南洋理工大学博士学位,曾在Meta(原Facebook)、苹果、爱丁堡大学、微软亚洲研究院等机构从事学术研究和技术领导工作。武教授发起了开源系统Merlin与Amphion、第一届声纹识别欺骗检测国际评测、第一届语音转换国际评测,组织了2019年语音合成国际评测(Blizzard Challenge 2019)。多次获得最佳论文奖。武教授现为IEEE/ACM TASLP、SPL等语音领域权威期刊编委, 也是IEEE Spoken Language Technology Workshop 2024的大会主席。
于天舒
助理教授
亚利桑那州立大学博士
曾获亚利桑那州立大学工程研究生奖学金,曾任亚利桑那州立大学助教、飞利浦医疗算法工程师
研究领域:机器学习、组合问题优化、图学习和优化、循环神经网络、行列式点过程
个人简介:于天舒博士现为香港中文大学(深圳)助理教授。于博士于2012年毕业于沈阳工业大学并取得学士学位。其后前往加拿大卡尔加里大学攻读地理信息工程专业并于2016年获得硕士学位。在此之前,他曾于2012至2014年间任飞利浦医疗算法工程师。于博士将于2021年从亚利桑那州立大学计算机科学专业毕业并取得博士学位。
于博士主要研究兴趣涵盖多个机器学习和组合问题优化相关领域。他对利用机器学习解决传统组合问题,图学习和优化,以及在深度学习框架内寻求结构扩展特别感兴趣。循环神经网络,行列式点过程也是他开展的研究之一。于博士也是多个顶级会议(例如ICLR 2021, NIPS 2020, CVPR 2019-2021, ICCV 2019, ECCV 2020等)以及期刊(例如IEEE Transactions on Image Processing, IEEE Transactions on Neural Networks and Learning Systems, IEEE Transactions on Circuits and Systems for Video Technology, Pattern Recognition, Pattern Recognition Letters等)的审稿人。
查宏远
校长学勤讲座教授
副院长(科研)
计算机科学学科负责人
斯坦福大学博士
全球Top 2%顶尖科学家、全球计算机科学和电子领域千强科学家,曾获NIPS杰出论文奖、Leslie Fox数值分析奖,SIGIR最佳学生论文奖指导教授,深圳市人工智能与机器人研究院中心主任、原佐治亚理工学院教授
研究领域:机器学习及应用
个人简介:查宏远教授现为香港中文大学(深圳)校长学勤讲座教授、数据科学学院副院长(科研)。
查宏远教授1984年毕业于复旦大学数学系, 并于1993年获得斯坦福大学科学计算专业博士学位。查教授于2006年至2020年任职于佐治亚理工学院计算机学院,1992年至2006年任职于宾州州立大学计算机科学与工程系,他也曾于1999年至2001年任职于 Inktomi 公司。他目前的研究方向是机器学习及应用。
查教授在计算机等相关领域的主流科技期刊和顶级学术会议上发表400多篇论文,据谷歌学术统计,截止2024年04月,谷歌H-index 90,总引用率超36703次。曾荣获多项重要学术奖项,如 Institute of Mathematics and Applications(IMA)授予的“莱斯利福克斯奖(Leslie Fox Prize)”二等奖(1991年),第34届ACM SIGIR国际信息检索会议(SIGIR 2011)最佳学生论文奖(指导教授)(2011年),第26届NeurIPS“最佳论文奖” (2013年)。
论文介绍
橘色为SDS学生作者;紫色为SDS教授作者
01 Online Control with Adversarial Disturbance for Continuous-time Linear Systems
作者:Jingwei Li, Jing Dong, Can Chang, Baoxiang Wang, Jingzhao Zhang
论文摘要:Integer linear programs (ILPs) are commonly employed to model diverse practical problems such as scheduling and planning. Recently, machine learning techniques have been utilized to solve ILPs. A straightforward idea is to train a model via supervised learning, with an ILP as the input and an optimal solution as the label. An ILP is symmetric if its variables can be permuted without changing the problem structure, resulting in numerous equivalent and optimal solutions. Randomly selecting an optimal solution as the label can introduce variability in the training data, which may hinder the model from learning stable patterns. In this work, we incorporate the intrinsic symmetry of ILPs and propose a novel training framework called SymILO. Specifically, we modify the learning task by introducing solution permutation along with neural network weights as learnable parameters and then design an alternating algorithm to jointly optimize the loss function. We evaluated our framework on ILPs with different symmetries, and computational results demonstrate that our symmetry-aware approach significantly outperforms the symmetry-agnostic ones. We conduct extensive experiments on ILPs involving different symmetries and the computational results demonstrate that our symmetry-aware approach significantly outperforms three existing methods -- achieving up to 50.3%, 66.5%, and 45.4% improvements, respectively.
https://arxiv.org/html/2306.01952v3
02 Few-Shot Diffusion Models Escape the Curse of Dimensionality
作者:Ruofeng Yang, Bo Jiang, Cheng Chen, Ruinan Jin, Baoxiang Wang, Shuai Li
论文摘要:Denoising diffusion probabilistic models (DDPM) are powerful hierarchical latent variable models with remarkable sample generation quality and training stability. These properties can be attributed to parameter sharing in the generative hierarchy, as well as a parameter-free diffusion-based inference procedure. In this paper, we present Few-Shot Diffusion Models (FSDM), a framework for few-shot generation leveraging conditional DDPMs. FSDMs are trained to adapt the generative process conditioned on a small set of images from a given class by aggregating image patch information using a set-based Vision Transformer (ViT). At test time, the model is able to generate samples from previously unseen classes conditioned on as few as 5 samples from that class. We empirically show that FSDM can perform few-shot generation and transfer to new datasets. We benchmark variants of our method on complex vision datasets for few-shot learning and compare to unconditional and conditional DDPM baselines. Additionally, we show how conditioning the model on patch-based input set information improves training convergence.
https://nips.cc/virtual/2024/poster/95694
03 Alignment at Pre-training! Towards Native Alignment for Arabic LLMs
作者:Juhao Liang (SSE Student), Zhenyang Cai, Jianqing Zhu, Huang Huang, Kewei Zong, Bang An, Mosen Alharthi, Juncai He, Lian Zhang, Haizhou Li, Benyou Wang (corresponding), Jinchao Xu
论文摘要:The alignment of large language models (LLMs) is critical for developing effective and safe language models. Traditional approaches focus on aligning models during the instruction tuning or reinforcement learning stages, referred to in this paper as `\textit{post alignment}'. We argue that alignment during the pre-training phase, which we term 'native alignment', warrants investigation. Native alignment aims to prevent unaligned content from the beginning, rather than relying on post-hoc processing. This approach leverages extensively aligned pre-training data to enhance the effectiveness and usability of pre-trained models. Our study specifically explores the application of native alignment in the context of Arabic LLMs. We conduct comprehensive experiments and ablation studies to evaluate the impact of native alignment on model performance and alignment stability. Additionally, we release open-source Arabic LLMs that demonstrate state-of-the-art performance on various benchmarks, providing significant benefits to the Arabic LLM community.
https://neurips.cc/virtual/2024/poster/93121
04 Breaking the False Sense of Security in Backdoor Defense through Re-Activation Attack
作者:Mingli Zhu, Siyuan Liang, Baoyuan Wu (corresponding author)
论文摘要:Deep neural networks face persistent challenges in defending against backdoor attacks, leading to an ongoing battle between attacks and defenses. While existing backdoor defense strategies have shown promising performance on reducing attack success rates, can we confidently claim that the backdoor threat has truly been eliminated from the model? To address it, we re-investigate the characteristics of the backdoored models after defense (denoted as defense models). Surprisingly, we find that the original backdoors still exist in defense models derived from existing post-training defense strategies, and the backdoor existence is measured by a novel metric called backdoor existence coefficient. It implies that the backdoors just lie dormant rather than being eliminated. To further verify this finding, we empirically show that these dormant backdoors can be easily re-activated during inference, by manipulating the original trigger with well-designed tiny perturbation using universal adversarial attack. More practically, we extend our backdoor reactivation to black-box scenario, where the defense model can only be queried by the adversary during inference, and develop two effective methods, i.e., query-based and transfer-based backdoor re-activation attacks. The effectiveness of the proposed methods are verified on both image classification and multimodal contrastive learning (i.e., CLIP) tasks. In conclusion, this work uncovers a critical vulnerability that has never been explored in existing defense strategies, emphasizing the urgency of designing more robust and advanced backdoor defense mechanisms in the future.
https://www.arxiv.org/abs/2405.16134
05 Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor
作者:Shaokui Wei, Hongyuan Zha, Baoyuan Wu (corresponding author)
论文摘要:Data-poisoning backdoor attacks are serious security threats to machine learning models, where an adversary can manipulate the training dataset to inject backdoors into models. In this paper, we focus on in-training backdoor defense, aiming to train a clean model even when the dataset may be potentially poisoned. Unlike most existing methods that primarily detect and remove/unlearn suspicious samples to mitigate malicious backdoor attacks, we propose a novel defense approach called PDB (Proactive Defensive Backdoor). Specifically, PDB leverages the home-field advantage of defenders by proactively injecting a defensive backdoor into the model during training. Taking advantage of controlling the training process, the defensive backdoor is designed to suppress the malicious backdoor effectively while remaining secret to attackers. In addition, we introduce a reversible mapping to determine the defensive target label. During inference, PDB embeds a defensive trigger in the inputs and reverses the model's prediction, suppressing malicious backdoor and ensuring the model's utility on the original task. Experimental results across various datasets and models demonstrate that our approach achieves state-of-the-art defense performance against a wide range of backdoor attacks.
https://arxiv.org/abs/2405.16112
06 FreqBlender: Enhancing DeepFake Detection by Blending Frequency Knowledge
作者:Hanzhe LI, Jiaran Zhou, Yuezun Li, Baoyuan Wu, Bin Li, Junyu Dong
论文摘要:Generating synthetic fake faces, known as pseudo-fake faces, is an effective way to improve the generalization of DeepFake detection. Existing methods typically generate these faces by blending real or fake faces in color space. While these methods have shown promise, they overlook the simulation of frequency distribution in pseudo-fake faces, limiting the learning of generic forgery traces in-depth. To address this, this paper introduces {\em FreqBlender}, a new method that can generate pseudo-fake faces by blending frequency knowledge. Specifically, we investigate the major frequency components and propose a Frequency Parsing Network to adaptively partition frequency components related to forgery traces. Then we blend this frequency knowledge from fake faces into real faces to generate pseudo-fake faces. Since there is no ground truth for frequency components, we describe a dedicated training strategy by leveraging the inner correlations among different frequency knowledge to instruct the learning process. Experimental results demonstrate the effectiveness of our method in enhancing DeepFake detection, making it a potential plug-and-play strategy for other methods.
https://arxiv.org/abs/2404.13872
07 Disentangling Linear Quadratic Control with Untrusted ML Predictions
作者:Tongxin Li, Hao Liu, Yisong Yue
论文摘要:Uncertain perturbations in dynamical systems often arise from diverse resources, represented by latent components. The predictions for these components, typically generated by "black-box" machine learning tools, are prone to inaccuracies. To tackle this challenge, we introduce DISC, a novel policy that learns a confidence parameter online to harness the potential of accurate predictions while also mitigating the impact of potential erroneous forecasts. When predictions are precise, DISC leverages this information to achieve near-optimal performance. Conversely, in the case of significant prediction errors, it still has a worst-case guarantee on its competitive ratio. We provide competitive ratio bounds for DISC in both a linear mixing framework of latent variables and a broader class of mixing functions. Our results highlight a first-of-its-kind "best-of-both-worlds" integration of machine-learned predictions, thus lead to a near-optimal consistency and robustness tradeoff, which provably improves what can be obtained without learning the confidence parameter. We validate the applicability of DISC across a spectrum of practical scenarios.Integer linear programs (ILPs) are commonly employed to model diverse practical problems such as scheduling and planning. Recently, machine learning techniques have been utilized to solve ILPs. A straightforward idea is to train a model via supervised learning, with an ILP as the input and an optimal solution as the label. An ILP is symmetric if its variables can be permuted without changing the problem structure, resulting in numerous equivalent and optimal solutions. Randomly selecting an optimal solution as the label can introduce variability in the training data, which may hinder the model from learning stable patterns. In this work, we incorporate the intrinsic symmetry of ILPs and propose a novel training framework called SymILO. Specifically, we modify the learning task by introducing solution permutation along with neural network weights as learnable parameters and then design an alternating algorithm to jointly optimize the loss function. We evaluated our framework on ILPs with different symmetries, and computational results demonstrate that our symmetry-aware approach significantly outperforms the symmetry-agnostic ones. We conduct extensive experiments on ILPs involving different symmetries and the computational results demonstrate that our symmetry-aware approach significantly outperforms three existing methods -- achieving up to 50.3%, 66.5%, and 45.4% improvements, respectively.
https://neurips.cc/virtual/2024/poster/94827
08 Safe Exploitative Play in Stochastic Bayesian Games with Untrusted Type Beliefs
作者:Tongxin Li, Tinashe Handina, Shaolei Ren, Adam Wierman
论文摘要:Uncertain perturbations in dynamical systems often arise from diverse resources, represented by latent components. The predictions for these components, typically generated by "black-box" machine learning tools, are prone to inaccuracies. To tackle this challenge, we introduce DISC, a novel policy that learns a confidence parameter online to harness the potential of accurate predictions while also mitigating the impact of potential erroneous forecasts. When predictions are precise, DISC leverages this information to achieve near-optimal performance. Conversely, in the case of significant prediction errors, it still has a worst-case guarantee on its competitive ratio. We provide competitive ratio bounds for DISC in both a linear mixing framework of latent variables and a broader class of mixing functions. Our results highlight a first-of-its-kind "best-of-both-worlds" integration of machine-learned predictions, thus lead to a near-optimal consistency and robustness tradeoff, which provably improves what can be obtained without learning the confidence parameter. We validate the applicability of DISC across a spectrum of practical scenarios.Integer linear programs (ILPs) are commonly employed to model diverse practical problems such as scheduling and planning. Recently, machine learning techniques have been utilized to solve ILPs. A straightforward idea is to train a model via supervised learning, with an ILP as the input and an optimal solution as the label. An ILP is symmetric if its variables can be permuted without changing the problem structure, resulting in numerous equivalent and optimal solutions. Randomly selecting an optimal solution as the label can introduce variability in the training data, which may hinder the model from learning stable patterns. In this work, we incorporate the intrinsic symmetry of ILPs and propose a novel training framework called SymILO. Specifically, we modify the learning task by introducing solution permutation along with neural network weights as learnable parameters and then design an alternating algorithm to jointly optimize the loss function. We evaluated our framework on ILPs with different symmetries, and computational results demonstrate that our symmetry-aware approach significantly outperforms the symmetry-agnostic ones. We conduct extensive experiments on ILPs involving different symmetries and the computational results demonstrate that our symmetry-aware approach significantly outperforms three existing methods -- achieving up to 50.3%, 66.5%, and 45.4% improvements, respectively.
https://nips.cc/virtual/2024/poster/95226
09 Towards Robust Multimodal Sentiment Analysis with Incomplete Data
作者:Haoyu Zhang (SDS RA), Wenbin Wang, Tianshu Yu
论文摘要:The field of Multimodal Sentiment Analysis (MSA) has recently witnessed an emerging direction seeking to tackle the issue of data incompleteness. Recognizing that the language modality typically contains dense sentiment information, we consider it as the dominant modality and present an innovative Language-dominated Noise-resistant Learning Network (LNLN) to achieve robust MSA. The proposed LNLN features a dominant modality correction (DMC) module and dominant modality based multimodal learning (DMML) module, which enhances the model's robustness across various noise scenarios by ensuring the quality of dominant modality representations. Aside from the methodical design, we perform comprehensive experiments under random data missing scenarios, utilizing diverse and meaningful settings on several popular datasets (\textit{e.g.,} MOSI, MOSEI, and SIMS), providing additional uniformity, transparency, and fairness compared to existing evaluations in the literature. Empirically, LNLN consistently outperforms existing baselines, demonstrating superior performance across these challenging and extensive evaluation metrics.
https://arxiv.org/abs/2409.20012
10 Boosting Graph Pooling with Persistent Homology
作者:Chaolong Ying, Xinjian Zhao,Tianshu Yu
论文摘要:Recently, there has been an emerging trend to integrate persistent homology (PH) into graph neural networks (GNNs) to enrich expressive power. However, naively plugging PH features into GNN layers always results in marginal improvement with low interpretability. In this paper, we investigate a novel mechanism for injecting global topological invariance into pooling layers using PH, motivated by the observation that filtration operation in PH naturally aligns graph pooling in a cut-off manner. In this fashion, message passing in the coarsened graph acts along persistent pooled topology, leading to improved performance. Experimentally, we apply our mechanism to a collection of graph pooling methods and observe consistent and substantial performance gain over several popular datasets, demonstrating its wide applicability and flexibility.
https://arxiv.org/abs/2402.16346
11 Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars
作者:Zhaoxuan Wu*, Xiaoqiang Lin*, Zhongxiang Dai (corresponding author), Wenyang Hu, Yao Shu, See-Kiong Ng, Patrick Jaillet and Kian Hsiang Low.
论文摘要:Large language models (LLMs) have shown impressive capabilities in real-world applications. The capability of in-context learning (ICL) allows us to adapt an LLM to downstream tasks by including input-label exemplars in the prompt without model fine-tuning. However, the quality of these exemplars in the prompt greatly impacts performance, highlighting the need for an effective automated exemplar selection method. Recent studies have explored retrieval-based approaches to select exemplars tailored to individual test queries, which can be undesirable due to extra test-time computation and an increased risk of data exposure. Moreover, existing methods fail to adequately account for the impact of exemplar ordering on the performance. On the other hand, the impact of the instruction, another essential component in the prompt given to the LLM, is often overlooked in existing exemplar selection methods. To address these challenges, we propose a novel method named EASE, which leverages the hidden embedding from a pre-trained language model to represent ordered sets of exemplars and uses a neural bandit algorithm to optimize the sets of exemplars while accounting for exemplar ordering. Our EASE can efficiently find an ordered set of exemplars that performs well for all test queries from a given task, thereby eliminating test-time computation. Importantly, EASE can be readily extended to jointly optimize both the exemplars and the instruction. Through extensive empirical evaluations (including novel tasks), we demonstrate the superiority of EASE over existing methods, and reveal practical insights about the impact of exemplar selection on ICL, which may be of independent interest.
https://arxiv.org/abs/2405.16122
12 Localized Zeroth-Order Prompt Optimization
作者:Wenyang Hu*, Yao Shu*, Zongmin Yu, Zhaoxuan Wu, Xiaoqiang Lin, Zhongxiang Dai, See-Kiong Ng and Kian Hsiang Low.
论文摘要:The efficacy of large language models (LLMs) in understanding and generating natural language has aroused a wide interest in developing prompt-based methods to harness the power of black-box LLMs. Existing methodologies usually prioritize a global optimization for finding the global optimum, which however will perform poorly in certain tasks. This thus motivates us to re-think the necessity of finding a global optimum in prompt optimization. To answer this, we conduct a thorough empirical study on prompt optimization and draw two major insights. Contrasting with the rarity of global optimum, local optima are usually prevalent and well-performed, which can be more worthwhile for efficient prompt optimization (Insight I). The choice of the input domain, covering both the generation and the representation of prompts, affects the identification of well-performing local optima (Insight II). Inspired by these insights, we propose a novel algorithm, namely localized zeroth-order prompt optimization (ZOPO), which incorporates a Neural Tangent Kernel-based derived Gaussian process into standard zeroth-order optimization for an efficient search of well-performing local optima in prompt optimization. Remarkably, ZOPO outperforms existing baselines in terms of both the optimization performance and the query efficiency, which we demonstrate through extensive experiments.
https://arxiv.org/abs/2403.02993
13 B-ary Tree Push-Pull Method is Provably Efficient for Distributed Learning on Heterogeneous Data
作者:Runze You, Shi Pu
论文摘要:This paper considers the distributed learning problem where a group of agents cooperatively minimizes the summation of their local cost functions based on peer-to-peer communication. Particularly, we propose a highly efficient algorithm, termed ``B-ary Tree Push-Pull'' (BTPP), that employs two B-ary spanning trees for distributing the information related to the parameters and stochastic gradients across the network. The simple method is efficient in communication since each agent interacts with at most (B+1) neighbors per iteration. More importantly, BTPP achieves linear speedup for smooth nonconvex objective functions with only \tilde{O}(n) transient iterations, significantly outperforming the state-of-the-art results to the best of our knowledge.
https://arxiv.org/abs/2404.05454
14 Analysing Multi-Task Regression via Random Matrix Theory with Application to Time Series Forecasting
作者:Romain Ilbert, Malik Tiomoko, Cosme Louart, Ambroise Odonnat, Vasilii Feofanov, Themis Palpanas, Ievgen Redko
论文摘要:In this paper, we introduce a novel theoretical framework for multi-task regression, applying random matrix theory to provide precise performance estimations, under high-dimensional, non-Gaussian data distributions. We formulate a multi-task optimization problem as a regularization technique to enable single-task models to leverage multi-task learning information. We derive a closed-form solution for multi-task optimization in the context of linear models. Our analysis provides valuable insights by linking the multi-task learning performance to various model statistics such as raw data covariances, signal-generating hyperplanes, noise levels, as well as the size and number of datasets. We finally propose a consistent estimation of training and testing errors, thereby offering a robust foundation for hyperparameter optimization in multi-task regression scenarios. Experimental validations on both synthetic and real-world datasets in regression and multivariate time series forecasting demonstrate improvements on univariate models, incorporating our method into the training loss and thus leveraging multivariate information.
https://arxiv.org/abs/2406.10327
15 BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models.
作者:Qijun Luo, Hengxu Yu, Xiao Li
论文摘要:This work presents BAdam, an optimization method that leverages the block coordinate descent framework with Adam as the inner solver. BAdam offers a memory efficient approach to the full parameter finetuning of large language models. We conduct theoretical convergence analysis for BAdam in the deterministic case. Experimentally, we apply BAdam to instruction-tune the Llama 2-7B and Llama 3-8B models using a single RTX3090-24GB GPU. The results confirm BAdam's efficiency in terms of memory and running time. Additionally, the convergence verification indicates that BAdam exhibits superior convergence behavior compared to LoRA. Furthermore, the downstream performance evaluation using the MT-bench shows that BAdam modestly surpasses LoRA and more substantially outperforms LOMO. Finally, we compare BAdam with Adam on a medium-sized task, i.e., finetuning RoBERTa-large on the SuperGLUE benchmark. The results demonstrate that BAdam is capable of narrowing the performance gap with Adam more effectively than LoRA.
https://arxiv.org/abs/2404.02827
16 GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI
作者:Pengcheng Chen, Jin Ye, Guoan Wang, Yanjun Li, Zhongying Deng, Wei Li, Tianbin Li, Haodong Duan, Ziyan Huang, Yanzhou Su, Benyou Wang, Shaoting Zhang, Bin Fu, Jianfei Cai, Bohan Zhuang, Eric J Seibel, Junjun He, Yu Qiao
论文摘要:Large Vision-Language Models (LVLMs) are capable of handling diverse data types such as imaging, text, and physiological signals, and can be applied in various fields. In the medical field, LVLMs have a high potential to offer substantial assistance for diagnosis and treatment. Before that, it is crucial to develop benchmarks to evaluate LVLMs' effectiveness in various medical applications. Current benchmarks are often built upon specific academic literature, mainly focusing on a single domain, and lacking varying perceptual granularities. Thus, they face specific challenges, including limited clinical relevance, incomplete evaluations, and insufficient guidance for interactive LVLMs. To address these limitations, we developed the GMAI-MMBench, the most comprehensive general medical AI benchmark with well-categorized data structure and multi-perceptual granularity to date. It is constructed from 284 datasets across 38 medical image modalities, 18 clinical-related tasks, 18 departments, and 4 perceptual granularities in a Visual Question Answering (VQA) format. Additionally, we implemented a lexical tree structure that allows users to customize evaluation tasks, accommodating various assessment needs and substantially supporting medical AI research and applications. We evaluated 50 LVLMs, and the results show that even the advanced GPT-4o only achieves an accuracy of 53.96%, indicating significant room for improvement. Moreover, we identified five key insufficiencies in current cutting-edge LVLMs that need to be addressed to advance the development of better medical applications. We believe that GMAI-MMBench will stimulate the community to build the next generation of LVLMs toward GMAI.
https://arxiv.org/abs/2408.03361
17 FinBen: An Holistic Financial Benchmark for Large Language Models
作者:Qianqian Xie, Weiguang Han, Zhengyu Chen, Ruoyu Xiang, Xiao Zhang, Yueru He, Mengxi Xiao, Dong Li, Yongfu Dai, Duanyu Feng, Yijing Xu, Haoqiang Kang, Ziyan Kuang, Chenhan Yuan, Kailai Yang, Zheheng Luo, Tianlin Zhang, Zhiwei Liu, GUOJUN XIONG, Zhiyang Deng, Yuechen Jiang, Zhiyuan Yao, Haohang Li, Yangyang Yu, Gang Hu, Huang Jiajia, Xiao-Yang Liu, Alejandro Lopez-Lira, Benyou Wang, Yanzhao Lai, Hao Wang, Min Peng, Sophia Ananiadou, Jimin Huang
论文摘要:LLMs have transformed NLP and shown promise in various fields, yet their potential in finance is underexplored due to a lack of comprehensive evaluation benchmarks, the rapid development of LLMs, and the complexity of financial tasks. In this paper, we introduce FinBen, the first extensive open-source evaluation benchmark, including 36 datasets spanning 24 financial tasks, covering seven critical aspects: information extraction (IE), textual analysis, question answering (QA), text generation, risk management, forecasting, and decision-making. FinBen offers several key innovations: a broader range of tasks and datasets, the first evaluation of stock trading, novel agent and Retrieval-Augmented Generation (RAG) evaluation, and three novel open-source evaluation datasets for text summarization, question answering, and stock trading. Our evaluation of 15 representative LLMs, including GPT-4, ChatGPT, and the latest Gemini, reveals several key findings: While LLMs excel in IE and textual analysis, they struggle with advanced reasoning and complex tasks like text generation and forecasting. GPT-4 excels in IE and stock trading, while Gemini is better at text generation and forecasting. Instruction-tuned LLMs improve textual analysis but offer limited benefits for complex tasks such as QA. FinBen has been used to host the first financial LLMs shared task at the FinNLP-AgentScen workshop during IJCAI-2024, attracting 12 teams. Their novel solutions outperformed GPT-4, showcasing FinBen's potential to drive innovation in financial LLMs.
https://arxiv.org/abs/2402.12659
18 Graph Classification via Reference Distribution Learning: Theory and Practice
作者:Zixiao Wang (FE graduated in 2024, current RA of Prof. FAN Jicong), Jicong Fan
论文摘要:Graph classification is a challenging problem owing to the difficulty in quantifying the similarity between graphs or representing graphs as vectors, though there have been a few methods using graph kernels or graph neural networks (GNNs). Graph kernels often suffer from computational costs and manual feature engineering, while GNNs commonly utilize global pooling operations, risking the loss of structural or semantic information. This work introduces Graph Reference Distribution Learning (GRDL), an efficient and accurate graph classification method. GRDL treats each graph's latent node embeddings given by GNN layers as a discrete distribution, enabling direct classification without global pooling, based on maximum mean discrepancy to adaptively learned reference distributions. To fully understand this new model (the existing theories do not apply) and guide its configuration (e.g., network architecture, references' sizes, number, and regularization) for practical use, we derive generalization error bounds for GRDL and verify them numerically. More importantly, our theoretical and numerical results both show that GRDL has a stronger generalization ability than GNNs with global pooling operations. Experiments on moderate-scale and large-scale graph datasets show the superiority of GRDL over the state-of-the-art, emphasizing its remarkable efficiency, being at least 10 times faster than leading competitors in both training and inference stages.
https://arxiv.org/abs/2408.11370
19 Unsupervised Anomaly Detection in The Presence of Missing Values
作者:Feng Xiao (Prof.Fan's PhD of CIE program), Jicong Fan
论文摘要:Anomaly detection methods typically require fully observed data for model training and inference and cannot handle incomplete data, while the missing data problem is pervasive in science and engineering, leading to challenges in many important applications such as abnormal user detection in recommendation systems and novel or anomalous cell detection in bioinformatics, where the missing rates can be higher than 30\% or even 80\%. In this work, first, we construct and evaluate a straightforward strategy, ''impute-then-detect'', via combining state-of-the-art imputation methods with unsupervised anomaly detection methods, where the training data are composed of normal samples only. We observe that such two-stage methods frequently yield imputation bias from normal data, namely, the imputation methods are inclined to make incomplete samples ''normal", where the fundamental reason is that the imputation models learned only on normal data and cannot generalize well to abnormal data in the inference stage. To address this challenge, we propose an end-to-end method that integrates data imputation with anomaly detection into a unified optimization problem. The proposed model learns to generate well-designed pseudo-abnormal samples to mitigate the imputation bias and ensure the discrimination ability of both the imputation and detection processes. Furthermore, we provide theoretical guarantees for the effectiveness of the proposed method, proving that the proposed method can correctly detect anomalies with high probability. Experimental results on datasets with manually constructed missing values and inherent missing values demonstrate that our proposed method effectively mitigates the imputation bias and surpasses the baseline methods significantly.
https://neurips.cc/virtual/2024/poster/96230
20 SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
作者:Junyi Ao, Yuancheng Wang, Xiaohai Tian, Dekun Chen, Jun Zhang, Lu Lu, Yuxuan Wang, Haizhou Li, Zhizheng Wu
论文摘要:Uncertain perturbations in dynamical systems often arise from diverse resources, represented by latent components. The predictions for these components, typically generated by "black-box" machine learning tools, are prone to inaccuracies. To tackle this challenge, we introduce DISC, a novel policy that learns a confidence parameter online to harness the potential of accurate predictions while also mitigating the impact of potential erroneous forecasts. When predictions are precise, DISC leverages this information to achieve near-optimal performance. Conversely, in the case of significant prediction errors, it still has a worst-case guarantee on its competitive ratio. We provide competitive ratio bounds for DISC in both a linear mixing framework of latent variables and a broader class of mixing functions. Our results highlight a first-of-its-kind "best-of-both-worlds" integration of machine-learned predictions, thus lead to a near-optimal consistency and robustness tradeoff, which provably improves what can be obtained without learning the confidence parameter. We validate the applicability of DISC across a spectrum of practical scenarios.Integer linear programs (ILPs) are commonly employed to model diverse practical problems such as scheduling and planning. Recently, machine learning techniques have been utilized to solve ILPs. A straightforward idea is to train a model via supervised learning, with an ILP as the input and an optimal solution as the label. An ILP is symmetric if its variables can be permuted without changing the problem structure, resulting in numerous equivalent and optimal solutions. Randomly selecting an optimal solution as the label can introduce variability in the training data, which may hinder the model from learning stable patterns. In this work, we incorporate the intrinsic symmetry of ILPs and propose a novel training framework called SymILO. Specifically, we modify the learning task by introducing solution permutation along with neural network weights as learnable parameters and then design an alternating algorithm to jointly optimize the loss function. We evaluated our framework on ILPs with different symmetries, and computational results demonstrate that our symmetry-aware approach significantly outperforms the symmetry-agnostic ones. We conduct extensive experiments on ILPs involving different symmetries and the computational results demonstrate that our symmetry-aware approach significantly outperforms three existing methods -- achieving up to 50.3%, 66.5%, and 45.4% improvements, respectively.
https://arxiv.org/pdf/2406.13340
21 Language Without Borders: A dataset and benchmark for code-switching lip reading
作者:Xueyi Zhang (SDS intern student), Chengwei Zhang, Mingrui Lao, Peng Zhao (SDS intern student), Jun Tang, Yanming Guo, Siqi Cai, Xianghu Yue, Haizhou Li
论文摘要:Lip reading aims at transforming the videos of continuous lip movement into textual contents, and has achieved significant progress over the past decade. It serves as a critical yet practical assistance for speech-impaired individuals, with more practicability than speech recognition in noisy environments. With the increasing interpersonal communications in social media owing to globalization, the existing monolingual datasets for lip reading may not be sufficient to meet the exponential proliferation of bilingual and even multilingual users. However, to our best knowledge, research on code-switching is only explored in speech recognition, while the attempts in lip reading are seriously neglected. To bridge this gap, we have collected a bilingual code-switching lip reading benchmark composed of Chinese and English, dubbed CSLR. As the pioneering work, we recruited 62 speakers with proficient foundations in bothspoken Chinese and English to express sentences containing both involved languages. Through rigorous criteria in data selection, CSLR benchmark has accumulated 85,560 video samples with a resolution of 1080x1920, totaling over 71.3 hours of high-quality code-switching lip movement data. To systematically evaluate the technical challenges in CSLR, we implement commonly-used lip reading backbones, as well as competitive solutions in code-switching speech for benchmark testing. Experiments show CSLR to be a challenging and under-explored lip reading task. We hope our proposed benchmark will extend the applicability of code-switching lip reading, and further contribute to the communities of cross-lingual communication and collaboration. Our dataset and benchmark are accessible at https://github.com/cslr-lipreading/CSLR.
https://nips.cc/virtual/2024/poster/97448
22 HyperLogic: Enhancing Diversity and Accuracy in Rule Learning with HyperNets
作者:Yang Yang, Wendi Ren,Shuang Li
论文摘要:Exploring the integration of if-then logic rules within neural network architectures presents an intriguing area. This integration seamlessly transforms the rule learning task into neural network training using backpropagation and stochastic gradient descent. From a well-trained sparse and shallow neural network, one can interpret each layer and neuron through the language of logic rules, and a global explanatory rule set can be directly extracted. However, ensuring interpretability may impose constraints on the flexibility, depth, and width of neural networks. In this paper, we propose HyperLogic: a novel framework leveraging hypernetworks to generate weights of the main network. HyperLogic can unveil multiple diverse rule sets, each capable of capturing heterogeneous patterns in data. This provides a simple yet effective method to increase model flexibility and preserve interpretability. We theoretically analyzed the benefits of the HyperLogic by examining the approximation error and generalization capabilities under two types of regularization terms: sparsity and diversity regularizations. Experiments on real data demonstrate that our method can learn more diverse, accurate, and concise rules.
https://neurips.cc/virtual/2024/poster/94153
23 Why Transformers Need Adam: A Hessian Perspective
作者:Yushun Zhang, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun, Zhi-Quan Luo
论文摘要:SGD performs worse than Adam by a significant margin on Transformers, but the reason remains unclear. In this work, we provide an explanation through the lens of Hessian: (i) Transformers are "heterogeneous": the Hessian spectrum across parameter blocks vary dramatically, a phenomenon we call "block heterogeneity"; (ii) Heterogeneity hampers SGD: SGD performs worse than Adam on problems with block heterogeneity. To validate (i) and (ii), we check various Transformers, CNNs, MLPs, and quadratic problems, and find that SGD can perform on par with Adam on problems without block heterogeneity, but performs worse than Adam when the heterogeneity exists. Our initial theoretical analysis indicates that SGD performs worse because it applies one single learning rate to all blocks, which cannot handle the heterogeneity among blocks. This limitation could be ameliorated if we use coordinate-wise learning rates, as designed in Adam.
https://nips.cc/virtual/2024/poster/94790
24 On the Power of Small-size Graph Neural Networks for Linear Programming
作者:Qian Li, Tian Ding, Linxin Yang, Minghui Ouyang, Qingjiang Shi, Ruoyu Sun
论文摘要:Graph neural networks (GNNs) have recently emerged as powerful tools for addressing complex optimization problems. It has been theoretically demonstrated that GNNs can universally approximate the solution mapping functions of linear programming (LP) problems. However, these theoretical results typically require GNNs to have large parameter sizes. Conversely, empirical experiments have shown that relatively small GNNs can solve LPs effectively, revealing a significant discrepancy between theoretical predictions and practical observations. In this work, we aim to bridge this gap by providing a theoretical foundation for the effectiveness of small-size GNNs. We prove that polylogarithmic-depth, constant-width GNNs are sufficient to solve packing and covering LPs, two widely used classes of LPs. Our proof leverages the capability of GNNs to simulate a variant of the gradient descent algorithm on a carefully selected potential function. Additionally, we introduce a new GNN architecture, termed GD-Net. Experimental results demonstrate that GD-Net significantly outperforms conventional GNN structures while using fewer parameters.
https://nips.cc/virtual/2024/poster/95370
25 SymILO: A Symmetry-Aware Learning Framework for Integer Linear Optimization
作者:Qian Chen, Tianjian Zhang, Linxin Yang, Qingyu Han, Akang Wang, Ruoyu Sun, Xiaodong Luo, Tsung-Hui Chang
论文摘要:Integer linear programs (ILPs) are commonly employed to model diverse practical problems such as scheduling and planning. Recently, machine learning techniques have been utilized to solve ILPs. A straightforward idea is to train a model via supervised learning, with an ILP as the input and an optimal solution as the label. An ILP is symmetric if its variables can be permuted without changing the problem structure, resulting in numerous equivalent and optimal solutions. Randomly selecting an optimal solution as the label can introduce variability in the training data, which may hinder the model from learning stable patterns. In this work, we incorporate the intrinsic symmetry of ILPs and propose a novel training framework called SymILO. Specifically, we modify the learning task by introducing solution permutation along with neural network weights as learnable parameters and then design an alternating algorithm to jointly optimize the loss function. We evaluated our framework on ILPs with different symmetries, and computational results demonstrate that our symmetry-aware approach significantly outperforms the symmetry-agnostic ones. We conduct extensive experiments on ILPs involving different symmetries and the computational results demonstrate that our symmetry-aware approach significantly outperforms three existing methods -- achieving up to 50.3%, 66.5%, and 45.4% improvements, respectively.
https://neurips.cc/virtual/2024/poster/95146
26 OptCM: The Optimization Consistency Models for Solving Combinatorial Problems in Few Shots
作者:Yang Li, Jinpei Guo, Runzhong Wang, Hongyuan Zha, Junchi Yan
论文摘要:Diffusion models have recently advanced Combinatorial Optimization (CO) as a powerful backbone for neural solvers. However, their iterative sampling process requiring denoising across multiple noise levels incurs substantial overhead. We propose to learn direct mappings from different noise levels to the optimal solution for a given instance, facilitating high-quality generation with minimal shots. This is achieved through an optimization consistency training protocol, which, for a given instance, minimizes the difference among samples originating from varying generative trajectories and time steps relative to the optimal solution. The proposed Optimization Consistency Models (OptCM) enable fast single-step solution generation while retaining the option of multi-step sampling to trade for sampling quality, which offers a more effective and efficient alternative backbone for neural solvers. In addition, to mitigate the gap between training over historical instances and solving for the new instance, we additionally introduce a novel consistency-based gradient search scheme at test stage, for more effective exploration in the training-phase learned solution space. It is achieved by updating the latent solution probabilities under objective gradient guidance during the alternation of noise injection and denoising steps. Extensive experiments on two popular tasks, Traveling Salesman Problem (TSP) and Maximal Independent Set (MIS), demonstrate the superiority of OptCM regarding both solution quality and efficiency, even outperforming LKH given limited time budgets. Notably, OptCM with merely one-step generation and one-step gradient search can mostly outperform the SOTA diffusion-based counterparts that require hundreds of steps, while achieving tens of times speedup.
https://neurips.cc/virtual/2024/poster/93096
27 Transfer Learning for Diffusion Models
作者: Yidong Ouyang, Liyan Xie, Hongyuan Zha, Guang Cheng
论文摘要:Diffusion models, a specific type of generative model, have achieved unprecedented performance in recent years and consistently produce high-quality synthetic samples. A critical prerequisite for their notable success lies in the presence of a substantial number of training samples, which can be impractical in real-world applications due to high collection costs or associated risks. Consequently, various finetuning and regularization approaches have been proposed to transfer knowledge from existing pre-trained models to specific target domains with limited data. This paper introduces the Transfer Guided Diffusion Process (TGDP), a novel approach distinct from conventional finetuning and regularization methods. We prove that the optimal diffusion model for the target domain integrates pre-trained diffusion models on the source domain with additional guidance from a domain classifier. We further extend TGDP to a conditional version for modeling the joint distribution of data and its corresponding labels, together with two additional regularization terms to enhance the model performance. We validate the effectiveness of TGDP on Gaussian mixture simulations and on real electrocardiogram (ECG) datasets.
https://neurips.cc/virtual/2024/poster/96508