理论中心前沿系列讲座｜直播：范畴论视角下的基础模型能力边界

2023-08-21 | 作者：微软亚洲研究院

微软亚洲研究院理论中心前沿系列讲座第十一期，将于 8 月 24 日（周四）上午 10:30-11:30 与你相见。

本期，我们请到了清华大学交叉信息研究院助理教授袁洋，带来以 “On the Power of Foundation Models” 为主题的讲座分享，届时请锁定 B 站 “微软科技” 直播间！

理论中心前沿系列讲座是微软亚洲研究院的常设系列直播讲座，将邀请全球站在理论研究前沿的研究者介绍他们的研究发现，主题涵盖大数据、人工智能以及其他相关领域的理论进展。通过这一系列讲座，我们期待与各位一起探索当前理论研究的前沿发现，并建立一个活跃的理论研究社区。

欢迎对理论研究感兴趣的老师同学们参与讲座并加入社区（加入方式见后文），共同推动理论研究进步，加强跨学科研究合作，助力打破 AI 发展瓶颈，实现计算机技术实质性发展！

直播信息

直播地址：B 站 “微软科技” 直播间

https://live.bilibili.com/730

如果您希望与讲者互动，欢迎通过 Teams 参会

会议链接：https://wxdlj.cn/63hva

会议 ID：249 516 746 294

会议密码：RgqRLW

直播时间：8 月 24 日（周四）上午 10:30-11:30

扫码或点击 “阅读原文” 直达 B 站直播间

讲座信息

Yang Yuan is now an assistant professor at IIIS, Tsinghua. He finished his undergraduate study at Peking University in 2012. Afterwards, he received his PhD at Cornell University in 2018, advised by Professor Robert Kleinberg. Before joining Tsinghua, he spent one year at MIT Institute for Foundations of Data Science (MIFODS) as a postdoc researcher. He works on AI+Healthcare, AI Theory and Applied Category Theory.

报告题目：

On the Power of Foundation Models

报告摘要：
With infinitely many high-quality data points, infinite computational power, an infinitely large foundation model with a perfect training algorithm and guaranteed zero generalization error on the pretext task, can the model be used for everything? This question cannot be answered by the existing theory of representation, optimization or generalization, because the issues they mainly investigate are assumed to be nonexistent here. In this paper, we show that category theory provides powerful machinery to answer this question. We have proved three results. The first one limits the power of prompt-based learning, saying that the model can solve a downstream task with prompts if and only if the task is representable. The second one says fine tuning does not have this limit, as a foundation model with the minimum required power (up to symmetry) can theoretically solve downstream tasks for the category defined by pretext task, with fine tuning and enough resources. Our final result can be seen as a new type of generalization theorem, showing that the foundation model can generate unseen objects from the target category (e.g., images) using the structural information from the source category (e.g., texts). Along the way, we provide a categorical framework for supervised and self-supervised learning, which might be of independent interest.

上期讲座回顾

在上期讲座中，来自波士顿大学电子与计算机工程系的副教授 Francesco Orabona，带来了以 “Understanding Adam and AdamW through proximal updates, scale-freeness, and relaxed smoothness” 为主题的讲座分享，他在讲座中探讨了关于训练深度神经网络最常用的算法 Adam 和 AdamW 的独特特性。

讲座回放地址：

https://www.bilibili.com/video/BV1d94y1r7Zd/

加入理论研究社区

欢迎扫码加入理论研究社区，与关注理论研究的研究者交流碰撞，群内也将分享微软亚洲研究院理论中心前沿系列讲座的最新信息。

【微信群二维码】

您也可以向

MSRA.TheoryCenter@outlook.com 发送以"Subscribe the Lecture Series"为主题的邮件，以订阅讲座信息。

关于微软亚洲研究院理论中心

2021 年 12 月，微软亚洲研究院理论中心正式成立，期待通过搭建国际学术交流与合作枢纽，促进理论研究与大数据和人工智能技术的深度融合，在推动理论研究进步的同时，加强跨学科研究合作，助力打破 AI 发展瓶颈，实现计算机技术实质性发展。目前，理论中心已经汇集了微软亚洲研究院内部不同团队和研究背景的成员，聚焦于解决包括深度学习、强化学习、动力系统学习和数据驱动优化等领域的基础性问题。

想了解关于理论中心的更多信息，请访问

MSR Asia Theory Center