理论中心前沿系列讲座 | 直播：好的神经网络应该是什么样的？

2022-10-17 | 作者：微软亚洲研究院

微软亚洲研究院理论中心前沿系列讲座第五期，于 10 月 13 日上午 10:00-11:00 与你相见。这一期，我们请到了宾夕法尼亚大学沃顿商学院统计与数据科学系副教授苏炜杰，带来关于神经网络的讲座分享，届时请锁定 B 站“微软中国视频中心”直播间！

理论中心前沿系列讲座是微软亚洲研究院的常设系列直播讲座，将邀请全球站在理论研究前沿的研究者介绍他们的研究发现，主题涵盖大数据、人工智能以及其他相关领域的理论进展。通过这一系列讲座，我们期待与各位一起探索当前理论研究的前沿发现，并建立一个活跃的理论研究社区。

欢迎对理论研究感兴趣的老师同学们参与讲座并加入社区（加入方式见后文），共同推动理论研究进步，加强跨学科研究合作，助力打破 AI 发展瓶颈，实现计算机技术实质性发展！

直播地址：B 站“微软中国视频中心”直播间
https://live.bilibili.com/730

直播时间：每两周直播一次，时间为周四上午 10:00-11:00（有变动将另行说明）

直播时间：10 月 13 日 10:00-11:00

Weijie Su is an Associate Professor in the Wharton Statistics and Data Science Department and, by courtesy, in the Department of Computer and Information Science, at the University of Pennsylvania. He is a co-director of Penn Research in Machine Learning. Prior to joining Penn, he received his Ph.D. from Stanford University in 2016 and his bachelor’s degree from Peking University in 2011. His research interests span privacy-preserving data analysis, deep learning theory, optimization, high-dimensional statistics, and mechanism design. He is a recipient of the Stanford Theodore Anderson Dissertation Award in 2016, an NSF CAREER Award in 2019, an Alfred Sloan Research Fellowship in 2020, the SIAM Early Career Prize in Data Science in 2022, and the IMS Peter Gavin Hall Prize in 2022.

报告题目:
What Should a Good Deep Neural Network Look Like? Insights from a Layer-Peeled Model and the Law of Equi-Separation
报告摘要:
In this talk, we will investigate the emergence of geometric patterns in well-trained deep learning models by making use of a layer-peeled model and the law of equi-separation. The former is a nonconvex optimization program that models the last-layer features and weights. We use the model to shed light on the neural collapse phenomenon of Papyan, Han, and Donoho, and to predict a hitherto-unknown phenomenon that we term minority collapse in imbalanced training. This is based on joint work with Cong Fang, Hangfeng He, and Qi Long.

The law of equi-separation is a pervasive empirical phenomenon that describes how data are separated according to their class membership from the bottom to the top layer in a well-trained neural network. We will show that, through extensive computational experiments, neural networks improve data separation through layers in a simple exponential manner. This law leads to roughly equal ratios of separation that a single layer is able to improve, thereby showing that all layers are created equal. We will conclude the talk by discussing the implications of this law on the interpretation, robustness, and generalization of deep learning, as well as on the inadequacy of some existing approaches toward demystifying deep learning. This is based on joint work with Hangfeng He.

在上次讲座中，来自清华大学的张景昭教授介绍了他在神经网络优化领域的最新工作。特别的，张教授指出现在的优化理论分析和实践观察仍有较大差距，特别是在对光滑性的建模方面。张教授同时提供了一些神经网络不稳定收敛的实验观察。来自微软的研究者们与外部观众提出了自己对于当前神经网络优化方面的看法和疑问，并得到了张教授的解答。

回放地址：
https://www.bilibili.com/video/BV19N4y1N7UE/?spm_id_from=333.788&vd_source=2ea2e8f7446a0ea90077df7f40d1790f

欢迎扫码加入理论研究社区，与关注理论研究的研究者交流碰撞，群内也将分享微软亚洲研究院理论中心前沿系列讲座的最新信息

也可以向
MSRA.TheoryCenter@outlook.com 发送以"Subscribe the Lecture Series"为主题的邮件订阅讲座信息

理论中心前沿系列讲座 | 直播：好的神经网络应该是什么样的？

关注微软亚洲研究院