理论中心前沿系列讲座|直播:通过样本高效的表示学习进行无奖励强化学习

2022-12-19 | 作者:微软亚洲研究院

微软亚洲研究院理论中心前沿系列讲座第七期,将于 12 月 22 日(本周四)上午 10:00-11:00 与你相见。这一期,我们请到了俄亥俄州立大学电气与计算机工程系教授 Yingbin Liang ,带来以 “通过样本高效的表示学习进行无奖励强化学习” 为主题的讲座分享,届时请锁定 B 站 “微软中国视频中心” 直播间!

理论中心前沿系列讲座是微软亚洲研究院的常设系列直播讲座,将邀请全球站在理论研究前沿的研究者介绍他们的研究发现,主题涵盖大数据、人工智能以及其他相关领域的理论进展。通过这一系列讲座,我们期待与各位一起探索当前理论研究的前沿发现,并建立一个活跃的理论研究社区。
欢迎对理论研究感兴趣的老师同学们参与讲座并加入社区(加入方式见后文),共同推动理论研究进步,加强跨学科研究合作,助力打破 AI 发展瓶颈,实现计算机技术实质性发展!

直播地址:B 站 “微软中国视频中心” 直播间
https://live.bilibili.com/730

直播时间:12 月 22 日 10:00 - 11:00

扫码直达直播间

 

Dr. Yingbin Liang is currently a Professor at the Department of Electrical and Computer Engineering at the Ohio State University (OSU), and a core faculty of the Ohio State Translational Data Analytics Institute (TDAI). She also serves as the Deputy Director of the AI-Edge Institute at OSU. Dr. Liang received the Ph.D. degree in Electrical Engineering from the University of Illinois at Urbana-Champaign in 2005, and served on the faculty of University of Hawaii and Syracuse University before she joined OSU. Dr. Liang's research interests include machine learning, optimization, information theory, and statistical signal processing. Dr. Liang received the National Science Foundation CAREER Award and the State of Hawaii Governor Innovation Award in 2009. She also received EURASIP Best Paper Award in 2014.

报告题目:
Reward-free Reinforcement Learning via Sample-Efficient Representation Learning
通过样本高效的表示学习进行无奖励强化学习

报告摘要:
As reward-free reinforcement learning (RL) becomes a powerful framework for a variety of multi-objective applications, representation learning arises as an effective technique to deal with the curse of dimensionality in reward-free RL. However, the existing algorithms of representation learning in reward-free RL still suffers seriously from high sample complexity, although they are polynomially efficient. In this talk, I will first present a novel representation learning algorithm that we propose for reward-free RL. We show that such an algorithm provably finds near-optimal policy as well as attaining near-accurate system identification via reward-free exploration, with significantly improved sample complexity compared to the best-known result before. I will then present our characterization of the benefit of representation learning in reward-free multitask (a.k.a. meta) RL as well as the benefit of employing the learned representation from upstream to downstream tasks. I will conclude my talk with remarks of future directions.
The work to be presented was jointly with Yuan Cheng (USTC), Ruiquan Huang (PSU), Dr. Songtao Feng (OSU), Prof. Jing Yang (PSU), and Prof. Hong Zhang (USTC).

在上次讲座中,来自上海交通大学的李帅教授分享了其团队在匹配市场中多臂老虎机算法领域的最新工作。特别地,针对求解玩家最优匹配这一具体问题,李教授展示了其团队提出的最新算法,并在理论上证明了这一算法可以达到(关于玩家的偏好差距的)多项式级别的后悔。该算法是首个达到这一量级后悔的(在不需要其他假设的情况下)。讲座结束后,大家就算法表现能否进一步提升等问题提出了疑问,并得到了李教授的解答。

 

欢迎扫码加入理论研究社区,与关注理论研究的研究者交流碰撞,群内也将分享微软亚洲研究院理论中心前沿系列讲座的最新信息

【微信群二维码】

也可以向
MSRA.TheoryCenter@outlook.com 发送以"Subscribe the Lecture Series"为主题的邮件订阅讲座信息

 

关于微软亚洲研究院理论中心

2021 年 12 月,微软亚洲研究院理论中心正式成立,期待通过搭建国际学术交流与合作枢纽,促进理论研究与大数据和人工智能技术的深度融合,在推动理论研究进步的同时,加强跨学科研究合作,助力打破 AI 发展瓶颈,实现计算机技术实质性发展。目前,理论中心已经汇集了微软亚洲研究院内部不同团队和研究背景的成员,聚焦于解决包括深度学习、强化学习、动力系统学习和数据驱动优化等领域的基础性问题。

标签