Peking University · PAIR-Lab

Yaodong Yang 杨耀东 · 博雅青年学者

Assistant Professor, PKU Chief Scientist, PKU-PsiBot Lab
Yaodong Yang YY

Dr. Yaodong Yang is an Assistant Professor (Boya Young Scholar) at the Institute for Artificial Intelligence, Peking University and Chief Scientist of the PKU–PsiBot Joint Laboratory. His research focuses on experience learning and alignment of AI / Embodied agents, aiming to advance the trustworthy deployment and real-world alignment of large models, spanning the areas of reinforcement learning, AI alignment, and embodied intelligence.

He has published over 200 papers in leading journals and conferences, including Nature Machine Intelligence, Cell Matter, Artificial Intelligence Journal, and IEEE TPAMI, with more than 15,000 Google Scholar citations. Since 2022, he has been ranked as the top scholar in AI & ML at Peking University according to CSRankings.

Dr. Yang has received numerous honors, including the ACL 2025 Best Paper Award, UKRI 2026 Best Paper Award in AI, ICCV 2023 Best Paper Finalist, CoRL 2020 Best System Paper Award, and the AAMAS 2021 Blue Sky Idea Award.

He was named to the MIT Technology Review "AI 100 Young Innovators", the 2025 Forbes China Technology & Innovation Innovative Leader list, received the WAIC 2022 "Yunfan Star Award", and the ACM SIGAI China Rising Star Award. His work has been featured by CCTV, People's Daily, Xinhua News, the National Natural Science Foundation of China (NSFC), and MIT Technology Review.

He serves as an Area Chair for major conferences including ICML, ICLR, NeurIPS, AAAI, IJCAI, AAMAS, and IROS, and as an Associate Editor for Scientific Reports, Transactions on Machine Learning Research, and Neural Networks.

Previously, Dr. Yang was an Assistant Professor at King's College London, a Principal Researcher at Huawei Research U.K., and a Senior Manager at AIG. He received his B.Sc. from the University of Science and Technology of China, M.Sc. from Imperial College London, and Ph.D. from University College London, where he was the university's sole nominee for the ACM SIGAI Doctoral Dissertation Award.

| CSRanking · #1 PKU AI+ML | Best Paper Award · Five times | Elsevier · World Top 2% Scientist
200+
Publications
Nature MI · Matter · JMLR · TPAMI
15k+
Citations
Google Scholar · h-index 60
#1
PKU AI/ML Rank since 2022
CSRankings · AIRankings
5+
Best-Paper-Level Awards
ACL · UKRI · CoRL · ICCV · AAMAS
— Industrial Collaborations · partners

News· 近期动态

Headlines · recent updates

Show all recent updates 31 entries
2026 · 04

7 papers accepted at ACL 2026.

2026 · 02

9 papers accepted at AAAI 2026 / ICLR 2026 / AAMAS 2026 / ICRA 2026.

2025 · 09

11 papers accepted at NeurIPS 2025 (2 Spotlights).

2025 · 05

6 papers accepted at ACL 2025; 2 papers accepted at ICML 2025.

2025 · 01

5 papers accepted at ICLR 2025.

2024 · 12

5 papers accepted at AAAI 2025; 2 at AAMAS 2025.

2024 · 10

Invited talk "Can LLMs be aligned?" at CNCC 2024.

2024 · 09

5 papers accepted at NeurIPS 2024.

2024 · 08

2 papers accepted at CoRL 2024.

2024 · 05

Delivered the VALSE 2024 annual progress report on alignment; 3 papers accepted at ICML 2024.

2024 · 03

Co-signed the Beijing AI Safety Declaration with leading scientists.

2024 · 02

Featured on CCTV「焦点访谈」 — national TV report on AI Safety.

2024 · 01

5 papers accepted at ICLR 2024; 1 at TPAMI.

2023 · 12

3 papers accepted at AAAI 2024.

2023 · 11

Released the AI Alignment Survey.

2023 · 10

Paper on the ICCV 2023 Best Paper Initial List (top 17 / 8260).

2023 · 09

6 papers accepted at NeurIPS 2023; 2 at JMLR and TMLR.

2023 · 06

TorchOpt officially joined the PyTorch Ecosystem.

2023 · 05

4 papers accepted at ICML 2023.

2023 · 02

2 papers accepted at ICRA 2023; 1 at ICLR 2023.

2023 · 01

1 paper accepted at JAAMAS and 1 at AAMAS 2023.

2022 · 12

NeurIPS 2022 MyoChallenge — 1st place (1 / 340 teams).

2022 · 11

National Science Review paper on Nash equilibrium complexity; 3 papers accepted at AAAI 2023.

2022 · 09

7 papers accepted at NeurIPS 2022.

2022 · 05

1 paper accepted at IJCAI 2022.

2022 · 04

TorchOpt and Bi-DexHands open-sourced.

2022 · 01

2 papers accepted at ICLR 2022.

2020 · 10

SMARTS platform released; CoRL 2020 Best System Paper Award.

2020 · 06

1 paper accepted at ICML 2020.

2020 · 05

1 paper accepted at IJCAI 2020.

2020 · 02

1 paper accepted at AAMAS 2020.

Research· 研究方向

Five directions · methods, benchmarks, and open-source systems

01 / Alignment & Safety

LLM Alignment & Safety 大模型对齐与安全

RLHF, preference learning, safe alignment, red-teaming and interpretability. Principled methods and open benchmarks — BeaverTails, PKU-SafeRLHF, Stream Aligner, Libra-Leaderboard — to make LLMs robustly helpful and harmless.

02 / Embodied AI

Embodied AI & Robot Learning 具身智能与机器人学习

Dexterous manipulation, vision-language-action models, and sim-to-real. From Bi-DexHands and ClutterDexGrasp to DexGraspVLA and Safe VLA — pursuing human-level generalist robotic agents.

03 / MARL

Multi-Agent RL 多智能体强化学习

Cooperative and competitive MARL, policy gradient theory, Nash equilibria. HARL, MAT, MARLlib — algorithms that scale to hundreds of agents.

04 / Agentic AI

Agentic AI & Social Simulation 智能体与社会仿真

LLM-based agents for macroeconomic modelling, social value orientation, negotiation and consensus. World models unifying physical and social dynamics.

05 / AI for Science

AI for Science AI 赋能科学

RL and LLMs applied to medicine, physics, materials (carbon-nanotube synthesis), and operations — featured in Cell iScience, Matter, and National Science Review.

Press· 媒体报道

National coverage · CCTV · Xinhua · NSFC · MIT Tech Review

📺 CCTV · China Central Television Three national TV features

Awards· 获奖

Best papers · talent programs · academic honors · competitions

I. Best-Paper Awards 最佳论文奖 5 awards
2026

UKRI Best Research Paper in AI

Efficient and Scalable Reinforcement Learning for Large-Scale Network Control · Nature Machine Intelligence

2025

ACL 2025 Best Paper Award

Language Models Resist Alignment: Evidence From Data Compression

2023

ICCV 2023 Best Paper Finalist

UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy

2021

AAMAS 2021 Blue-Sky Idea Award

Diverse Auto-Curriculum is Critical for Successful Real-World Multiagent Learning Systems

2020

CoRL 2020 Best System Paper Award

SMARTS: An Open-Source Scalable Multi-Agent RL Training School for Autonomous Driving

II. Talent Programs 人才项目 3 programs
2024

National Young Talent

NSFC Excellent Young Scientist

2022

High-Level Overseas Talent

Ministry of Human Resources — 30 nationwide

2023

CAST Youth Talent Support Program

CAAI — 6 selected nationally

III. Academic Honors 学术荣誉 5 honors
2025

Elsevier / Stanford World Top 2% Scientists

Global Top 2% career-impact ranking

2025

MIT Tech Review — AI 100 Young Innovators

麻省理工科技评论「AI 100 青年先锋」

2026

Forbes China — Innovation Leader

福布斯中国科创革新力人物

2022

ACM SIGAI China Rising Star Award

Three awardees nationwide

2022

WAIC Yunfan Award — Rising Star

Ten awardees nationwide

IV. Competitions & Industry 竞赛与产业 4 awards
2026

Wu Wenjun AI S&T Award · 2nd Prize (2025)

吴文俊人工智能科学技术奖 · 科技进步奖二等奖 — 知识增强的可信多模态交互关键技术及应用

2025

CMSA Meteorological Tech Invention Award · 1st Prize

中国气象服务协会 · 气象技术发明一等奖 — 融合北斗与人工智能的极端大风气象应急救援导航路径规划技术研究

2022

NeurIPS 2022 MyoChallenge · Winner

Physiological dexterity manipulation · 1 / 340 teams

2025

Digital China Innovation Contest · AI Track 1st Prize

数字中国创新大赛人工智能赛道全国一等奖

Mentorship· 学生培养

Highest PKU student honors · Apple & Tencent fellowships · NSFC grants

2024 Highest Student Honor · PKU

PKU May-4th Medal
北京大学五四奖章

Yiran Geng 耿逸然
PKU's highest honor for students · sole recipient among all undergraduate STEM majors.
2024 University-Wide · PKU

PKU Annual Figures
北京大学年度人物

Jiaming Ji 吉嘉铭 Boyuan Chen 陈博远
Two PAIR-Lab students named PKU Annual Figures — one of the most prestigious annual recognitions at Peking University.
2025 Industry Fellowship · Apple

Apple Scholars
in AI / ML

Jiaming Ji 吉嘉铭
Apple PhD Fellowship (2025) — one of only 12 scholars selected globally.
2025 Industry Fellowship · Tencent

Tencent Hunyuan Scholar
腾讯混元学者

Jiaming Ji 吉嘉铭
Tencent's flagship PhD fellowship for top AI students in China.
2024 NSFC · PhD Student Grant

NSFC Young Student
Basic Research (PhD)

Jiaming Ji 吉嘉铭
Sole PhD awardee in PKU's AI direction — 国自然青年学生基础研究项目.
2024 NSFC · Undergraduate Grant

NSFC Young Student
Basic Research (UG)

Tianyi Qiu 邱天异
One of only two undergraduates in PKU's AI direction to receive this grant.
My Teaching · 本人获奖
2026

PKU Teaching Achievement Award · 2nd Prize (2025)

For the course "Foundations and Alignment of Large Language Models" (《大语言模型基础与对齐》).

2025

Digital China Innovation Contest · AI Track 1st Prize

2025 Digital China Innovation Competition — AI Track, First Prize National.

2025

ICBC Teaching Award · PKU

中国工商银行奖教金 · 北京大学 2025 年度

2022–

Class Advisor · Yuanpei AGI Experimental Class

元培学院"通用人工智能实验班"2022 级班主任 · 教学委员

2023 – 2025

Outstanding Undergraduate Research Supervisor · PKU

Awarded three years in a row (2023, 2024, 2025) by Peking University.

Publications· 论文发表

Representative works · browse by topic below

2025 2 papers
ALN
Language Models Resist Alignment: Evidence From Data Compression *
Jiaming Ji, Kaile Wang, Tianyi Alex Qiu, Boyuan Chen, Jiayi Zhou, Changye Li, Hantao Lou, Josef Dai, Yunhuai Liu, Yaodong Yang#
ACL 2025 ★ Best Paper
Alignment TheoryAlignmentLLM
ALN
Safe VLA: Towards Safety Alignment of Vision-Language-Action Model via Safe Reinforcement Learning *
Borong Zhang, Yuhao Zhang, Jiaming Ji, Yingshan Lei, Josef Dai, Yuanpei Chen, Yaodong Yang#
NeurIPS 2025 Spotlight
Safe VLAVLASafe RLSafetyAlignment
2024 7 papers
EMB
ASP: Learn a Universal Neural Solver *
Chenguang Wang, Zhouliang Yu, Stephen McAleer, Tianshu Yu, Yaodong Yang#
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
Combinatorial Optimization
ALN
Aligner: Efficient Alignment by Learning to Correct *
Jiaming Ji, Boyuan Chen, Hantao Lou, Donghai Hong, Borong Zhang, Xuehai Pan, Juntao Dai, Yaodong Yang#
NeurIPS 2024 Oral
AlignerAlignment
EMB
Bi-DexHands: Towards Human-Level Bimanual Dexterous Manipulation *
Yuanpei Chen, Yiran Geng, Fangwei Zhong, Jiaming Ji, Jiechuang Jiang, Zongqing Lu, Hao Dong, Yaodong Yang#
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
BimanualDexterous ManipulationRobotics
AI4
Efficient and scalable reinforcement learning for large-scale network control *
Chengdong Ma, Aming Li, Yali Du, Hao Dong, Yaodong Yang#
Nature Machine Intelligence ★ Best Paper
Network ControlReinforcement Learning
MRL
Heterogeneous-Agent Reinforcement Learning *
Yifan Zhong, Jakub Grudzien Kuba, Xidong Feng, Siyi Hu, Jiaming Ji, Yaodong Yang#
Journal of Machine Learning Research (JMLR)
HARLReinforcement Learning
ALN
Omnisafe: An infrastructure for accelerating safe reinforcement learning research *
Jiaming Ji, Jiayi Zhou, Borong Zhang, Juntao Dai, Xuehai Pan, Ruiyang Sun, Weidong Huang, Yiran Geng, Mickel Liu, Yaodong Yang#
Journal of Machine Learning Research (JMLR)
OmniSafeSafe RLReinforcement Learning
AI4
Transforming the synthesis of carbon nanotubes with machine learning models and automation *
Yue Li, Shurui Wang, Zhou Lv, Zhaoji Wang, Yunbiao Zhao, Ying Xie, Yang Xu, Liu Qian, Yaodong Yang#, Ziqiang Zhao#, Jin Zhang#
Matter (Cell Press)
Carbon NanotubesMaterials Synthesis
2023 4 papers
MRL
MARLlib: A Multi-agent Reinforcement Learning Library *
Siyi Hu, Yifan Zhong, Minquan Gao, Weixun Wang, Hao Dong, Xiaodan Liang, Zhihui Li, Xiaojun Chang, Yaodong Yang#
Journal of Machine Learning Research (JMLR)
MARLlibMulti-Agent RLReinforcement Learning
MRL
On the complexity of computing markov perfect equilibrium in general-sum stochastic games *
Xiaotie Deng, Ningyuan Li, David Mguni, Jun Wang, Yaodong Yang#
National Science Review
Nash EquilibriumStochastic Games
ALN
Safe multi-agent reinforcement learning for multi-robot control *
Shangding Gu, Jakub Grudzien Kuba, Yuanpei Chen, Yali Du, Long Yang, Alois C. Knoll, Yaodong Yang#
Artificial Intelligence Journal (AIJ)
Multi-Agent RLRoboticsReinforcement Learning
MRL
TorchOpt: An Efficient Library for Differentiable Optimization *
Jie Ren, Xidong Feng, Bo Liu, Xuehai Pan, Yao Fu, Luo Mai, Yaodong Yang#
Journal of Machine Learning Research (JMLR)
Differentiable Optimization
2021 1 papers
MRL
Diverse Auto-Curriculum is Critical for Successful Real-World Multiagent Learning Systems *
Yaodong Yang, Jun Luo, Ying Wen, Oliver Slumbers, Daniel Graves, Haitham Bou Ammar, Jun Wang, Matthew E. Taylor
AAMAS 2021 ★ Best Paper
Auto-CurriculumMulti-Agent RL
2020 1 papers
EMB
SMARTS: An Open-Source Scalable Multi-Agent RL Training School for Autonomous Driving
Ming Zhou, Jun Luo, Julian Villella, Yaodong Yang, David Rusu, Jiayu Miao, Weinan Zhang, Montgomery Alban, Iman Fadakar, Zheng Chen, Aurora Chongxi Huang, Ying Wen, Kimia Hassanzadeh, Daniel Graves, Dong Chen, Zhengbang Zhu, Nhat Nguyen, Mohamed Elsayed, Kun Shao, Sanjeevan Ahilan, Baokuan Zhang, Jiannan Wu, Zhengang Fu, Kasra Rezaee, Peyman Yadmellat, Mohsen Rohani, Nicolas Perez Nieves, Yihan Ni, Seyedershad Banijamali, Alexander Cowen Rivers, Zheng Tian, Daniel Palenicek, Haitham bou Ammar, Hongbo Zhang, Wulong Liu, Jianye Hao, Jun Wang
CoRL 2020 ★ Best Paper
SMARTSAutonomous DrivingMulti-Agent RL
View all 176 publications →

Service· 学术服务

Area Chair · Associate Editor · Program Chair

Area Chair
  • NeurIPS CCF-A
  • ICML CCF-A
  • ICLR CCF-A
  • AAAI CCF-A
  • IJCAI CCF-A
  • AAMAS — Senior AC CCF-B
  • IROS CCF-C
Associate Editor
  • Neural Networks (Springer) CCF-B
  • Transactions on Machine Learning Research TMLR
  • Scientific Reports Nature
Program / Publicity Chair
  • World Artificial Intelligence Conference Academic (WAICA) 2026 · Shanghai Publicity Chair
  • Distributed AI Conference (DAI) 2024 · Singapore Program Chair

Experience· 履历

USTC · Imperial · UCL · AIG · KCL · PKU

2022 – Now
Assistant Professor (Boya Young Scholar)
Peking University · Institute for AI 北京大学人工智能研究院
Chief Scientist, PKU–PsiBot Joint Laboratory · PI, PAIR-Lab
2021 – 2022
Assistant Professor
King's College London · Department of Informatics 伦敦国王大学
2019 – 2021
Principal Researcher
Huawei U.K. · London Research Centre 华为英国研究院
2020 Best Technology Breakthrough Award (sole awardee)
2015 – 2019
Senior Science Manager
American International Group (AIG) · Science Dept. 美国国际集团
2016 – 2021
Ph.D. · Computer Science
University College London (UCL) 伦敦大学学院
Thesis: Many-Agent Reinforcement Learning · Advisors: Jun Wang & John Shawe-Taylor
2013 – 2014
M.Sc. · Quantitative Biology
Imperial College London 伦敦帝国理工学院
2009 – 2013
B.Eng. · Electronic Engineering & Information Science
University of Science & Technology of China (USTC) 中国科学技术大学
§ Join the Lab · 招生招聘

Come work on the hardest problems in safe and trustworthy AGI.

PhD · 2027 Prospetive PhD in
0
Peking University
FULL · no spots this cycle
Multiple
Zhongguancun Academy
Open · accepting applications
Three research directions · 三大方向

Embodied Intelligence · Dexterous Manipulation · Robot Foundation Models

Sim-to-real policy learning for high-DoF dexterous manipulation; embodied foundation models that act in the physical world. Joint work with PsiBot.

World Models · Physics Foundation Models · Sim-to-Real Alignment

Build world models that capture both physical and social dynamics; align simulators with the real world for downstream policy training. Joint work with Neo Matrix.

LLM Post-Training & Alignment

RLHF / DPO / Safe-RLHF · reward modeling · interpretability · multi-modal & multilingual safety. Connecting alignment theory to practice at scale.

PAIR-Lab also welcomes master's students, visiting scholars, undergraduate research interns, and postdocs. If you are fascinated by reinforcement learning, LLM alignment, multi-agent systems, or embodied intelligence — and want to build safe and trustworthy AGI that ships — please read the starter materials above and reach out.