About Me
Dr. Yaodong Yang is an Assistant Professor (Boya Young Scholar) at the Institute for Artificial Intelligence, Peking University, Director of the AI Safety Centre at BAAI, and Chief Scientist of the PKU–PsiBot Joint Laboratory. His research focuses on experience learning and alignment of AI agents, aiming to advance the trustworthy deployment and real-world alignment of large models, spanning the areas of reinforcement learning, AI alignment, and embodied intelligence. He has published over 200 papers in leading journals and conferences, including Nature Machine Intelligence, Cell Matter, Artificial Intelligence Journal, and IEEE TPAMI, with more than 12,000 Google Scholar citations. Since 2022, he has been ranked as the top scholar in Artificial Intelligence and Machine Learning at Peking University according to CSRankings. Dr. Yang has received numerous honors, including the ACL 2025 Best Paper Award, ICCV 2023 Best Paper Initial List, CoRL 2020 Best System Paper Award, and the AAMAS 2021 Blue Sky Idea Award. He was named to the MIT Technology Review “AI 100 Young Innovators,” received the WAIC 2022 “Yunfan Star Award,” and the ACM SIGAI China Rising Star Award. His work has been featured by CCTV, Xinhua News, the National Natural Science Foundation of China (NSFC), and MIT Technology Review. He serves as an Area Chair for major conferences including ICML, ICLR, NeurIPS, AAAI, IJCAI, AAMAS, and IROS, and as an Associate Editor for Scientific Reports, Transactions on Machine Learning Research, and Neural Networks. Previously, Dr. Yang was an Assistant Professor at King’s College London, a Principal Researcher at Huawei Research U.K., and a Senior Manager at AIG. He received his B.Sc. from the University of Science and Technology of China, M.Sc. from Imperial College London, and Ph.D. from University College London, where he was the university’s sole nominee for the ACM SIGAI Doctoral Dissertation Award.
杨耀东,北京大学人工智能研究院研究员(博雅学者),北京智源大模型安全中心主任,北大-灵初智能联合实验室首席科学家。国家人社部高层次留学人才、国家优秀青年科学基金(海外)获得者、中国科协青年托举计划入选者。主要研究方向为智能体交互学习与对齐,致力于大模型的可信应用与安全落地,科研领域涵盖强化学习、AI对齐与具身智能。在 Nature Machine Intelligence、Cell Matter、AIJ、TPAMI 等国际顶级期刊和会议发表论文二百余篇,谷歌学术引用逾12000+次,自2022年以来位列CSRanking北大人工智能与机器学习方向学者首位,入选Scopus全球Top2%顶尖科学家。曾获ACL 2025最佳论文奖、ICCV 2023最佳论文奖入围、CoRL 2020最佳系统论文奖、AAMAS 2021最佳前瞻性论文奖,入选麻省理工科技评论“AI 100青年榜”、WAIC 2022“云帆奖·璀璨明星”及ACM SIGAI China新星奖。相关研究成果获中央电视台《焦点访谈》、新华网、国家自然科学基金委官网及《麻省理工科技评论》等多家媒体报道。现任ICML、ICLR、NeurIPS、AAAI、IJCAI、AAMAS、IROS等国际会议领域主席,以及《Scientific Reports》《Transactions on Machine Learning Research》《Neural Networks》等期刊执行编委,主持国家自然科学基金、科技部、北京市科委及多项校企联合实验室科研项目五十余项。曾任伦敦国王学院助理教授、华为英国研究所主任研究员、美国国际集团高级经理。本科毕业于中国科学技术大学,先后在伦敦帝国理工学院获硕士学位、伦敦大学学院获博士学位,并获校唯一提名角逐ACM SIGAI优秀博士论文奖。
北大对齐与交互实验室PAIR-Lab的科研方向包括:
-
PAIR-Lab 2026年的博士名额:<0
-
常年招收强化学习实习生/访问学者(带薪)
人工智能对齐(人类反馈强化学习、博弈论、控制论)
基于强化学习的灵巧双手操作(强化学习、机器人、具身智能)
多智能体博弈交互(强化学习、多智能体、博弈论)
-
一个合作博弈的通用求解框架(TechBeat'23最受欢迎讲者)
强化学习开源项目(Show me the code, not the story~)
Recent News
Top Highlights:
-
07/2025
Our paper wins the ACL'25 Best Paper Award.
Language Models Resist Alignment: Evidence From Data Compression
-
04/2025
I deliver a 3-hour tutorial at ICML 2025 (virtual).
"Alignment Methods on Large Language Models"
-
12/2024
Check out our Matter (Cell Press) paper on applying LLMs for generating carbon nanotubes automatically.
Transforming the synthesis of carbon nanotubes with machine learning models and automation
-
09/2024
Check out our Nature Machine Intelligence paper on Large Scale Multi-agent Networked RL & its applications on pandemics, smart grid and traffic control.
Efficient and scalable reinforcement learning for large-scale network control
Latest News:
09/2025
Eleven papers get accepted at NeurIPS 2025
-
Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning
-
Learning Principles from Multi-modal Human Preference
-
Spotlight (3%) Safe VLA: Towards Safety Alignment of Vision-Language-Action Model via Safe Reinforcement Learning
-
Risk-aware Direct Preference Optimization under Nested Risk Measure
-
Social World Model-Augmented Mechanism Design Policy Learning
-
Spotlight (3%) DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation
-
STAR: Efficient Preference-based Reinforcement Learning via Dual Regularization
-
Safe RLHF-V: Safe Reinforcement Learning from Multi-modal Human Feedback
-
Spotlight DB (3%) InterMT: Multi-Turn Interleaved Preference Alignment with Human Feedback
-
PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models
-
World Models Should Prioritize the Unification of Physical and Social Dynamics (Position Paper)
05/2025
Six papers get accepted at ACL 2025
-
Boosting Policy and Process Reward Models with Monte Carlo Tree Search in Open-Domain QA
-
SafeLawBench: Towards Safe Alignment of Large Language Models
-
Best Paper Award: Language Models Resist Alignment: Evidence From Data Compression
-
Reward Generalization in RLHF: A Topological Perspective
-
Benchmarking Multi-National Value Alignment for Large Language Models
-
BeaverTails v2: Towards Multi-Level Safety Alignment for LLMs with Human Preference
05/2025
Two papers get accepted at ICML 2025
-
Falcon: Fast visuomotor policy via partial denoising
-
SAE-V: Interpreting Multimodal Models for Enhanced Alignment
01/2025
Five papers get accepted at ICLR 2025
-
In-Context Editing: Learning Knowledge from Self-Induced Distributions
-
Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
-
Mitigating Reward Over-Optimization in RLHF via Behavior-Supported Regularization
-
Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models
-
Magnetic Mirror Descent Self-play Preference Optimization
12/2024
Five papers get accepted at AAAI 2025
-
Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback (Oral)
-
Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction
-
Differentiable Information Enhanced Model-Based Reinforcement Learning (Oral)
-
Towards efficient collaboration via graph modeling in reinforcement learning
-
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors (Oral)
Two papers get accepted at AAMAS 2025
-
Mean Field Correlated Imitation Learning
-
EconTwo: A Two-Level Multi-Agent Framework for Dynamic Macroeconomic Modeling with Shock Resilience (Short paper)
10/2024
Checkout my recent talk on "Can LLM be Aligned ?" at CNCC 2024.
09/2024
Five papers get accepted at NeurIPS 2024
-
Achieving Efficient Alignment through Learned Correction (Oral, top 0.5%)
-
ProgressGym: Alignment with a Millennium of Moral Progress (Spotlight)
-
Panacea: Pareto Alignment via Preference Adaptation for LLMs
-
Scalable Constrained Policy Optimization for Safe Multi-agent Reinforcement Learning
-
SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset
09/2024
Check out our Nature Machine Intelligence paper on Large Scale Multi-agent RL.
Efficient and scalable reinforcement learning for large-scale network control
08/2024
Two papers accepted at CoRL 2024
-
Neural Attention Field: Emerging Point Relevance in 3D Scenes for One-Shot Dexterous Grasping
-
Object-Centric Dexterous Manipulation from Human Motion Data
05/2024
Valse 2024年度进展报告:从偏好对齐到价值对齐与超对齐
中文视频
05/2024
Three papers get accepted at ICML 2024
-
SINSIGHT: End-to-End Neuro-Symbolic Visual Reinforcement Learning with Language Explanations
-
Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation
-
Planning with Theory of Mind for Few-Shot Adaptation in Mixed-motive Environments
03/2024
中文报道
01/2024
Five papers get accepted at ICLR 2024 & one paper on TPAMI.
-
Spotlight (5%) CivRealm: A Learning and Reasoning Odyssey for Decision-Making Agents
-
Spotlight (5%) Maximum Entropy Heterogeneous-Agent Reinforcement Learning
-
Spotlight (5%) Safe RLHF: Safe Reinforcement Learning from Human Feedback
-
SafeDreamer: Safe Reinforcement Learning with World Models
-
Byzantine Robust Cooperative Multi-Agent Reinforcement Learning as a Bayesian Game
12/2023
Three papers get accepted at AAAI 2024.
-
STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning
-
Oral (7%) ProAgent: Building Proactive Cooperative AI with Large Language Models
-
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
12/2023
Two top journals get accepted!
11/2023
We release AI Alignment Survey and Alignment Resource Website.
10/2023
Our paper won the best paper initial list (17/8260) at ICCV 2023!
-
UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning
09/2023
Six papers get accepted at NeurIPS 2023.
-
Multi-Agent First Order Constrained Optimization in Policy Space
-
Hierarchical Multi-Agent Skill Discovery
-
Policy Space Diversity for Non-Transitive Games
-
Team-PSRO for Learning Approximate TMECor in Large Team Games via Cooperative Reinforcement Learning
-
BeaverTails: A Human-Preference Dataset for LLM Harmlessness Alignment
-
Safety Gymnasium: A Unified Safe Reinforcement Learning Benchmark
09/2023
Two papers get accepted at JMLR and TMLR.
-
JMLR MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library
-
TMLR JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games
06/2023
TorchOpt is now officially part of PyTorch Ecosystem!
05/2023
Four papers get accepted at ICML 2023.
-
A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems
-
GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models
03/2023
Invited talk given:
Slides: Aligning safe decision in open-ended world.
03/2023
One paper gets accepted at Artificial Intelligence Journal
Safe Multi-Agent Reinforcement Learning for Multi-Robot Control
We propose the first safe cooperative MARL method.
02/2023
Two ICRA papers, One ICLR paper got accepted.
ICRA'23: End-to-End Affordance Learning for Robotic Manipulation
We take advantage of visual affordance by using the contact information generated during the RL training process to predict contact maps of interest.
ICRA'23: GenDexGrasp: Generalizable Dexterous Grasping
A versatile dexterous grasping method that can generalize to unseen hands.
ICLR'23: QUALITY-SIMILAR DIVERSITY VIA POPULATION BASED REINFORCEMENT LEARNING
A new policy diversity measure is proposed that suits game AI settings.
1/2023
One paper gets accepted at Autonomous Agents and Multi-Agent Systems (Springer)
Online Markov Decision Processes with Non-oblivious Strategic Adversary
We study the setting of online MDP where the adversary is smart where it can change its policy accordingly to the learning agent's behavior.
1/2023
One paper gets accepted at AAMAS 2023
Is Nash Equilibrium Approximator Learnable ?
We prove that Nash Equilibrium is agnostic-PAC learnable.
12/2022
We have won the 1st place at NeurIPS 2022 MyoChallenge!
This competition is about learning contact-rich manipulation using a musculoskeletal hand, e.g., Die Rotation.
11/2022
Our paper gets accepted at National Science Review [IF-23]
On the complexity of computing markov perfect equilibrium in general-sum stochastic games
We prove the complexity of computing Nash Equilibrium in Markov games are PPAD-Complete.
11/2022
Three multi-agent RL papers get accepted at AAAI 2023.
Mutli-agent RL:
-
Subspace-Aware Exploration for Sparse-Reward Multi-Agent Tasks
-
Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency
10/2022
Talk is given at Airs in Air.
Game Theoretical Multi-Agent Reinforcement Learning.
09/2022
Talk is given at Techbeat.com 2022.
A General Solution Framework to Cooperative MARL.
09/2022
Seven papers got accepted at NeurIPS 2022.
Preference-based RL:
Meta-Reward-Net: Implicitly Differentiable Reward Learning for Preference-based Reinforcement Learning
Meta-RL:
A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning
Safe RL:
Constrained Update Projection Approach to Safe Policy Optimization
Cooperative Games:
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Zero-sum Games:
A Unified Diversity Measure for Multiagent Reinforcement Learning
New RL environments:
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
MATE: Benchmarking Multi-Agent Reinforcement Learning in Distributed Target Coverage Control
08/2022
Tutorial on Conference on Games 2022
Solving two-player zero-sum games through reinforcement learning
08/2022
Two Invited Talks were given during the summer holidays
CSML China 08/22:
A continuum of solutions to cooperative MARL.
CCDM China 07/22:
Training a Population of Agents.
07/2022
One paper got accepted at IROS 2022.
Fully Decentralized Model-based Policy Optimization for Networked Systems
We figured out how to do model-based MARL in networked systems.
05/2022
One paper got accepted at IJCAI 2022.
1. On the Convergence of Fictitious Play: A Decomposition Approach
We extend the convergence guarantee for the well-known fictitious play method.
04/2022
We open source two reinforcement learning projects:
We develop an optimisation tool in Pytorch where meta-gradients can be computed easily.
With TorchOpt, you can implement Meta-RL algorithms easily, try our code!
We develop a RL/MARL environment for bimanual dexterous hands manipulations.
BiDexhands are super fast, you can reach 40,000 FPS by only one GPU.
01/2022
Two papers got accepted at ICLR 2022.
1. Multiagent-Agent TRPO Methods
We develop how to conduct trust-region updates in MARL settings.
This is the SOTA algorithm in the cooperative MARL space, try our code!
[English Blog] [Chinese Blog] [Code]
2. LIGS: Learnable Intrinsic-Reward Generation Selection for Multi- Agent Learning
The paper addresses coordination improvement in the MARL setting by learning intrinsic rewards that motivate the exploration and coordination.
01/2022
北大AI院多智能体中心面向全球招收寒研学生,见课题如下。
01/2022
Invited talk at DAI 2021 on the topic of Training A Population of Reinforcement Learning Agents.
09/2021
Three papers get accepted at NeurIPS 2021:
We analysed the variance of gradient norm for multi-agent reinforcement learning and developed a minimal-variance policy gradient estimator.
We developed a rigorous way to generate diverse policies in population-based training and demonstrated impressive results on Google football.
We show it is entirely possible to make AI learn to learn how to solve zero-sum games without even telling it what is a Nash equilibrium.
08/2021
Invited talk at RLChina on the tutorial of Multi-Agent Learning.
07/2021
Invited talk by 机器之心 on my recent work on how to deal with non-transitivity in two-player zero-sum games.
06/2021
We opensource MALib: A bespoke high-performance framework for population-based multi-agent reinforcement learning.
05/2021
Two papers get accepted in ICML 2021.
Learning in Nonzero-Sum Stochastic Games with Potentials. This paper studies a generalised class of fully cooperative games, named stochastic potential games, and propose a MARL solution to find the Nash in such games.
03/2021
Check out my recent talk on the topic of:
A general framework for solving two-player zero-sum games.
02/2021
Update: Our paper wins the Best Paper Award at the Blue Sky Idea track!!!
One paper gets accepted in AAMAS 2021.
11/2020
Check out my latest work on:
An Overview of Multi-Agent Reinforcement Learning from Game Theoretical Perspective. I hope this work could offer a nice summary of game theory basics for MARL researches in addition to the deep RL hype :)
10/2020
Update: SMARTS won the BEST paper award in CoRL 2020!
We release SMARTS: a multi-agent reinforcement learning enabled autonomous driving platform.
Multi-agent interaction is a fundamental aspect of autonomous driving in the real world. Today we are excited to introduce a dedicated platform: SMARTS, that supports Scalable Multi-Agent Reinforcement Learning Training for autonomous driving. With SMARTS, ML researchers can now evaluate their new algorithms in the self-driving scenarios, in addition to traditional video games. In turn, SMARTS can enrich the social vehicle behaviours and create increasingly more realistic and diverse interactions, powered by RL techniques, for autonomous driving researchers. Check our code on Github, and our paper at Conference on Robotic Learning 2020.
10/2020
One paper gets accepted at NIPS 2020 !
Replica-exchange Nos\'e-Hoover dynamics for Bayesian learning on large datasets. We introduce a new HMC sampler for large-scale Bayesian deep learning that suits multi-mode sampling and the noises from mini-batches can be absorbed by a special design of Nose-Hoover dynamics.
09/2020
One paper gets accepted at CIKM 2020 !
Learning to infer user hidden states for online sequential advertising.
08/2020
A lecture was given at RL China Summer School.
Advances of Multi-agent Learning in Gaming AI.
06/2020
A talk was given at ISTBI, Fudan University.
Many-agent Reinforcement Learning.
06/2020
One paper gets accepted at ICML 2020
Multi-agent Determinantal Q-learning. We introduce a new function approximator called Q-determinant point process for multi-agent reinforcement learning problems. It can help learn the Q-function factorisation with no needs for a priori structural constraints such as QMIX, VDN, etc.
05/2020
One paper gets accepted at IJCAI 2020
Modelling Bounded Rationality in Multi-Agent Interactions by Generalized Recursive Reasoning. We use probabilisitic graphical model to describe the recursive reasoning process of "I believe you believe I believe..." in the multi-agent system.
02/2020
One paper gets accepted at AAMAS 2020
Alpha^Alpla-Rank: Practically Scaling Alpha-Rank through Stochastic Optimisation. Alpha-Rank is a replacement for Nash equilibrium for general-sum N-player game, importantly, its solution is P-complete. In this paper, we further enhance its tractability by several orders of magnitude by stochastic optimisation formulation.
