Yaodong Yang · 杨耀东

★ Headline · Apr 2026

Our paper RoboSafe wins the Outstanding Paper Award at the ICLR 2026 Workshop on Efficient Spatial Reasoning.

"RoboSafe: Safeguarding Embodied Agents via Executable Safety Logic" proposes a neuro-symbolic framework that compiles natural-language safety rules into executable logic to monitor and constrain embodied agents at run-time.

Paper →Workshop →

★ Headline · Apr 2026

AI breaks the human records in the Kissing Number Problem

PKU mathematicians used AI and reinforcement learning to explore the kissing number problem, achieving breakthroughs in higher dimensions.

Paper →Video →People's Daily →Xinhua News →

★ Headline · Apr 2026

PsiBot releases WAM — world-action model ranking #1 globally on MolmoSpace

Joint work with PKU–PsiBot Lab. A generalist world-action model for embodied agents, outperforming prior SOTA on spatial reasoning benchmarks.

Xinhua →Tech Blog →PsiBot Site →

★ Headline · Jul 2025

Our paper wins the ACL 2025 Best Paper Award — "Language Models Resist Alignment"

The paper shows that post-aligned language models tend to revert to their pre-training distributions — a theoretical "elasticity" result with implications for RLHF and safety.

Paper →Slides →Xinhua →NSFC →

★ Headline · Apr 2025

I delivered a 3-hour tutorial at ICML 2025 (virtual) on Alignment Methods for LLMs.

A comprehensive ICML tutorial covering RLHF, DPO, safe alignment, preference learning and super-alignment — delivered to a virtual audience of thousands.

Tutorial Site →

★ Headline · Dec 2024

We published Matter (Cell Press) paper on applying LLMs for generating carbon nanotubes automatically.

A cross-disciplinary work applying LLMs to steer autonomous experimental synthesis of carbon nanotubes, featured in Cell Press's flagship materials journal Matter.

Paper →Clarivate Highly-cited Paper →

★ Headline · Sep 2024

We published Nature Machine Intelligence paper on large-scale multi-agent networked RL and its applications on pandemics, smart grid and traffic control.

The first multi-agent RL paper led by a Chinese team on a Nature sister journal. Scalable method for controlling 1000+ networked agents with real-world deployments.

Paper →Xinhua →S&T Daily →PKU News →

Come work on the hardest problems in safe and trustworthy AGI.

Three research directions

LLM Post-Training · Alignment

RLHF / DPO / Safe-RLHF · reward modeling · interpretability · multi-modal & multilingual safety. Connecting alignment theory to practice at scale.

Starter materials

Embodied Intelligence · Dexterous Manipulation · Robot Foundation Models

Sim-to-real policy learning for high-DoF dexterous manipulation; embodied foundation models that act in the physical world. Joint work with PsiBot.

Starter materials

World Models · Physics Foundation Models · Sim-to-Real Alignment

Build world models that capture both physical and social dynamics; align simulators with the real world for downstream policy training. Joint work with Neo Matrix.

Starter materials

Physis AI · Industry coverage↗

PAIR-Lab also welcomes master's students, visiting scholars, undergraduate research interns, and postdocs. If you are fascinated by reinforcement learning, LLM alignment, multi-agent systems, or embodied intelligence — and want to build safe and trustworthy AGI that ships — please read the starter materials above and reach out.

News

Our paper RoboSafe wins the Outstanding Paper Award at the ICLR 2026 Workshop on Efficient Spatial Reasoning.

AI breaks the human records in the Kissing Number Problem

PsiBot releases WAM — world-action model ranking #1 globally on MolmoSpace

Our paper wins the ACL 2025 Best Paper Award — "Language Models Resist Alignment"

I delivered a 3-hour tutorial at ICML 2025 (virtual) on Alignment Methods for LLMs.

We published Matter (Cell Press) paper on applying LLMs for generating carbon nanotubes automatically.

We published Nature Machine Intelligence paper on large-scale multi-agent networked RL and its applications on pandemics, smart grid and traffic control.

Research

LLM Alignment & RLHF

Embodied Reinforcement Learning

Multi-Agent RL

Agentic RL & Social Simulation

RL for Science

Press

Awards

UKRI Best Research Paper in AI

ACL 2025 Best Paper Award

ICCV 2023 Best Paper Finalist

AAMAS 2021 Blue-Sky Idea Award

CoRL 2020 Best System Paper Award

National Young Talent

High-Level Overseas Talent

CAST Youth Talent Support Program

Elsevier / Stanford World Top 2% Scientists

MIT Tech Review — AI 100 Young Innovators

Forbes China — Innovation Leader

ACM SIGAI China Rising Star Award

WAIC Yunfan Award — Rising Star

Wu Wenjun AI S&T Award · 2nd Prize

CMSA Meteorological Tech Invention Award · 1st Prize

NeurIPS 2022 MyoChallenge · Winner

Digital China Innovation Contest · AI Track 1st Prize

Mentorship

PKU May-4th Medal

PKU Annual Figures

PKU President's Scholarship

Apple Scholars in AI / ML

NSFC Young StudentBasic Research (PhD)

NSFC Young StudentBasic Research (UG)

PKU Teaching Achievement Award · 2nd Prize

Digital China Innovation Contest · AI Track 1st Prize

ICBC Teaching Award · PKU

Class Advisor · Yuanpei AGI Experimental Class

Outstanding Undergraduate Research Supervisor · PKU

Publications

Service

Experience

Come work on the hardest problems in safe and trustworthy AGI.

LLM Post-Training · Alignment

Embodied Intelligence · Dexterous Manipulation · Robot Foundation Models

World Models · Physics Foundation Models · Sim-to-Real Alignment

NSFC Young Student
Basic Research (PhD)

NSFC Young Student
Basic Research (UG)