My Google scholar page is here with the full list of papers including pre-prints.
Publications
SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores (ICLR 2024)
Zhiyu Mei, Wei Fu, Jiaxuan Gao, Guangju Wang, Huanchen Zhang, Yi Wu
Zhiyu Mei, Wei Fu, Jiaxuan Gao, Guangju Wang, Huanchen Zhang, Yi Wu
Stylized Offline Reinforcement Learning: Extracting Diverse High-Quality Behaviors from Heterogeneous Datasets (ICLR 2024)
Yihuan Mao, Chengjie Wu, Xi Chen, Hao Hu, Ji Jiang, Tianze Zhou, Tangjie Lv, Changjie Fan, Zhipeng Hu, Yi Wu, Yujing Hu, Chongjie Zhang
Yihuan Mao, Chengjie Wu, Xi Chen, Hao Hu, Ji Jiang, Tianze Zhou, Tangjie Lv, Changjie Fan, Zhipeng Hu, Yi Wu, Yujing Hu, Chongjie Zhang
OmniDrones: An Efficient and Flexible Platform for Reinforcement Learning in Drone Control (RA-L)
Botian Xu, Feng Gao, Chao Yu, Ruize Zhang, Yi Wu, Yu Wang
Botian Xu, Feng Gao, Chao Yu, Ruize Zhang, Yi Wu, Yu Wang
Quarl: A Learning-Based Quantum Circuit Optimizer (OOPSLA 2024)
Zikun Li, Jinjun Peng, Yixuan Mei, Sina Lin, Yi Wu, Oded Padon, Zhihao Jia
Zikun Li, Jinjun Peng, Yixuan Mei, Sina Lin, Yi Wu, Oded Padon, Zhihao Jia
LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination (AAMAS 2024)
Jijia Liu*, Chao Yu*, Jiaxuan Gao*, Yuqing Xie, Qingmin Liao, Yi Wu, Yu Wang
Jijia Liu*, Chao Yu*, Jiaxuan Gao*, Yuqing Xie, Qingmin Liao, Yi Wu, Yu Wang
MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure (AAMAS 2024)
Zhicheng Zhang*, Yancheng Liang*, Yi Wu and Fei Fang
Zhicheng Zhang*, Yancheng Liang*, Yi Wu and Fei Fang
Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with Subgame Curriculum Learning (AAAI 2024)
Jiayu Chen*, Zelai Xu*, Yunfei Li, Chao Yu, Jiaming Song, Huazhong Yang, Fei Fang, Yu Wang, Yi Wu
Jiayu Chen*, Zelai Xu*, Yunfei Li, Chao Yu, Jiaming Song, Huazhong Yang, Fei Fang, Yu Wang, Yi Wu
Understanding Curriculum Learning in Policy Optimization for Online Combinatorial Optimization (TMLR)
Runlong Zhou, Zelin He, Yuandong Tian, Yi Wu, Simon Shaolei Du
Runlong Zhou, Zelin He, Yuandong Tian, Yi Wu, Simon Shaolei Du
Iteratively Learn Diverse Strategies with State Distance Information (NeurIPS 2023)
Wei Fu, Weihua Du, Jingwei Li, Sunli Chen, Jingzhao Zhang, Yi Wu
Wei Fu, Weihua Du, Jingwei Li, Sunli Chen, Jingzhao Zhang, Yi Wu
Automatics Truss Design with Reinforcement Learning (IJCAI 2023)
Weihua Du*, Jinglun Zhao*, Chao Yu, Xingcheng Yao, Zimeng Song, Siyang Wu, Ruifeng Luo, Zhiyuan Liu, Xianzhong Zhao, Yi Wu |
|
Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased (ICLR 2023)
Chao Yu*, Jiaxuan Gao*, Weilin Liu, Botian Xu, Hao Tang, Jiaqi Yang, Yu Wang, Yi Wu |
|
SpeedyZero: Mastering Atari with Limited Data and Time (ICLR 2023)
Yixuan Mei*, Jiaxuan Gao*, Weirui Ye, Shaohuai Liu, Yang Gao, Yi Wu |
|
Efficient Bimanual Handover and Rearrangement via Symmetry-Aware Actor-Critic Learning (ICRA 2023)
Yunfei Li*, Chaoyi Pan*, Huazhe Xu, Xiaolong Wang, Yi Wu |
|
Fictitious Cross-Play: Learning Nash Equilibrium in Mixed Cooperative-Competitive Games (AAMAS 2023)
Zelai Xu, Yancheng Liang, Chao Yu, Yu Wang and Yi Wu |
|
Asynchronous Multi-Agent Reinforcement Learning for Efficient Real-Time Multi-Robot Cooperative Exploration (AAMAS 2023)
Chao Yu*, Xinyi Yang*, Jiaxuan Gao*, Jiayu Chen, Yunfei Li, Jijia Liu, Yunfei Xiang, Ruixin Huang, Huazhong Yang, Yi Wu and Yu Wang |
|
Differentiable Arbitrating in Zero-sum Markov Games (AAMAS 2023)
Jing Wang*, Meichen Song*, Feng Gao*, Boyi Liu, Zhaoran Wang and Yi Wu |
|
Beyond Information Gain: An Empirical Benchmark for Low-Switching-Cost Reinforcement Learning (TMLR)
Shusheng Xu, Yancheng Liang, Yunfei Li, Simon Shaolei Du, Yi Wu |
|
Maximum Entropy Population-Based Training for Zero-Shot Human-AI Coordination (AAAI 2023)
Rui Zhao, Jinming Song, Yufeng Yuan, Hu Haifeng, Yang Gao, Yi Wu, Zhongqian Sun, Yang Wei
Rui Zhao, Jinming Song, Yufeng Yuan, Hu Haifeng, Yang Gao, Yi Wu, Zhongqian Sun, Yang Wei
|
|
Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning (NeurIPS 2022)
Zhecheng Yuan*, Zhengrong Xue*, Bo Yuan, Xueqian Wang, Yi Wu, Yang Gao, Huazhe Xu |
|
|
Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning (ICML 2022)
Yunfei Li*, Tian Gao*, Jiaqi Yang, Huazhe Xu, Yi Wu |
|
|
Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization (ICLR 2022)
Zihan Zhou*, Wei Fu*, Bingliang Zhang, Yi Wu Check out our code, full paper (with appendix) and discovered behaviors |
|
Learning Design and Construction with Varying-Sized Materials via Prioritized Memory Resets (ICRA 2022)
Yunfei Li, Tao Kong, Lei Li, Yi Wu |
|
Sequence Level Contrastive Learning for Text Summarization (AAAI 2022)
Shusheng Xu, Xingxing Zhang, Yi Wu, Furu Wei |
|
Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems (NeurIPS 2021)
Jiayu Chen, Yuanxin Zhang, Yuanfan Xu, Huimin Ma, Huazhong Yang, Jiaming Song, Yu Wang, Yi Wu |
|
NovelD: A Simple yet Effective Exploration Criterion (NeurIPS 2021)
Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian |
|
Native Chinese Reader: A Dataset Towards Native-Level Chinese Machine Reading Comprehension (NeurIPS 2021, dataset track)
Shusheng Xu*, Yichen Liu*, Xiaoyu Yi, Siyuan Zhou, Huizi Li, Yi Wu Dataset, models and supplementary materials can be found at our project website |
|
Learning to Design and Construct Bridge without Blueprint (IROS 2021)
Yunfei Li, Tao Kong, Lei Li, Yifeng Li, Yi Wu |
|
Temporal Induced Self-Play for Stochastic Bayesian Games (IJCAI 2021)
Weizhe Chen*, Zihan Zhou*, Yi Wu, Fei Fang |
|
Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization (ICLR 2021)
Zhenggang Tang*, Chao Yu*, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Shaolei Du, Yu Wang, Yi Wu |
|
Solving Compositional Reinforcement Learning Problems via Task Reduction (ICLR 2021)
Yunfei Li, Yilin Wu, Huazhe Xu, Xiaolong Wang, Yi Wu |
|
Multi-Task Reinforcement Learning with Soft Modularization (NeurIPS 2020)
Ruihan Yang, Huazhe Xu, Yi Wu, Xiaolong Wang |
|
Unsupervised Extractive Summarization by Pre-training Hierarchical Transformers (Findings of EMNLP 2020)
Shusheng Xu, Xingxing Zhang, Yi Wu, Furu Wei, Ming Zhou Source code can be found here |
|
Emergent Tool Use from Multi-Agent Autocurricula (ICLR 2020, Spotlight)
Bowen Baker*, Ingmar Kanitscheider*, Todor Markov*, Yi Wu*, Glenn Powell*, Bob McGrew*, Igor Mordatch* (*team project) Check out our blog post with the most popular video ever in OpenAI's history :) |
|
Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning (ICLR 2020)
Qian Long*, Zihan Zhou*, Abhibav Gupta, Fei Fang, Yi Wu†, Xiaolong Wang† († equal advising) Project website and source code can be found here |
|
Influence-Based Multi-Agent Exploration (ICLR 2020, Spotlight)
Tonghan Wang*, Jianhao Wang*, Yi Wu, Chongjie Zhang The full version with appendix can be found here |
|
|
Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient (AAAI 2019, Oral)
Shihui Li, Yi Wu, Xinyue Cui, Honghua Dong, Fei Fang, Stuart Russell Source code can be found here |
|
Deep Reinforcement Learning for Green Security Games with Real-Time Information (AAAI 2019, Oral)
Yufei Wang, Zheyuan Ryan Shi, Lantao Yu, Yi Wu, Rohit Singh, Lucas Joppa, Fei Fang |
|
|
Discrete-Continuous Mixtures in Probabilistic Programming: Generalized Semantics and Inference Algorithms (ICML 2018)
Yi Wu, Siddharth Srivastava, Nicholas Hay, Simon S. Du, Stuart Russell A short version was accepted at PPS workshop, POPL 2017 |
|
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments (NIPS 2017)
Ryan Lowe*, Yi Wu*, Aviv Tamar, Jean Harb, Pieter Abbeel, Igor Mordatch (* equal contribution) The OpenAI blog for this work is here |
|
Adversarial Training for Relation Extraction (EMNLP 2017)
Yi Wu, David Bamman, Stuart Russell Source code can be found here |
|
A Nearly-Black-Box Online Algorithm for Joint Parameter and State Estimation in Temporal Models (AAAI 2017, Oral)
Yusuf B. Erol*, Yi Wu*, Lei Li, Stuart Russell (* equal contribution) Supplementary materials are here |
|
Value Iteration Network (NIPS 2016, Best Paper Award)
Aviv Tamar, Yi Wu, Garrett Thomas, Sergey Levine, Pieter Abbeel |
|
|
Understanding and Evaluating Sparse Linear Discriminant Analysis (AISTATS 2015, Oral)
Yi Wu, David Wipf, Jeong-Min Yun |
|
Dual Space Analysis of the Sparse Linear Model (NIPS2012)
David Wipf, Yi Wu |
|
Selected Workshop Papers
Building Generalizable Agents with a Realistic and Rich 3D Environment
(NIPS 2017 Deep Reinforcement Learning Symposium; ICLR 2018 Workshop Track) Check the Github page of our House3D project |
|
BFiT: From Possible-World Semantics to Random-Evaluation Semantics in an Open Universe
(NIPS 2014 workshop on Probabilistic Programming, spotlight) |
|
Thesis
On Building Generalizable Learning Agents
Fall 2019 at UC Berkeley |
|