Self-Bootstrapping on the Self-Reinforcing Mind
Resume
✉️ Email
🎓 Google Scholar
🧑💻 Github
💼 Linkedln
🔎Research and Personal Interests
- 🧠 Reinforcement Learning
- 🤖 Machine Learning
- :capybara_icon: Capybara
🚶 Experience
🏫 Education
- 09/2024 ~ Present: University of Southampton :southampton:
- 09/2023 ~ 09/2024: University of Liverpool :liverpool:
- 09/2019 ~ 04/2022: Nanjing University of Aeronautics and Astronautics :nuaa:
- 09/2015 ~ 09/2019: Nanjing University of Aeronautics and Astronautics :nuaa:
- Undergraduate Student in Mathematics.
🏢 Work
- 04/2022 ~ 09/2023: **Parametrix.AI** :ccs2:
- Reinforcement Learning and Gaming AI Research Engineer (P2 Team).
📄Publication
- Directly Forecasting Belief for Reinforcement Learning with Delays.
- Qingyuan Wu, Yuhui Wang, Simon Sinong Zhan, Yixuan Wang, Chung-Wei Lin, Chen Lv, Qi Zhu, Jürgen Schmidhuber, Chao Huang.
- [ICML 2025] International Conference on Machine Learning, 2025, Poster.
- [Paper] / [Code]
- Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning.
- Yuhui Wang, Qingyuan Wu, Weida Li, Dylan R. Ashley, Francesco Faccio, Chao Huang, Jürgen Schmidhuber.
- [ICML 2025] International Conference on Machine Learning, 2025, Poster.
- [Paper]
- Variational Delayed Policy Optimization.
- Qingyuan Wu, Simon Sinong Zhan, Yixuan Wang, Yuhui Wang, Chung-Wei Lin, Chen Lv, Qi Zhu, Chao Huang.
- [NeurIPS 2024] Conference on Neural Information Processing Systems, 2024, Spotlight.
- [Paper] / [Code] / [Poster]
- Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays.
- Qingyuan Wu, Simon Sinong Zhan, Yixuan Wang, Yuhui Wang, Chung-Wei Lin, Chen Lv, Qi Zhu, Jürgen Schmidhuber, Chao Huang.
- [ICML 2024] International Conference on Machine Learning, 2024, Poster.
- [Paper] / [Code] / [Poster]
- Highway Value Iteration Networks.
- Yuhui Wang, Weida Li, Francesco Faccio, Qingyuan Wu, Jürgen Schmidhuber.
- [ICML 2024] International Conference on Machine Learning, 2024, Poster.
- [Paper]
- State-wise Safe Reinforcement Learning with Pixel Observations.
- Simon Sinong Zhan, Yixuan Wang, Qingyuan Wu, Ruochen Jiao, Chao Huang, Qi Zhu.
- [L4DC 2024] Learning for Dynamics and Control Conference, 2024, Poster.
- [Paper] / [Code]
📝Pre-print
- VSC-RL: Advancing Autonomous Vision-Language Agents with Variational Subgoal-Conditioned Reinforcement Learning.
- Qingyuan Wu, Jianheng Liu, Jianye Hao, Jun Wang, Kun Shao.
- [Paper] / [Code] / [Website]
- Inverse Delayed Reinforcement Learning.
- Simon Sinong Zhan, Qingyuan Wu, Zhian Ruan, Frank Yang, Philip Wang, Yixuan Wang, Ruochen Jiao, Chao Huang, Qi Zhu.
- [Paper]
- Model-based Reward Shaping for Adversarial Inverse Reinforcement Learning in Stochastic Environments.
- Simon Sinong Zhan, Qingyuan Wu, Philip Wang, Yixuan Wang, Ruochen Jiao, Chao Huang, Qi Zhu.
- [Paper]
- Highway Reinforcement Learning.
- Yuhui Wang, Miroslav Strupl, Francesco Faccio, Qingyuan Wu, Haozhe Liu, Michał Grudzień, Xiaoyang Tan, Jürgen Schmidhuber.
- [Paper]
- Expected-Max Ensemble Q-learning with Temporally-Varying Exploration.
- Qingyuan Wu, Yuhui Wang.
- [Paper]
- Greedy-Step Off-Policy Reinforcement Learning.
- Yuhui Wang, Qingyuan Wu, Pengcheng He, Xiaoyang Tan.
- [Paper]
Nothing more can be shared here 🙈 🙉 🙊