Hi, my name is Wei Liu (刘威). Currently, I am a 1st-year Ph.D. student at HKUST NLP, supervised by Prof. Junxian He. Previously, I obtained my master’s degree from ShanghaiTech University, advised by Prof. Kewei Tu.

I am focusing on natural language processing (NLP) and machine learning (ML). Recently, I am interested in exploring scalable methods for large language models (LLMs) and improving models’ abilities in complex scenarios.

To be more specific, my research interests lie in:

Scalable LLMs: Exploring scalable methods for training, inference in large language models.
Complex Reasoning: Improving the reasoning ability of LLMs for complex tasks e.g. Math, Code, and Agents.
Multi-Modal LLMs: Investigating the integration of vision, language, and other modalities in LLMs.
Deep Latent-variable Models: Deep Latent-variable models.

News

[05/2025] We release Laser to improve efficacy-efficiency trade-off of Large Reasoning Models, improving AIME24 +6.1 while reducing 63% of tokens used!
[05/2025] Three papers accepted by ICML 2025!
[01/2025] Announce our latest effort on O/R-1 Style Model and Scalable Reinforcement Learning for LLM Reasoning! SimpleRL-Reason Twitter
[11/2024] Announce M-STAR (Multimodal Self-Evolving TrAining for Reasoning) project! M-STAR aims at facilitating multimodal reasoning via self-evolving training. More details will be released soon!
[07/2024] Is Your Model Really a Good Math Reasoner?Let’s use MathCheck to Evaluate Mathematical Reasoning with a Checklist! Unlike benchmarks that can be tackled merely through memorization, MathCheck reveals the more robust and comprehensive mathematical reasoning abilities of LLMs. It represents Mathematical Intelligence More Linearly!
[01/2024] Our Deita paper What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning has been accepted by ICLR2024!
[12/2023] Our Deita (Data-Efficient Instruction Tuning for Alignment) Project has been released! Utilizing only 6K samples of SFT data selected by Deita, along with 10K randomly selected preference data, our Deita-7B model has achieved remarkable results, scoring 7.55 on the MT-Bench benchmark, 90.06% on AlpacaEval, and 69.86 on the OpenLLM Benchmark! Welcome to try!

📝 Publications (* denotes equal contribution)

Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

Wei Liu, Ruochen Zhou, Yiyun Deng, Yuzhen Huang, Junteng Liu, Yuntian Deng, Yizhe Zhang, Junxian He

Preprint. | Project

Diving into Self-Evolving Training for Multimodal Reasoning

Wei Liu*, Junlong Li*, Xiwen Zhang, Fan Zhou, Yu Cheng, Junxian He

In Proceedings of ICML, 2025. | Reasoning and Planning for Large Language Models Workshop at ICLR 2025. project

Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement Learning

Zican Hu, Wei Liu, Xiaoye Qu, Xiangyu Yue, Chunlin Chen, Zhi Wang, Yu Cheng

In Proceedings of ICML, 2025.

Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging

Shiqi Chen*, Jinghan Zhang*, Tongyao Zhu, Wei Liu, Siyang Gao, Miao Xiong, Manling Li, Junxian He

In Proceedings of ICML, 2025.

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

Weihao Zeng*, Yuzhen Huang*, Qian Liu*, Wei Liu, Keqing He, Zejun Ma, Junxian He

Preprint. project

7B Model and 8K Examples: Emerging Reasoning with Reinforcement Learning is Both Effective and Efficient

Weihao Zeng*, Yuzhen Huang*, Wei Liu, Keqing He, Qian Liu, Zejun Ma, Junxian He

Preprint. project Twitter

Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist

Zihao Zhou*, Shudong Liu*, Maizhen Ning, Wei Liu, Jindong Wang, Derek F. Wong, Xiaowei Huang, Qiufeng Wang, Kaizhu Huang

In Proceedings of ICLR, 2025. project

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning

Wei Liu*, Weihao Zeng*, Keqing He, Yong Jiang, Junxian He

In Proceedings of ICLR, 2024. project

MathAttack: Attacking Large Language Models Towards Math Solving Ability

Zihao Zhou, Qiufeng Wang, Mingyu Jin, Jie Yao, Jianan Ye, Wei Liu, Wei Wang, Xiaowei Huang, Kaizhu Huang

In Proceedings of AAAI, 2024.

SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding

Tianyu Yu*, Chengyue Jiang*, Chao Lou*, Shen Huang*, Xiaobin Wang, Wei Liu, Jiong Cai, Yangning Li, Yinghui Li, Kewei Tu, Hai-Tao Zheng, Ningyu Zhang, Pengjun Xie, Fei Huang, Yong Jiang

In Proceedings of AAAI, 2024. code

Simple Hardware-Efficient PCFGs with Independent Left and Right Productions

Wei Liu*, Songlin Yang*, Yoon Kim, Kewei Tu

In Findings of EMNLP, 2023. code

Joint Entity and Relation Extraction with Span Pruning and Hypergraph Neural Networks

Zhaohui Yan, Songlin Yang, Wei Liu, Kewei Tu

In Proceedings of EMNLP, 2023. code

Structured Mean-Field Variational Inference for Higher-Order Span-Based Semantic Role Labeling

Wei Liu, Songlin Yang, Kewei Tu

In Findings of ACL, 2023. code

Dynamic Programming in Rank Space: Scaling Structured Inference with Low-Rank HMMs and PCFGs

Songlin Yang*, Wei Liu*, Kewei Tu

In Proceedings of NAACL, 2022(Top-3 Score in ARR Jan 2022). code

Knowledge-Based Chat Detection With False Mention Discrimination

Wei Liu, Peijie Huang, Dongzhu Liang, Zihao Zhou

In Proceedings of ICASSP, 2021.

A Knowledge-gated Mechanism for Utterance Domain Classification

Zefeng Du, Peijie Huang, Yuhong He, Wei Liu & Jiankai Zhu

In Proceedings of NLPCC, 2019.

👥 Service

Reviewer: ARR, ACL, NAACL, EMNLP, COLM, NeurIPS, ICLR

💻 Internships

2025.06 - Present, TikTok AI Innovation Center, Singapore
2023.11 - 2024.08, Shanghai AI Laboratory, advised by Prof. Yu Cheng
2022.07 - 2023.10, Alibaba DAMO Academy

🏆 Awards

2023 National Scholarship in China

📖 Educations

2024.09 - Present, The Hong Kong University of Science and Technology, Ph.D. Student in computer science
2021.09 - 2024.07, ShanghaiTech University, M.S. in computer science
2017.09 - 2021.06, South China Agricultural University, B.E. in computer science