I am currently a researcher at the Alibaba Qwen Team. Prior to this, I was a PhD student in the CUHK Text Mining Group, under the supervision of Professor Wai Lam.
My research primarily focuses on applying reinforcement learning techniques to enhance the reasoning capabilities and alignment of large language models (LLMs).
📝 Preprints

Soft Adaptive Policy Optimization
Chang Gao, Chujie Zheng, Xiong-Hui Chen, Kai Dang, Shixuan Liu, Bowen Yu, An Yang, Shuai Bai, Jingren Zhou, Junyang Lin


Group Sequence Policy Optimization
Chujie Zheng, Shixuan Liu, Mingze Li, Xiong-Hui Chen, Bowen Yu, Chang Gao, Kai Dang, Yuqiong Liu, Rui Men, An Yang, Jingren Zhou, Junyang Lin

📝 Publications

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Shenzhi Wang, Le Yu, Chang Gao, Chujie Zheng, Shixuan Liu, Rui Lu, Kai Dang, Xionghui Chen, Jianxin Yang, Zhenru Zhang, Yuqiong Liu, An Yang, Andrew Zhao, Yang Yue, Shiji Song, Bowen Yu, Gao Huang, Junyang Lin










Search Clarification Selection via Query-Intent-Clarification Graph Attention
Chang Gao, Wai Lam

📖 Educations
- 2020.08 - 2025.03, PhD, The Chinese University of Hong Kong, Hong Kong, China
- 2018.09 - 2020.06, Master, Harbin Institute of Technology, Harbin, China
- 2014.09 - 2018.06, Undergraduate, Harbin Institute of Technology, Weihai, China
🎖 Honors and Awards
- 2020.09 ACM SIGIR Student Travel Grant
- 2020.06 Outstanding Master Thesis Award
- 2018.06 Outstanding Graduate Award
- 2017.11 National Scholarship
- 2016.11 National Scholarship