Hello! This is Zhaopeng (pronounced: /dʒaʊ pʌŋ/). I am a Master’s student at Zhejiang University, advised by Prof. Zuozhu Liu (刘佐珠), with an expected graduation in 2026. In August 2026, I will join the School of Computing at the National University of Singapore (NUS) as a PhD student, supervised by Prof. Bryan Hooi. I’m currently a research intern at Alibaba Tongyi Lab (DeepResearch Team), advised by Xinyu Wang (王新宇).
My current research interests lie in Agent Context Learning and Reinforcement Learning for Agents, with a particular focus on small language models. Previously, my research focused on Large Language Models (LLMs), specifically in Reinforcement Learning, post-training, and agent-based applications in Machine Translation. I am a highly self-motivated and quick learner with strong communication, collaboration, and practical engineering skills. I am driven by a desire to pursue research directions that are both general and impactful. If you are seeking any form of cooperation, please feel free to email me at zhaopengfeng424@gmail.com.
I earned my Bachelor’s degree in Automation from the Harbin Institute of Technology, Shenzhen (HITSZ). As an undergraduate, I had the valuable opportunity to be a research intern at the City University of Hong Kong under the guidance of Prof. Shiqi Wang (王诗淇) . I’ve also been fortunate to gain industry experience at Alibaba and Xiaohongshu, and to collaborate with incredible mentors like Yu Cao (曹宇) , and Shaosheng Cao (曹绍升). I have also worked with Prof. Haizhou Li (李海洲), Yan Zhang (张琰) and Baoliang Chen (陈宝亮), whose guidance has been invaluable.
My research has led to several publications at top-tier AI conferences, including ACL , EMNLP , and NAACL.
Outside of research, I am a strong believer in knowledge sharing. I previously co-operated the DLNLP community, which grew to over 100,000 subscribers. In 2024, I started my own academic paper-sharing account on Xiaohongshu (Rednote), “nlper今天读paper了吗”, which has quickly grown to a community of over 15,000 followers. I also stay active and energized through sports. I’m a certified National Level 2 Basketball Referee🏀, a fitness enthusiast🎽, a runner👟, and a former competitive rower🚣.
🔥 News
- 2026.02: 🎓 Received a PhD offer from NUS SoC and was honored to be nominated for the AISG PhD Fellowship.
- 2025.11: 🎉 One papers was accepted to AAAI Oral.
- 2025.10: 🏆 Awarded National Scholarship.
- 2025.08: 🎉 Two papers were accepted to EMNLP 2025 Findings.
- 2025.05: 🎉 One paper was accepted to ACL 2025.
- 2025.01: 📕 My academic sharing account on Xiaohongshu (Rednote) reached over 10,000 followers.
- 2025.01: 🎉 One paper was accepted to NAACL 2025 Findings.
- 2024.09: 🎉 One paper was accepted to EMNLP 2024.
- 2024.07: 🏆 Our team won 2nd Place in the KDD Cup 2024 Reasoning Track (Best Student Team Award; Team Lead).
- 2023.09: 🎉 One paper was accepted to EMNLP 2023 Findings.
📝 Selected Research Papers
My full paper list is shown at my google scholar.
Post-train/Reinforcement Learning in MT
EMNLP 2025 FindingsMT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning, Zhaopeng Feng, Shaosheng Cao, Jiahan Ren, et al.EMNLP 2024Ladder: A Model-Agnostic Framework Boosting LLM-based Machine Translation to the Next Level, Zhaopeng Feng*, Ruizhe Chen*, Yan Zhang, et al.Pre-printMT^3: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning, Zhaopeng Feng*, Yupu Liang*, Shaosheng Cao, et al.
Agent/Multi-Agent in MT
ACL 2025M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation, Zhaopeng Feng*, Jiayuan Su*, Jiamei Zheng, et al.NAACL 2025 FindingsTEaR: Improving LLM-based Machine Translation with Systematic Self-Refinement, Zhaopeng Feng*, Yan Zhang*, et al.
Text Embedding/Reasoning/Multimodal
ENMLP 2025 FindingsMT-RewardTree: A Comprehensive Framework for Advancing LLM-Based Machine Translation via Reward Modeling, Zhaopeng Feng*, Jiahan Ren*, Jiayuan Su*, et al.ENMLP 2023 FindingsHow Well Do Text Embedding Models Understand Syntax?, Yan Zhang*, Zhaopeng Feng*, Zhiyang Teng, Zuozhu Liu, Haizhou Li.-
AAAI 2026 (Oral)CP-Router: An Uncertainty-Aware Router Between LLM and LRM, Jiayuan Su*, Fulin Lin*, Zhaopeng Feng*, et al. -
CVPR 2026CompBench: Benchmarking Complex Instruction-guided Image Editing, Bohan Jia*, Wenxuan Huang*, Yuntian Tang, et al. Pre-printMed-U1: Incentivizing Unified Medical Reasoning in LLMs via Large-scale Reinforcement Learning, Xiaotian Zhang*, Yuan Wang*, Zhaopeng Feng*, et al.
🕊️Interesting Experience
- 2025.07 On vacation in Austria and Greece.
- 2025.06 Certified as a National Level 2 Basketball Referee.
- 2024.07 On vacation in Spain and Portugal.
- 2024.01 Certified as a National Level 3 Basketball Referee.
- 2023.12 Completed my first half marathon at Qiandao Lake.
- 2023.07 On vacation in Thailand.
- 2022.10 Secured 3rd Place in the Men’s Eight (M8+) at the Shenzhen Inter-collegiate Rowing League, representing HITSZ.
- 2020.01 Successfully lose weight (from 96.5KG to 75KG).
📖 Educations
- 2023.09 - 2026.03, Master, Zhejiang University, Hangzhou.
- 2019.09 - 2023.06, Undergraduate, Harbin Institute of Technology (Shenzhen), Shenzhen.
💻 Internships
-
2025.11 - Now, Alibaba (Tongyi Lab), advised by Xinyu Wang, Hangzhou.
-
2025.03 - 2025.10, Xiaohongshu (Rednote), advised by Shaosheng Cao, Shanghai.
-
2024.11 - 2025.03, Alibaba (Quark, now the Qwen Business Group), advised by Yu Cao, Hangzhou.