Research Interests
I have broad interests in LLM reasoning, safety, and evaluation, with a recent focus on CoT monitorability in reasoning models and safety issues in coding agents.
Selected Publications & Manuscripts (* denotes the equal contribution)
-
On The Fragility of Benchmark Contamination Detection in Reasoning Models
Han Wang*, Haoyu Li*, Brian Ko*, Huan Zhang
ICLR 2026
-
DecepChain: Inducing Deceptive Reasoning from Large Language Model
Wei Shen*, Han Wang*, Haoyu Li*, Huan Zhang
Preprint
-
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Junyu Zhang*, Runpei Dong*, Han Wang, Xuying Ning, Haoran Geng,
Peihao Li, Xialin He, Yutong Bai, Jitendra Malik, Saurabh Gupta, Huan Zhang
EMNLP 2025 Main
-
The Emperor's New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data Contamination
Yifan Sun*, Han Wang*, Dongbai Li*, Gang Wang, Huan Zhang
ICML 2025
-
Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbreaks
Han Wang, Gang Wang, Huan Zhang
CVPR 2025
-
ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based Evaluation
Jingnan Zheng*, Han Wang*, An Zhang, Tai D. Nguyen, Jun Sun, Tat-Seng Chua
NeurIPS 2024
Education
University of Illinois Urbana-Champaign, IL
Ph.D. • Aug. 2024 to Present
|
|
|
Zhejiang University, China
B.Eng. • Aug. 2020 to June 2024
|
|
|
Services
Conference Reviewer: NeurIPS 2025, ICLR 2026, ICML 2026, ACL ARR 2025, ACM CCS AISec Workshop 2025, NeurIPS MATH-AI Workshop 2025
Journay Reviewer: IEEE TNNLS 2025
Teaching Assistant: ECE 484 (Principles of Safe Autonomy), UIUC, Fall 2025
Website source from Jon Barron.
Last Updated: Jan 26th, 2025.
|
|