I am a Research Assistant at the EIT-NLP Lab, advised by Prof. Xiaoyu Shen (沈晓宇). My research focuses on efficient multimodal large language models (MLLMs), particularly MLLM compression and streaming LLMs. I plan to pursue a PhD in 2026 through a joint program between Eastern Institute of Technology, Ningbo (宁波东方理工大学) and Shanghai Jiao Tong University (上海交通大学).
My recent work explores compression techniques for MLLMs, including image, video, audio, 3D and Omni LLMs. If you are interested in academic collaboration, please feel free to reach out via email (haowu.ai.research@gmail.com). We are always looking for motivated interns!
My research interest includes MLLMs and World Models. Beyond accelerating models through compression, I aim to identify and reuse redundancy in representations, attention patterns, and multimodal interactions to develop efficient architectures that improve both performance and computational efficiency in the long run. I have published 7 papers at top-tier AI conferences and journals (e.g., ICLR, CVPR, ECCV, TMM, TMI, and TGRS), including one Best Paper Finalist.
🔥 News
- 2026.03: 📰 Our ICLR 2026 paper on mllm compression has received media coverage from MachineSapiens(机器之心), showcasing its impact on efficient MLLMs.
- 2026.03: 📰 Our CVPR 2026 paper on streaming video reasoning has received media coverage from QbitAI(量子位), showcasing its impact on efficient and real-time video reasoning.
- 2026.03: 📢 We release a systematic survey about Streaming LLMs: “From Static Inference to Dynamic Interaction: Navigating the Landscape of Streaming Large Language Models”.
- 2026.02: 📢 We release a systematic survey about MLLM compression: “From Data to Model: A Survey of the Compression Lifecycle in MLLMs”.
- 2026.02: 🎉 Two papers are accepted by CVPR 2026.
- 2026.02: 🎉 One paper is accepted by TMM 2026.
- 2026.01: 🎉 One paper is accepted by ICLR 2026.
- 2025.06: 🎉 One paper is accepted by TMI 2025.
- 2024.10: 🎉 One paper is accepted by ECCV 2024 and nominated as a 🏆Best Paper Award Candidate.
- 2024.03: 🎉 One paper is accepted by TGRS 2024.
📝 Selected Publications
See full list in Publications.
MLLM Compression
From Data to Model: A Survey of the Compression Lifecycle in MLLMs. Hao Wu*, Junlong Tong*, Xudong Wang, Yang Tan, Changyu Zeng, Anastasia Antsiferova, Xiaoyu Shen†.
-
ViCA: Efficient Multimodal LLMs with Vision-Only Cross-Attention. Wenjie Liu*, Hao Wu*, Xin Qiu, Yingqi Fan, Yihan Zhang, Anhao Zhao, Yunpu Ma, Xiaoyu Shen†.
-
UTPTrack: Towards Simple and Unified Token Pruning for Visual Tracking. Hao Wu*, Xudong Wang*, Jialiang Zhang, Junlong Tong, Xinghao Chen, Junyan Lin, Yunpu Ma, Xiaoyu Shen†.
HiDrop: Hierarchical Vision Token Reduction in MLLMs via Late Injection, Concave Pyramid Pruning, and Early Exit. Hao Wu*, Yingqi Fan*, Jinyang Dai, Junlong Tong, Yunpu Ma, Xiaoyu Shen†.
Streaming LLMs
From Static Inference to Dynamic Interaction: A Survey of Streaming Large Language Models. Junlong Tong, Zilong Wang, YuJie Ren, Peiran Yin, Hao Wu, Wei Zhang, Xiaoyu Shen†.
-
Speak While Watching: Unleashing TRUE Real-Time Video Understanding Capability of Multimodal Large Language Models. Junyan Lin*, Junlong Tong*, Hao Wu*, Jialiang Zhang*, Jinming Liu, Xin Jin, Xiaoyu Shen†.
Think-as-You-See: Streaming Chain-of-Thought Reasoning for Large Vision-Language Models. Jialiang Zhang*, Junlong Tong*, Junyan Lin*, Hao Wu, Yirong Sun, Yunpu Ma, Xiaoyu Shen†.
Pathology MLLMs
-
PathBench: Advancing the Benchmark of Large Multimodal Models for Pathology Image Understanding at Patch and Whole Slide Level. Yuxuan Sun, Hao Wu, Chenglu Zhu, et al, Tao Lin†, Lin Yang†.
-
(Best Paper Award Candidate) PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology. Yuxuan Sun, Hao Wu, Chenglu Zhu, et al, Tao Lin†, Lin Yang†.
Others
-
A Transformer-Based Tracker Integrating Motion and Representation Information. Yuanhui Wang, Ben Ye, Zhanchuan Cai†, Hao Wu.
-
Crater-DETR: A novel transformer network for crater detection based on dense supervision and multiscale fusion. Yue Guo, Hao Wu, Shuojin Yang, Zhanchuan Cai†.
🏆 Honors and Awards
- 2024.10 ECCV 2024 Best Paper Candidate (Top 0.2%).