Zihao Yue (岳子豪)

PhD Student @ AIM3 Lab
School of Information
Renmin University of China

Email: yzihao [at] ruc.edu.cn



Bio


I am currently a third year PhD student at Renmin University of China (RUC), advised by Prof. Qin Jin. I received my B.E. degree in Computer Science from University of Electronic Science and Technology of China (UESTC) in 2022. My research interests include language modeling and video understanding.



Research


Partial Vocabulary Learning

Partial Vocabulary Learning for Neural Text Generation
Invited Talk on CCAI (中国人工智能大会) 2025
[Slides]

Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation
NeurIPS 2023
[Paper] [Github] [Demo]

Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective
ACL 2024
[Paper] [Github]

Large Multimodal Models and Reasoning

MiMo-VL Technical Report
2025
[Report] [Github] [Huggingface]

R1-V: Reinforcing Super Generalization Ability in Vision Language Models
2025
[Report] [Github]

Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding
2025
[Paper] [Github]

Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement
2025
[Paper] [Github]

VideoOrion: Tokenizing Object Dynamics in Videos
ICCV 2025
[Paper]

Unified Multimodal Understanding via Byte-Pair Visual Encoding
ICCV 2025
[Paper]

Movie Understanding

Movie101: A new movie understanding benchmark
ACL 2023
[Paper] [Website] [Github] [Huggingface]

Movie101v2: Improved Movie Narration Benchmark
ACL 2025
[Paper] [Website] [Github] [Huggingface]

Competitions

Video to Text Description @ TRECVID 2024, 1st Place
2024

Video to Text Description @ TRECVID 2023, 1st Place
2023

Video to Text Description @ TRECVID 2022, 1st Place
2022