Shangzhe Di   狄尚哲
Hi, I am a second-year PhD student at Shanghai Jiao Tong University (SJTU) under the guidance of Prof. Weidi Xie. My research focuses on video understanding and multimodal learning.
Before joining SJTU, I completed my master's and bachelor's degrees at Beihang University (BUAA). Under the supervision of Prof. Si Liu, I explored video BGM generation and visual object tracking.
Email  / 
CV  / 
Github  / 
Google Scholar
|
|
|
Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
Qirui Chen, Shangzhe Di, Weidi Xie
preprint, 2024.
paper / project page / code
Pinpoint scattered visual evidence in long egocentric videos while responding to questions.
|
|
Grounded Question-Answering in Long Egocentric Videos
Shangzhe Di, Weidi Xie
In CVPR, 2024.
paper / project page / code / bibtex
Simultaneous query grounding and answering in long, egocentric videos.
|
|
Linker: Learning Long Short-term Associations for Robust Visual Tracking
Zizheng Xun*, Shangzhe Di*, Yulu Gao, Zongheng Tang, Gang Wang, Si Liu, Bo Li
In IEEE Transactions on Multimedia (TMM), 2023.
paper
|
|
Video Background Music Generation with Controllable Music Transformer
Shangzhe Di*, Zeren Jiang*, Si Liu, Zhaokai Wang, Leyan Zhu, Zexin He, Hongming Liu, Shuicheng Yan
In ACM MM, 2021. (Best Paper Award)
paper / project page / code / colab notebook / bibtex
The first satisfying method for video background music generation.
|
The website template is borrowed from here.
|