Shangzhe Di

I am currently a first-year PhD student at Shanghai Jiao Tong University (SJTU), mentored by Prof. Weidi Xie. My research focuses on video understanding and multimodal learning.

Before joining SJTU, I earned both my master's and bachelor's degrees from Beihang University (BUAA). During that time, I had the privilege of delving into video BGM generation and visual object tracking with Prof. Si Liu as my supervisor.

Email  /  CV  /  Github  /  Google Scholar

profile photo
Education

  • PhD Student, Shanghai Jiao Tong University, Apr. 2023 -
  • M.Eng. in Computer Science, Beihang University, Sep. 2020 - Jan. 2023
  • Exchange Student, Technical University of Munich, Apr. 2019 - Aug. 2019
  • B.Eng. in Software Engineering, Beihang University, Sep. 2016 - Jun. 2020

  • Experience

  • Research Intern (Foundation Models & Face Recognition), SenseTime, Aug. 2022 - Jan. 2023
  • Research Intern (Music Generation and Retrieval), Tencent, Nov. 2021 - Feb. 2022
  • Research Intern (Generative AI & Virtual Try-on), KwaiShou Y-Tech, Oct. 2019 - Apr. 2021

  • Research
    Video Background Music Generation with Controllable Music Transformer
    Shangzhe Di*, Zeren Jiang*, Si Liu, Zhaokai Wang, Leyan Zhu, Zexin He, Hongming Liu, Shuicheng Yan
    ACM MM, 2021 (Best Paper Award)
    project page / arXiv / bibtex

    The first satisfying method for video background music generation.

    Honors and Awards

  • Best Paper Award, ACM MM 2021
  • Best Video Award, IJCAI 2021 Video Competition
  • First Prize Scholarship x 2 (Top 10%), Beihang University, 2019 & 2021
  • Full Scholarship for Exchange Program, China Scholarship Council, 2019
  • Special Prize Scholarship (Top 3%), Beihang University, 2018



  • The website template is borrowed from here.