Yang Li's personal homepage

Bio

Dr. Yang Li is an associate professor at the School of Computer Science and Technology in East China Normal University. He is leading the research group, Visual Perception + X group, focusing on the intersection of computer vision, computer graphics, and robotics areas. Dr. Li completed his PhD in the College of Computer Science at Zhejiang University with Prof. Jianke Zhu. In 2018, he visited the University of California, San Diego and worked with Prof. Michael Yip in the ECE department. Meanwhile, he worked as a part-time researcher at Alibaba-Zhejiang University Joint Institute of Frontier Technologies during his PhD. Prior to that, he spent some years in the video game industry at Virtuos.

Research Group

Visual Perception + X (VPX) group's mission is to develop visual perception for cross-disciplinary research, particularly, in methods of extracting meaningful information and structural data from videos and raw streaming sources. Using the vision information further enables AI-based downstream applications, including metaverse, AIGC and embodied intelligence. Currently, VPX focuses on enabling vision-based AI technologies in video analysis, controllable video/image generation, and non-rigid-object-manipulation in robotics.

Research Areas

Computer Vision
Machine Learning
Computer Graphics (Neural Rendering)
Robotics (Visual Perception and Manipulation)

Selected Research Projects

Visual Object Tracking

Given a video sequence and a selected target in the first frame, visual object tracking aims to track the target object robustly and accurately during the whole video sequence. We are interested in different geometric representations of the tracking algorithm. With various geometric representations, visual object tracking methods can be viewed as basic blocks for many high-level AI applications, such as surveillance, video editing/analysis, etc.

3D Motion Capture

Taking one step forward, estimating the geometric representation of a visual object in the RGB-D sequence leads us to the 3D motion capture topic. It is highly related to 3D dynamic reconstruction, RGB-D fusion and 3D visual tracking. The algorithm outputs all geometric properties for every pixel in the sequence. It can be viewed as a fundamental 3D perception method which is very useful in AR/VR and Robotics. The 3D reconstruction technique can also be applied to game/film making, geography, and so on.

Scene Reconstruction & Neural Rendering

With all geometric properties, the next step is to visualize to what human being could understand. To this end, reconstruction and rendering related topics come to our research directions. Based on deep learning, we can bring the real work back onto screen without traditional computer graphics pipeline. We value these topics as next generation tech. for Games and Metaverse.

Visual Perception for Robots

With the capabilities of perceiving visual information in a sequence/video stream, we have succeeded in applying computer vision algorithms to surgical robots to automatically manipulate bio-tissue. In the future, we are planning to continue exploring the possibilities of AGI and Embodied Intelligence-related interdisciplinary projects.

Selected Publications

GT²-GS: Geometry-aware Texture Transfer for Gaussian Splatting
Wenjie Liu, Zhongliang Liu, Junwei Shu, Changbo Wang, Yang Li
AAAI, 2026
[PAPER] [Project Webpage]

Audio-VLA: Adding Contact Audio Perception to Vision-Language-Action Model for Robotic Manipulation
Xiangyi Wei, Haotian Zhang, Xinyi Cao, Siyu Xie, Weifeng Ge, Yang Li, Changbo Wang
Arxiv, 2025
[PAPER] [Project Webpage]

TimeSoccer: An End-to-End Multimodal Large Language Model for Soccer Commentary Generation
Ling You, Wenxuan Huang, Xinni Xie, Xiangyi Wei, Bangyan Li, Yang Li, Shaohui Lin, Changbo Wang
ACM MM, 2025
[PAPER] [Project Webpage]

ABC-GS: Alignment-Based Controllable Style Transfer for 3D Gaussian Splatting
Wenjie Liu, Zhongliang Liu, Xiaoyan Yang, Man Sha, Yang Li
IEEE International Conference on Multimedia and Expo (ICME), 2025
[PAPER] [Project Webpage]

Warped convolutional neural networks for large homography transformation with psl(3) algebra
Xinrui Zhan, Wenyu Liu, Risheng Yu, Jianke Zhu and Yang Li
Neurocomputing, 2025
[PAPER] [Early Arxiv Version]

Open-World Reinforcement Learning over Long Short-Term Imagination
Jiajian Li, Qi Wang, Yunbo Wang, Xin Jin, Yang Li, Wenjun Zeng, Xiaokang Yang
ICLR, 2025
[PAPER] [Project Webpage]

Motion-Zero: A Zero-Shot Trajectory Control Framework of Moving Object for Diffusion-Based Video Generation
Changgu Chen, Junwei Shu, Gaoqi He, Changbo Wang, Yang Li
AAAI, 2025
[PAPER] [Project Webpage]

ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model
Yiming Sun, Fan Yu, Shaoxiang Chen, Yu Zhang, Junwei Huang, Yang Li, Chenhui Li, Changbo Wang
NeurIPS, 2024
[PAPER] [Project Webpage]

FIND: Fine-tuning Initial Noise Distribution with Policy Optimization for Diffusion Models
Changgu Chen, Libing Yang, Xiaoyan Yang, Lianggangxu Chen, Gaoqi He, Changbo Wang, Yang Li
ACM Multimedia, 2024
[PAPER] [Project Webpage]

ClothPPO: A Proximal Policy Optimization Enhancing Framework for Robotic Cloth Manipulation with Observation-Aligned Action Spaces
Libing Yang, Yang Li, Long Chen
International Joint Conference on Artificial Intelligence (IJCAI), 2024
[PAPER] [Project Webpage]

Multi-Prototype Space Learning for Commonsense-Based Scene Graph Generation
Lianggangxu Chen, Youqi Song, Yiqing Cai, Jiale Lu, Yang Li, Yuan Xie, Changbo Wang, Gaoqi He
The Conference on Association for the Advancement of Artificial Intelligence (AAAI), 2024
[PAPER]

RAGT: Learning Robust Features for Occluded Human Pose and Shape Estimation with Attention-Guided Transformer
Ziqing Li, Yang Li, Shaohui Lin
CAD&Graphics, 2023
[PAPER]

AdaptMVSNet: Efficient Multi-View Stereo with Adaptive Convolution and Attention Fusion
Pengfei Jiang, Xiaoyan Yang, Yuanjie Chen, Wenjie Song, Yang Li
Computers & Graphics, 2023
[PAPER] [CODE]

Contact-conditioned Hand-held Object Reconstruction from Single-View Images
Xiaoyuan Wang, Yang Li, Adnane Boukhayma, Changbo Wang, Marc Christie
Computers & Graphics, 2023
[PAPER]

InvVis: Large-Scale Data Embedding for Invertible Visualization
Huayuan Ye, Chenhui Li, Yang Li, Changbo Wang
IEEE Transactions on Visualization and Computer Graphics, 2023
[PDF] [CODE]

Multi-Source Templates Learning for Real-Time Aerial Tracking
Yiming Sun, Yang Li, Changbo Wang
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
[PAPER] [CODE]

Homography Decomposition Networks for Planar Object Tracking
Xinrui Zhan, Yueran Liu, Jianke Zhu and Yang Li
The Conference on Association for the Advancement of Artificial Intelligence (AAAI), 2022
[PDF] [CODE] [Project Webpage]

SuPer Deep: A Surgical Perception Framework for Robotic Tissue Manipulation using Deep Learning for Feature Extraction
Jingpei Lu, Ambareesh Jayakumari, Florian Richter, Yang Li and Michael C. Yip
IEEE Conference on Robotics and Automation (ICRA), 2021
[PDF] [Project Webpage]

Attribute-aware Pedestrian Detection in a Crowd
Jialiang Zhang, Lixiang Lin, Jianke Zhu, Yang Li, Yun-chen Chen, Yao Hu, Steven C.H. Hoi
IEEE Trans. on Multimedia, 2020
[PDF] [CODE]

SuPer: A Surgical Perception Framework for Endoscopic Tissue Manipulation with Surgical Robotics
Yang Li, Florian Richter, Jingpei Lu, Emily K. Funk, Ryan K. Orosco, Jianke Zhu and Michael C. Yip
IEEE Robotics and Automation Letters, 2020
[PDF] [Project Webpage]

DeepFacade: A deep learning approach to facade parsing with symmetric loss
Hantang Liu, Yinghao Xu, Jialiang Zhang, Jianke Zhu, Yang Li, Steve C.H. Hoi
IEEE Trans. on Multimedia, 2020
[PDF]

Robust Estimation of Similarity Transformation for Visual Object Tracking
Yang Li, Jianke Zhu, Steven C.H. Hoi, Wenjie Song, Zhefeng Wang, Hantang Liu
The Conference on Association for the Advancement of Artificial Intelligence (AAAI), 2019
[PDF] [CODE] [Project Webpage]

Temporally-Adjusted Correlation Filter-based Tracking
Wenjie Song, Yang Li, Jianke Zhu, Chun Chen
Neurocomputing, 2018
[PDF]

CFNN: Correlation Filter Neural Network for Visual Object Tracking
Yang Li, Zhan Xu and Jianke Zhu
International Joint Conference on Artificial Intelligence (IJCAI), 2017
[PDF] [CODE]

Reliable Patch Trackers: Robust Visual Tracking by Exploiting Reliable Patches
Yang Li, Jianke Zhu, Steven C.H. Hoi
Computer Vision and Pattern Recognition (CVPR), 2015
[PDF] [ABSTRACT] [CODE]

Image Alignment by Online Robust PCA via Stochastic Gradient Descent
Wenjie Song, Jianke Zhu, Yang Li, Chun Chen.
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2015
[PDF]

A Scale Adaptive Kernel Correlation Filter Tracker with Feature Integration
Yang Li, Jianke Zhu
European Conference on Computer Vision, Workshop VOT2014 (ECCVW), 2014.
[PDF] [CODE]

The visual object tracking vot2014 challenge results.
M. Kristan, R. Pflugfelder, et al. (Co-author)
In ECCV 2014 Workshops, Workshop on Visual Object Tracking Challenge 2014
[PDF] We won the second place in VOT2014 challenge.

The visual object tracking vot2013 challenge results.
M. Kristan, R. Pflugfelder, et al. (Co-author)
In ICCV 2013 Workshops, Workshop on Visual Object Tracking Challenge 2013
[PDF]

Adaptive lattice-based light rendering of participating media.
Changbo Wang, Chenhui Li, Jinqiu Dai, Yang Li
Journal of Computer Animation and Virtual World 22(6): 487-498 (2011).
[PDF]

Real-time realistic rendering of under seawater scene.
Chenhui Li, Changbo Wang, Yang Li, Min Zhao, et al.
Journal of Image and Graphics. 2011.16(8):1497-1502.
[PDF]

Professional Services

Program Committee Member of AAAI (2018-2022)
Program Committee Member of IJCAI (2018-2020)
Program Committee Member of ICONIP (2020)
Invited Reviewer for International Journal of Computer Vision
Invited Reviewer for IEEE Transactions on Image Processing
Invited Reviewer for IEEE Transactions on Multimedia
Invited Reviewer for IEEE Robotics and Automation Letters
Invited Reviewer for IEEE Transactions on Circuits and Systems for Video Technology
Invited Reviewer for Neurocomputing
Invited Reviewer for IEEE Signal Processing Letters
Invited Reviewer for International Journal of Advanced Robotic Systems
Reviewer of CVPR, ECCV, MM, NeurIPS

Yang Li

李洋

Associate Professor

Bio