About Me
I am an associate professor in the School of Computer Science and Technology at East China Normal University. I completed my PhD in the College of Computer Science at Zhejiang University with my advisor Prof. Jianke Zhu. In 2018, I visited the University of California, San Diego and worked with Prof. Michael Yip in the ECE department. Also, I worked as a part-time researcher at Alibaba-Zhejiang University Joint Institute of Frontier Technologies during my PhD. Prior to that, I spent some years in the video game industry at Virtuos.
Research Interest
My research interests are in the areas of Computer Vision, Computer Graphics, Robotics and, particularly, in methods of extracting meaningful information and structural data from videos and raw streaming sources. I am also very interested in finding and defining structural information in video frames to provide visual perception for cross-disciplinary research, e.g. robotic manipulation. Currently, my focus is on visual object tracking, 3D reconstruction/fusion/MVS, neural rendering/diffusion and cloth manipulation works. In general, my research area includes:
- Computer Vision
- Machine Learning
- Computer Graphics (Neural Rendering)
- Robotics (Visual Perception and Manipulation)
Selected Research Projects
Visual Object Tracking
Given a video sequence and a selected target in the first frame, visual object tracking aims to track the target object robustly and accurately during the whole video sequence. We are interested in different geometric representations of the tracking algorithm. With various geometric representations, visual object tracking methods can be viewed as basic blocks for many high-level AI applications, such as surveillance, video editing/analysis, etc.
3D Motion Capture
Taking one step forward, estimating the geometric representation of a visual object in the RGB-D sequence leads us to the 3D motion capture topic. It is highly related to 3D dynamic reconstruction, RGB-D fusion and 3D visual tracking. The algorithm outputs all geometric properties for every pixel in the sequence. It can be viewed as a fundamental 3D perception method which is very useful in AR/VR and Robotics. The 3D reconstruction technique can also be applied to game/film making, geography, and so on.
Scene Reconstruction & Neural Rendering
With all geometric properties, the next step is to visualize to what human being could understand. To this end, reconstruction and rendering related topics come to our research directions. Based on deep learning, we can bring the real work back onto screen without traditional computer graphics pipeline. We value these topics as next generation tech. for Games and Metaverse.
Visual Perception for Robots
With the capabilities of perceiving visual information in a sequence/video stream, we have succeeded in applying computer vision algorithms to surgical robots to automatically manipulate bio-tissue. In the future, we are planning to continue exploring the possibilities of AGI and Embodied Intelligence-related interdisciplinary projects.
Selected Publications
ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model
Yiming Sun, Fan Yu, Shaoxiang Chen, Yu Zhang, Junwei Huang, Yang Li, Chenhui Li, Changbo Wang
NeurIPS, 2024
[PAPER]
[Project Webpage]
FIND: Fine-tuning Initial Noise Distribution with Policy Optimization for Diffusion Models
Changgu Chen, Libing Yang, Xiaoyan Yang, Lianggangxu Chen, Gaoqi He, Changbo Wang, Yang Li
ACM Multimedia, 2024
[PAPER]
[Project Webpage]
ClothPPO: A Proximal Policy Optimization Enhancing Framework for Robotic Cloth Manipulation with Observation-Aligned Action Spaces
Libing Yang, Yang Li, Long Chen
International Joint Conference on Artificial Intelligence (IJCAI), 2024
[PAPER]
[Project Webpage]
Multi-Prototype Space Learning for Commonsense-Based Scene Graph Generation
Lianggangxu Chen, Youqi Song, Yiqing Cai, Jiale Lu, Yang Li, Yuan Xie, Changbo Wang, Gaoqi He
The Conference on Association for the Advancement of Artificial Intelligence (AAAI), 2024
[PAPER]
RAGT: Learning Robust Features for Occluded Human Pose and Shape Estimation with Attention-Guided Transformer
Ziqing Li, Yang Li, Shaohui Lin
CAD&Graphics, 2023
[PAPER]
Contact-conditioned Hand-held Object Reconstruction from Single-View Images
Xiaoyuan Wang, Yang Li, Adnane Boukhayma, Changbo Wang, Marc Christie
Computers & Graphics, 2023
[PAPER]
Homography Decomposition Networks for Planar Object Tracking
Xinrui Zhan, Yueran Liu, Jianke Zhu and Yang Li
The Conference on Association for the Advancement of Artificial Intelligence (AAAI), 2022
[PDF]
[CODE]
[Project Webpage]
SuPer Deep: A Surgical Perception Framework for Robotic Tissue Manipulation using Deep Learning for Feature Extraction
Jingpei Lu, Ambareesh Jayakumari, Florian Richter, Yang Li and Michael C. Yip
IEEE Conference on Robotics and Automation (ICRA), 2021
[PDF]
[Project Webpage]
SuPer: A Surgical Perception Framework for Endoscopic Tissue Manipulation with Surgical Robotics
Yang Li, Florian Richter, Jingpei Lu, Emily K. Funk, Ryan K. Orosco, Jianke Zhu and Michael C. Yip
IEEE Robotics and Automation Letters, 2020
[PDF]
[Project Webpage]
DeepFacade: A deep learning approach to facade parsing with symmetric loss
Hantang Liu, Yinghao Xu, Jialiang Zhang, Jianke Zhu, Yang Li, Steve C.H. Hoi
IEEE Trans. on Multimedia, 2020
[PDF]
Robust Estimation of Similarity Transformation for Visual Object Tracking
Yang Li, Jianke Zhu, Steven C.H. Hoi, Wenjie Song, Zhefeng Wang, Hantang Liu
The Conference on Association for the Advancement of Artificial Intelligence (AAAI), 2019
[PDF]
[CODE]
[Project Webpage]
Temporally-Adjusted Correlation Filter-based Tracking
Wenjie Song, Yang Li, Jianke Zhu, Chun Chen
Neurocomputing, 2018
[PDF]
Image Alignment by Online Robust PCA via Stochastic Gradient Descent
Wenjie Song, Jianke Zhu, Yang Li, Chun Chen.
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2015
[PDF]
The visual object tracking vot2014 challenge results.
M. Kristan, R. Pflugfelder, et al. (Co-author)
In ECCV 2014 Workshops, Workshop on Visual Object Tracking Challenge 2014
[PDF]
We won the second place in VOT2014 challenge.
The visual object tracking vot2013 challenge results.
M. Kristan, R. Pflugfelder, et al. (Co-author)
In ICCV 2013 Workshops, Workshop on Visual Object Tracking Challenge 2013
[PDF]
Adaptive lattice-based light rendering of participating media.
Changbo Wang, Chenhui Li, Jinqiu Dai, Yang Li
Journal of Computer Animation and Virtual World 22(6): 487-498 (2011).
[PDF]
Real-time realistic rendering of under seawater scene.
Chenhui Li, Changbo Wang, Yang Li, Min Zhao, et al.
Journal of Image and Graphics. 2011.16(8):1497-1502.
[PDF]
Professional Services
- Program Committee Member of AAAI (2018-2022)
- Program Committee Member of IJCAI (2018-2020)
- Program Committee Member of ICONIP (2020)
- Invited Reviewer for International Journal of Computer Vision
- Invited Reviewer for IEEE Transactions on Image Processing
- Invited Reviewer for IEEE Transactions on Multimedia
- Invited Reviewer for IEEE Robotics and Automation Letters
- Invited Reviewer for IEEE Transactions on Circuits and Systems for Video Technology
- Invited Reviewer for Neurocomputing
- Invited Reviewer for IEEE Signal Processing Letters
- Invited Reviewer for International Journal of Advanced Robotic Systems
- Reviewer of CVPR, ECCV, MM, NeurIPS