Chengyao Wang
I am a PhD student at the Department of Computer Science and Engineering, The Chinese University of Hong Kong (CUHK), advised by Prof. Jiaya Jia and Prof. Bei Yu .
Prior to that, I obtained my B.E. degree in Computer Science from Sun Yat-Sen University.
I am particular interested in building human like multimodal agent that can actively interact to the physical world and have long-term memory.
Recently, my research mainly focus on vision-language models.
Prior to that, I also has some experiments on 3D scene understanding and label efficient computer vision.
Google Scholar  / 
GitHub  / 
Twitter  / 
Linkdin  / 
Email
|
|
News
- [2024-07] LLaMA-VID is accepted in ECCV 2024, Milano.
- [2024-03] We release Mini-Gemini , an open source vision-language models that support high-resolution image understanding, reasoning and generation.
- [2024-02] GroupContrast is accepted in CVPR 2024, Seattle.
- [2023-11] We release LLaMA-VID , an open source vision-language models that support hour-long video understanding and reasoning.
|
Research
* indicates equal contribution
|
|
|
|
|
|
|
|
|
|
|
Selected Awards
- Gold Medal x 3, International Collegiate Programming Contest (ICPC), Regional
- Gold Medal, Chinese Collegiate Programming Contest (CCPC)
|
|