Chengyao Wang

I am a PhD student at the Department of Computer Science and Engineering, The Chinese University of Hong Kong (CUHK), advised by Prof. Jiaya Jia and Prof. Bei Yu . Prior to that, I obtained my B.E. degree in Computer Science from Sun Yat-Sen University.

I am particular interested in building human like multimodal agent that can actively interact to the physical world and have long-term memory. Recently, my research mainly focus on vision-language models. Prior to that, I also has some experiments on 3D scene understanding and label efficient computer vision.

Google Scholar  /  GitHub  /  Twitter  /  Linkdin  /  Email

News
  • [2024-07] LLaMA-VID is accepted in ECCV 2024, Milano.
  • [2024-03] We release Mini-Gemini , an open source vision-language models that support high-resolution image understanding, reasoning and generation.
  • [2024-02] GroupContrast is accepted in CVPR 2024, Seattle.
  • [2023-11] We release LLaMA-VID , an open source vision-language models that support hour-long video understanding and reasoning.
Research

* indicates equal contribution

Selected Awards
  • Gold Medal x 3, International Collegiate Programming Contest (ICPC), Regional
  • Gold Medal, Chinese Collegiate Programming Contest (CCPC)