Chengyao Wang

Chengyao Wang (王程钥)

I am a PhD student at the Department of Computer Science and Engineering, The Chinese University of Hong Kong (CUHK), advised by Prof. Jiaya Jia and Prof. Bei Yu . Prior to that, I obtained my B.E. degree in Computer Science from Sun Yat-Sen University (SYSU).

I am particular interested in building human like multi-modality inteligence system that can actively interact to the physical world and have long-term memory. Recently, my research mainly focus on Multi-modal Large Language Models. Prior to that, I also has some experiments on 3D scene understanding and label efficient computer vision.

I am looking for industrial opportunities in 2026 fall in anywhere around the world. Fell free to contact if you are intersted.

Google Scholar / GitHub / Twitter / Linkdin / Email

News

[2025-08] We release MGM-Omni , an open source omni moded support long speech understanding, generation and zero-shot voice clone.
[2025-06] Lyra is accepted in ICCV 2025, Hawaii.
[2025-03] VisionZip and DreamOmni are accepted in CVPR 2025, Nashville.
[2024-12] We release Lyra , an open source multi-modal large language models that support long-speech comprehension, sound understanding, cross-modality efficiency, and seamless speech interaction.
[2024-07] LLaMA-VID is accepted in ECCV 2024, Milano.
[2024-03] We release Mini-Gemini , an open source vision-language models that support high-resolution image understanding, reasoning and generation.
[2024-02] GroupContrast is accepted in CVPR 2024, Seattle.
[2023-11] We release LLaMA-VID , an open source vision-language models that support hour-long video understanding and reasoning.

Research

* indicates equal contribution

Selected Awards

Gold Medal x 3, International Collegiate Programming Contest (ICPC), Regional
Gold Medal, Chinese Collegiate Programming Contest (CCPC)

Academic Service

Reviewer / Program Committee Member

IEEE International Conference on Computer Vision (ICCV)
Conference on Neural Information Processing Systems (NeurIPS)
IEEE Winter Conference on Applications of Computer Vision (WACV)