About Me

Here is Gao Wei (Victor, 高伟).

I’m an Algorithm Engineer at ByteDance, specializing in Multimodal Large Language Models (MLLMs) and Diffusion Models. Prior to joining ByteDance, I held positions at ZEEKR and Alibaba (Amap), where I focused on perception algorithms within the mapping and navigation sector..

I am always open to academic discussions and potential collaborations. Please feel free to reach out to me at vasgaowei@gmail.com

Research Interests

Understanding: MLLM, Unified MLLM, MLLM-Embedding
Generation: Diffusion Model, Flow Matching, Regressive Model

News and Updates

Aug 2021： Discrepant multiple instance learning for weakly supervised object detection is accept by Pattern Recognition
July 2021： TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization is accept by ICCV 2021

Education

Sep 2014 - July 2018: Department of Automation, Tsinghua University
Sep 2018 - July 2021: University of Chinese Academy of Science

Work Experience

Oct 2024 - Present: Algorithm Engineer at ByteDance, Beijing, China
Jun 2024 - Oct 2024: Algorithm Engineer at ZEEKR, Beijing, China
Nov 2021 - Jun 2024: Computer Vision Algorithm Engineer at Alibaba Group (Amap), Beijing, China
- BEV perception, cloud-based lane line vectorization