About Me

Here is Gao Wei (Victor, 高伟).
I’m an Algorithm Engineer at ByteDance, specializing in Multimodal Large Language Models (MLLMs) and Diffusion Models. Prior to joining ByteDance, I held positions at ZEEKR and Alibaba (Amap), where I focused on perception algorithms within the mapping and navigation sector..
I am always open to academic discussions and potential collaborations. Please feel free to reach out to me at vasgaowei@gmail.com
Research Interests
- Understanding: MLLM, Unified MLLM, MLLM-Embedding
- Generation: Diffusion Model, Flow Matching, Regressive Model
News and Updates
- Aug 2021: Discrepant multiple instance learning for weakly supervised object detection is accept by Pattern Recognition
- July 2021: TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization is accept by ICCV 2021
Education
- Sep 2014 - July 2018: Department of Automation, Tsinghua University
- Sep 2018 - July 2021: University of Chinese Academy of Science
Work Experience
- Oct 2024 - Present: Algorithm Engineer at ByteDance, Beijing, China
- Jun 2024 - Oct 2024: Algorithm Engineer at ZEEKR, Beijing, China
- Nov 2021 - Jun 2024: Computer Vision Algorithm Engineer at Alibaba Group (Amap), Beijing, China
- BEV perception, cloud-based lane line vectorization
RedNote