In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-09-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
Thanks to CTOnews.com netizens West window past, South China Daniel Wu's clue delivery! CTOnews.com August 25 news, Aliyun today launched a large-scale visual language model Qwen-VL, which is now open source in ModeScope. CTOnews.com reported earlier that Aliyun has previously opened up general 7 billion parameter model Qwen-7B and dialogue model Qwen-7B-Chat.
It is reported that Qwen-VL is a visual language (Vision Language,VL) model that supports Chinese, English and other languages. Compared with the previous VL model, it not only has the basic ability of picture and text recognition, description, question answering and dialogue, but also adds visual positioning, image Chinese character understanding and other capabilities.
▲ Image Source ArXiv thesis Qwen-VL uses Qwen-7B as the base language model, and introduces a visual encoder into the model architecture to make the model support visual signal input. The model supports image input resolution of 448. Previously, open source LVLM models usually only support 224 resolution.
Officials say the model can be used in scenarios such as knowledge question answering, image title generation, image question answering, document question answering, fine-grained visual positioning, etc., in the mainstream multimodal task evaluation and multimodal chat ability evaluation. The performance of the general model is much higher than that of the general model of the same scale.
▲ source modelscope in addition, on the basis of Qwen-VL, Tongyi Qianwen team uses the alignment mechanism to create a LLM-based visual AI assistant Qwen-VL-Chat, which allows developers to quickly build dialogue applications with multimodal capabilities.
Tongyi Qianwen team also said that in order to test the multimodal dialogue ability of the model, they constructed a set of test suite "touchstone" based on GPT-4 scoring mechanism, and compared Qwen-VL-Chat with other models. Qwen-VL-Chat achieved the best results of open source LVLM in both Chinese and English alignment tests.
▲ diagram source modelscope
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
The market share of Chrome browser on the desktop has exceeded 70%, and users are complaining about
The world's first 2nm mobile chip: Samsung Exynos 2600 is ready for mass production.According to a r
A US federal judge has ruled that Google can keep its Chrome browser, but it will be prohibited from
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
About us Contact us Product review car news thenatureplanet
More Form oMedia: AutoTimes. Bestcoffee. SL News. Jarebook. Coffee Hunters. Sundaily. Modezone. NNB. Coffee. Game News. FrontStreet. GGAMEN
© 2024 shulou.com SLNews company. All rights reserved.