【AI】ubuntu 22.04 本地搭建Qwen-VL 支持图片识别的大语言模型 AI视觉 【3】Qwen-VL-Chat-Int4版本 + 4060ti 16G
hkNaruto 2024-07-07 16:31:04 阅读 56
接上篇
【AI】ubuntu 22.04 本地搭建Qwen-VL 支持图片识别的大语言模型 AI视觉-CSDN博客
【AI】ubuntu 22.04 本地搭建Qwen-VL 支持图片识别的大语言模型 AI视觉 【2】 4060ti 16G 也顶不住-CSDN博客
下载Qwen-VL-Chat-Int4版本模型
<code>cd ~/Downloads/ai
git lfs install
git clone https://www.modelscope.cn/qwen/Qwen-VL-Chat-Int4.git
这个版本模型体积小不少
2060 6G 仍然不能启动web
尝试参考模型中的README.md编写使用量化的代码
test.py
<code>import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
from modelscope import (
AutoModelForCausalLM, AutoTokenizer, GenerationConfig,
)
from transformers import BitsAndBytesConfig
import torch
model_dir = "/home/yeqiang/Downloads/ai/Qwen-VL-Chat-Int4"
torch.manual_seed(1234)
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_quant_type='nf4',code>
bnb_4bit_use_double_quant=True,
llm_int8_skip_modules=['lm_head', 'attn_pool.attn'])
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto",code>
trust_remote_code=True, fp16=True,
quantization_config=quantization_config).eval()
model.generation_config = GenerationConfig.from_pretrained(model_dir, trust_remote_code=True)
query = tokenizer.from_list_format([
{'image': 'https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg'},
{'text': '这是什么'},
])
response, history = model.chat(tokenizer, query=query, history=None)
print(response)
response, history = model.chat(tokenizer, '输出狗的检测框', history=history)
print(response)
image = tokenizer.draw_bbox_on_latest_picture(response, history)
image.save('output_chat2.jpg')
启动报错
2024-04-08 13:40:52,816 - modelscope - INFO - PyTorch version 2.2.2 Found.
2024-04-08 13:40:52,816 - modelscope - INFO - Loading ast index from /home/yeqiang/.cache/modelscope/ast_indexer
2024-04-08 13:40:52,840 - modelscope - INFO - Loading done! Current index file version is 1.13.3, with md5 1c4da6103bff77f1a134ac63a6cb75b9 and a total number of 972 components indexed
/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/transformers/utils/generic.py:260: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/transformers/utils/generic.py:260: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
Traceback (most recent call last):
File "/home/yeqiang/Downloads/src/Qwen-VL/test.py", line 11, in <module>
quantization_config = BitsAndBytesConfig(
File "/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 212, in __init__
self.post_init()
File "/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 238, in post_init
if self.load_in_4bit and not version.parse(importlib.metadata.version("bitsandbytes")) >= version.parse(
File "/usr/lib/python3.10/importlib/metadata/__init__.py", line 996, in version
return distribution(distribution_name).version
File "/usr/lib/python3.10/importlib/metadata/__init__.py", line 969, in distribution
return Distribution.from_name(distribution_name)
File "/usr/lib/python3.10/importlib/metadata/__init__.py", line 548, in from_name
raise PackageNotFoundError(name)
importlib.metadata.PackageNotFoundError: No package metadata was found for bitsandbytes
这个脚本的依赖与requirements.txt有一定差异
pip install modelscope -U
pip install transformers accelerate tiktoken -U
pip install einops transformers_stream_generator -U
pip install "pillow==9.*" -U
pip install torchvision
pip install matplotlib -U
参考文档手动安装
故障依旧
AI分析
根据您给出的终端命令输出,您当前虚拟环境中安装的库版本如下:
<code>transformers 版本为 4.39.3bitsandbytes
版本为 0.43.0
然而,根据之前的错误信息,BitsAndBytesConfig
对象似乎缺乏 get_loading_attributes
方法,导致在尝试从预训练模型加载时抛出了异常。这可能意味着在您当前安装的 transformers
版本中,对于如何与 bitsandbytes
配合进行模型量化加载有所变更,或者 bitsandbytes
的某个版本引入了不兼容的改变。
为了修复这个问题,您可以尝试以下步骤:
查阅 transformers
和 bitsandbytes
的官方文档,查找有关二者协同工作以及量化配置的最佳实践指导。检查 transformers
与 bitsandbytes
的兼容性列表,确保您使用的版本搭配是经过测试和验证的。尝试升级或降级 bitsandbytes
或者 transformers
至已知兼容的版本。若上述方式无效,可能需要查看 transformers
项目的 GitHub 页面或其他社区论坛,查找是否存在类似问题的解决方案,或者向项目作者报告此问题以求得帮助。
总之,您需要确保 transformers
和 bitsandbytes
之间的配合无缝无误,以避免类似 get_loading_attributes
方法缺失引发的加载失败问题。
这python项目的依赖管理似乎不如主流maven项目来的稳定,这种版本依赖关系应该是requirements.txt约定好。
不使用Int4版本
model_dir = "/home/yeqiang/Downloads/ai/Qwen-VL-Chat"
报错
Traceback (most recent call last):
File "/home/yeqiang/Downloads/src/Qwen-VL/test.py", line 19, in <module>
model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto",
File "/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/modelscope/utils/hf_util.py", line 113, in from_pretrained
module_obj = module_class.from_pretrained(model_dir, *model_args,
File "/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/modelscope/utils/hf_util.py", line 76, in from_pretrained
return ori_from_pretrained(cls, model_dir, *model_args, **kwargs)
File "/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3481, in from_pretrained
hf_quantizer.validate_environment(device_map=device_map)
File "/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/transformers/quantizers/quantizer_bnb_4bit.py", line 86, in validate_environment
raise ValueError(
ValueError:
Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the
quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules
in 32-bit, you need to set `load_in_8bit_fp32_cpu_offload=True` and pass a custom `device_map` to
`from_pretrained`. Check
https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu
for more details.
4060ti 16G
根据提示,安装python包
<code>pip install optimum
pip install auto-gptq
大约10秒启动成功
显卡状态
访问web
测试对话,速度飞快,没有等待(但是回答的内容与问题对不上!)
对比在线版本(这才是正确的内容,最近在线版本速度明显变慢了)
测试图片内容的识别
这种复杂的验证码不能识别
写作文(太弱了)
对比在线版本
总体评价:
Int4这个版本模型能够给出的内容不多,实际意义不大。需要更大的显存跑更大的模型来提供优质的内容输出。
上一篇: 通用性技术底座AI大模型与各行业专用性AI小模型搭建(第一篇)
下一篇: 【AI基础】大模型部署工具之ollama的安装部署以及api调用
本文标签
【AI】ubuntu 22.04 本地搭建Qwen-VL 支持图片识别的大语言模型 AI视觉 【3】Qwen-VL-Chat-Int4版本 + 4060ti 16G
声明
本文内容仅代表作者观点,或转载于其他网站,本站不以此文作为商业用途
如有涉及侵权,请联系本站进行删除
转载本站原创文章,请注明来源及作者。