【AI】ubuntu 22.04 本地搭建Qwen-VL 支持图片识别的大语言模型 AI视觉 【3】Qwen-VL-Chat-Int4版本 + 4060ti 16G

hkNaruto 2024-07-07 16:31:04 阅读 56

接上篇

【AI】ubuntu 22.04 本地搭建Qwen-VL 支持图片识别的大语言模型 AI视觉-CSDN博客

【AI】ubuntu 22.04 本地搭建Qwen-VL 支持图片识别的大语言模型 AI视觉 【2】 4060ti 16G 也顶不住-CSDN博客

下载Qwen-VL-Chat-Int4版本模型

<code>cd ~/Downloads/ai

git lfs install

git clone https://www.modelscope.cn/qwen/Qwen-VL-Chat-Int4.git

这个版本模型体积小不少

2060 6G 仍然不能启动web

尝试参考模型中的README.md编写使用量化的代码

test.py

<code>import os

os.environ['CUDA_VISIBLE_DEVICES'] = '0'

from modelscope import (

AutoModelForCausalLM, AutoTokenizer, GenerationConfig,

)

from transformers import BitsAndBytesConfig

import torch

model_dir = "/home/yeqiang/Downloads/ai/Qwen-VL-Chat-Int4"

torch.manual_seed(1234)

quantization_config = BitsAndBytesConfig(

load_in_4bit=True,

bnb_4bit_compute_dtype=torch.float16,

bnb_4bit_quant_type='nf4',code>

bnb_4bit_use_double_quant=True,

llm_int8_skip_modules=['lm_head', 'attn_pool.attn'])

tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)

model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto",code>

trust_remote_code=True, fp16=True,

quantization_config=quantization_config).eval()

model.generation_config = GenerationConfig.from_pretrained(model_dir, trust_remote_code=True)

query = tokenizer.from_list_format([

{'image': 'https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg'},

{'text': '这是什么'},

])

response, history = model.chat(tokenizer, query=query, history=None)

print(response)

response, history = model.chat(tokenizer, '输出狗的检测框', history=history)

print(response)

image = tokenizer.draw_bbox_on_latest_picture(response, history)

image.save('output_chat2.jpg')

启动报错

2024-04-08 13:40:52,816 - modelscope - INFO - PyTorch version 2.2.2 Found.

2024-04-08 13:40:52,816 - modelscope - INFO - Loading ast index from /home/yeqiang/.cache/modelscope/ast_indexer

2024-04-08 13:40:52,840 - modelscope - INFO - Loading done! Current index file version is 1.13.3, with md5 1c4da6103bff77f1a134ac63a6cb75b9 and a total number of 972 components indexed

/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/transformers/utils/generic.py:260: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.

  torch.utils._pytree._register_pytree_node(

/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/transformers/utils/generic.py:260: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.

  torch.utils._pytree._register_pytree_node(

Traceback (most recent call last):

  File "/home/yeqiang/Downloads/src/Qwen-VL/test.py", line 11, in <module>

    quantization_config = BitsAndBytesConfig(

  File "/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 212, in __init__

    self.post_init()

  File "/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 238, in post_init

    if self.load_in_4bit and not version.parse(importlib.metadata.version("bitsandbytes")) >= version.parse(

  File "/usr/lib/python3.10/importlib/metadata/__init__.py", line 996, in version

    return distribution(distribution_name).version

  File "/usr/lib/python3.10/importlib/metadata/__init__.py", line 969, in distribution

    return Distribution.from_name(distribution_name)

  File "/usr/lib/python3.10/importlib/metadata/__init__.py", line 548, in from_name

    raise PackageNotFoundError(name)

importlib.metadata.PackageNotFoundError: No package metadata was found for bitsandbytes

这个脚本的依赖与requirements.txt有一定差异

pip install modelscope -U

pip install transformers accelerate tiktoken -U

pip install einops transformers_stream_generator -U

pip install "pillow==9.*" -U

pip install torchvision

pip install matplotlib -U

参考文档手动安装

故障依旧

AI分析

根据您给出的终端命令输出,您当前虚拟环境中安装的库版本如下:

<code>transformers 版本为 4.39.3bitsandbytes 版本为 0.43.0

然而,根据之前的错误信息,BitsAndBytesConfig 对象似乎缺乏 get_loading_attributes 方法,导致在尝试从预训练模型加载时抛出了异常。这可能意味着在您当前安装的 transformers 版本中,对于如何与 bitsandbytes 配合进行模型量化加载有所变更,或者 bitsandbytes 的某个版本引入了不兼容的改变。

为了修复这个问题,您可以尝试以下步骤:

查阅 transformers 和 bitsandbytes 的官方文档,查找有关二者协同工作以及量化配置的最佳实践指导。检查 transformers 与 bitsandbytes 的兼容性列表,确保您使用的版本搭配是经过测试和验证的。尝试升级或降级 bitsandbytes 或者 transformers 至已知兼容的版本。若上述方式无效,可能需要查看 transformers 项目的 GitHub 页面或其他社区论坛,查找是否存在类似问题的解决方案,或者向项目作者报告此问题以求得帮助。

总之,您需要确保 transformers 和 bitsandbytes 之间的配合无缝无误,以避免类似 get_loading_attributes 方法缺失引发的加载失败问题。

这python项目的依赖管理似乎不如主流maven项目来的稳定,这种版本依赖关系应该是requirements.txt约定好。

不使用Int4版本

model_dir = "/home/yeqiang/Downloads/ai/Qwen-VL-Chat"

报错

Traceback (most recent call last):

  File "/home/yeqiang/Downloads/src/Qwen-VL/test.py", line 19, in <module>

    model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto",

  File "/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/modelscope/utils/hf_util.py", line 113, in from_pretrained

    module_obj = module_class.from_pretrained(model_dir, *model_args,

  File "/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained

    return model_class.from_pretrained(

  File "/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/modelscope/utils/hf_util.py", line 76, in from_pretrained

    return ori_from_pretrained(cls, model_dir, *model_args, **kwargs)

  File "/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3481, in from_pretrained

    hf_quantizer.validate_environment(device_map=device_map)

  File "/home/yeqiang/Downloads/src/Qwen-VL/venv/lib/python3.10/site-packages/transformers/quantizers/quantizer_bnb_4bit.py", line 86, in validate_environment

    raise ValueError(

ValueError: 

                    Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the

                    quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules

                    in 32-bit, you need to set `load_in_8bit_fp32_cpu_offload=True` and pass a custom `device_map` to

                    `from_pretrained`. Check

                    https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu

                    for more details.

4060ti 16G

根据提示,安装python包

<code>pip install optimum

pip install auto-gptq

大约10秒启动成功

显卡状态

访问web

测试对话,速度飞快,没有等待(但是回答的内容与问题对不上!)

对比在线版本(这才是正确的内容,最近在线版本速度明显变慢了)

测试图片内容的识别

这种复杂的验证码不能识别

写作文(太弱了)

对比在线版本

总体评价:

Int4这个版本模型能够给出的内容不多,实际意义不大。需要更大的显存跑更大的模型来提供优质的内容输出。



声明

本文内容仅代表作者观点,或转载于其他网站,本站不以此文作为商业用途
如有涉及侵权,请联系本站进行删除
转载本站原创文章,请注明来源及作者。