欢迎来到尧图网

客户服务 关于我们

您的位置:首页 > 房产 > 家装 > 昇腾910B部署Qwen2-7B-Instruct进行流式输出【pytorch框架】NPU推理

昇腾910B部署Qwen2-7B-Instruct进行流式输出【pytorch框架】NPU推理

2024/10/26 11:15:44 来源:https://blog.csdn.net/weixin_46398647/article/details/140214117  浏览:    关键词:昇腾910B部署Qwen2-7B-Instruct进行流式输出【pytorch框架】NPU推理

目录

  • 前情提要
    • torch_npu框架
    • mindsport框架
    • mindnlp框架
  • 下载模型
    • 国外
    • 国内
  • 环境设置
  • 代码适配(非流式)
    • Main
    • Branch
    • 结果展示
  • 代码适配(流式)

前情提要

torch_npu框架

官方未适配
在这里插入图片描述

mindsport框架

官方未适配
在这里插入图片描述

mindnlp框架

官方适配了,但是速度非常非常慢,10秒一个字
在这里插入图片描述

下载模型

国外

Hugging FaceHugging Face

国内

在这里插入图片描述modelscope

环境设置

pip install transformers==4.39.2
pip3 install torch==2.1.0
pip3 install torch-npu==2.1.0.post4
pip3 install accelerate==0.24.1
pip3 install transformers-stream-generator==0.0.5

代码适配(非流式)

Main

import torch
import torch_npu
import os
import platform
torch_device = "npu:1" # 0~7
torch.npu.set_device(torch.device(torch_device))
torch.npu.set_compile_mode(jit_compile=False)
option = {}
option["NPU_FUZZY_COMPILE_BLACKLIST"] = "Tril"
torch.npu.set_option(option)
from transformers import AutoModelForCausalLM, AutoTokenizer
# device = "cuda" # the device to load the model onto
DEFAULT_CKPT_PATH = '/root/.cache/modelscope/hub/qwen/Qwen2-7B-Instruct'
model = AutoModelForCausalLM.from_pretrained(DEFAULT_CKPT_PATH,torch_dtype=torch.float16,device_map=torch_device
).npu().eval()
tokenizer = AutoTokenizer.from_pretrained(DEFAULT_CKPT_PATH)
while True:prompt = input("user:")if prompt == "exit":breakmessages = [{"role": "system", "content": "You are a helpful assistant."},{"role": "user", "content": prompt}]text = tokenizer.apply_chat_template(messages,tokenize=False,add_generation_prompt=True)model_inputs = tokenizer([text], return_tensors="pt").to(torch_device)generated_ids = model.generate(model_inputs.input_ids,max_new_tokens=512)generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)]response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]print("Qwen2-7B-Instruct:",response)

Branch

找到自己虚拟环境

which python

我的是/root/anaconda3/envs/sakura/bin/python
找到/lib/python3.9/site-packages/transformers/generation/utils.py示例:

/root/anaconda3/envs/sakura/lib/python3.9/site-packages/transformers/generation/utils.py

找到第2708行,注释掉2708行~2712行
在2709行添加

next_token_scores = outputs.logits[:, -1, :]

示例:
在这里插入图片描述
出错就是在这里,如果进行了pre-process distribution,就会报错

/root/anaconda3/envs/sakura/lib/python3.9/site-packages/transformers/generation/logits_process.py:455: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at build/CMakeFiles/torch_npu.dir/compiler_depend.ts:74.)sorted_indices_to_remove[..., -self.min_tokens_to_keep :] = 0
Traceback (most recent call last):File "/root/Qwen_test.py", line 63, in <module>generated_ids = model.generate(File "/root/anaconda3/envs/sakura/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_contextreturn func(*args, **kwargs)File "/root/anaconda3/envs/sakura/lib/python3.9/site-packages/transformers/generation/utils.py", line 1576, in generateresult = self._sample(File "/root/anaconda3/envs/sakura/lib/python3.9/site-packages/transformers/generation/utils.py", line 2736, in _samplenext_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: Sync:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:158 NPU error, error code is 507018
[Error]: The aicpu execution is abnormal.Rectify the fault based on the error information in the ascend log.
E39999: Inner Error!
E39999: 2024-07-02-14:14:50.735.070  An exception occurred during AICPU execution, stream_id:23, task_id:2750, errcode:21008, msg:inner error[FUNC:ProcessAicpuErrorInfo][FILE:device_error_proc.cc][LINE:730]TraceBack (most recent call last):rtStreamSynchronizeWithTimeout execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]synchronize stream failed, runtime result = 507018[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]DEVICE[1] PID[864803]:
EXCEPTION TASK:Exception info:TGID=864803, model id=65535, stream id=23, stream phase=SCHEDULE, task id=2750, task type=aicpu kernel, recently received task id=2750, recently send task id=2749, task phase=RUNMessage info[0]:aicpu=0,slot_id=0,report_mailbox_flag=0x5a5a5a5a,state=0x5210Other info[0]:time=2024-07-02-14:14:50.091.974, function=proc_aicpu_task_done, line=970, error code=0x2a
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.Rectify the fault based on the error information in the ascend log.
EZ9999: Inner Error!
EZ9999: 2024-07-02-14:14:50.743.702  Kernel task happen error, retCode=0x2a, [aicpu exception].[FUNC:PreCheckTaskErr][FILE:task_info.cc][LINE:1776]TraceBack (most recent call last):Aicpu kernel execute failed, device_id=1, stream_id=23, task_id=2750, errorCode=2a.[FUNC:PrintAicpuErrorInfo][FILE:task_info.cc][LINE:1579]Aicpu kernel execute failed, device_id=1, stream_id=23, task_id=2750, fault op_name=[FUNC:GetError][FILE:stream.cc][LINE:1512]rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161](function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.745.695  wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]TraceBack (most recent call last):(function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.747.300  wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]TraceBack (most recent call last):(function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.814.377  wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]TraceBack (most recent call last):(function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.816.023  wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]TraceBack (most recent call last):(function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.817.628  wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]TraceBack (most recent call last):(function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.819.236  wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]TraceBack (most recent call last):(function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.820.843  wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]TraceBack (most recent call last):(function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.822.422  wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]TraceBack (most recent call last):(function npuSynchronizeDevice)

结果展示

最后运行Main文件
在这里插入图片描述

代码适配(流式)

未完待续

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com