配置
Atlas800I A2(910B4 8*32G)
系统:openEuler 22.03-LTS
驱动:24.rc3
参考文章:https://www.hiascend.com/developer/ascendhub/detail/07a016975cc341f3a5ae131f2b52399d
准备好模型使能镜像
mindie_docker_images/800IA2-mis-tei-6.0.RC3.tar
embedding-rerank-models
docker load -i mindie_docker_images/800IA2-mis-tei-6.0.RC3.tar //加载镜像rar -xvf BAAI.tar //解压在你想放的目录下,我放在/www/down
我这次部署是:bge-large-zh-v1.5 和 bge-reranker-large
部署
我是分别两个模型进行容器创建:
docker run -u root -e ASCEND_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 -itd --name=tei-reranker --net=host \
-e HOME=/home/HwHiAiUser \
--privileged=true \
-v /www/down/:/home/HwHiAiUser/model \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
--entrypoint /home/HwHiAiUser/start.sh \
swr.cn-south-1.myhuaweicloud.com/ascendhub/mis-tei:6.0.0-800I-A2-aarch64 \
BAAI/bge-reranker-large 127.0.0.1 8085
docker run -u root -e ASCEND_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 -itd --name=tei-large --net=host \
-e HOME=/home/HwHiAiUser \
--privileged=true \
-v /www/down/:/home/HwHiAiUser/model \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
--entrypoint /home/HwHiAiUser/start.sh \
swr.cn-south-1.myhuaweicloud.com/ascendhub/mis-tei:6.0.0-800I-A2-aarch64 \
BAAI/bge-large-zh-v1.5 127.0.0.1 8086
可以查看docker logs 镜像ID判断是否启动成功,最后出现ready就成功了
测试接口
rerank模型测试:
curl 127.0.0.1:8085/rerank \-X POST \-d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \-H 'Content-Type: application/json'
embedding模型测试:
curl 127.0.0.1:8086/embed \-X POST \-d '{"inputs":"What is Deep Learning?"}' \-H 'Content-Type: application/json'
连接dify
bge-reranker-large模型API:http://IP:8085/rerank ,API key随便填
bge-large-zh-v1.5模型API:http://IP:8086/embed,API key随便填