ollama
https://github.com/ollama/ollama
open-webui
https://github.com/open-webui/open-webui
部署流程:
1,**open-webui docker一键安装**
docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollamadocker exec -it open-webui bash2,**下载ollama库**
git clone https://github.com/ollama/ollama.git
cd ollama3,**获取llama.cpp模块**
git submodule init
git submodule update llm/llama.cpp4,**创建环境并安装依赖**
python -m venv llm/llama.cpp/.venv
source llm/llama.cpp/.venv/bin/activatepip install -r llm/llama.cpp/requirements.txt5,**创建量化工具**
apt update
apt install make
make -C llm/llama.cpp quantize6,**转换模型格式(如果模型是.safetensors)**
(./model 换成本地的模型目录)
python llm/llama.cpp/convert-hf-to-gguf.py ./model --outtype f16 --outfile converted.bin7,**量化模型**
(可选项)
llm/llama.cpp/quantize converted.bin quantized.bin q4_08,**创建一个新的Modelfile**
apt install vim
vim xxx.Modelfile
**Modelfile内容**
:FROM llama3 # 替换成转换模型格式的模型路径,最简单的只需要这一行就够了
FROM llama3# sets the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1# sets the context window size to 4096, this controls how many tokens the LLM can use as context to generate the next token
PARAMETER num_ctx 4096# sets a custom system message to specify the behavior of the chat assistant
SYSTEM You are Mario from super mario bros, acting as an assistant.9,**创建并运行模型**
ollama create xxx -f xxx.Modelfile
olama list
ollama run xxx10,在open-webui 的docker容器中运行完上述命令之后,打开webui界面,在添加模型处就能看见创建的自定义模型
配置内网穿透,实现公网访问
1,服务器端配置
前提:阿里云云服务器,开放7000防火墙端口,阿里云服务器怎么搞,网上自己搜教程
wget https://github.com/fatedier/frp/releases/download/v0.48.0/frp_0.48.0_linux_amd64.tar.gz
tar -xzf frp_0.48.0_linux_amd64.tar.gz
cd frp_0.48.0_linux_amd64添加可执行权限
chmod +x frps
启动命令:
nohup ./frps -c ./frps.ini &
2,客户端配置
wget https://github.com/fatedier/frp/releases/download/v0.48.0/frp_0.48.0_linux_amd64.tar.gz
tar -xzf frp_0.48.0_linux_amd64.tar.gz
cd frp_0.48.0_linux_amd64
vim frpc.ini
frpc.ini:
[common]
server_addr = 阿里云公网ip
server_port = 7000
privilege_token = xxxx,服务端和客户端的校验token,(可选项)[ssh1]
type = tcp
local_ip = 客户端机器ip
local_port = 3000 web应用port
remote_port = 3000 :wq 保存退出添加可执行权限
chmod +x frpc
启动命令:
nohup ./frpc -c ./frpc.ini &
访问:阿里云公网ip+port,完成公网访问本地应用