DeepSeekR1之三_基于RAGFlowAI托管平台在Docker中部署搭建本地AI知识库
文章目录
- DeepSeekR1之三_基于RAGFlowAI托管平台在Docker中部署搭建本地AI知识库
- 1. RAGFlow是什么
- 1. 主要功能
- 1. **"Quality in, quality out"**
- 2. 🍱 **基于模板的文本切片**
- 3. 🌱 **有理有据、最大程度降低幻觉(hallucination)**
- 4. 🍔 **兼容各类异构数据源**
- 5. 🛀 **全程无忧、自动化的 RAG 工作流**
- 2. 系统架构
- 2. GitHub地址
- 1. github地址
- 2. 快速开始文档
- 3. Docker中部署RAGFlow
- 1. 下载源码
- 2. Docker部署RAGFlow
- 1. 解压源码
- 2. docker-compose启动容器
- 3. 容器启动后的容器实例
- 4. docker logs查看启动日志
- 3. 访问RAGFlow
- 4. 使用
- 1. 注册账号
- 2. 输入注册信息
- 3. 账户创建并登录
1. RAGFlow是什么
RAGFlow 是一款基于深度文档理解构建的开源 RAG(Retrieval-Augmented Generation)引擎。RAGFlow 可以为各种规模的企业及个人提供一套精简的 RAG 工作流程,结合大语言模型(LLM)针对用户各类不同的复杂格式数据提供可靠的问答以及有理有据的引用。
1. 主要功能
1. “Quality in, quality out”
- 基于深度文档理解,能够从各类复杂格式的非结构化数据中提取真知灼见。
- 真正在无限上下文(token)的场景下快速完成大海捞针测试。
2. 🍱 基于模板的文本切片
- 不仅仅是智能,更重要的是可控可解释。
- 多种文本模板可供选择
3. 🌱 有理有据、最大程度降低幻觉(hallucination)
- 文本切片过程可视化,支持手动调整。
- 有理有据:答案提供关键引用的快照并支持追根溯源。
4. 🍔 兼容各类异构数据源
- 支持丰富的文件类型,包括 Word 文档、PPT、excel 表格、txt 文件、图片、PDF、影印件、复印件、结构化数据、网页等。
5. 🛀 全程无忧、自动化的 RAG 工作流
- 全面优化的 RAG 工作流可以支持从个人应用乃至超大型企业的各类生态系统。
- 大语言模型 LLM 以及向量模型均支持配置。
- 基于多路召回、融合重排序。
- 提供易用的 API,可以轻松集成到各类企业系统。
2. 系统架构
2. GitHub地址
1. github地址
https://github.com/infiniflow/ragflow
2. 快速开始文档
Quick start | RAGFlow: https://ragflow.io/docs/dev
3. Docker中部署RAGFlow
1. 下载源码
- 使用git clone
git clone https://github.com/infiniflow/ragflow.git
- 还可根据tag下载对应的源码文件
根据标签直接下载:https://codeload.github.com/infiniflow/ragflow/zip/refs/tags/v0.16.0
下载后的源码包如:ragflow-0.16.0.zip
minio
C:\Users\jinshengyuan>docker pull quay.io/minio/minio:RELEASE.2023-12-20T01-00-02Z
RELEASE.2023-12-20T01-00-02Z: Pulling from minio/minio
f72461870632: Pull complete
683391db8929: Pull complete
ba8b8055313f: Pull complete
a8e0787fb7ed: Pull complete
fd20cadb8d39: Pull complete
3738ac54d510: Pull complete
128c59a31db4: Pull complete
Digest: sha256:5702ea3614203466e8e6616469e460567dc0c82def5a024a90426b25ee4a4d23
Status: Downloaded newer image for quay.io/minio/minio:RELEASE.2023-12-20T01-00-02Z
quay.io/minio/minio:RELEASE.2023-12-20T01-00-02ZWhat's next:View a summary of image vulnerabilities and recommendations → docker scout quickview quay.io/minio/minio:RELEASE.2023-12-20T01-00-02ZC:\Users\jinshengyuan>
2. Docker部署RAGFlow
1. 解压源码
解压ragflow-0.16.0.zip
源码文件到指定的位置,如D:\2025AICode\ragflow-0.16.0
2. docker-compose启动容器
进入D:\2025AICode\ragflow-0.16.0
目录并打开终端执行docker compose -f docker/docker-compose.yml up -d
,如下
D:\2025AICode\ragflow-0.16.0>docker compose -f docker/docker-compose.yml up -d
time="2025-02-07T13:19:45+08:00" level=warning msg="The \"HF_ENDPOINT\" variable is not set. Defaulting to a blank string."
time="2025-02-07T13:19:45+08:00" level=warning msg="The \"MACOS\" variable is not set. Defaulting to a blank string."
[+] Running 10/10✔ Network docker_ragflow Created 0.1s✔ Volume "docker_mysql_data" Created 0.0s✔ Volume "docker_minio_data" Created 0.0s✔ Volume "docker_redis_data" Created 0.0s✔ Volume "docker_esdata01" Created 0.0s✔ Container ragflow-mysql Healthy 21.6s✔ Container ragflow-minio Started 1.1s✔ Container ragflow-es-01 Started 1.0s✔ Container ragflow-redis Started 0.8s✔ Container ragflow-server Started
注意:首次执行docker compose时可能时间会很长,因为要拉取部署ragflow所依赖的docker镜像,所以耗时比较长
3. 容器启动后的容器实例
4. docker logs查看启动日志
D:\2025AICode\ragflow-0.16.0>docker logs ragflow-server
2025-02-07 13:20:08,281 INFO 20 ragflow_server log path: /ragflow/logs/ragflow_server.log, log levels: {'peewee': 'WARNING', 'pdfminer': 'WARNING', 'root': 'INFO'}
2025-02-07 13:20:13,564 INFO 20 init database on cluster mode successfully
2025-02-07 13:20:23,052 INFO 20____ ___ ______ ______ __/ __ \ / | / ____// ____// /____ _ __/ /_/ // /| | / / __ / /_ / // __ \| | /| / // _, _// ___ |/ /_/ // __/ / // /_/ /| |/ |/ //_/ |_|/_/ |_|\____//_/ /_/ \____/ |__/|__/2025-02-07 13:20:23,053 INFO 20 RAGFlow version: v0.16.0 slim
2025-02-07 13:20:23,054 INFO 20 project base: /ragflow
2025-02-07 13:20:23,055 INFO 20 Current configs, from /ragflow/conf/service_conf.yaml:ragflow: {'host': '0.0.0.0', 'http_port': 9380}mysql: {'name': 'rag_flow', 'user': 'root', 'password': '********', 'host': 'mysql', 'port': 3306, 'max_connections': 100, 'stale_timeout': 30}minio: {'user': 'rag_flow', 'password': '********', 'host': 'minio:9000'}es: {'hosts': 'http://es01:9200', 'username': 'elastic', 'password': '********'}infinity: {'uri': 'infinity:23817', 'db_name': 'default_db'}redis: {'db': 1, 'password': '********', 'host': 'redis:6379'}
2025-02-07 13:20:23,056 INFO 20 Use Elasticsearch http://es01:9200 as the doc engine.
2025-02-07 13:20:23,125 INFO 20 GET http://es01:9200/ [status:200 duration:0.068s]
2025-02-07 13:20:23,130 INFO 20 HEAD http://es01:9200/ [status:200 duration:0.004s]
2025-02-07 13:20:23,134 INFO 20 Elasticsearch http://es01:9200 is healthy.
2025-02-07 13:20:23,141 WARNING 20 Load term.freq FAIL!
2025-02-07 13:20:23,145 WARNING 20 Realtime synonym is disabled, since no redis connection.
2025-02-07 13:20:23,149 WARNING 20 Load term.freq FAIL!
2025-02-07 13:20:23,151 WARNING 20 Realtime synonym is disabled, since no redis connection.
2025-02-07 13:20:23,152 INFO 20 MAX_CONTENT_LENGTH: 134217728
2025-02-07 13:20:23,152 INFO 20 SERVER_QUEUE_MAX_LEN: 1024
2025-02-07 13:20:23,153 INFO 20 SERVER_QUEUE_RETENTION: 3600
2025-02-07 13:20:23,153 INFO 20 MAX_FILE_COUNT_PER_USER: 0
2025-02-07 13:20:26,486 INFO 19 task_executor_0 log path: /ragflow/logs/task_executor_0.log, log levels: {'peewee': 'WARNING', 'pdfminer': 'WARNING', 'root': 'INFO'}
2025-02-07 13:20:26,746 INFO 19 init database on cluster mode successfully
2025-02-07 13:20:32,827 INFO 19 TextRecognizer det uses CPU
2025-02-07 13:20:32,894 INFO 19 TextRecognizer rec uses CPU
2025-02-07 13:20:32,915 INFO 19______ __ ______ __/_ __/___ ______/ /__ / ____/ _____ _______ __/ /_____ _____/ / / __ `/ ___/ //_/ / __/ | |/_/ _ \/ ___/ / / / __/ __ \/ ___// / / /_/ (__ ) ,< / /____> </ __/ /__/ /_/ / /_/ /_/ / /
/_/ \__,_/____/_/|_| /_____/_/|_|\___/\___/\__,_/\__/\____/_/2025-02-07 13:20:32,916 INFO 19 TaskExecutor: RAGFlow version: v0.16.0 slim
2025-02-07 13:20:32,916 INFO 19 Use Elasticsearch http://es01:9200 as the doc engine.
2025-02-07 13:20:32,923 INFO 19 GET http://es01:9200/ [status:200 duration:0.005s]
2025-02-07 13:20:32,926 INFO 19 HEAD http://es01:9200/ [status:200 duration:0.002s]
2025-02-07 13:20:32,927 INFO 19 Elasticsearch http://es01:9200 is healthy.
2025-02-07 13:20:32,932 WARNING 19 Load term.freq FAIL!
2025-02-07 13:20:32,935 WARNING 19 Realtime synonym is disabled, since no redis connection.
2025-02-07 13:20:32,938 WARNING 19 Load term.freq FAIL!
2025-02-07 13:20:32,940 WARNING 19 Realtime synonym is disabled, since no redis connection.
2025-02-07 13:20:32,941 INFO 19 MAX_CONTENT_LENGTH: 134217728
2025-02-07 13:20:32,941 INFO 19 SERVER_QUEUE_MAX_LEN: 1024
2025-02-07 13:20:32,942 INFO 19 SERVER_QUEUE_RETENTION: 3600
2025-02-07 13:20:32,942 INFO 19 MAX_FILE_COUNT_PER_USER: 0
2025-02-07 13:20:32,948 WARNING 19 RedisDB.queue_info rag_flow_svr_queue got exception: no such key
2025-02-07 13:20:32,949 INFO 19 task_consumer_0 reported heartbeat: {"name": "task_consumer_0", "now": "2025-02-07T13:20:32.948+08:00", "boot_at": "2025-02-07T13:20:32.915+08:00", "pending": 0, "lag": 0, "done": 0, "failed": 0, "current": null}
2025-02-07 13:20:38,972 INFO 20 init web data success:5.778308153152466
2025-02-07 13:20:38,973 INFO 20 RAGFlow HTTP server start...
2025-02-07 13:20:38,975 INFO 20 WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.* Running on all addresses (0.0.0.0)* Running on http://127.0.0.1:9380* Running on http://172.19.0.6:9380
2025-02-07 13:20:38,975 INFO 20 Press CTRL+C to quit
D:\2025AICode\ragflow-0.16.0>
3. 访问RAGFlow
RAGFlow部署后默认的端口为80,打开浏览器输入地址: http://localhost ,如下图
4. 使用
1. 注册账号
- 点击注册进入创建账户页面