surya-ocr 是一个开源的 OCR 模型,个人用是免费的,商用是需要License,收费标准有些复杂,具体可以查看官网。 主要包括以下功能:
- 支持 90 多种语言的 OCR
- 任何语言的行级文本检测
- 版面分析(表格、图像、标题等检测)
- 阅读顺序检测
- 表格识别(检测行/列)
OCR 文字识别
首先安装依赖
!pip install surya-ocr
识别文字
from PIL import Image
from surya.ocr import run_ocr
from surya.model.detection.model import load_model as load_det_model, load_processor as load_det_processor
from surya.model.recognition.model import load_model as load_rec_model
from surya.model.recognition.processor import load_processor as load_rec_processorimage = Image.open("slide_19.jpg")
langs = ["zh"] # Replace with your languages - optional but recommended
det_processor, det_model = load_det_processor(), load_det_model()
rec_model, rec_processor = load_rec_model(), load_rec_processor()predictions = run_ocr([image], [langs], det_model, det_processor, rec_model, rec_processor)
识别结果给出了文字和位置信息
格式识别
from PIL import Image
from surya.detection import batch_text_detection
from surya.layout import batch_layout_detection
from surya.model.detection.model import load_model, load_processor
from surya.settings import settingsimage = Image.open("a.jpg")
model = load_model(checkpoint=settings.LAYOUT_MODEL_CHECKPOINT)
processor = load_processor(checkpoint=settings.LAYOUT_MODEL_CHECKPOINT)
det_model = load_model()
det_processor = load_processor()# layout_predictions is a list of dicts, one per image
line_predictions = batch_text_detection([image], det_model, det_processor)
layout_predictions = batch_layout_detection([image], model, processor, line_predictions)
图片是一个简单的表格
可以看到 Layout 可以正确识别出对应信息
表格识别
通过命令识别表格信息,识别的表格信息会保存到文件中。
总结
surya-ocr 识别效果不错,比前两天看到的 GOT 的结果要好一些,效果可以媲美 PaddleOCR 了。