tesseract安装及问题处理

错误1

pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path

解决方法

‘Tesseract-OCR’ 下载安装,选择对应的版本下载

我这里下载的是 window 版本的 tesseract

找到源码中

tesseract_cmd = 'tesseract'

修改为

tesseract_cmd = r'D:\Program Files (x86)\Tesseract-OCR\tesseract.exe'

错误2

E:\BuildFolder\tesseract-ocr\testing>tesseract-dlld.exe eurotext.tif eurotext
Error opening data file ./tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Could not initialize tesseract.

解决方法

  • 把 tessdata 目录放在 tesseract.exe 的目录下
  • 将 TESSDATA_PREFIX=D:\Program Files (x86)\Tesseract-OCR 添加环境变量

    临时在 cmd 中设置环境变量,测试

set TESSDATA_PREFIX=D:\Program Files (x86)\Tesseract-OCR
上一篇:转:Java架构师与开发者提高效率的10个工具


下一篇:pandas获取groupby分组里最大值所在的行,获取第一个等操作