百度文字识别文档:
https://ai.baidu.com/docs#/OCR-Python-SDK/top
安装sdk
pip install baidu-aip
先创建应用,得到appid
要识别的表格图片:
代码示例
from aip import AipOcr """ 你的 APPID AK SK """ APP_ID = '你的 App ID' API_KEY = '你的 Api Key' SECRET_KEY = '你的 Secret Key' client = AipOcr(APP_ID, API_KEY, SECRET_KEY) with open("names.png", "rb") as f: image = f.read() result = client.basicGeneral(image) print(result)
识别结果:
{ "log_id":3213553909522465362, "words_result_num":20, "words_result":[ { "words":"表格1:" }, { "words":"姓名" }, { "words":"年龄" }, { "words":"性别" }, { "words":"李雷" }, { "words":"20男" }, { "words":"韩梅梅" }, { "words":"23女" }, { "words":"赵小三" }, { "words":"25女" }, { "words":"Table2." }, { "words":"Name" }, { "words":"ge" }, { "words":"Gender" }, { "words":"Tom" }, { "words":"30 Male" }, { "words":"Jack" }, { "words":"33 Male" }, { "words":"one" }, { "words":"31Female" } ] }
结果不太满意,年龄和性别被合在一起了