阿里云视觉智能开放平台--文字识别使用教程

概述

文字识别技术是基于阿里云深度学习技术,为您提供通用的印刷文字识别和文档结构化等能力。文字识别技术可以灵活应用于证件文字识别、发票文字识别、文档识别与整理等行业场景,满足认证、鉴权、票据流转审核等业务需求。

Step By Step

1、服务开通,参考链接:阿里云视觉智能开放平台使用简明教程

2、目前文字识别提供5大类:个人类卡证识别、资产类证件识别、通用文字类识别、车辆交通类识别和车辆交通类识别,共21个API的接口能力

3、Code Sample

增值税发票识别接口为例分别演示使用本地图片及OSS图片的使用;

其它接口的使用方式类似,注意目前:二维码识别RecognizeQrCode API不支持上传本地文件,但是支持公网图片URL,其它API接口支持OSS地址和本地图片上传。


  • 3.1 pom.xml
    <dependencies>
        <dependency>
            <groupId>com.aliyun</groupId>
            <artifactId>ocr20191230</artifactId>
            <version>0.0.3</version>
        </dependency>
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>fastjson</artifactId>
            <version>1.2.52</version>
        </dependency>
    </dependencies>

3.2 Java Code

import com.alibaba.fastjson.JSON;
import com.aliyun.ocr20191230.Client;
import com.aliyun.ocr20191230.models.RecognizeVATInvoiceAdvanceRequest;
import com.aliyun.ocr20191230.models.RecognizeVATInvoiceRequest;
import com.aliyun.ocr20191230.models.RecognizeVATInvoiceResponse;
import com.aliyun.tearpc.models.Config;
import com.aliyun.teautil.models.RuntimeOptions;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.InputStream;

public class RecognizeVATInvoiceDemo {

    public static void main(String[] args) throws Exception {

        // accessKeyId、accessKeySecret 获取参考:https://yq.aliyun.com/articles/693979
        Config config = new Config();
        config.accessKeyId= "LTAIOZZg********";
        config.accessKeySecret= "v7CjUJCMk7j9aK****************";
        config.regionId="cn-shanghai";
        config.protocol = "https";
        config.endpoint="ocr.cn-shanghai.aliyuncs.com";

        Client client = new Client(config);
        String filePath = "C:\\Users\\Administrator\\Desktop\\2019041500152001020003743286_0.jpg";
        String fileURL = "https://viapi-test.oss-cn-shanghai.aliyuncs.com/test/ant_ai/vat_invoice/2019041500152001020003743286_0.jpg";

        recognizeVATInvoiceAdvance(client, filePath);
        recognizeVATInvoice(client, fileURL);
    }

    /**
     * 增值税发票识别--使用本地图片
     * @param client
     * @param filePath 本地图片的路径
     */
    public static void recognizeVATInvoiceAdvance(Client client, String filePath)
    {
        RecognizeVATInvoiceAdvanceRequest recognizeVATInvoiceAdvanceRequest = new RecognizeVATInvoiceAdvanceRequest();

        InputStream inputStream = null;
        try {
            inputStream = new FileInputStream(new File(filePath));
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }

        // 传递请求参数
        recognizeVATInvoiceAdvanceRequest.fileURLObject = inputStream;
        recognizeVATInvoiceAdvanceRequest.fileType = "jpg";

        try {
            RecognizeVATInvoiceResponse recognizeVATInvoiceResponse = client.recognizeVATInvoiceAdvance(recognizeVATInvoiceAdvanceRequest, new RuntimeOptions());
            System.out.println(JSON.toJSONString(recognizeVATInvoiceResponse));// 输出请求结果
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    /**
     * 增值税发票识别--使用OSS图片
     * @param client
     * @param fileURL OSS 图片URL
     */
    public static void recognizeVATInvoice(Client client, String fileURL)
    {
        RecognizeVATInvoiceRequest recognizeVATInvoiceRequest = new RecognizeVATInvoiceRequest();

        recognizeVATInvoiceRequest.fileType = "jpg";
        recognizeVATInvoiceRequest.fileURL = fileURL;

        try {
            RecognizeVATInvoiceResponse recognizeVATInvoiceResponse = client.recognizeVATInvoice(recognizeVATInvoiceRequest,new RuntimeOptions());
            System.out.println(JSON.toJSONString(recognizeVATInvoiceResponse));// 输出请求结果
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}
  • 3.3 Result
{"data":{"box":{"checkers":[589.0,1003.0,662.0,1033.0],"clerks":[986.0,1003.0,1060.0,1033.0],"invoiceAmounts":[1364.0,776.0,1438.0,804.0],"invoiceCodes":[1378.0,41.0,1520.0,68.0],"invoiceDates":[1376.0,115.0,1596.0,145.0],"invoiceFakeCodes":[1376.0,153.0,1640.0,181.0],"invoiceNoes":[1377.0,78.0,1478.0,105.0],"payeeAddresses":[355.0,909.0,734.0,939.0],"payeeBankNames":[354.0,947.0,938.0,977.0],"payeeNames":[356.0,833.0,633.0,865.0],"payeeRegisterNoes":[356.0,873.0,571.0,902.0],"payees":[189.0,1003.0,264.0,1033.0],"payerAddresses":[0.0,0.0,0.0,0.0],"payerBankNames":[0.0,0.0,0.0,0.0],"payerNames":[354.0,222.0,700.0,255.0],"payerRegisterNoes":[358.0,262.0,567.0,290.0],"sumAmounts":[532.0,774.0,629.0,805.0],"taxAmounts":[1606.0,721.0,1658.0,748.0],"withoutTaxAmounts":[1265.0,721.0,1339.0,749.0]},"content":{"antiFakeCode":"02702870934284730434","checker":"赵弯弯","clerk":"赵弯弯","invoiceAmount":"200.00","invoiceCode":"031001600311","invoiceDate":"20190415","invoiceNo":"80625433","payee":"赵弯弯","payeeAddress":"上海虹桥机场迎宾二路161号22342185","payeeBankName":"上海浦东发展银行空港支行076389-98910158000000030-22","payeeName":"上海机场(集团)有限公司","payeeRegisterNo":"91310000132284295X","payerAddress":"","payerBankName":"","payerName":"百特医疗用品贸易(上海)有限公司","payerRegisterNo":"91310000607402073L","sumAmount":"200.00","taxAmount":"9.52","withoutTaxAmount":"190.48"}},"requestId":"75E88483-753C-4D5E-9EF5-5E132FF67DED"}
{"data":{"box":{"checkers":[589.0,1003.0,662.0,1033.0],"clerks":[986.0,1003.0,1060.0,1033.0],"invoiceAmounts":[1364.0,776.0,1438.0,804.0],"invoiceCodes":[1378.0,41.0,1520.0,68.0],"invoiceDates":[1376.0,115.0,1596.0,145.0],"invoiceFakeCodes":[1376.0,153.0,1640.0,181.0],"invoiceNoes":[1377.0,78.0,1478.0,105.0],"payeeAddresses":[355.0,909.0,734.0,939.0],"payeeBankNames":[354.0,947.0,938.0,977.0],"payeeNames":[356.0,833.0,633.0,865.0],"payeeRegisterNoes":[356.0,873.0,571.0,902.0],"payees":[189.0,1003.0,264.0,1033.0],"payerAddresses":[0.0,0.0,0.0,0.0],"payerBankNames":[0.0,0.0,0.0,0.0],"payerNames":[354.0,222.0,700.0,255.0],"payerRegisterNoes":[358.0,262.0,567.0,290.0],"sumAmounts":[532.0,774.0,629.0,805.0],"taxAmounts":[1606.0,721.0,1658.0,748.0],"withoutTaxAmounts":[1265.0,721.0,1339.0,749.0]},"content":{"antiFakeCode":"02702870934284730434","checker":"赵弯弯","clerk":"赵弯弯","invoiceAmount":"200.00","invoiceCode":"031001600311","invoiceDate":"20190415","invoiceNo":"80625433","payee":"赵弯弯","payeeAddress":"上海虹桥机场迎宾二路161号22342185","payeeBankName":"上海浦东发展银行空港支行076389-98910158000000030-22","payeeName":"上海机场(集团)有限公司","payeeRegisterNo":"91310000132284295X","payerAddress":"","payerBankName":"","payerName":"百特医疗用品贸易(上海)有限公司","payerRegisterNo":"91310000607402073L","sumAmount":"200.00","taxAmount":"9.52","withoutTaxAmount":"190.48"}},"requestId":"9B97F14B-0970-45C7-BE9E-CD3204BB3E1B"}

参考链接

文字识别介绍
阿里云视觉智能开放平台使用简明教程

上一篇:SpringBoot整合MongoDB(实现一个简单缓存)


下一篇:搜狗2016校园招聘之算法编程解析