ML kit OCR

6.3.4+ 实验

gmlkit.ocr(img, language)

对给定的图像进行文字识别。

img {Image} 图片
Language {String} 识别语言，可选值为：
- zh 中文
- sa 梵文
- ja 日语
- ko 韩语
- 其他语言open in new window
retrun{Result} 文字识别结果。

// 识别中文
let result = gmlkit.ocr(img, "zh");
console.log(result.text);

gmlkit.ocrText(img, language)

对给定的图像进行文字识别，并返回识别到的文本字符串。

img {Image} 图片
Language {String} 识别语言，可选值为：
- zh 中文
- sa 梵文
- ja 日语
- ko 韩语
- 其他语言open in new window
retrun {String} 识别到的文本字符串。

// 识别中文
let result = gmlkit.ocrText(img, "zh");
console.log(result);

Result

表示 Google ML Kit 文字识别返回的结果,有以下几个属性：

level {Number} 结果的层级。
confidence {Number} 识别结果的置信度。
text {String} 识别到的文本。
language {String} 识别到的语言。
bounds {Rectopen in new window} 文本在图片中的位置
children {Array} 子列表，包含更详细的内容。

Result.find(predicate)

查找符合条件的第一个元素，没找到则返回 null。

predicate {Function} 用于判断的函数，接受一个 Result 对象作为参数。
return {Result}

Result.find(level,predicate)

查找指定层级中符合条件的第一个元素，没找到则返回 null。

level {Number} 指定的层级。
predicate {Function} 用于判断的函数，接受一个 Result 对象作为参数。
return {Result}

Result.filter(predicate)

查找符合条件的所有元素

predicate {Function} 用于判断的函数，接受一个 Result 对象作为参数。
return {Array} Java 数组

Result.filter(level,predicate)

在指定层级中查找符合条件的所有元素

level {Number} 指定的层级。
predicate {Function} 用于判断的函数，接受一个 Result 对象作为参数。
return {Array} Java 数组

Result.toArray()

将结果转换成数组

return {Array} Java 数组

Result.toArray(level)

将指定层级结果转换成数组

level {Number} 层级
return {Array} Java 数组

Result.sort()

根据bounds的位置对原结果进行排序

Result.sorted()

同上,返回排序后的 Result 对象

return {Result}

paddle

5.6.1 实验

paddle.ocr(image,useSlim)

使用指定的 OCR 模型来执行 OCR。

image {Image} 要执行 OCR 的图像。
useSlim {Boolean} 加载的模型,可选值:
- true ocr_v2_for_cpu(slim) :快速模型
- false ocr_v2_for_cpu : 精准模型
return {Array} 识别结果数组,值为 OcrResult

paddle.ocr(image,[cpuThreadNum,useSlim])

使用指定的 CPU 核心数和 OCR 模型来执行 OCR。

image {Image} 要执行 OCR 的图像。
cpuThreadNum {Number} 用于执行 OCR 的 CPU 核心数。默认值:系统的 CPU 核心数
useSlim {Boolean} 加载的模型,可选值:
- true ocr_v2_for_cpu(slim) :快速模型,默认
- false ocr_v2_for_cpu : 精准模型
return {Array} 识别结果数组,值为 OcrResult

paddle.ocr(image,cpuThreadNum,myModelPath)

使用指定的 CPU 核心数和自定义 OCR 模型来执行 OCR。

image {Image} 要执行 OCR 的图像。
cpuThreadNum {Number} 用于执行 OCR 的 CPU 核心数。
myModelPath {String} 自定义 OCR 模型的绝对路径。
return {Array} 识别结果数组,值为 OcrResult

paddle.ocr(image,myModelPath)

使用自定义 OCR 模型来执行 OCR。

image {Image} 要执行 OCR 的图像。
myModelPath {String} 自定义 OCR 模型的绝对路径。
return {Array} 识别结果数组,值为 OcrResult

paddle.ocrText(image,useSlim)

使用指定的 OCR 模型来执行 OCR。

image {Image} 要执行 OCR 的图像。
useSlim {Boolean} 加载的模型,可选值:
- true ocr_v2_for_cpu(slim) :快速模型
- false ocr_v2_for_cpu : 精准模型
return {String} 识别到的文本字符串。

paddle.ocrText(image,[cpuThreadNum,useSlim])

使用指定的 CPU 核心数和 OCR 模型来执行 OCR。

image {Image} 要执行 OCR 的图像。
cpuThreadNum {Number} 用于执行 OCR 的 CPU 核心数。默认值:系统的 CPU 核心数
useSlim {Boolean} 加载的模型,可选值:
- true ocr_v2_for_cpu(slim) :快速模型,默认
- false ocr_v2_for_cpu : 精准模型
return {String} 识别到的文本字符串。

paddle.ocrText(image,cpuThreadNum,myModelPath)

使用指定的 CPU 核心数和自定义 OCR 模型来执行 OCR。

image {Image} 要执行 OCR 的图像。
cpuThreadNum {Number} 用于执行 OCR 的 CPU 核心数。
myModelPath {String} 自定义 OCR 模型的绝对路径。
return {String} 识别到的文本字符串。

paddle.ocrText(image,myModelPath)

使用自定义 OCR 模型来执行 OCR。

image {Image} 要执行 OCR 的图像。
myModelPath {String} 自定义 OCR 模型的绝对路径。
return {String} 识别到的文本字符串。

OcrResult

OcrResult 是一个表示 OCR 结果的类。它包含以下字段：

confidence {Number} 识别的置信度。
preprocessTime {Number} 预处理时间。
inferenceTime {Number} 推理时间。
text {String} 识别出的文本。
bounds {Rectopen in new window} 文本在图像中的位置

Tessract

6.2.9 实验

前往 github 下载完整例子：Tessract OCRopen in new window