如何在 Visual Studio 2010 中实现 Tesseract 与项目一起运行
我在 Visual Studio 2010 中有一个 C++ 项目并希望使用 OCR.我遇到了许多 Tesseract 的教程",但遗憾的是,我得到的只是头疼和浪费时间.
I have a C++ project in Visual Studio 2010 and wish to use OCR. I came across many "tutorials" for Tesseract but sadly, all I got was a headache and wasted time.
在我的项目中,我将图像存储为 Mat.我的问题的一种解决方案是将此 Mat 保存为图像(例如 image.jpg),然后像这样调用 Tesseract 可执行文件:
In my project I have an image stored as a Mat. One solution to my problem is to save this Mat as an image (image.jpg for example) and then call Tesseract executable file like this:
system("tesseract.exe image.jpg out");
这让我得到一个输出 out.txt 然后我调用
Which gets me an output out.txt and then I call
infile.open ("out.txt");
读取 Tesseract 的输出.
to read the output from Tesseract.
一切都很好,像椅子一样工作,但它不是最佳解决方案.在我的项目中,我正在处理一个视频,所以 save/call .exe/write/read 在 10+ FPS 并不是我真正想要的.我想对现有代码实现 Tesseract,以便能够将 Mat 作为参数传递并立即获得字符串形式的结果.
It is all good and works like a chair but it is not an optimal solution. In my project I am processing a video so save/call .exe/write/read at 10+ FPS is not what I am really looking for. I want to implement Tesseract to existing code so to be able to pass a Mat as an argument and immediately get a result as a String.
您是否知道使用 Visual Studio 2010 实现 Tesseract OCR 的任何好的教程(pref. step-by-step)?还是您自己的解决方案?
Do you know any good tutorial(pref. step-by-step) to implement Tesseract OCR with Visual Studio 2010? Or your own solution?
推荐答案
好的,我想通了,但它仅适用于 Release 和 Win32 配置(无调试或x64).Debug配置下有很多链接错误.
OK, I figured it out but it works for Release and Win32 configuration only (No debug or x64). There are many linking errors under Debug configuration.
所以,
1. 首先,在这里下载准备好的库文件夹(Tesseract + Leptonica):
1. First of all, download prepared library folder(Tesseract + Leptonica) here:
镜像 1(Google 云端硬盘)
镜像 2(MediaFire)
2. 将 tesseract.zip
解压到 C:
3. 在 Visual Studio 中,转到 C/C++ >一般>其他包含目录
3. In Visual Studio, go under C/C++ > General > Additional Include Directories
插入C: esseractinclude
4. 在 Linker > 下一般>其他图书馆目录
插入C: esseractlib
5. 在 Linker > 下输入 >额外的依赖
添加:
liblept168.lib
libtesseract302.lib
<小时>
示例代码应如下所示:
Sample code should look like this:
#include <tesseractaseapi.h>
#include <leptonicaallheaders.h>
#include <iostream>
using namespace std;
int main(void){
tesseract::TessBaseAPI api;
api.Init("", "eng", tesseract::OEM_DEFAULT);
api.SetPageSegMode(static_cast<tesseract::PageSegMode>(7));
api.SetOutputName("out");
cout<<"File name:";
char image[256];
cin>>image;
PIX *pixs = pixRead(image);
STRING text_out;
api.ProcessPages(image, NULL, 0, &text_out);
cout<<text_out.string();
system("pause");
}
有关与 OpenCV 和 Mat 类型图像的交互,请查看此处
For interaction with OpenCV and Mat type images look HERE
相关文章