1.环境安装
yum install -y gcc gcc-c++ autoconf automake libtool libjpeg libpng libtiff zlib libjpeg-devel libpng-devel libtiff-devel zlib-devel
2.gcc升级
因编译Tesseract5需要c++17
yum install -y centos-release-scl
yum install -y devtoolset-8-gcc*
mv /usr/bin/gcc /usr/bin/gcc-4.8.5
ln -s /opt/rh/devtoolset-8/root/bin/gcc /usr/bin/gcc
mv /usr/bin/g++ /usr/bin/g++-4.8.5
ln -s /opt/rh/devtoolset-8/root/bin/g++ /usr/bin/g++
gcc --version
g++ --version
3.编译安装Leptonica
github下载leptonica-1.76.0.tar.gz
tar zxvf leptonica-1.78.0.tar.gz
cd leptonica-1.76.0
./autobuild
./configure
make && make install
4.配置环境
/etc/profile添加
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
export LIBLEPT_HEADERSDIR=/usr/local/include
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
source /etc/profile
4.编译安装Tesseract
github下载tesseract-5.1.0.tar.gz
tar zxvf tesseract-5.1.0.tar.gz
cd tesseract-5.1.0.tar.gz
./autogen.sh
./configure --with-extra-includes=/usr/local/include --with-extra-libraries=/usr/local/lib
make && make install
5.安装语言(可选)
https://github.com/tesseract-ocr/tessdata_fast下载语言包(一般下载eng.traineddata,chi_sim.traineddata)并至/usr/local/share/tessdata/文件夹。
检测支持语言tesseract --list-langs
6.运行测试