Tesseract
tags :
Summary #
Tesseract Open Source OCR Engine by Google
github #

Adding language support with training data #
Like Arabic https://github.com/tesseract-ocr/tessdata You can grab eng.traineddata Github:
wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata
Check https://github.com/tesseract-ocr/tessdata for a full list of trained language data.
When you grab the file(s), move them to the /usr/local/share/tessdata folder. Warning: some Linux distributions (such as openSUSE and Ubuntu) may be expecting it in /usr/share/tessdata instead.
# If you got the data from Google, unzip it first!
gunzip eng.traineddata.gz
# Move the data
sudo mv -v eng.traineddata /usr/local/share/tessdata/
OCR of Images #
2024-02-13_17-49-25_screenshot.png #

W TU A T githuo.comlesseract-ocrlesseract 16 tesseract-ocr I tesseract 81 <> Code Issues 391 81 Pull requests 25 Actions Projects a Wiki Security tesseract Public Watch 1.7k V e Fork 9.1k V L7 Star 56.5k / P main V P S Go to file + <> Code About Tesseract Open Source OCR Engine (main repository) la egorpugin Merge pu... : X becd395 yesterday 6,309 Commits .github Update issue-bug.yml 3 months ago € tesseract-ocrgithub.o) machine-learning ocr tesseract Istm tesseract-ocr hacktoberfest ocr-engine cmake cmake: check_Jeptonica_ti.. 7 months ago doc Rename BibTex file to plea... 6 months ago include/tesseract Update publictypes.h last month Readme 44 Apache-2.0 license java Correct indefinite articles : 3 months ago