Update OCR.txt

This commit is contained in:
Deepraj Pandey 2016-12-02 14:01:53 +05:30 committed by GitHub
parent dd9243d459
commit 902da70ee3

View File

@ -4,14 +4,14 @@ Overview
OCR (Optical Character Recognition) is a technique used to OCR (Optical Character Recognition) is a technique used to
extract text from images. In the World of Subtitle, subtitle stored extract text from images. In the World of Subtitle, subtitle stored
in bitmap format are common and even necessary for converting subtitle in bitmap format are common and even necessary for converting subtitle
in bitmap format to subtitle in text format ocr is used. in bitmap format to subtitle in text format OCR is used.
Dependency Dependency
========== ==========
Tesseract (OCR library by Google) Tesseract (OCR library by Google)
Leptonica (image processing library) Leptonica (Image processing library)
How to compile ccextractor on linux with OCR How to compile CCExtractor on Linux with OCR
============================================= =============================================
Download and Install Leptonnica. Download and Install Leptonnica.
@ -50,12 +50,12 @@ you can download tesseract training data from https://github.com/tesseract-ocr/t
Compile CCextractor passing flags like following Compile CCExtractor passing flags like following
------------------------------------------------- -------------------------------------------------
make ENABLE_OCR=yes make ENABLE_OCR=yes
How to compile ccextractor on Windows with OCR How to compile CCExtractor on Windows with OCR
=============================================== ===============================================
Download prebuild library of leptonica and tesseract from following link Download prebuild library of leptonica and tesseract from following link
@ -72,23 +72,23 @@ Step 5) Add path of Directory where you have kept uncompressed library of lepton
Set preprocessor flag ENABLE_OCR=1 Set preprocessor flag ENABLE_OCR=1
Step 1)In visual studio 2013 right click <Project> and select property. Step 1) In visual studio 2013 right click <Project> and select property.
Step 2)In the left panel, select Configuration Properties, C/C++, Preprocessor. Step 2) In the left panel, select Configuration Properties, C/C++, Preprocessor.
Step 3)In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit. Step 3) In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit.
Step 4)In the Preprocessor Definitions dialog box, add ENABLE_OCR=1. Choose OK to save your changes. Step 4) In the Preprocessor Definitions dialog box, add ENABLE_OCR=1. Choose OK to save your changes.
Add library in linker Add library in linker
step 1)Open property of project step 1) Open property of project
Step 2)Select Configuration properties Step 2) Select Configuration properties
Step 3)Select Linker in left panel(column) Step 3) Select Linker in left panel(column)
Step 4)Select Input Step 4) Select Input
Step 5)Select Additional dependencies in right panel Step 5) Select Additional dependencies in right panel
Step 6)Add libtesseract304d.lib in new line Step 6) Add libtesseract304d.lib in new line
Step 7)Add liblept172.lib in new line Step 7) Add liblept172.lib in new line
Download language data from following link Download language data from following link
https://code.google.com/p/tesseract-ocr/downloads/list https://code.google.com/p/tesseract-ocr/downloads/list
after downloading the tesseract-ocr-3.02.eng.tar.gz extract the tar file and put after downloading the tesseract-ocr-3.02.eng.tar.gz extract the tar file and put
tessdata folder where you have kept ccextractor executable tessdata folder where you have kept CCExtractor executable
Copy the tesseract and leptonica dll from lib folder downloaded from above link to folder of executable or in system32. Copy the tesseract and leptonica dll from lib folder downloaded from above link to folder of executable or in system32.