diff --git a/docs/OCR.txt b/docs/OCR.txt index ed75abe2..e0bf2d0f 100644 --- a/docs/OCR.txt +++ b/docs/OCR.txt @@ -4,14 +4,14 @@ Overview OCR (Optical Character Recognition) is a technique used to extract text from images. In the World of Subtitle, subtitle stored in bitmap format are common and even necessary for converting subtitle -in bitmap format to subtitle in text format ocr is used. +in bitmap format to subtitle in text format OCR is used. Dependency ========== Tesseract (OCR library by Google) -Leptonica (image processing library) +Leptonica (Image processing library) -How to compile ccextractor on linux with OCR +How to compile CCExtractor on Linux with OCR ============================================= Download and Install Leptonnica. @@ -50,12 +50,12 @@ you can download tesseract training data from https://github.com/tesseract-ocr/t -Compile CCextractor passing flags like following +Compile CCExtractor passing flags like following ------------------------------------------------- make ENABLE_OCR=yes -How to compile ccextractor on Windows with OCR +How to compile CCExtractor on Windows with OCR =============================================== Download prebuild library of leptonica and tesseract from following link @@ -72,23 +72,23 @@ Step 5) Add path of Directory where you have kept uncompressed library of lepton Set preprocessor flag ENABLE_OCR=1 -Step 1)In visual studio 2013 right click and select property. -Step 2)In the left panel, select Configuration Properties, C/C++, Preprocessor. -Step 3)In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit. -Step 4)In the Preprocessor Definitions dialog box, add ENABLE_OCR=1. Choose OK to save your changes. +Step 1) In visual studio 2013 right click and select property. +Step 2) In the left panel, select Configuration Properties, C/C++, Preprocessor. +Step 3) In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit. +Step 4) In the Preprocessor Definitions dialog box, add ENABLE_OCR=1. Choose OK to save your changes. Add library in linker -step 1)Open property of project -Step 2)Select Configuration properties -Step 3)Select Linker in left panel(column) -Step 4)Select Input -Step 5)Select Additional dependencies in right panel -Step 6)Add libtesseract304d.lib in new line -Step 7)Add liblept172.lib in new line +step 1) Open property of project +Step 2) Select Configuration properties +Step 3) Select Linker in left panel(column) +Step 4) Select Input +Step 5) Select Additional dependencies in right panel +Step 6) Add libtesseract304d.lib in new line +Step 7) Add liblept172.lib in new line Download language data from following link https://code.google.com/p/tesseract-ocr/downloads/list after downloading the tesseract-ocr-3.02.eng.tar.gz extract the tar file and put -tessdata folder where you have kept ccextractor executable +tessdata folder where you have kept CCExtractor executable Copy the tesseract and leptonica dll from lib folder downloaded from above link to folder of executable or in system32.