Overview ======== Subtitles which are burned into the video (or hard subbed) can be extracted using the -hardsubx flag. The system works by processing video frames and extracting only the subtitles from them, followed by an OCR recognition using Tesseract. Dependencies ============ Tesseract (OCR library by Google) Leptonica (C Image processing library) FFMpeg (Video Processing Library) Compilation =========== Linux ----- Make sure Tesseract, Leptonica and FFMPeg are installed, and that their libraries can be found using pkg-config. Refer to OCR.txt for installation details. FFmpeg from packages (on Debian) plus a couple of other dependencies you will need: sudo apt-get install libavcodec-dev libavformat-dev libavutil-dev libswscale-dev libxcb-shm0-dev liblzma-dev FFmpeg from source: To install FFmpeg (libav), follow the steps at:- https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu - For Ubuntu, Debian and Linux Mint https://trac.ffmpeg.org/wiki/CompilationGuide/Generic - For generic Linux compilation To validate your FFMpeg installation, make sure you can run the following commands on your terminal:- pkg-config --cflags libavcodec pkg-config --cflags libavformat pkg-config --cflags libavutil pkg-config --cflags libswscale pkg-config --libs libavcodec pkg-config --libs libavformat pkg-config --libs libavutil pkg-config --libs libswscale On success, you should see the correct include directory path and the linker flags. To build the program with hardsubx support, == from the Linux directory run:- ./configure --enable-hardsubx make ENABLE_HARDSUBX=yes == using cmake from root directory mkdir build cd build cmake -DWITH_OCR=on -DWITH_HARDSUBX=on ../src/ make NOTE: The build has been tested with FFMpeg version 3.1.0, and Tesseract 3.04. Windows ------- Coming Soon