Added new dictionaries, lots of corrections in documents.

This commit is contained in:
Carlos Fernandez 2016-12-02 11:27:37 -08:00
commit 3704111fb0
17 changed files with 230 additions and 71 deletions

View File

@ -0,0 +1,54 @@
Ancient Psychic Tandem War Elephant
Banana Guard
Candy Kingdom
Candy People
Choose Goose
Cinnamon Bun
City of Thieves
Colonel Candycorn
Cosmic Owl
Crab Princess
Dr. Donut
Dr. Ice Cream
Duchess of Nuts
Earl of Lemongrab
Everything Burrito
Finn the Human
Fire Kingdom
Flame Princess
Flying Lettuce Bros.
Ghost Princess
Hotdog Knight
Ice King
Ice Kingdom
Jake the Dog
Lady Rainicorn
Lake Butterscotch
Land of Ooo
Lumpy Space Princess
Marauder Village
Marshmallow Kid
Mr. Cream Puff
Muscle Princess
Nice King
Nice Knights
Nightosphere
Nurse Poundcake
Old Lady Princess
Party Pat
Peppermint Butler
Pillow World
Princess Bubblegum
Raggedy Princess
Root Beer Guy
Sir Slicer
Skeleton Princess
Slime Princess
Snow Golem
The Enchiridion
The Lich
Toast Princess
Tree Fort
Tree Trunks
Wildberry Princess
Wizard Battle

View File

@ -6,12 +6,37 @@ Thatcher Grey
Derek Shepherd
Amelia Shepherd
Owen Hunt
Maggie Pierce
Teddy
Dr. Altman
Dr. Margaret Pierce
Dr. Teddy Altman
Alex Karev
Callie Torres
Izzie Stevens
Christina Yang
Mark Sloan
Jackson Avery
Leah Murphy
April Kepner
Arizona Robbins
George O'Malley
Preston Bruke
Miranda Bailey
Denny Duquette
Dr. Addison Montgomery
Richard Webber
Adele Webber
Jo Wilson
Andrew Deluca
Nathan Riggs
Erica Hahn
Sadie Harris
Stephanie Edwards
Jason Myers
Dr. Nicole Herman
Hannah Davies
Shane Ross
Seattle Grace Hospital
Mercy West Medical Center
Seatle Grace Mercy West Hospital
Seattle Grace Mercy West Hospital
Denny Duquette Memorial Clinic
Grey Sloan Memorial Hospital
Mayo Clinic

View File

@ -0,0 +1,11 @@
Dev
Rachel
Go-Gurt
Arnold
Brian
Denise
The Sickening
Nina
Nashville
Paro
Benjamin

View File

@ -1,5 +1,9 @@
Mr. Robot
Elliot Alderson
Darlene
Angela Moss
Tyrell Wellick
Joanna Wellick
Phillip Price
Federal Bureau of Investigation
Fun Society
@ -8,4 +12,4 @@ New York
Evil Corp Headquarters
Allsafe Cybersecurity
Rons Coffee
Python
Python

View File

@ -0,0 +1,11 @@
Jess
Jessica Day
Nick Miller
Winston Bishop
Schmidt
Cece Parekh
Coach
Latvian Basketball League
Ferguson
True American
Los Angeles middle school

View File

@ -0,0 +1,16 @@
Amethyst
Beach City
Cookie Cat
Crying Breakfast Friends
Crystal Gems
Crystal Temple
Earthlings
Fryman
Garnet
Lion
Pearl
Peridot
Rose Quartz
Ruby
Sapphire
Steven Universe

View File

@ -1,15 +1,30 @@
The Big Bang Theory
Penny
Leonard Hofstadter
Sheldon Cooper
Raj Koothrappali
Bernadette Rostenkowski
Howard Wolowitz
Amy Farrah Fowler
Leslie Winkle
Stuart Bloom
Arthur Jeffries
Mrs. Wolowitz
Barry Kripke
Priya Koothrappali
Mrs. Koothrappali
Mr. Koothrappali
Lucy
Sheldons Spot
The Apartment Building
Apartment 4A/B
The Laundry Room
The Roof
Wolowitz House
Wolowitzs' House
Capitol Comics
The Cheesecake Factory
The Comic Center of Pasadena
California Institute of Technology
Massachusetts Institute of Technology
Jet Propulsion Laboratory
Pasadena
Pasadena

View File

@ -0,0 +1,23 @@
Arsenal Football Club
Aunt Irma
Big Ben
Countdown
Dragon's Den
Emergency Services
Employee of the Month
Friendface
Gay: A Gay Musical
Information Technology
Jen Barber
Lonely Hearts
Maurice Moss
Random Access Memory
Sea Parks
Spaceology
The Banner
The Evening Informer
The Internet
The London Echo
Tnetennba
Windows Vista
Word

View File

@ -156,7 +156,7 @@ version of CCExtractor.
- Display end time
- Display caption mode
- Display caption channel
- Use a relative timestamp ( relative to the sample)
- Use a relative timestamp (relative to the sample)
- Display XDS info
- Use colors
Examples:
@ -209,7 +209,7 @@ version of CCExtractor.
.raw, which depends on padding. Fixed.
- MythTV's branch had a fixed size buffer that could not be enough
some times. Made dynamic.
- Better support for PAT changing mid stream.
- Better support for PAT changing mid-stream.
- Removed quotes in Start in .smi (format fix).
- Added multicast support (Chris Small)
- Added ability to select IP address to bind in UDP (Chris Small)
@ -239,10 +239,10 @@ version of CCExtractor.
their PMT entry.
- Added -datastreamtype to manually selecting a stream based on
its type instead of its PID. Useful if your recording program
always hides the caption under the stream stream type.
always hides the caption under the stream type.
- Added -streamtype so if an elementary stream is selected manually
for processing the streamtype can be selected too. This can be
needed if you process for example a stream that is declared as
for processing, the streamtype can be selected too. This can be
needed if you process, for example a stream that is declared as
"private MPEG" in the PMT, so CCExtractor can't tell what it is.
Usually you'll want -streamtype 2 (MPEG video) or -streamtype 6
(MPEG private data).
@ -251,10 +251,10 @@ version of CCExtractor.
- Fixes in roll-up, cursor was being moved to column 1 if a
RU2, RU3 or RU4 was received even if already in roll-up mode.
- Added -autoprogram. If a multiprogram TS is processed and
-autoprogram is used CCExtractor will analyze all PMTs and use
-autoprogram is used, CCExtractor will analyze all PMTs and use
the first program that has a suitable data stream.
- Timed transcript (ttxt) now also exports the caption mode
(roll-up, paint-on, etc) next to each line, as it's useful to
(roll-up, paint-on, etc.) next to each line, as it's useful to
detect things like commercials.
- Content Advisory information from XDS is now decoded if it's
transmitted in "US TV parental guidelines" or "MPA".
@ -522,12 +522,12 @@ version of CCExtractor.
- Removed -autopad and -goppad, no longer needed.
- In preparation to a new binary format we have
renamed the current .bin to .raw. Raw files
have only CC data (with no header, timing, etc).
have only CC data (with no header, timing, etc.).
- The input file format (when forced) is now
specified with
-in=format
such as -in=ts, -in=raw, -in=ps ...
The old switches (-ts, -ps, etc) still work.
The old switches (-ts, -ps, etc.) still work.
The only exception is -bin which has been removed
(reserved for the new binary format). Use
-in=raw to process a raw file.
@ -569,7 +569,7 @@ version of CCExtractor.
0.46 (2008-11-24)
-----------------
- Added support for live streaming, ccextractor
- Added support for live streaming, CCExtractor
can now process files that are being recorded
at the same time.
@ -619,7 +619,7 @@ version of CCExtractor.
- Fixed a bug in the read loop (no less)
that caused some files to fail when
reading without buffering (which is
the default in the linux build).
the default in the Linux build).
- Several improvements in the GUI, such as
saving current options as default.
@ -642,7 +642,7 @@ version of CCExtractor.
deaf people know if the person talking is
at the left or the right of the screen, i.e.
there aren't useless. But if they annoy
you go ahead...
you, go ahead...
0.40 (2008-05-20)
-----------------
@ -661,7 +661,7 @@ version of CCExtractor.
- Fixed a bug in the CC decoder that could cause
the first line not to be cleared in roll-up
mode.
- ccextractor can now follow number sequences in
- CCExtractor can now follow number sequences in
file names, by suffixing the name with +.
For example,
@ -698,7 +698,7 @@ version of CCExtractor.
that have been added because old behaviour was
annoying to most people: _1 and _2 at the end
of the output file names is now added ONLY if
-12 is used (ie when there are two output
-12 is used (i.e. when there are two output
files to produce). So
ccextractor -srt sopranos.mpg
@ -800,7 +800,7 @@ version of CCExtractor.
0.32 (unreleased)
-----------------
- Added -delay ms, which adds (or substracts)
- Added -delay ms, which adds (or subtracts)
a number of milliseconds to all times in
.srt/.sami files. For example,
@ -811,7 +811,7 @@ version of CCExtractor.
-delay -400
causes all substitles to appear 400 ms before
causes all subtitles to appear 400 ms before
they would normally do.
- Added -startat at -endat which lets you
select just a portion of data to be processed,
@ -837,7 +837,7 @@ version of CCExtractor.
0.29 (unreleased)
-----------------
- Minor bugfix.
- Minor bug fix.
0.28 (unreleased)
-----------------
@ -851,7 +851,7 @@ version of CCExtractor.
0.27 (unreleased)
-----------------
- Modified sanitizing code, it's less aggresive
- Modified sanitizing code, it's less aggressive
now. Ideally it should mean that characters
won't be missed anymore. We'll see.
@ -906,7 +906,7 @@ version of CCExtractor.
many others (bttv) with the same closed caption recording
format.
This is the result of hacking MythTV's MPEG parser into
ccextractor. Integration is not very good (to put it
CCExtractor. Integration is not very good (to put it
midly) but it seems to work. Depending on the feedback I
may continue working on this or just leave it 'as it'
(good enough).
@ -923,7 +923,7 @@ version of CCExtractor.
It's fixed now at least for the samples I have, if it's not
completely fixed let me know. Credit for this goes to
Jack Ha who sent me a couple of samples and a first
implementation of a semiworking fix.
implementation of a semi working-fix.
- Added support for several input files (see help screen for
details).
- Added Unicode and Latin-1 encoding.
@ -955,7 +955,7 @@ version of CCExtractor.
0.07 (2007-04-19)
-----------------
- Added MPEG reference clock parsing.
- Added autopadding in TS. Does miracles with timing.
- Added auto padding in TS. Does miracles with timing.
- Added video information (as extracted from sequence header).
- Some code clean-up.
- FF sanity check enabled by default.

View File

@ -1,8 +1,8 @@
Overview
========
FFmpeg Intigration was done to support multiple encapsulations.
FFmpeg Integration was done to support multiple encapsulations.
Dependecy
Dependency
=========
FFmpeg library's
@ -35,24 +35,24 @@ make ENABLE_FFMPEG=yes
On Windows
----------
put the path of libs/include of ffmpeg library in library paths.
step 1) In visual studio 2013 right click <Project> and select property.
step 2) Select Configuration properties in left panel(column) of property.
step 3) Select VC++ Directory.
step 4) In the right pane, in the right-hand column of the VC++ Directory property,
Step 1) In visual studio 2013 right click <Project> and select property.
Step 2) Select Configuration properties in left panel(column) of property.
Step 3) Select VC++ Directory.
Step 4) In the right pane, in the right-hand column of the VC++ Directory property,
open the drop-down menu and choose Edit.
Step 5) Add path of Directory where you have kept uncompressed library of FFmpeg.
Set preprocessor flag ENABLE_FFMPEG=1
Step 1)In visual studio 2013 right click <Project> and select property.
Step 2)In the left panel, select Configuration Properties, C/C++, Preprocessor.
Step 3)In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit.
Step 4)In the Preprocessor Definitions dialog box, add ENABLE_FFMPEG=1. Choose OK to save your changes.
Step 1) In visual studio 2013 right click <Project> and select property.
Step 2) In the left panel, select Configuration Properties, C/C++, Preprocessor.
Step 3) In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit.
Step 4) In the Preprocessor Definitions dialog box, add ENABLE_FFMPEG=1. Choose OK to save your changes.
Add library in linker
step 1)Open property of project
Step 2)Select Configuration properties
Step 3)Select Linker in left panel(column)
Step 4)Select Input
Step 5)Select Additional dependencies in right panel
Step 6)Add all FFmpeg's lib in new line
Step 1) Open property of project
Step 2) Select Configuration properties
Step 3) Select Linker in left panel(column)
Step 4) Select Input
Step 5) Select Additional dependencies in right panel
Step 6) Add all FFmpeg's lib in new line

View File

@ -1,4 +1,4 @@
Starting with version 0.51, ccextractor has a mode
Starting with version 0.51, CCExtractor has a mode
that allows frontends and other programs know what
the current progress is as well as get information
on interesting events, such as a file being open

View File

@ -46,13 +46,13 @@ The possible color values are:
And the possible font values are:
R => Regular
I => Italic
I => Italics
U => Underlined
B => Underlined + italic
B => Underlined + Italics
If a 'E' is found in ether color or font that means a bug in CCExtractor. Should you ever get
If a 'E' is found in either color or font that means a bug in CCExtractor. Should you ever get
an E please send us a .bin file that causes it.
This format is intended for post processing tools that need to represent the output of a 608
decoder accurately but that don't want to deal with the madness of other more generic subtitle
formats.
formats.

View File

@ -36,7 +36,7 @@ pkg-config --libs libswscale
On success, you should see the correct include directory path and the linker flags.
To build the program with hardsubx support, from the linux directory run:-
To build the program with hardsubx support, from the Linux directory run:-
make ENABLE_HARDSUBX=yes
NOTE: The build has been tested with FFMpeg version 3.1.0, and Tesseract 3.04.
@ -44,4 +44,4 @@ NOTE: The build has been tested with FFMpeg version 3.1.0, and Tesseract 3.04.
Windows
-------
Coming Soon
Coming Soon

View File

@ -3,7 +3,7 @@ A mailing list is now available from sourceforge:
https://lists.sourceforge.net/lists/listinfo/ccextractor-users
I expect it to be very low traffic (right now there's around 10
people actively helping with ccextractor in one way or
people actively helping with CCExtractor in one way or
another), so almost everything goes here:
- Bug reports

View File

@ -4,14 +4,14 @@ Overview
OCR (Optical Character Recognition) is a technique used to
extract text from images. In the World of Subtitle, subtitle stored
in bitmap format are common and even necessary for converting subtitle
in bitmap format to subtitle in text format ocr is used.
in bitmap format to subtitle in text format OCR is used.
Dependency
==========
Tesseract (OCR library by Google)
Leptonica (image processing library)
Leptonica (Image processing library)
How to compile ccextractor on linux with OCR
How to compile CCExtractor on Linux with OCR
=============================================
Download and Install Leptonnica.
@ -50,12 +50,12 @@ you can download tesseract training data from https://github.com/tesseract-ocr/t
Compile CCextractor passing flags like following
Compile CCExtractor passing flags like following
-------------------------------------------------
make ENABLE_OCR=yes
How to compile ccextractor on Windows with OCR
How to compile CCExtractor on Windows with OCR
===============================================
Download prebuild library of leptonica and tesseract from following link
@ -72,23 +72,23 @@ Step 5) Add path of Directory where you have kept uncompressed library of lepton
Set preprocessor flag ENABLE_OCR=1
Step 1)In visual studio 2013 right click <Project> and select property.
Step 2)In the left panel, select Configuration Properties, C/C++, Preprocessor.
Step 3)In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit.
Step 4)In the Preprocessor Definitions dialog box, add ENABLE_OCR=1. Choose OK to save your changes.
Step 1) In visual studio 2013 right click <Project> and select property.
Step 2) In the left panel, select Configuration Properties, C/C++, Preprocessor.
Step 3) In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit.
Step 4) In the Preprocessor Definitions dialog box, add ENABLE_OCR=1. Choose OK to save your changes.
Add library in linker
step 1)Open property of project
Step 2)Select Configuration properties
Step 3)Select Linker in left panel(column)
Step 4)Select Input
Step 5)Select Additional dependencies in right panel
Step 6)Add libtesseract304d.lib in new line
Step 7)Add liblept172.lib in new line
step 1) Open property of project
Step 2) Select Configuration properties
Step 3) Select Linker in left panel(column)
Step 4) Select Input
Step 5) Select Additional dependencies in right panel
Step 6) Add libtesseract304d.lib in new line
Step 7) Add liblept172.lib in new line
Download language data from following link
https://code.google.com/p/tesseract-ocr/downloads/list
after downloading the tesseract-ocr-3.02.eng.tar.gz extract the tar file and put
tessdata folder where you have kept ccextractor executable
tessdata folder where you have kept CCExtractor executable
Copy the tesseract and leptonica dll from lib folder downloaded from above link to folder of executable or in system32.

View File

@ -1,4 +1,4 @@
For building ccextractor using cmake follow steps below..
For building CCExtractor using cmake follow steps below..
Step 1) Check you have right version of cmake installed. ( version >= 3.0.2 )
We are using CMP0037 policy of cmake which was introduced in 3.0.0

View File

@ -1,4 +1,4 @@
/* CCExtractor, carlos at ccextractor org
/* CCExtractor, originally by carlos at ccextractor.org, now a lot of people.
Credits: See CHANGES.TXT
License: GPL 2.0
*/