0.74 ----------- - Fixed issue with -o1 -o2 and -12 parameters (where it would write output only in the o2 file) - Fixed UCLA parameter issue. Now the UCLA parameter settings can't be overwritten anymore by later parameters that affect the custom transcript - Switched order around for TLT and TT page number in custom transcript to match UCLA settings - Added nobom parameter, for when files are processed by tools that can't handle the BOM. If using this, files might be not readable under windows. - Segfault fix when no input files were given - No more bin output when sending to server + possibility to send TT to server for processing - Windows: Added the Microsoft redistributable MSVCR120.DLL to both the installation package and the application zip. 0.73 - GSOC ----------- - Added support of BIN format for Teletext - Added start of librarisation. This will allow in the future for other programs to use encoder/decoder functions and more. 0.72 - GSOC ----------- - Fix for WTV files with incorrect timing - Added support for fps change using data from AVC video track in a H264 TS file. - Added FFMpeg Support to enable all encapsulator and decoder provided by ffmpeg 0.71 - GSOC ----------- - Added feature to receive captions in BIN format according to CCExtractor's own protocol over TCP (-tcp port [-tcppassword password]) - Added ability to send captions to the server described above or to the online repository (-sendto host[:port]) - Added -stdin parameter for reading input stream from standard input - Compilation in Cygwin using linux/Makefile - Fix for .bin files when not using latin1 charset - Correction of mp4 timing, when one timestamp points timing of two atom 0.70 - GSOC ----------- This is the first release that is part of Google's Summer of Code. Anshul, Ruslan and Willem joined CCExtractor to work on a number of things over the summer, and their work is already reaching the mainstream version of CCExtractor. - Added a huge dictionary submitted by Matt Stockard. - Added DVB subtitles decoder, spupng in output - Added support for cdt2 media atoms in QT video files. Now multiple atoms in a single sample sequence are supported. - Changed Makefile. - Fixed some bugs. - Added feature to print info about file's subtitles and streams (-out=report). - Support Long PMT. - Support Configuration file. - There is an sample configuration file in doc/ folder with name ccextractor.cnf.sample - Just now only ccextractor.cnf named files kept beside ccextractor executable is supported - for details of which options can be set using configuration file, please look at sample file. - Added options for custom transcript output: new parameter (-customtxt format), where the format must be like this: 1100100 (7 digits). These indicate whether the next things should be displayed or not in the (timed) transcript: - Display start time - Display end time - Display caption mode - Display caption channel - Use a relative timestamp ( relative to the sample) - Display XDS info - Use colors Examples: 0000101 is the default setting for transcripts 1110101 is the default for timed transcripts 1111001 is the default setting for -ucla Make sure you use this parameter after others that might affect these settings (-out, -ucla, -xds, -txt, -ttxt, ...) - Fixed Negative timing Bug 0.69 ---- - A few patches from Christopher Small, including proper support for multiple multicast clients listening on the same port. - GUI: Fixed teletext preview. - GUI: Added a small indicator of data being received when reading from UDP. - GUI: Added UTF-8 support to preview Window (used for teletext). - Fixes in Makefile and build script, compilation in linux and OSX failed if another libpng was found in the system. - WTV support directly in CCExtractor (no need for wtvccdump any more). - Started refactoring and clean-up. - Fix: MPEG clock rollover (happens each 26 hours) caused a time discontinuity. - Windows GUI: Started work on HDHomeRun support. For now it just looks for HDHomeRun devices. Lots of other things will arrive in the next versions. - Windows GUI: Some code refactoring, since the HDHomeRun support makes the code larger enough to require more than one source file :-) 0.68 ---- - A couple of shared variables between 608 decoders were causing problems when both fields were processed at the same time with -12, fixed. - Added BOM for UTF-8 files. - Corrected a few extended characters in the UTF-8 encoding, probably never used in real world captioning but since we got a good test sample file... - Color and fonts in PAC commands were ignored, fixed (Helen Buus). - Added a new output format, spupng. It consists on one .png file for each subtitle frame and one .xml with all the timing (Heleen Buus). - Some fixes (Chris Small). 0.67 ---- - Padding bytes were being discarded early in the process in 0.66, which is convenient for debugging, but it messes with timing in .raw, which depends on padding. Fixed. - MythTV's branch had a fixed size buffer that could not be enough some times. Made dynamic. - Better support for PAT changing mid stream. - Removed quotes in Start in .smi (format fix). - Added multicast support (Chris Small) - Added ability to select IP address to bind in UDP (Chris Small) - Fixes in -unixts and -delay for teletext. - Added -autodash : When two people are talking, add a dash as needed (this is based on subtitle position). Only in .srt and with -trim. Quite experimental, feedback appreciated. - Added -latin1 to select Latin 1 as encoding. Default is now UTF-8 (-utf8 still exists but it's not needed). - Added -ru1, which emulates a (non-existing in real life) 1 line roll-up mode. 0.66 ---- - Fixed bug in auto detection code that triggered a message about file being auto of sync. - Added -investigate_packets The PMT is used to select the most promising elementary stream to get captions from. Sometimes captions are where you least expect it so -datapid allows you to select a elementary stream manually, in case the CC location is not obvious from the PMT contents. To assist looking for the right stream, the parameter "-investigate_packets" will have CCExtractor look inside each stream, looking for CC markers, and report the streams that are likely to contain CC data even if it can't be determined from their PMT entry. - Added -datastreamtype to manually selecting a stream based on its type instead of its PID. Useful if your recording program always hides the caption under the stream stream type. - Added -streamtype so if an elementary stream is selected manually for processing the streamtype can be selected too. This can be needed if you process for example a stream that is declared as "private MPEG" in the PMT, so CCExtractor can't tell what it is. Usually you'll want -streamtype 2 (MPEG video) or -streamtype 6 (MPEG private data). - PMT content listing improved, it now shows the stream type for more types. - Fixes in roll-up, cursor was being moved to column 1 if a RU2, RU3 or RU4 was received even if already in roll-up mode. - Added -autoprogram. If a multiprogram TS is processed and -autoprogram is used CCExtractor will analyze all PMTs and use the first program that has a suitable data stream. - Timed transcript (ttxt) now also exports the caption mode (roll-up, paint-on, etc) next to each line, as it's useful to detect things like commercials. - Content Advisory information from XDS is now decoded if it's transmitted in "US TV parental guidelines" or "MPA". Other encoding such as Canada's are not supported yet due to lack of samples. - Copy Management information from XDS is now decoded. - Added -xds. If present and export format is timed transcript (only), XDS information will be saved to file (same file as the transcript, with XDS being clearly marked). Note that for now all XDS data is exported even if it doesn't change, so the transcript file will be significantly larger. - Added some PaintOn support, at least enough to prevent it from breaking things when the other modes are used. - Removed afd_data() warning. AFD doesn't carry any caption related data. AFD still detected in code in case we want to do something with it later anyway. - Ported last changes from Petr Kutalek's telxcc. Current version is 2.4.4. - In teletext mode when exporting to transcript (not .srt), an effort is made to detect and merge line duplicates. This is done by using the Levenshtein's distance, which is the number of changes requires to convert one string to another. To simplify things, strings are compared up to the length of the shortest one. There are 3 parameters that can be used to tweak the thresholds: -deblev: Enable debug so the calculated distance for each two strings is displayed. The output includes both strings, the calculated distance, the maximum allowed distance, and whether the strings are ultimately considered equivalent or not, i.e. the calculated distance is less or equal than the max allowed. -levdistmincnt value: Minimum distance we always allow regardless of the length of the strings. Default 2. This means that if the calculated distance is 0, 1 or 2, we consider the strings to be equivalent. -levdistmaxpct value: Maximum distance we allow, as a percentage of the shortest string length. Default 10%. For example, consider a comparison of one string of 30 characters and one of 60 characters. We want to determine whether the first 30 characters of the longer string are more or less the same as the shortest string, i.e. whether the longest string is the shortest one plus new characters and maybe some corrections. Since the shortest string is 30 characters and the default percentage is 10%, we would allow a distance of up to 3 between the first 30 characters. - Added -lf : Use UNIX line terminator (LF) instead of Windows (CRLF). - Added -noautotimeref: Prevent UTC reference from being auto set from the stream data. 0.65 ---- - Minor GUI changes for teletext - Added end timestamps in timed transcripts - Added support for SMPTE (patch by John Kemp) - Initial support for MPEG2 video tracks inside MP4 files (thanks a lot to GPAC's Jean who assisted in analyzing the sample and doing the required changes in GPAC). - Improved MP4 auto detection - Support for PCR if PTS is not available (needed for some teletext samples, and probably useful for everything else). - Support for UDP streaming - finally. Use "-udp $port" to have CCExtractor listen for a stream. I've only been able to test it with an European HDHomeRun, but it should work fine with any other tuner. - Refactored PMT / PAT processing in transport streams, now allows to display their contents (-parsePAT and -parsePMT) which makes troubleshooting easier. 0.64 ---- - Changed Window GUI size (larger). - Added Teletext options to GUI. - Added -teletext to force teletext mode even if not detected - Added -noteletext to disable teletext detection. This can be needed for streams that have both 608 data and teletext packets if you need to process the 608 data (if teletext is detected it will take precedence otherwise). - Added -datapid to force a specific elementary stream to be used for data (bypassing detections). - Added -ru2 and -ru3 to limit the number of visible lines in roll-up captions (bypassing whatever the broadcast says). - Added support for a .hex (hexadecimal) dump of data. - Added support for wtv in Windows. This is done by using a new program (wtvccdump.exe) and a new DirectShow filter (CCExtractorDump.dll) that process the .wtv using DirecShow's filters and export the line 21 data to a .hex file. The GUI calls wtvccdump.exe as needed. - Added --nogoptime to force PTS timing even when CCExtractor would use GOP timing otherwise. 0.63 ---- - Telext support added, by integrating Petr Kutalek's telxcc. Integration is still quite basic (there's equivalent code from both CCExtractor and telxcc) and some clean up is needed, but it works. Petr has announced that he's abandoning telxcc so further development will happen directly in CCExtractor. - Some bug fixes, as usual. 0.62 ---- - Corrected Mac build "script" (needed to add GPAC includes). Thanks to the Mac users that sent this. - Hauppauge mode now uses PES timing, needed for files that don't have caption data during all the video (such as in commercial breaks). - Added -mp4 and -in:mp4 to force the input to be processed as MP4. - CC608 data embedded in a separate stream (as opposed as in the video stream itself) in MP4 files is now supported (not heavily tested). This should be rather useful since closed captioned files from iTunes use this format. - More CEA-708 work. The debugger is now able to dump the "TV" contents for the first time. Also, a .srt can be generated, however timing is not quite good yet (still need to figure out why). - Added -svc (or --service) to select the CEA-708 services to be processed. For example, -svc 1,2 will process the primary and secondary language services. Valid values are 1-63, where 1 is the primary language, 2 is the secondary language (this is part of the specification) and 3-63 are provider defined. - Rajesh Hingorani sent a fix for the MPEG decoder that fixes garbled output or certain samples (we had none like this in our test collection). Thanks, Rajesh. 0.61 ---- - Fix: GCC 3.4.4 can now build CCExtractor. - Fix: Damaged TS packets (those that come with 'error in transport' bit on) are now skipped. - Fix: Part of the changes for MP4 support (CC packets buffering in particular) broke some stuff for other files, causing at least very annoying character duplication. We hope we've fixed it without breaking anything but please report). - Some non-interesting cleanup. 0.60 ---- - Add: MP4 support, using GPAC (a media library). Integration is currently "enough so it works", but needs some more work. There's some duplicate code, the stream must be a file (no streaming), etc. - Fix: The Windows version was writing text files with double \r. - Fix: Closed captions blocks with no data could cause a crash. - Fix: -noru (to generate files without duplicate lines in roll-up) was broken, with complete lines being missing. - Fix: bin format not working as input. 0.59 ---- - More AVC/H.264 work. pic_order_cnt_type != 0 will be processed now. - Fix: Roll-up captions with interruptions for Text (with ResumeTextDisplay in the middle of the caption data) were missing complete lines. - Added a timed text transcript output format, probably only useful for roll-up captions. Use --timedtranscript or -ttxt. Output is like this: 00:01:25,485 | HOST: LAST NIGHT THE REPUBLICAN 00:01:29,522 | HOPEFULS INTRODUCE THEMSELVES TO 00:01:30,623 | PRIMARY VOTERS. - XDS parser. Not complete (no point in dealing with V-Chip stuff for example), but enough to extract program and station information. - Input streams can now come from standard input using - (just an hyphen) as parameter. - Added a new output format called 'null' (use -null or -out=null). This format means "Don't produce any file", and is useful to have CCExtractor process the stream (for XDS messages, debugging, etc) without actually generating anything. - Updated Windows GUI. - Added -quiet => If used, CCExtractor will not write any message. - Added -stdout => If used, the captions will be sent to stdout (console) instead of file. Combined with -, CCExtractor can work as a filter in a larger process, receiving the stream from stdin and sending the captions to stdout. - Some code clean up, minor refactoring. - Teletext detection (not yet processing). 0.58 ---- - Implemented new PTS based mode to order the caption information of AVC/H.264 data streams. The old pic_order_cnt_lsb based method is still available via the -poc or --usepicorder command switches. - Removed a couple of those annoying "Impossible!" error messages that appears when processing some (possibly broken, unsure) files. - Added -nots --notypesettings to prevent italics and underline codes from being displayed. - Note to those not liking the paragraph symbol being used for the music note: Submit a VALID replacement in latin-1. - Added preliminary support for multiple program TS files. The parameter --program-number (or -pn) will let you choose which program number to process. If no number is passed and the TS file contains more than one, CCExtractor will display a list of found programs and terminate. - Added support (basic, because I only received one sample) for some Hauppauge cards that save CC data in their own format. Use the parameter -haup to enable it (CCExtractor will display a notice if it thinks that it's processing a Hauppauge capture anyway). - Fixed bug in roll-up. - More AVC work, now TS files from echostar that provided garbled output are processed OK. - Updated Windows GUI. 0.57 ---- - Bugfixes in the Windows version. Some debug code was unintentionally left in the released version. 0.56 ---- - H264 support - Other minor changes a lot less important 0.55 ---- - Replace pattern matching code with improved parser for MPEG-2 elementary streams. - Fix parsing of ReplayTV 5000 captions. - Add ability to decode SCTE 20 encoded captions. - Make decoding of TS files more error tolerant. - Start implementation of EIA-708 decoding (not active yet). - Add -gt / --goptime switch to use GOP timing instead of PTS timing. - Start implementation of AVC/H.264 decoding (not active yet). - Fixed: The basic problem is that when 24fps movie film gets converted to 30fps NTSC they repeat every 4th frame. Some pics have 3 fields of CC data with field 3 CC data belongs to the same channel as field 1. The following pics have the fields reversed because of the odd number of fields. I used top_field_first to tell when the channels are reversed. See Table 6-1 of the SCTE 20 [Paul Fernquist] 0.54 ---- - Add -nosync and -fullbin switches for debugging purposes. - Remove -lg (--largegops) switch. - Improve syncronization of captions for source files with jumps in their time information or gaps in the caption information. - [R. Abarca] Changed Mac script, it now compiles/link everything from the /src directory. - It's now possible to have CCExtractor add credits automatically. - Added a feature to add start and end messages (for credits). See help screen for details. 0.53 ---- - Force generated RCWT files to have the same length as source file. - Fix documentation for -startat / -endat switches. - Make -startat / -endat work with all output formats. - Fix sync check for raw/rcwt files. - Improve timing of dvr-ms NTSC captions. - Add -in=bin switch to read CCExtractor's own binary format. - Fix problem with short input files (smaller 1MB). - Clean up regular and debug output. - Add -out=bin switch to write RCWT data. - Remove -bo/--bufferoutput switch and functionality. - [Volker] Added new generic binary format (RCWT for Raw Captions With Time). This new format allows one file to contain all the available closed caption data instead of just one stream. - Added --no_progress_bar to disable status information (mostly used when debugging, as the progress information is annoying in the middle of debug logs). - The Windows GUI was reported to freeze in some conditions. Fixed. - The Windows GUI is now targeted for .NET 2.0 instead of 3.5. This allows Windows 2000 to run it (there's not .NET 3.5 for Windows 2000), as requested by a couple of key users. 0.51 ---- - Removed -autopad and -goppad, no longer needed. - In preparation to a new binary format we have renamed the current .bin to .raw. Raw files have only CC data (with no header, timing, etc). - The input file format (when forced) is now specified with -in=format such as -in=ts, -in=raw, -in=ps ... The old switches (-ts, -ps, etc) still work. The only exception is -bin which has been removed (reserved for the new binary format). Use -in=raw to process a raw file. - Removed -d, which when produced a raw file used a DVD format. This has been merged into a new output type "dvdraw". So now instead of using -raw -d as before, use -out=dvdraw if you need this. - Removed --noff - Added gui_mode_reports for frontend communications, see related file. - Windows GUI rewritten. Source code now included, too. - [Volker] Dish Network clean-up 0.50 ---- - [Volker] Fix in DVR-MS NTSC timing - [Volker] More clean-up - Minor fixes 0.49 ---- - [Volker] Major MPEG parser rework. Code much cleaner now. - Some stations transmit broken roll-up captions, and for some reason don't send CRs but RUs... Added work-around code to make captions readable. - Started work on EIA-708 (DTV). Right now you can add -debug-708 to get a dump of the 708 data. An actually useful decoder will come soon. - Some of the changes MIGHT HAVE BROKEN MythTV's code. I don't use MythTV myself so I rely on other people's samples and reports. If MythTV is broken please let me know. - Added new debug options. - [Volker] Added support for DVR-MS NTSC files. - Other minor bugfixes and changes. 0.46 ---- - Added support for live streaming, ccextractor can now process files that are being recorded at the same time. - [Volker] Added a new DVR-MS loop - this is completely new, DVR-MS specific code, so we no longer use the generic MPEG code for DVR-MS. DVR-MS should (or will be eventually at least) be as reliable as TS. Note: For now, it's only ATSC recordings, not NTSC (analog) recordings. 0.45 ---- - Added autodetection of DVR-MS files. - Added -asf to force DVR-MS mode. - Added some specific support for DVR-MS files. These format used to work correcty in 0.34 (pure luck) but the MPEG code rework broke it. It should work as it used to. - Updated Windows GUI to support the new options. - Added -lg --largegops From the help screen: Each Group-of-Picture comes with timing information. When this info is too separate (for example because there are a lot of frames in a GOP) ccextractor may prefer not to use GOP timing. Use this option is you need ccextractor to use GOP timing in large GOPs. 0.44 ---- - Added an option to the GUI to process individual files in batch, i.e. call ccextractor once per file. Use it if you want to process several unrelated files in one go. - Added an option to prevent duplicate lines in roll-up captions. - Several minor bugfixes. - Updated the GUI to add the new options. 0.43 ---- - Fixed a bug in the read loop (no less) that caused some files to fail when reading without buffering (which is the default in the linux build). - Several improvements in the GUI, such as saving current options as default. 0.42 ---- - The option switch "-transcript" has been changed to "--transcript". Also, "-txt" has been added as the short alias. - Windows GUI - Updated help screen 0.41 ---- - Default output is now .srt instead of .bin, use -raw if you need the data dump instead of .srt. - Added -trim, which removes blank spaces at the left and rights of each line in .srt. Note that those spaces are there to help deaf people know if the person talking is at the left or the right of the screen, i.e. there aren't useless. But if they annoy you go ahead... 0.40 ---- - Fixed a bug in the sanity check function that caused the Myth branch to abort. - Fixed the OSX build script, it needed a new #define to work. 0.39 ---- - Added a -transcript. If used, the output will have no time information. Also, if in roll-up mode there will be no repeated lines. - Lots of changes in the MPEG parser, most of them submitted by Volker Quetschke. - Fixed a bug in the CC decoder that could cause the first line not to be cleared in roll-up mode. - ccextractor can now follow number sequences in file names, by suffixing the name with +. For example, DVD0001.VOB+ means DVD0001.VOB, DVD0002.VOB, etc. This works for all files, so part001.ts+ does what you could expect. - Added -90090 which changes the clock frequency from the MPEG standard 90000 to 90090. It *could* (remains to be seen) help if there are timing issues. - Better support for Tivo files. - By default ccextractor now considers the whole input file list a one large file, instead of several, independent, video files. This has been changed because most programs (for example DVDDecrypt) just cut the files by size. If you need the old behaviour (because you actually edited the video files and want to join the subs), use -ve. 0.36 ---- - Fixed bug in SMI, nbsp was missing a ;. - Footer for SAMI files was incorrect (
and