From 3377dfa9474d56e4f7f76e826190f76ff4088b17 Mon Sep 17 00:00:00 2001 From: Moritz Bunkus Date: Sun, 9 Aug 2009 18:01:07 +0200 Subject: [PATCH] Finally a full conversion including a couple of updates. --- doc/docbook/mkvmerge.xml | 651 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 634 insertions(+), 17 deletions(-) diff --git a/doc/docbook/mkvmerge.xml b/doc/docbook/mkvmerge.xml index 1235331fa..d80e21d14 100644 --- a/doc/docbook/mkvmerge.xml +++ b/doc/docbook/mkvmerge.xml @@ -4,6 +4,7 @@ [ + mkvmerge1"> mkvinfo1"> @@ -12,12 +13,14 @@ Matroska"> OggVorbis"> +XML"> ]> &product; + &date; Developer @@ -31,7 +34,8 @@ &product; 1 &version; - http://www.bunkus.org/videotools/mkvtoolnix + &date; + www.bunkus.org http://www.bunkus.org/videotools/mkvtoolnix/doc @@ -97,10 +101,10 @@ - + file-name - Read global tags from the XML file file-name. See the section about tags below for + Read global tags from the &xml; file file-name. See the section about tags below for details. @@ -116,7 +120,7 @@ - Chapter handling: (global options) + Chapter and tag handling: (global options) @@ -135,7 +139,7 @@ - + character-set @@ -197,6 +201,16 @@ + + + file-name + + + Read global tags from the file file-name. See the section about tags below + for details. + + + @@ -572,7 +586,7 @@ - , n,m,... + , n,m,... Copy the video tracks n, m etc. The numbers are track IDs which can be obtained with the @@ -583,7 +597,7 @@ - , n,m,... + , n,m,... Copy the subtitle tracks n, m etc. The numbers are track IDs which can be obtained with @@ -594,7 +608,7 @@ - , n,m,... + , n,m,... Copy the button tracks n, m etc. The numbers are track IDs which can be obtained with @@ -615,6 +629,23 @@ + + , n:all|first,m:all|first,... + + + Copy the attachments with the IDs n, m etc to all or only the first output file. Each ID + can be followed by either ':all' (which is the default if neither is entered) or ':first'. If + splitting is active then those attachments whose IDs are specified with ':all' are copied to all of the resulting + output files while the others are only copied into the first output file. If splitting is not active then both variants have the same + effect. + + + + The default is to copy all attachments to all output files. + + + + , @@ -660,7 +691,7 @@ - + @@ -669,8 +700,8 @@ - - + + , Don't copy attachments from this file. @@ -1014,7 +1045,7 @@ - + TID:character-set @@ -1078,7 +1109,7 @@ - + character-set @@ -1101,7 +1132,7 @@ - + , file-name @@ -1112,7 +1143,7 @@ - + code @@ -1294,7 +1325,7 @@ $ mkvmerge -o MM-complete.mkv MyMovie-with-sound.mkv MyMovie-add-audio.ogg For text subtitles you can either use some Windows software (like SubRipper) or the subrip package found in - transcode1's sources in the + transcode1's sources in the 'contrib/subrip' directory. The general process is: @@ -1334,7 +1365,7 @@ $ mkvmerge -o MM-complete.mkv MyMovie-with-sound.mkv MyMovie-add-audio.ogg $ mkvmerge --list-languages - Search the list for the languages you need. Let's assume you have put two audio tracks into a Matroska file and want to set their + Search the list for the languages you need. Let's assume you have put two audio tracks into a &matroska; file and want to set their language codes and that their track IDs are 2 and 3. This can be done with @@ -1357,6 +1388,185 @@ $ mkvmerge -o MM-complete.mkv MyMovie-with-sound.mkv MyMovie-add-audio.ogg + + Track IDs + + + Some of the options for &mkvmerge; need a track ID to specify which track they should be applied to. Those track IDs are printed by the + readers when demuxing the current input file, or if &mkvmerge; is called with the option. An example for such output: + + + +$ mkvmerge -i v.mkv +File 'v.mkv': container: &matroska; +Track ID 1: video (V_MS/VFW/FOURCC, DIV3) +Track ID 2: audio (A_MPEG/L3) + + + + Track IDs are assigned like this: + + + + + + AVI files: The video track has the ID 0. The audio tracks get IDs in ascending order starting at 1. + + + + + + AAC, AC3, MP3, SRT and WAV files: The one 'track' + in that file gets the ID 0. + + + + + + Ogg/OGM files: The track IDs are assigned in order the tracks are found in the file starting at 0. + + + + + + &matroska; files: The track's ID is the track number as reported by &mkvinfo;. It is not the track UID. + + + + + + The special track ID '-1' is a wild card and applies the given switch to all tracks that are read from an input + file. + + + + The options that use the track IDs are the ones whose description contains 'TID'. + The following options use track IDs as well: , + , and . + + + + + Text files and character set conversions + + + This section applies to all programs in MkvToolNix even if it only mentions &mkvmerge;. + + + + + All text in a &matroska; file is encoded in UTF-8. This means that &mkvmerge; has to convert every text file it reads as well as every + text given on the command line from one character set into UTF-8. In return this also means that &mkvmerge;'s output has to be converted + back to that character set from UTF-8, e.g. if a non-English translation is used with or for text originating from a &matroska; file. + + + + &mkvmerge; does this conversion automatically based on the presence of a byte order marker (short: + BOM) or the system's current locale. How the character set is inferred from the locale depends on the operating system + that &mkvmerge; is run on. + + + + Text files that start with a BOM are already encoded in one representation of UTF. &mkvmerge; supports the following five modes: UTF-8, + UTF-16 Little and Big Endian, UTF-32 Little and Big Endian. Text files with a BOM are automatically converted to UTF-8. Any of the + parameters that would otherwise set the character set for such a file (e.g. ) is silently ignored. + + + + On Unix-like systems &mkvmerge; uses the setlocale3 + system call which in turn uses the environment variables LANG, LC_ALL and + LC_CYPE. The resulting character set is often one of UTF-8 or the ISO-8859-* family and is used for all text file + operations and for encoding strings on the command line and for output to the console. + + + + On Windows there are actually two different character sets that &mkvmerge; uses due to the way the Windows shell program + cmd.exe is implemented. The first character set is determined by a call to the GetCP() system + call. This character set is used as the default for text file conversions and for all elements displayed by the GUI + programs in the MkvToolNix package. cmd.exe uses another character set which is determined by a call to the + GetACP() system call. This is the default character set for all strings read from the command line and for all + strings output to the console. + + + + The following options exist that allow specifying the character sets: + + + + + + for text subtitle files and for text subtitle + tracks stored in container formats for which the character set cannot be determined unambiguously (e.g. Ogg files), + + + + + + for chapter text files and for chapters + and file titles stored in container formats for which the character set cannot be determined unambiguously (e.g. Ogg files for chapter + information, track and file titles etc; MP4 files for chapter information), + + + + + + for all strings on the command + line, + + + + + + for all strings written to the console or + to a file if the output has been redirected with the option. + + + + + + + Subtitles + + There are several text subtitle formats that can be embedded into &matroska;. At the moment &mkvmerge; supports only text, VobSub and Kate + subtitle formats. Text subtitles must be recoded to UTF-8 so that they can be displayed correctly by a player (see the section about + text files and character sets for an explanation how &mkvmerge; converts between + character sets). Kate subtitles are already encoded in UTF-8 and do not have to be re0encoded. + + + + The following subtitle formats are supported at the moment: + + + + + + Subtitle Ripper (SRT) files + + + + + + Substation Alpha (SSA) / Advanced Substation Alpha scripts (ASS) + + + + + + OggKate streams + + + + + + VobSub bitmap subtitle files + + + + File linking @@ -1396,6 +1606,413 @@ $ mkvmerge -o MM-complete.mkv MyMovie-with-sound.mkv MyMovie-add-audio.ogg + + Default values + + The &matroska; specification states that some elements have a default value. Usually an element is not written to the file if its value + is equal to its default value in order to save space. The elements that the user might miss in &mkvinfo;'s output are the + language and the default track flag elements. The default value for the + language is English ('eng'), and the default value for the default track + flag is true. Therefore if you used for a track then it will not + show up in &mkvinfo;'s output. + + + + + Attachments + + Maybe you also want to keep some photos along with your &matroska; file, or you're using SSA subtitles and need a + special TrueType font that's really rare. In these cases you can attach those files to the &matroska; + file. They will not be just appended to the file but embedded in it. A player can then show those files (the 'photos' case) or use them + to render the subtitles (the 'TrueType fonts' case). + + + + Here's an example how to attach a photo and a TrueType font to the output file: + + + +$ mkvmerge -o output.mkv -A video.avi sound.ogg --attachment-description "Me and the band behind the stage in a small get-together" --attachment-mime-type image/jpeg --attach-file me_and_the_band.jpg --attachment-description "The real rare and unbelievably good looking font" --attachment-type application/octet-stream --attach-file really_cool_font.ttf + + + + If a &matroska; containing attachments file is used as an input file then &mkvmerge; will copy the attachments into the new file. The + selection which attachments are copied and which are not can be changed with the options and . + + + + + Chapters + + The &matroska; chapter system is more powerful than the old known system used by OGM files. The full specifications can + be found at the &matroska; website. + + + + &mkvmerge; supports two kinds of chapter files as its input. The first format, called 'simple chapter + format', is the same format that the OGM tools expect. The second format is a &xml; based + chapter format which supports all of &matroska;'s chapter functionality. + + + + The simple chapter format + + + This formmat consists of pairs of lines that start with 'CHAPTERxx=' and 'CHAPTERxxNAME=' + respectively. The first one contains the start timecode while the second one contains the title. Here's an example: + + + +CHAPTER01=00:00:00.000 +CHAPTER01NAME=Intro +CHAPTER02=00:02:30.000 +CHAPTER02NAME=Baby prepares to rock +CHAPTER03=00:02:42.300 +CHAPTER03NAME=Baby rocks the house + + + + &mkvmerge; will transform every pair or lines into one &matroska; ChapterAtom. It does not set any + ChapterTrackNumber which means that the chapters all apply to all tracks in the file. + + + + As this is a text file character set conversion may need to be done. See the section about text files and character sets for an explanation how &mkvmerge; converts between + character sets. + + + + + The &xml; based chapter format + + The &xml; based chapter format looks like this example: + + + +<?xml version="1.0" encoding="ISO-8859-1"?> +<!DOCTYPE Chapters SYSTEM "matroskachapters.dtd"> +<Chapters> + <EditionEntry> + <ChapterAtom> + <ChapterTimeStart>00:00:30.000</ChapterTimeStart> + <ChapterTimeEnd>00:01:20.000</ChapterTimeEnd> + <ChapterDisplay> + <ChapterString>A short chapter</ChapterString> + <ChapterLanguage>eng</ChapterLanguage> + </ChapterDisplay> + <ChapterAtom> + <ChapterTimeStart>00:00:46.000</ChapterTimeStart> + <ChapterTimeEnd>00:01:10.000</ChapterTimeEnd> + <ChapterDisplay> + <ChapterString>A part of that short chapter</ChapterString> + <ChapterLanguage>eng</ChapterLanguage> + </ChapterDisplay> + </ChapterAtom> + </ChapterAtom> + </EditionEntry> +</Chapters> + + + + With this format three things are possible that are not possible with the simple chapter format: + + + + The timestamp for the end of the chapter can be set, + chapters can be nested, + the language and country can be set. + + + + The mkvtoolnix distribution contains some sample files in the doc subdirectory which can be used as a basis. + + + + + General notes + + When splitting files &mkvmerge; will correctly adjust the chapters as well. This means that each file only includes the chapter entries + that apply to it, and that the timecodes will be offset to match the new timecodes of each output file. + + + + &mkvmerge; is able to copy chapters from &matroska; source files unless this is explicitly disabled with the option. The chapters from all sources (&matroska; files, + Ogg files, MP4 files, chapter text files) are usually not merged but end up in separate + ChapterEditions. Only if chapters are read from several &matroska; or &xml; files that share the + same edition UIDs will chapters be merged into a single ChapterEdition. If such a merge is desired in other + situations as well then the user has to extract the chapters from all sources with &mkvextract; first, merge the &xml; + files manually and mux them afterwards. + + + + + + Tags + + + Introduction + + + &matroska; supports an extensive set of tags that is deprecated and a new, simpler system like it is is used in most other containers: + KEY=VALUE. However, in &matroska; these tags can also be nested, and both the KEY and the + VALUE are elements of their own. The example file example-tags-2.xml shows how to use this + new system. + + + + + Scope of the tags + + + &matroska; tags do not automatically apply to the complete file. They can, but they also may apply to different parts of the file: to one + or more tracks, to one or more chapters, or even to a combination of both. The the &matroska; specification gives more details about this fact. + + + + One important fact is that tags are linked to tracks or chapters with the Targets &matroska; tag element, and + that the UIDs used for this linking are not the track IDs &mkvmerge; uses everywhere. Instead the numbers used are + the UIDs which &mkvmerge; calculates automatically (if the track is taken from a file format other than &matroska;) or which are copied + from the source file if the track's source file is a &matroska; file. Therefore it is difficult to know which UIDs to use in the tag + file before the file is handed over to &mkvmerge;. + + + + &mkvmerge; knows two options with which you can add tags to &matroska; files: The and the options. The difference is that the former option, , will make the tags apply to the complete file by + removing any of those Targets elements mentioned above. The latter option, , automatically inserts the UID that &mkvmerge; generates for the tag + specified with the TID part of the + option. + + + + + Example + + Let's say that you want to add tags to a video track read from an AVI. mkvmerge --identify file.avi + tells you that the video track's ID (do not mix this ID with the UID!) is 0. So you create your tag file, leave out all + Targets elements and call &mkvmerge;: + + + +$ mkvmerge -o file.mkv --tags 0:tags.xml file.avi + + + + + Tag file format + + &mkvmerge; supports a &xml; based tag file format. The format is very closely modeled after the &matroska; specification. Both the binary and the source distributions + of MkvToolNix come with a sample file called example-tags-2.xml which simply lists all known tags and which can be + used as a basis for real life tag files. + + + + The basics are: + + + + The outermost element must be <Tags>. + + One logical tag is contained inside one pair of <Tag> &xml; tags. + + White spaces directly before and after tag contents are ignored. + + + + + Data types + + The new &matroska; tagging system only knows two data types, a UTF-8 string and a binary type. The first is used for the tag's name and + the <String> element while the binary type is used for the <Binary> element. + + + + As binary data itself would not fit into a &xml; file &mkvmerge; supports two other methods of storing binary data. If the contents of a + &xml; tag starts with '@' then the following text is treated as a file name. The corresponding file's content is + copied into the &matroska; element. + + + + Otherwise the data is expected to be Base64 encoded. This is an encoding that transforms binary data into + a limited set of ASCII characters and is used e.g. in email programs. &mkvextract; will output + Base64 encoded data for binary elements. + + + + The deprecated tagging system knows some more data types which can be found in the official &matroska; tag specs. As &mkvmerge; does not + support this system anymore these types aren't described here. + + + + + + &matroska; file layout + + The &matroska; file layout is quite flexible. &mkvmerge; will render a file in a predefined way. The resulting file looks like this: + + + + [EBML head] [segment {meta seek #1} {attachments} {chapters} [segment information] [track information] [cluster 1] {cluster 2} ... + {cluster n} {cues} {meta seek #2} {tags}] + + + + The elements in curly braces are optional and depend on the contents and options used. A couple of notes: + + + + + + meta seek #1 includes only a small number of level 1 elements, and only if they actually exist: attachments, chapters, cues, tags, meta + seek #2. Older versions of &mkvmerge; used to put the clusters into this meta seek element as well. Therefore some imprecise guessing + was necessary to reserve enough space. It often failed. Now only the clusters are stored in meta seek #2, and meta seek #1 refers to + the meta seek element #2. + + + + + Attachment, chapter and tag elements are only present if they were added. + + + + + The shortest possible Matroska file would look like this: + + + + [EBML head] [segment [segment information] [track information] [cluster 1]] + + + + This might be the case for audio-only files. + + + + + External timecode files + + &mkvmerge; allows the user to chose the timecodes for a specific track himself. This can be used in order to create files with variable + frame rate video or include gaps in audio. A frame in this case is the unit that &mkvmerge; creates separately per &matroska; block. For + video this is exactly one frame, for audio this is one packet of the specific audio type. E.g. for AC3 this would be a + packet containing 1536 samples. + + + + Timecode files that are used when tracks are appended to each other must only be specified for the first part in a chain of tracks. For + example if you append two files, v1.avi and v2.avi, and want to use timecodes then your command line must look something like this: + + + +mkvmerge ... --timecodes 0:my_timecodes.txt v1.avi +v2.avi + + + + There are four formats that are recognized by &mkvmerge;. The first line always contains the version number. Empty lines, lines + containing only whitespace and lines beginning with '#' are ignored. + + + + Timecode file format v1 + + This format starts with the version line. The second line declares the default number of frames per second. All following lines contain + three numbers separated by commas: the start frame (0 is the first frame), the end frame and the number of frames + in this range. The FPS is a floating point number with the dot '.' as the decimal point. The ranges + can contain gaps for which the default FPS is used. An example: + + + +# timecode format v1 +assume 27.930 +800,1000,25 +1500,1700,30 + + + + + Timecode file format v2 + + + In this format each line contains a timecode for the corresponding frame. This timecode must be given in millisecond precision. It can + be a floating point number, but it doesn't have to be. You have to give at least as many timecode lines as there + are frames in the track. The timecodes in this file must be sorted. Example for 25fps: + + + +# timecode format v2 +0 +40 +80 + + + + + Timecode file format v3 + + In this format each line contains a duration in seconds followed by an optional number of frames per second. Both can be floating point + numbers. If the number of frames per second is not present the default one is used. For audio you should let the codec calculate the + frame timecodes itself. For that you should be using 0.0 as the number of frames per second. You can also create + gaps in the stream by using the 'gap' keyword followed by the duration of the gap. Example for an audio file: + + + +# timecode format v3 +assume 0.0 +25.325 +7.530,38.236 +gap, 10.050 +2.000,38.236 + + + + + Timecode file format v4 + + This format is identical to the v2 format. The only difference is that the timecodes do not have to be sorted. This format should + almost never be used. + + + + + + Exit codes + + + &mkvmerge; exits with one of three exit codes: + + + + + + 0 -- This exit codes means that muxing has completed successfully. + + + + + + 1 -- In this case &mkvmerge; has output at least one warning, but muxing did continue. A warning is prefixed with + the text 'Warning:'. Depending on the issues involved the resulting file might be ok or not. The user is urged to + check both the warning and the resulting file. + + + + + + 2 -- This exit code is used after an error occured. &mkvmerge; aborts right after outputting the error message. + Error messages range from wrong command line arguments over read/write errors to broken files. + + + + + See also