Problematic were files for which DOS-style EOLs were
detected (carriage return followed by newline, \r\n) but which had
some lines terminated solely by a newline (\n). In such a case the
EOL was only detected upon seeing the next \r\n, and the value
returned from the `getline()` function would return something that
everyone would judge to be multiple lines of text.
Fixes#2594.
This pulls the fixes for handling Unicode code points >
U+FFFF. Also update the one test case with invalid data which is now
handled slightly differently than before.
Part of the fix of #2516.
Various places needed to differentiate better between "no duration is
known or set for this packet" and "this packet has a duration of
0ms". The former means that no duration element should be written to
the file, and the use of `SimpleBlock` instead of `BlockGroup` is
OK. The latter on the other hand means that a `BlockGroup` must be
used and a duration element must be added.
This core change uncovered a couple of subtle bugs in other places
where subtitles were handled:
• The Matroska reader passed 0 as the duration even if no duration
element was present making it impossible for other code to
differentiate between "no duration present" and "duration present
but set to 0".
• The DVB subtitle packetizer always enforced writing the duration
even if the duration wasn't known.
• The Kate packetizer used a wrong dummy value of 1us for the duration
for the "end of stream" packet as the core would not write a
duration of 0.
• The text subtitle packetizer was using the difference between the
current packet and the following one for packets were the duration
was 0 or unknown. The correct behavior is to do this only if the
duration is unknown, not if it is 0.
Fixes#2490.
Unsupported edits such as dwells were simply ignored. If all of them
were ignored, then the new timeline was empty resulting in no data
being copied for that track. Instead simply ignore edit lists whose
new timeline ends up empty after the entries have been processed.
Fixes#2487.
When closing files that were opened for writing, cached data will not
be flushed to storage automatically anymore. This reverts the
workaround implemented for #2469. A new option was added to both
programs (`--flush-on-close`) that re-enables flushing for people who
are affected by data loss such as described in #2469.
The reason is that automatic flushing causes long delays in processing
queues when the output by mkvmerge/mkvextract isn't the final product but
just an intermediate result to be processed further.
Implements #2480.
Writing level 1 elements can lead to the situation that a one-byte gap
must be covered. In that case `kax_analyzer_c` can move the head of
the following element by shrinking or enlarging its size field.
If that following element happens to be a cluster, there may be cues
that refer to that cluster. They must be updated in order to reflect
the cluster's new position.
Fixes#2408.
The `CodecPrivate` Matroska element contains AAC's
`AudioSpecificConfig` structure. That structure can contain a
`GASpecificConfig` structure which in turn can contain a
`program_config_element` (short: PCE).
The PCE carries vital information about number of
channels in certain situations and must be present in the first raw
AAC packet if it is present in the `AudioSpecificConfig`. Otherwise
the number of channels cannot be determined.
mkvextract will now check whether the first packet contains the PCE
already. If it doesn't and if there's a PCE in the
`AudioSpecificConfig`, mkvextract will now prepend the first audio
packet with that PCE (right behind the ADTS headers).
Fixes#2205 and #2433.
When generating chapters mkvmerge has to take into account things such
as splitting and file linking. This requires shifting chapter
timestamps to match file timestamps. However, for files which don't
start at 0 generated chapters would be wrongfully shifted down to
below 0 causing invalid timestamps.
Fixes#2432.
Older libEBML or libMatroska versions don't validate the parent/child
sizes properly. This means that tests running on those older versions
cause mkvinfo to fail (with an exception = harmlessly).
The `EbmlElement::Read` function returns two values via reference
parameters. They're called `UpperEltFound` (an integer) and
`FoundElt` (a pointer to an EBML element). They're used for passing
back the first element found (if any) that is not a child of the
element currently being read so that the calling code can continue
parsing the file using the upper-level element.
If the calling code doesn't need that element, it has to delete it
itself. However, the code must not simply rely on the `FoundElt`
pointer being not null as the `Read` function assigns temporary
results to that variable. Depending on the file content, that
temporary element may have already been deleted by the `Read`
function. When the calling code then simply deletes `FoundElt` itself,
this leads to a typical case of use-after-free.
Instead the calling code must only work with the returned `FoundElt`
pointer if the other returned value, `UpperEltFound`, trueish in the
C++ sense (if it isn't 0). Then and only then may the calling code
attempt to delete the object `FoundElt` points to.
This vulnerability allows arbitrary code execution via specially
crafted Matroska files. It was reported by Cisco TALOS on 2018-10-25
and is known as TALOS 2018-0694.
The two header fields `delta_frame_id_length_minus2` and
`additional_frame_id_length_minus1` are only present if
`reduced_still_picture_header` is not set but
`frame_id_numbers_present_flag` is.
Part of the fix for #2410.
When surrounding elements have been written using eight-byte size
length fields, the analyzer cannot enlarge the element
anymore. Instead, it can shrink them by one byte and move the head
up. That way the former one-byte gap will become a two-byte gap
instead. A new, empty EBML void element can then be placed in the gap
instead.
libavformat from ffmpeg/libav writes most level 1 elements with
eight-byte size length fields. Files created by it are therefore the
prime candidate for hitting this but.
Fixes#2406.
With a low buffer limit, it's possible that mkvmerge hits the limit
while looking for the "end of display" conditions for teletext
subtitles. In such a case mkvmerge starts writing out buffered audio &
video packets even though there's no packet available for the subtitle
track.
Once mkvmerge does find the "end of display" conditions, the formerly
incomplete subtitle packet will be written. However, at that point the
timestamps of audio & video packets are higher already, causing the
subtitle packet to be interlaved wrongly.
The higher the limit, the less likely it is mkvmerge will run into
such a situation. With 50 MB the problem disappears for the provided
test file.
Workaround for #2393.
If there's no duration, the current entry will be buffered. As soon as
the following entry is found, the difference between the start
timestamps of the current and buffered blocks will be used as the
buffered block's duration.
Second part of the implementation of #2397.
If there's no duration, the current entry will be buffered. As soon as
the following entry is found, the difference between the start
timestamps of the current and buffered blocks will be used as the
buffered block's duration.
Part of the implementation of #2397.