GUI: multiplexer: be much stricter when detecting chapter/tags/segment info files

When adding files the GUI has special handling for
chapter/tags/segment info files, whose paths must be specified in the
corresponding text inputs. Those files don't show up like other
regular source files do.

This type detection is done by comparing their content to certain
patters via regular expressions. This recognition could wrongfully be
triggered if any such file was embedded in another file verbatim,
e.g. with a chapter XML file attachment in a Matroska file. When
trying to add that Matroska file, the GUI would treat it as a chapter
file instead of a regular one.

Fixes #3487.
This commit is contained in:
Moritz Bunkus 2023-02-21 20:38:55 +01:00
parent 95c05a6989
commit 8009c92fa6
No known key found for this signature in database
GPG Key ID: 74AF00ADF2E32C85
2 changed files with 21 additions and 4 deletions

View File

@ -21,6 +21,13 @@
the operating system's language is not available for MKVToolNix. This might
also happen on Linux if e.g. `en_GB` is set, even though `en_US` is
available. Now English (`en_US`) will be selected instead. Fixes #3486.
* MKVToolNix GUI: multiplexer: when adding files the GUI has special handling
for chapter/tags/segment info files. This is done by comparing their content
to certain patterns. This recognition could wrongfully be triggered if any
such file was embedded in another file verbatim, e.g. with a chapter XML
file attachment in a Matroska file. When trying to add that Matroska file,
the GUI would treat it as a chapter file instead of a regular one. This
content-based detection was fixed. Fixes #3487.
# Version 74.0.0 "You Oughta Know" 2023-02-12

View File

@ -9,6 +9,8 @@
#include <QRegularExpression>
#include <QTimer>
#include "common/mm_proxy_io.h"
#include "common/mm_text_io.h"
#include "common/qt.h"
#include "common/timestamp.h"
#include "mkvtoolnix-gui/merge/file_identification_thread.h"
@ -39,9 +41,9 @@ FileIdentificationWorker::FileIdentificationWorker(QObject *parent)
{
auto p = p_func();
p->m_simpleChaptersRE = QRegularExpression{R"(^CHAPTER\d{2}=[\s\S]*CHAPTER\d{2}NAME=)"};
p->m_xmlChaptersRE = QRegularExpression{R"(<\?xml[^>]+version[\s\S]*\?>[\s\S]*<Chapters>)"};
p->m_xmlSegmentInfoRE = QRegularExpression{R"(<\?xml[^>]+version[\s\S]*\?>[\s\S]*<Info>)"};
p->m_xmlTagsRE = QRegularExpression{R"(<\?xml[^>]+version[\s\S]*\?>[\s\S]*<Tags>)"};
p->m_xmlChaptersRE = QRegularExpression{R"(^(<!--.*?-->\s*)*<\?xml[^>]+version[\s\S]*?\?>[\s\S]*?<Chapters>)"};
p->m_xmlSegmentInfoRE = QRegularExpression{R"(^(<!--.*?-->\s*)*<\?xml[^>]+version[\s\S]*?\?>[\s\S]*?<Info>)"};
p->m_xmlTagsRE = QRegularExpression{R"(^(<!--.*?-->\s*)*<\?xml[^>]+version[\s\S]*?\?>[\s\S]*?<Tags>)"};
}
FileIdentificationWorker::~FileIdentificationWorker() {
@ -167,7 +169,15 @@ FileIdentificationWorker::determineIfFileThatShouldBeSelectedElsewhere(QString c
if (!file.open(QIODevice::ReadOnly))
return IdentificationPack::FileType::Regular;
auto content = QString::fromUtf8(file.read(1024 * 10));
auto contentBytes = file.read(1024 * 10);
auto bytes = reinterpret_cast<unsigned char const *>(contentBytes.data());
auto bom_type = byte_order_mark_e::none;
unsigned int bom_length{};
if (mm_text_io_c::detect_byte_order_marker(bytes, contentBytes.size(), bom_type, bom_length))
bytes += bom_length;
auto content = QString::fromUtf8(bytes);
if (content.contains(p->m_simpleChaptersRE) || content.contains(p->m_xmlChaptersRE))
return IdentificationPack::FileType::Chapters;