Created UTF-8 invalid sequence (markdown)

Moritz Bunkus 2015-02-07 21:13:10 +01:00
parent a9306d5b34
commit 025284fec3

19
UTF-8-invalid-sequence.md Normal file

@ -0,0 +1,19 @@
# mkvmerge outputs an error: "Error: cstrutf8_to_UTFstring: Invalid UTF-8 sequence encountered"
## The problem
mkvmerge reports the error message above. What does it mean? Why does it occur? And how do I get rid of it?
## The answer
This is a combination of input files with certain pieces of invalid data (strings that are not encoded in UTF-8 properly) and old mkvmerge versions being very rigid about what they except. Newer versions (starting with v4.5.0) simply discard those invalid pieces and process the rest of the file without a problem.
Users of certain application (Popcorn MKV AudioConverter) have often complained that mkvmerge exits with the error message shown above. Analysis showed that the cause was invalidly encoded strings inside a Matroska file (all Unicode strings in Matroska must be encoded in UTF-8, and all non-Unicode strings may only include ASCII characters).
Popcorn MKV AudioConverter seems to be a frontend for various tools that are executed in the background in order to convert and encode from one format to another one. One of the programs that is run prior to mkvmerge produces a Matroska file that is then used as an input file for mkvmerge. This Matroska file contains Unicode strings with invalid characters. mkvmerge chokes on these invalid strings.
Therefore this error message is not really a bug in mkvmerge. The proper solution would be to fix the creation of the invalid encoding by that other tool (I don't know which tool that is).
However, I do recognize that this may be rather difficult depending on who that tool's author is. Therefore newer mkvmerge versions (starting with v4.5.0) will simply discard such strings with invalid encodings. You can [update mkvtoolnix](http://www.bunkus.org/videotools/mkvtoolnix/downloads.html), and it should work with those invalid files.
Categories: [merging](Category-merging)