Commit Graph

20 Commits

Author SHA1 Message Date
Moritz Bunkus
e64e58c0a3
BCP47: only normalize DCNC tags really part of BCP47 "private use" range 2022-05-04 20:54:18 +02:00
Moritz Bunkus
39529c226b
languages/scripts/regions/IANA lists: use different method of initialization
The prior method was to generate one line of
`g_container.emplace_back(…)` per entry in the list & letting the
compiler chew on that. Each string argument in that call was done was
`u8"Some Name"s`, meaning as a std::string instance.

Drawbacks:

• takes the compiler ages to compile, even forcing me to drop all
  optimizations for the ISO-639 language list file

• even smaller files such as the IANA language subtag registry lists
  take more than 30s to compile

• due to no optimizations initialization is actually not as fast as
  could be

The new method uses a plain C-style array of structs with `char
const *` entries for the initial list. The initialization method then
copies the entries from that list to the actual container, again using
`std::emplace_back(…)`.

This yields sub-1s compilation times even with the longest file, the
ISO-639 language list, and the runtime initialization is actually
faster.
2022-04-23 00:00:15 +02:00
Moritz Bunkus
c50e582fa4
GUI: BCP47: show warning if script should be suppressed
Part of the implementation of #3307.
2022-03-29 21:15:53 +02:00
Moritz Bunkus
d214fc30d0
build system: IANA updater: refactor to use ERB for cleaner code 2022-03-29 21:15:53 +02:00
Moritz Bunkus
14a0bec2cf
BCP47: normalize DCNC tags from BCP47 "private use" range to BCP47 equivalents
Replaces e.g. `QMS` with `cmn-Hans`.

Part of the implementation of #3307.
2022-03-28 18:46:09 +02:00
Moritz Bunkus
4ce17cfddb
BCP47: default to normalize to canonical form
Normalization can be turned off via the `--normalize-language-ietf
off` command line arguments.

Part of the implementation of #3307.
2022-03-28 18:46:09 +02:00
Moritz Bunkus
7767846416
BCP47: extlangs, variants: include fact whether entries are deprecated
Part of the implementation of #3307.
2022-03-26 13:40:09 +01:00
Moritz Bunkus
37d48d5d2f
IANA registry parser: parse & format entries for mapping to preferred values
Part of the implementation of #3307.
2022-03-26 00:21:58 +01:00
Moritz Bunkus
c0e49abf8e
IANA registry parser: refactor lambdas to methods 2022-03-25 23:43:34 +01:00
Moritz Bunkus
06dce25865
BCP 47: add support for grandfathered language tags
Part of the implementation of #3307.
2022-03-24 22:39:32 +01:00
Moritz Bunkus
2c5b3c27a8
IANA language subtag registry: fix reading continuation lines
Also shorten certain very long descriptions.
2021-08-04 22:47:48 +02:00
Moritz Bunkus
20437cb0f6
build system: move file download handling to dedicated module 2021-07-17 12:26:48 +02:00
Moritz Bunkus
11584c416a
BCP 47: use emplace_back for initialization of IANA language subtag registries
It's much faster than using the initializer lists. See previous commit
for more details.
2021-01-26 14:53:31 +01:00
Moritz Bunkus
ed309582ce
BCP 47: various lists: cosmetics (remove superfluous space at end of row) 2021-01-26 14:53:30 +01:00
Moritz Bunkus
82b4c7eaf9
Rakefile: remove debug output 2020-09-07 17:53:37 +02:00
Moritz Bunkus
2156af77fa
Rakefile: add dev target for updating all lists 2020-09-07 17:49:33 +02:00
Moritz Bunkus
19e61a2d2e
IANA language subtag registry: make registry download & parsing reusable
Part of the implementation of #2919.
2020-09-07 17:49:32 +02:00
Moritz Bunkus
d5dbdb0a7e
replace outdated link to GPLv2 with current one 2020-08-01 18:03:54 +02:00
Moritz Bunkus
dff305d136
IANA Language Subtag Registry: add extended language subtags
Part of the implementation of #2419.
2020-07-05 11:35:31 +02:00
Moritz Bunkus
ef1a803687
Rakefile: add target for generating IANA language subtag registry 2020-07-02 19:09:47 +02:00