70 Commits

Author SHA1 Message Date
rlaphoenix
2b7fc929f6 Rework the HLS downloader, add support for new downloaders
- It now downloads all segment files multi-threaded first before any decryption or merging operations (excluding init data, which will be downloaded in sequence/order after all the segments are downloaded)
- Once all segments are downloaded it then starts to go through and do any merging/decryption/init data stuff/e.t.c afterwards.
- Segments are no longer decrypted one by one. If segments use the same EXT-X-KEY data, then they will be merged together and then decrypted. This should see a noticeable speed increase for Widevine DRM.
2024-02-15 17:26:39 +00:00
rlaphoenix
709901176e Use CRC32 instead of MD5 for Track IDs in DASH/HLS 2024-02-15 10:56:51 +00:00
rlaphoenix
bd185126b6 HLS: Skip merging continuity if all segments were skipped
If all segments of a continuity is skipped, i.e. by OnSegmentFilter, then this code fails as the folder wouldn't exist.
2024-02-13 17:03:42 +00:00
rlaphoenix
cd194e3192 Add new Track Event, OnSegmentDownloaded
Like OnDownloaded but called every time a DASH or HLS segment is downloaded. The path to the downloaded segment file is passed to the callable.
2024-02-10 18:10:09 +00:00
rlaphoenix
87779f4e7d Move Track OnDownloaded event before decryption 2024-02-10 18:05:35 +00:00
rlaphoenix
c18fe5706b Pass DRM and Segment objects to Track OnDecrypted event 2024-02-10 17:48:26 +00:00
rlaphoenix
439e376b38 No longer pass the track through track events
If you are setting a callable onto a track event, then you have access to the track variable, so just include/use that in your lambda/callable.
2024-02-10 17:47:12 +00:00
rlaphoenix
a544b1e867 Merge HLS segments first by discontinuity then via FFmpeg
HLS playlists where each segment is in an mp4 container seems to corrupt when the EXT-X-MAP is changed out, unless you first merge segments by discontinuity and then merge the merges via FFmpeg (which demuxes all the merged segment continuities and then concatanates them together, probably giving it new init data too).
2024-02-09 08:33:17 +00:00
rlaphoenix
167b45475e Only decode text direction entities in Sub files
Previously, all entities were decoded in Subtitle files because of a problem with SubtitleEdit and it's /ReverseRtlStartEnd option not being entity-aware.

It actually ends up reversing the `;` of `&rlm;`, instead of the actual value of `&rlm;`. Therefore, I decoded all entities before SubtitleEdit could have processed the Subtitle, but this has caused problems with more advanced formats like TTML and WebVTT as `&lt;` would decode to `<` causing syntax errors, among other problematic characters.

According to the TTML and WebVTT spec, html entity encoding is allowed, and that makes sense or you wouldn't be able to use `<` etc. Any failure for players to show the decoded character would be a player problem and be out of scope with Devine.
2024-02-05 12:37:21 +00:00
rlaphoenix
2056e056a4 Unescape HTML Entities in Subtitles after Downloading
This fixes some Subtitles having e.g., `&amp;` instead of just `&`, but especially for special entities like `&rlm;` which enables Right-to-Left mode on Hebrew and Arabic Subtitles.
2024-01-18 16:25:39 +00:00
rlaphoenix
e8e3d4a90f Remove 5-attempt loop from DASH and HLS Downloads
These are unnecessary now as all downloaders have retry functionality built-in.
2024-01-09 13:00:39 +00:00
rlaphoenix
cc4900a2ed Remove uses of the downloader's silent arg in DASH and HLS
This was originally done to prevent *all* aria2c logs unless on the last attempt, at which if it failed all attempts it would let aria2c log the error.

However, that's bad practice as aria2c may produce errors or warnings on say the 3rd attempt, and the 3rd attempt may have otherwise succeeded, with warnings or errors. It also generally shouldn't be necessary.
2024-01-09 12:54:27 +00:00
rlaphoenix
fa3cee11b7 Move Download Cancel/Skip Events to constants 2024-01-09 11:55:05 +00:00
rlaphoenix
ce457df151 Change wording from Download Stopped to Download Cancelled 2024-01-09 11:38:58 +00:00
rlaphoenix
d566aa2547 Show Licensing and Licensed Messages via Rich 2024-01-09 11:34:14 +00:00
rlaphoenix
f28a6dc28a Fix usage of __all__ 2024-01-09 02:31:22 +00:00
rlaphoenix
c0d940b17b Remove Track.needs_proxy
Ok, so there's a few reasons this was done.

1) Design-wise it isn't valid to have --proxy (or via config/otherwise) set a proxy, then unpredictably have it bypassed or disabled. If I specify `--proxy 127.0.0.1:8080`, I would expect it to use that proxy for all communication indefinitely, not switch in and out depending on the track or service.

2) With reason 1, it's also a security problem. The only reason I implemented it in the first place was so I could download faster on my home connection. This means I would authenticate and call APIs under a proxy, then suddenly download manifests and segments e.t.c under my home connection. A competent service could see that as an indicator of bad play and flag you.

3) Maintaining this setup across the codebase is extremely annoying, especially because of how proxies are setup/used by Requests in the Session. There's no way to tell a request session to temporarily disable the proxy and turn it back on later, without having to get the proxy from the session (in an annoying way) store it, then remove it, make the calls, then assuming your still in the same function you can add it back. If you're not in the same function, well, time for some spaghetti code.

---

tldr; -1 ux/design/expectations with CLI, -1 security aspect, -1 code maintenance, but only +1 for potentially increased download speeds in certain scenarios.
2023-12-29 20:25:57 +00:00
rlaphoenix
7cec16d8ab Validate track languages in HLS.to_tracks 2023-12-02 22:40:41 +00:00
Shivelight
c31ee338dc
Add option for automatic subtitle character encoding normalization (#68)
* Add option for automatic subtitle character encoding normalization

The rationale behind this function is that some services use ISO-8859-1
(latin1) or Windows-1252 (CP-1252) instead of UTF-8 encoding, whether
intentionally or accidentally. Some services even stream subtitles with
malformed/mixed encoding (each segment has a different encoding).

* Remove Subtitle parameter `auto_fix_encoding`

Just always attempt to fix encoding. If the subtitle is neither UTF-8 nor CP-1252, then it should realistically error out instead of producing garbage Subtitle data anyway.

* Move Subtitle encoding fixing code out of if drm tree

* Use chardet as a last ditch effort fixing Subs, or return original data

* Move Subtitle.fix_encoding method to utilities as try_ensure_utf8

* Add Shivelight as a contributor

---------

Co-authored-by: rlaphoenix <rlaphoenix@pm.me>
2023-12-02 11:00:55 +00:00
rlaphoenix
4b8cfabaac Fix all Ruff and isort linter errors 2023-12-02 09:57:13 +00:00
rlaphoenix
6cfbaa7db1 Pass cookies to the aria2c and requests downloaders
For aria2c I've simplified the operation by offloading most of the work for creating a cookie header by just re-doing what Python-requests does. This results in the exact same cookies Python-requests would have used in a requests.get() call or such. It supports multiple of the same-name cookies under different domains/paths based on the URI of the mock request.
2023-05-29 22:23:39 +01:00
rlaphoenix
fd52073605 Skip merging of HLS segments if --skip-dl is used
Partially fixes #61
2023-05-27 20:20:07 +01:00
rlaphoenix
df2f9b85ae Use urljoin instead of an if check and + op in HLS
This used to be used even before devine was public, but it was constantly changed back and forth between an urljoin(), another form of urljoin (something custom or something I can't remember), and an if check + addition.

However, I can confirm that a simple if check will not work as the Base URI might not even be in the same relative root. The if checks have also been inconsistent with some checking if it starts with http(s)://, and some checking if it does not have the base URI at the start of the string.

This if check method does not work as well as an urljoin() has the potential to. It also fixes some services as some HLS playlists would have the m3u8 URL on a completely different root, subdomain, or even domain, causing it to completely break when trying to download segments.
2023-05-21 00:06:30 +01:00
rlaphoenix
6e844409ae Set stop event & mark track failed if HLS Session DRM fails to license 2023-05-19 19:07:06 +01:00
rlaphoenix
c9ecab444f Use range offset when calculating HLS init map byte ranges 2023-05-19 18:38:33 +01:00
rlaphoenix
dd64212ad2 Move download_segment() from DASH/HLS download_track() to Class
Various overall small readability improvements have also been made.
2023-05-17 03:20:01 +01:00
rlaphoenix
6cdde3efb0 Override the downloader more efficiently in DASH/HLS when Range is used 2023-05-17 01:33:06 +01:00
rlaphoenix
a45c784569 Replace download speeds with "Downloaded" text when finished 2023-05-16 21:59:03 +01:00
rlaphoenix
cb82febb7c Add ability to choose downloader via config 2023-05-12 06:42:33 +01:00
rlaphoenix
b92708ef45 Alter behaviour of --skip-dl to allow DRM licensing
Most people used --skip-dl just to license the DRM pre-v1.3.0. Which makes sense, --skip-dl is otherwise a pointless feature. I've fixed it so that --skip-dl worked like before, allowing license calls, while still supporting the new per-segment features post-v1.3.0.

Fixes #37
2023-05-11 22:17:41 +01:00
rlaphoenix
71cf2b4016 Fix rare issue where DASH/HLS dl speed divides by 0 2023-03-26 14:30:12 +01:00
rlaphoenix
d8acdda044 Silence DASH and HLS logs unless it's the last attempt 2023-03-12 00:09:02 +00:00
rlaphoenix
055bc927f5 Add a 5-attempt retry system to DASH & HLS downloads 2023-03-11 19:28:02 +00:00
rlaphoenix
111dac9264 Fix association of preceding HLS EXT-X-KEYs with m3u8 fork
This will improve efficiency and accuracy of getting appropriate DRM systems when downloading segments.

This can dramatically improve download speed from less than 50 kb/s to full speed if the HLS playlist used a lot of AES-128 EXT-X-KEYs. E.g., a unique key for each segment.

This was caused because the HLS.get_drm function took EVERY EXT-X-KEY, checked for supported systems, loaded them, and returned the supported objects. This meant it could load possibly 100s of AES-128 ClearKey objects (likely requiring URL downloads for the key URI) causing a huge delay before downloading each segment.
2023-03-09 21:46:48 +00:00
rlaphoenix
abf6c71688 Specify HLS Track Key IDs to prepare_drm
This also moves the init data code before drm related code, just so it has the init data ready to retrieve the Key ID from.
2023-03-08 22:45:41 +00:00
rlaphoenix
d175ffaf15 Add support for byte-range on HLS init maps 2023-03-04 12:21:28 +00:00
rlaphoenix
1b1412d498 Fix byte range calculation on HLS downloads
It was off by one. The final calculation for the right-side range needed to be converted from one-index to zero-index.
2023-03-04 12:18:19 +00:00
rlaphoenix
318832e6b2 Store DRM in the track.drm property in HLS and DASH 2023-03-04 11:49:53 +00:00
rlaphoenix
f8166f098c Apply threading lock to HLS DRM preparation
Without this, if two threads started at the same time there was a very good chance they would run the code and license twice, which is unnecessary.
2023-03-04 11:41:10 +00:00
rlaphoenix
c3a22431f0 Fix possible soft-lock in HLS if Queue is left empty after error 2023-03-04 11:11:20 +00:00
rlaphoenix
9fff14af30 Fix regression that broke pproxy 2023-03-03 08:53:28 +00:00
rlaphoenix
a3efadf00b Fix aria2c's segmented check for DASH/HLS 2023-03-03 07:52:13 +00:00
rlaphoenix
9e23ee13bb Remove silent args in aria2c calls for HLS/DASH 2023-03-03 07:52:13 +00:00
rlaphoenix
9f48aab80c Shutdown HLS & DASH dl pool, pass exceptions to dl
This results in a noticeably faster speed cancelling segmented track downloads on CTRL+C and Errors. It's also reducing code duplication as the dl code will now handle the exception and cleanup for them.

This also simplifies the STOPPING/STOPPED and FAILING/FAILED status messages by quite a bit.
2023-03-01 11:08:52 +00:00
rlaphoenix
624bb6fe75 Only calculate DASH/HLS dl speed if dl sizes are available 2023-03-01 11:08:52 +00:00
rlaphoenix
840db6e689 Move segment merging from dl to DASH/HLS classes 2023-03-01 08:54:35 +00:00
rlaphoenix
f4ad7a2e6c Mark track as stopping when skipping segments 2023-02-28 18:14:03 +00:00
rlaphoenix
383e7d9647 Add full support for CTRL+C on HLS and DASH 2023-02-28 18:05:04 +00:00
rlaphoenix
9cfda3bb9c Don't shutdown pool or the for loop will lock
Since I'm using `futures.as_completed()`, it will never ever for loop over all tracks and segments and will forever be stuck in the primary thread of the operation. I.e., main thread for the download track threads, or the track thread for the download segment threads.

I've also removed all future cancelled checks as they will never be cancelled before they get the chance to run, because no future cancel calls are made anymore.
2023-02-28 18:05:03 +00:00
rlaphoenix
acead803bd Remove unnecessary sleep calls at start of download threads 2023-02-28 16:38:10 +00:00