Ripping HLS videos from Yuja*

Remote classes are starting, and you know what that means: long introduction videos, and plenty of equally long lectures to come. It feels like I always get a migraine trying to watch these videos. No, my professors aren’t particularly boring, and no, the material isn’t too stale… But the effort I have to put into simply loading the videos is just Sisyphisian. The places these videos are uploaded are so obscure and brittle that I really can’t imagine how they were found to begin with.

Of horse, the video isn’t just a file. Of horse, the JavaScript is enough to down the mightiest of PCs. Of horse, the proprietary video player is so much worse than the one that ships with your browser.

How could you expect any less?

It’s simply too much effort to upload a plain MP4 file and say, “Hey, watch this!” No, that’d too frictionless, too easy. After all, suffering builds character! If something’s easy or reasonable, it’s worthless. Right?

Well, this time I had to download a video with subtitles from Yuja, and I learned about HLS. Maybe you’ll find it useful, or learn a bit yourself. If you don’t wanna read my devlog and just want to download from Yuja quick, skip to the end.

HLS videos

Yuja serves videos using HLS, which is slightly more annoying to download than a plain video file. (You can’t just use the “F12→Network and look for videos” trick).

The video’s split into variable-sized chunks, each at a different URL, a list of which is stored in an M3U8 file. If you find the M3U8 file (F12→Network should do the trick this time), you can download the individual chunks and put them into a single video file.

At first, I did something like this:

for link in $(grep -v '#' blah.m3u8); do
wget -O part $link
cat part >> whole
done

ffmpeg -i whole video.mp4
… which works just fine! But, as it turns out, there’s a more elegant way to do it:

$ ffmpeg -i $M3U8_URL video.mp4

Yea, ffmpeg’ll handle it all for you. What a good boy. :)

Yuja, in particular

If you want to programmatically and easily get videos from Yuja (skipping out on the tedious F12→Network business), you’ll have to download and parse their video metadata.

Video URLs in Yuja are structured like so (where $SUB is your subdomain):

https://$SUB.yuja.com/V/Video?v=$ID

You can get a JSON file of the metadata on a given video from this URL:

https://$SUB.yuja.com/P/Data/GetVideoListNodeInfo?videoPID=$ID

From there, you can find:

  • videoHLSFileKey ($HLS_KEY), which is used for getting the M3U8 URL
  • captionFileKey, ($SUB_KEY), which is used for getting the subtitle URL

Caption URLs are structured just like this:

https://$SUB.yuja.com/P/DataPage/CaptionFile/$SUB_KEY

M3U8 URLs seem to be structured as:

https://my.yuja.com/P/Data/VideoUrl/level${HLS_KEY}/720p/${HLS_KEY}.m3u8.m3u8?dist=yuja-edits&key=${HLS_KEY}/720p/${HLS_KEY}.m3u8

… no, that’s not a typo. I’m sure the back-end’s prettyyy.

Put it all together

That’s all you need, really. I put together a shell script that’ll do the whole thing for you, right over here: GitHub, Forgejo.

On bitrot

However, this is the sort of post (and script) that’s very vulnerable to bitrot. They might switch around their back-end at any moment, might swap around API calls, or maybe even dye their hair yellow. I’ll keep this post and the script updated for as long as I’m dogfooding — a few months, at least — but after that I won’t notice when it bitrots.

I don’t like leaving things to rot, though. So, if you can’t get it working with a video, please send me the URL (via GitHub or e-mail), and an as-detailed-as-you-can description of the problem. I’ll try and get things fixed up, if I’m able†

† Edits †

On bitrot, again

As of 2024, I haven’t needed to use this script since mid-2022. I can no longer test it, but will still try my best to fix any issues. Ideally, the project would be forked (or an alternative would be devised), though. If you do want to file a bug report, please be very specific! Remember that I can’t do any testing at all!

As of mid-2022, the method up there works for some videos’ M3U8s, but not all. Here is a new method that works for every video (… that I tried). Note that the previous method for getting captions still works for everything (I think).

From the GetVideoListNodeInfo JSON you got captionFileKey from, you can also get:

  • videoListNodePID ($LN_PID)

From there, you can get more metadata from https://$SUB.yuja.com/P/Data/VideoJSON eby passing the following POST data:

video=$ID&node=$LN_PID

From there you can get this value:

  • videoLink ($VFL)

And now we have enough info the get the M3U8 link! You just have to make a request like this (replacing $ID and $VFL, ofc):

https://dcccd.yuja.com/P/Data/VideoSource?video=$VFL&videoPID=$ID

Then you can get the M3U8 link(s) in a list called videoSources.

Bam! Now you’re good to go! =w=

💬 Comments

💬 Komentoj

Nobody here but us chickens! 🐔

Neniu kometo jam! 🌠

Komentu per via ✦federuja✦ konto! Respondu:

Comment with your ✦fediverse✦ account! Reply to:

https://jam.xwx.moe/notice/AcCidjFnlQYZnxdk0m