I Want to Be Synthesized
- Posted by mariteaux on April 17th, 2021 filed in Modding
- Comments Off on I Want to Be Synthesized
Something to break up the Nirvana stuff today! Still music-related (though just wait, there’s three pages of a Colton story in the works), but regardless–more extracting exclusive audio from obscure video game formats. Seems to be a recurring theme around here.
So I’ve been playing a lot of Amplitude recently. (The PS2 one, not the remake.) The game has an absolutely nutso soundtrack that tosses you between house and trance, grimy rap and pop punk, drum and bass, shiny early 2000s pop–name it, it’s in there. Several of the songs are also exclusive to the game, and as such, I wanted a way to listen to them without also destroying my fingers.
Now, Amplitude actually has two of each song on-disc, just in different formats. The audio you hear in-game are samples of each of the individual stem parts (which can be reconstructed using onyxite’s Samplitude–wait for my “Dope Nose” custom, seriously), but there’s also a Soundtrack mode that lets you listen to just the songs without playing the game, and these sound like the normal stereo mixes of each song. You can find the former in the game’s ARK file and the latter in the AUDIO directory on-disc.
Where they are encoded in STR format. Which is, in fact, not the video STR format.
So let’s talk about PS2 hardware decoding for a brief moment. Sony packed in (on both the PS1 and PS2) chips for encoding and decoding two specific formats of audio and video, that being ADPCM (which is most commonly used to encode digital signals for analog phone lines) and various flavors of MPEG video (the PS2 naturally supports MPEG-2, being a DVD player, but it also supports MPEG-1 given that Guitar Hero II has MPEG-1 videos in PSS format). Because both machines are rather anemic, these are handled in hardware rather than software, freeing up the main CPU for other tasks.
Normally, STR video is encoded in motion JPEG, a very old, simplistic, and kinda shitty video codec where all the frames are simply JPEG images without any interframe compression (compression between frames, encoding only the differences in each one–ever seen a YouTube video glitch and smear? That’s the interframe compression fucking up) between them. Given the simplicity, plenty of tools have been written to extract STR video from games or play them in a special player.
Problem being, this isn’t STR video. This is just audio named STR.
I tried a variety of tools with supposed support to get these things to play. I first tried QuickBMS, which is what I used to rip the Madden soundtracks and supposedly had Amplitude support, but that didn’t do a whole lot. (I’m guessing it’s more for ripping the ARK apart than any audio files–more experimentation needed.) MFAudio had no clue what to do with them either. Then I learned from a forum discussion that a tool called Jaeder Naub had gotten Amplitude support, though it very much never found any PCM audio. (I upload the copy I found on the Wayback Machine because it seems otherwise lost.) The Foobar2000 component for vgmstream, a pretty popular library for playing back game audio with supposed STR support, also refused to take them.
There was one avenue I had left to take. In an obscure thread on the XeNTaX forum, an unlikely face popped up with a small snippet of C code and a description of what the stereo mixes of each song are encoded in.
Maxton is kind of a legendary (and unfortunately recently deceased) name in a bunch of circles, but most relevantly to us, the rhythm game hacking circles for all his work on (among other things) Forge-based games. (Forge is the next generation of Milo, that which everything from Amplitude 2003 to Rock Band 3 runs on.) It was both a total surprise and also not really surprising at all to see him trying to crack said STR files. And he left code behind! I had to try to compile it.
So out I ran to get the Visual Studio tools (you can get them separately from the IDE, fun fact), and a small bit of bugfixing later, I had playable audio.
Now, the tool is not exactly the most convenient thing (but it also wasn’t meant to be). No wildcard stuff, and more importantly, it doesn’t write the files in WAV, only in RAW. (WAV, while uncompressed, contains some header information to give a program reading it an idea of the sample rate, number of channels, and bit depth. RAW is only sample data, and you have to tell Audacity how exactly to decode it.) About an hour of reassembling and re-encoding later, though, I had shiny new FLAC files of every single track in the game, every single exclusive mix in full quality.
To keep another one of Maxton’s tools from getting lost, I’ve reuploaded the source (my one tweak in a separate file) and my compile. Whether or not anyone will need it, given that I just ripped everything and I’ll be adding the soundtrack to my soundtracks section soon enough, is another question, but hey, reference code is important. Saves someone having to reverse-engineer it again, potentially.
Later today, I’ll be streaming Amplitude and likely fighting the content ID bots for it. Worst case scenario, I’ll download it and cut any notable runs into their own videos.