I tried changing the value at the offset you mentioned from 22 56 (22050 in little-endian hex) to 44 AC (44100), unfortunately it didn't change anything.

The original in-game files aren't played at twice the speed, and my files are still played with the same spectral folding as the original files, and they still have that slight DC offset building up over time. Now I don't know much about IDA (and disassembling in general), and I don't have much time nor motivation to lose myself in such a task for hours...
kooz wrote:I may have misinterpreted what you were trying to say here, but I think you might be confused. The old analog line-in rip from 2001 suffers from the exact same frequency cutoff that my new digital rip does. Both rips use the game engine itself (which we have concluded is output at 22050Hz) as its source. Any appearance of higher frequency information in the original rip is byproduct of the analog signal transfer, or possibly an artifact of sample rate conversion.
If you study the ~11kHz threshold in spectral analysis, you'll notice that the higher frequencies appear to simply mirror the frequencies below.
That's right, exactly. Deton24, you don't seem to be aware of what the Shannon-Nyquist theorem (also known as the sampling theorem) has to do with aliasing, do you?
Basically, the interpolation method you choose when upsampling (eg. from 22k to 44k), or when converting to analog, will impact how much spectral folding there is in your audio (that "mirror" effect kooz is talking about). Zero-order hold ("stairsteps") causes strong mirroring -- and a big ringing effect (that's the principle of a bitcrusher, really) --, linear interpolation attenuates the effect, and with ideal sinc interpolation, you get no mirroring at all (or very little, depending on how many neighboring sample points you use).
But I might be a bit too technical for you.
The thing you have to keep in mind is, what you call "quality" is the fact that there are (or not) frequencies in the high-end of the spectrum. If you want to use the Crystalizer to strengthen the spectral folding under the assumption that it sounds better to you, then good for you. But that's not the "quality" we're looking for. We (as in, at least Droolie and I) are (or were, now that it's done) looking for a bit-perfect extraction of the soundtrack, which should show absolutely nothing in the spectrum above 11kHz (since they were sampled at 22kHz). The only frequencies we'd want above that threshold are the ones that got lost in the 44k > 22k resampling process, and which are irretrievable.
Anyway, I'm getting carried away. We've done awesome work, and I think there might be things left to dig.