WinRK progress, and new codecs
Time to update you all on my progress, so here goes…
First up is WinRK. There have been several comments and emails regarding problems allocating large model sizes when using WinRK on 32bit windows. These issues stem from fragmentation of the virtual address space, making it impossible for WinRK to allocate large continuous block(s) of memory. I’m happy to report that I have found a fix for this issue, and with the fix, WinRK is able to use up to 1.5GB of memory at one time. This fix will be in the next point release of WinRK. This issue doesn’t affect 64bit platforms.
I’ve also been spending quite a bit of time on testing WinRK in as many corner cases as possible. This has most recently involved compressing >20GB of files. I’m hoping to use this testing program to hunt down some of the annoying stability issues that people have reported in the past, with the aim of increasing stability and reliability.
Next up, is for the compressor heads out there. For some time now I have been looking for a good compression codec to replace the ‘fastest’ mode in WinRK. WinRK was always originally designed for the best compression, and over the years I have tried to adapt it to a wider range of tasks. This has generally meant looking to position it better in terms of efficiency - that is time taken versus the compression gained. There are many archiving tasks where ‘good enough’ compression at a reasonable speed are important (try compressing a hard drive of data for example!).
So, I’ve been experimenting with new algorithms lately. My first attempt was to look at the DMC algorithm that seems to have gained popularity recently. Hook made DMC look pretty good in terms of efficiency (great work Nania!). Unfortunately, it seems Hook gets a lot of its performance from the LZP implementation. Still, I have implemented a pure DMC model that works quite well. It might interest some that it uses 8 bytes per node for model sizes up to 512MB. (It would probably fit into paq8 well!).
Next up was BWT. The perrenial favourite for efficiency. This turned out to be far more interesting! With BWT I was able to create a rather efficient multi-threaded codec that will scale nicely from PocketRK up to my current Phenom X4 and beyond. With small blocksizes (2-4MB) this gets 15-20MB/s on the Phenom, while still retaining good compression. My implementation is using divsufsort for sorting, and a symbol encoder inspired by Matt Mahoney’s BBB compressor. This means 5n memory during encode and 4n (n<=16MB) or 5n memory during decode.
I’ve attached the latest mcomp.exe below (both 32bit and 64bit versions). This is based on the PocketRK codebase, and as such doesn’t include FPW or PWCM algorithms yet.
Speaking of PocketRK. It is now feature complete and undergoing testing a refinement ready for release. PocketRK supports a subset of the codecs and filters that its big brother WinRK does, chosen mainly to keep memory and time usage suitable for the form factor.
OK, last but not least is technology development. I am currently beginning work on a new contract developing further some GPGPU type technology. This work is pretty exciting, but unfortunately I cannot yet say much about it publicly. This will be consuming quite a bit of my time over the next month or so, however I still have time scheduled for PocketRK as well. Hopefully I’ll be able to announce the first release soon!