GT1 compression
Posted: 29 Dec 2023, 23:46
One of Marcel's comments in the ROM code suggests to look into pucrunch or eximozer for ways to compress GT1 files and fit more in the ROM. As a reference point, simply using gzip on at67's repository of GT1 files https://forum.gigatron.io/viewtopic.php ... pbox#p3958 reduces the code size by about a factor two. However decoding gzipped files on the Gigatron is challenging because the gzip bitstream must be processed one bit at a time (something the Gigatron does not do very well) and because there is no way around using a fair amount of memory to store the Huffman tables that are provided at the beginning of the gzipped data. Memory is the most fundamental problem here. There is simply no area of memory that can safely be used for this purpose with all GT1 files. Therefore one would need each compressed file to specify which pieces of memory are safe for this use, etc. Otherwise there is very little memory available to the SYS_Exec system call: sysArgs[0-7], sysFn, vAC, and that's about it. All this made GT1 compression look very challenging.
I learned a couple weeks ago that it is possible to achieve decent compression without using a bitstream. Yann Collet's LZ4 algorithm is byte based https://github.com/lz4/lz4, and Charles Bloom's LZNIB algorithm uses 4 bits nibbles http://cbloomrants.blogspot.com/2019/01 ... rithm.html.
The LZ4 compression essentially alternates literal runs, that is sequence of uncompressed bytes, and match runs that copy data located at a specific offset from the current address. Interestingly, LZ4 achieves decent results by encoding these offsets with two bytes, which are enough to access any point of the Gigatron address space. In fact we can make this scheme Gigatron-friendly by making sure that runs do not cross page boundaries, and by encoding the match offsets using separate page offsets and byte offsets without dealing with carries. One also need codes to change the current address between GT1 segments. I tried implementing this and obtained encouraging results.
Then I remembered that LZNIB also uses direct matches that do not code an offset but reuse the offset of the last match. Although Charles Bloom made clear that properly using direct matches requires a much more complicated search at compression time, this search gets easier when matches cannot cross page boundaries because we can process each page separately. After a little bit of experimentation, I restricted the page offsets to the 0-127 range and used the 128-255 range to represent short offsets with single byte instead of two.
See https://github.com/lb3361/gigatron-rom/ ... Utils/gt1z for a complete encoder and decoder in C++ (see function "decompress()" for a good idea of how this works.) The compression performance of this scheme is substantially better than LZ4 and on average performs between GZIP -2 and GZIP -3. In the following table, "Apps" refers to all GT1 files in the "Apps" subdirectory of the DEV7ROM, 'Dropbox" refers to at67's GT1 repository, and "Images" is a collection of images encoded as GT1 files. The 'gt1.gz' column uses GZIP -3, the 'gt1.lz4' column uses LZ4 -9. Clearly, compression works much better on the graphics heavy applications found in Dropbox than the more code oriented ones found in Apps.
Note that all GT1Z files start with 0x00 0xff which is an unlikely start for a GT1 files. The fun part was to write a version of SYS_Exec that transparently loads either a regular GT1 files or a compressed GT1Z files. I implemented that in DEV7ROM using the FSM approach with a substantially more refined way to create FSM microcode https://forum.gigatron.io/viewtopic.php ... =FSM#p3714. I was then able to reintroduce Pictures (about 11k, uncompressed), and add Frogstroll (about 9k, compressed to 1.6K) and the Shuttle image (19k, compressed to 2.2K). The next step is to properly compressed the pictures in Pictures
.
I am willing to port this to the official DEVROM is there is demand. Note that this requires two important changes, increasing MaxTicks to 15, and using the FSM approach despite its slightly controversial "bge([fsmState])" instruction.
For the curious, the sys_Exec code is at https://github.com/lb3361/gigatron-rom/ ... m.py#L7072 and https://github.com/lb3361/gigatron-rom/ ... .py#L10475.
I learned a couple weeks ago that it is possible to achieve decent compression without using a bitstream. Yann Collet's LZ4 algorithm is byte based https://github.com/lz4/lz4, and Charles Bloom's LZNIB algorithm uses 4 bits nibbles http://cbloomrants.blogspot.com/2019/01 ... rithm.html.
The LZ4 compression essentially alternates literal runs, that is sequence of uncompressed bytes, and match runs that copy data located at a specific offset from the current address. Interestingly, LZ4 achieves decent results by encoding these offsets with two bytes, which are enough to access any point of the Gigatron address space. In fact we can make this scheme Gigatron-friendly by making sure that runs do not cross page boundaries, and by encoding the match offsets using separate page offsets and byte offsets without dealing with carries. One also need codes to change the current address between GT1 segments. I tried implementing this and obtained encouraging results.
Then I remembered that LZNIB also uses direct matches that do not code an offset but reuse the offset of the last match. Although Charles Bloom made clear that properly using direct matches requires a much more complicated search at compression time, this search gets easier when matches cannot cross page boundaries because we can process each page separately. After a little bit of experimentation, I restricted the page offsets to the 0-127 range and used the 128-255 range to represent short offsets with single byte instead of two.
See https://github.com/lb3361/gigatron-rom/ ... Utils/gt1z for a complete encoder and decoder in C++ (see function "decompress()" for a good idea of how this works.) The compression performance of this scheme is substantially better than LZ4 and on average performs between GZIP -2 and GZIP -3. In the following table, "Apps" refers to all GT1 files in the "Apps" subdirectory of the DEV7ROM, 'Dropbox" refers to at67's GT1 repository, and "Images" is a collection of images encoded as GT1 files. The 'gt1.gz' column uses GZIP -3, the 'gt1.lz4' column uses LZ4 -9. Clearly, compression works much better on the graphics heavy applications found in Dropbox than the more code oriented ones found in Apps.
Code: Select all
+------------------------------------------+----------+----------+----------+----------+
| filename | size | gt1.gz | gt1.lz4 | gt1z |
+------------------------------------------+----------+----------+----------+----------+
| Apps | 165001 | -29.3% | -17.5% | -26.2% |
| Dropbox | 1240815 | -56.1% | -47.7% | -55.7% |
| Images | 121362 | -68.0% | -57.0% | -62.2% |
+------------------------------------------+----------+----------+----------+----------+
| TOTAL | 1527178 | -54.1% | -45.2% | -53.1% |
+------------------------------------------+----------+----------+----------+----------+
.
I am willing to port this to the official DEVROM is there is demand. Note that this requires two important changes, increasing MaxTicks to 15, and using the FSM approach despite its slightly controversial "bge([fsmState])" instruction.
For the curious, the sys_Exec code is at https://github.com/lb3361/gigatron-rom/ ... m.py#L7072 and https://github.com/lb3361/gigatron-rom/ ... .py#L10475.