delpozzo wrote: ↑17 Jun 2020, 13:45
I was never great at Pacman, but managed to get to level 6 on my Gigatron
I've played it hundreds of times over the last few weeks and haven't got past level 7, so you're doing much better than I am; it's still nowhere near as brutal as the real PacMan though.
delpozzo wrote: ↑17 Jun 2020, 13:45
Out of curiosity, what improvements were made in the ROMv5a/DEVROM that allowed you to implement the sound FX vs ROMv4? My Gigatron is currently running v4 and I was thinking about getting an EPROM programmer and another EPROM chip to upgrade to the DEVROM.
There is a bunch of great stuff in ROMv5a and DEVROM that Marcel was working on, but the main thing that gtBASIC and PucMon utilise is the CALLI instruction.
CALLI allows vCPU code to CALL/JUMP anywhere in memory with one instruction totaling 3 bytes and without destroying the contents of the accumulator, (vAC).
In ROMv4 and lower whenever gtBASIC wants to jump between 256 byte pages or smaller fragmented segments such as the 96 byte offscreen video memory area, it must perform a thunk, (page jump), like this:
Code: Select all
STW vAC_save
LDWI Label
CALL vAC
.
.
Label:
LDW vAC
You'll notice the page jump requires a prologue and epilogue, which makes this thunk a rather costly 9 bytes for every call/jump that is required. The way the compiler works is that it does not require the programmer to care about segmentation or fragmentation from a coding standpoint. i.e. as a programmer you can merrily produce code until you run out of RAM, (and you will).
To facilitate this the compiler must be able to insert these page jumps/thunks anywhere in the code as you leave one page/segment and enter another. It achieves this by keeping a list of free RAM pages and segments that are allocated based on code size, (individual instructions and groups/macros of instructions). As the virtual program counter approaches the end of a page/segment, the code thunk is inserted and code is relocated and shuffled around. As you can imagine if a thunk is inserted early in the code, it can cause a cascade effect that causes even more thunks to be inserted and even more relocations further on down the line. GCL and raw vASM on the other hand expect you to micro-manage everything, including this aspect of your code, they give you the flexibility to design the memory layout of your project as you see fit to achieve the greatest possible efficiency; but it becomes extremely time consuming and tedious as your project gets bigger, (I spent about 50% of my time on memory layout and organisation when coding Tetronis).
Once the remaining RAM becomes so fragmented, (segments <25 bytes in size), the thunk becomes a significant percentage of the code, (normally <10%, but in this situation approaching 50%), this leads to drastic code inefficiency, wasted RAM and slower effective code execution. When your project gets to this stage, it's time to do some major re-shuffling, moving data structures around and re-organising code order can have a significant effect on how much relocation is performed and how many thunks are generated. (The compiler has many pragma's and options to control this aspect of code generation and thus allows you to squeeze out every possible non fragmented byte of RAM for code and data).
CALLI solves these issues like this:
So instead of a 9 byte thunk that must save and restore vAC, (because it can be inserted literally anywhere in code), we have a 3 byte instruction that is able to both CALL a subroutine and jump to anywhere in the full 64K address space. So thunks are reduced in size to 1/3 but also CALL's of subroutines and the runtime, (which there can be hundreds of), are reduced from 5 bytes to 3 bytes as well. gtBASIC actually has two completely separate versions of the runtime, one for the 9 byte thunk and 5 byte CALL and one for the optimised 3 byte CALLI thunk and CALL. By specifying a ROM version pragma at the beginning of your code, you can automatically link with whichever version of the runtime you require.
TLDR:
So to finally answer your question
CALLI saved me around 850 bytes of much less fragmented RAM, which was enough to allow for the programmatic generation of the sound effects. PucMon ROMv3 has around 400 bytes of badly fragmented RAM free and PucMon ROMv5a has around 650 bytes of much less fragmented RAM free.
P.S. The insertion of the 9 byte thunk into every page/segment boundary could be optimised by statically analysing the code and determining if vAC actually needed to be saved and restored. Then the compiler could switch between a 5byte and 9byte thunk producing more efficient code and RAM usage. Unfortunately this is not a trivial problem to solve as it is much more complex than just analysing LDW/STW pairs, thus the effort required to achieve potentially questionable gains deterred me from going down that path. A much simpler and better solution with much higher efficiency, is to just get all Gigatron owners to update their ROM's to either 5a or DEVROM and make CALLI thunking the defacto standard or do what I do now and just provide both versions.