New vCPU instructions 2.0

Using, learning, programming and modding the Gigatron and anything related.
Forum rules
Be nice. No drama.
lb3361
Posts: 77
Joined: 17 Feb 2021, 23:07

Re: New vCPU instructions 2.0

Post by lb3361 »

I just ran some experimented with a batch of indirect-indexed instructions.

The encoding is as follows:

PREFIX VAR OPCODE OFFSET

where PREFIX = $B1 (which is at67's PREFX1) and OPCODE is one of LD/LDW/ST/STW/ADDW/SUBW/ANDW/ORW/XORW. Instead of accessing a 16 bit variable at address OFFSET in page zero, these instructions now use [ [VAR] + OFFSET ]. This is useful in the C compiler to access local variables allocated on the stack, -- e.g, LDW([SP,offset]) -- and also to access fields in a structure pointed by a register variable, -- e.g., ANDW( [StructPtr, FieldOffset] ).

This comes at a cost of an additional 42-44 cycles which can be split in various ways (the PREFIX instruction does the full address calculation if it has enough time, otherwise it delegates the addition to a restart. Once the address is computed (stored in vLR), a final restart runs the actual instruction.) This overhead is quite good because it is the same as computing the address with LDI(offset);ADDW(var). The code size benefit is quite small with LD/LDW because one could do LDI(offset);ADDW(var);PEEK/DEEK() but much more significant with STW or ADDW because one replaces things like LDI(offset);ADDW(var);STW(tmpvar); <compute-something-in-vAC> ; DOKE(tmpvar) by a simple <compute-something-in-vAC> STW([var,offset]).

The total gain with the C compiler is about 3-5% extra reduction with respect to at67's new instruction set. This is smaller than I expected because the C compiler often finds a way to use DEEKA/DOKEA/DEEKV/DOKE relatively efficiently and aggressively promotes local variables to registers. When it fails to promote, it resorts to using stack variables in a manner that costs a lot of opcodes. So indirect-indexed addressing helps a lot there. But when the compiler works well, or when the programmer uses the keyword 'register' smartly, the gain is more limited.

Another question is the potential gain with respect to the v5a instruction set. Without the competition of DOKEA/DEEKA/DEEKV, the benefits of indirect-indexed addressing is a lot more obvious.

Overall I believe this is a good idea. The implementation might have to be refined. In particular I am not sure at67 would like the idea of completely taking over the PREFX1 instruction page for just 8 instructions. I need to sleep over this...

After mulling these results, I concluded that this would be a nice improvement over rom v5a, but a much less compelling one over at67's rom, once released.
at67
Posts: 383
Joined: 14 May 2018, 08:29

Re: New vCPU instructions 2.0

Post by at67 »

lb3361 wrote: 07 Jul 2021, 02:51 I just ran some experimented with a batch of indirect-indexed instructions.

The encoding is as follows:

PREFIX VAR OPCODE OFFSET
I'm going to use this format, (as you suggested), for PREFX3 to save a few cycles.
lb3361 wrote: 07 Jul 2021, 02:51 Overall I believe this is a good idea. The implementation might have to be refined. In particular I am not sure at67 would like the idea of completely taking over the PREFX1 instruction page for just 8 instructions. I need to sleep over this...

After mulling these results, I concluded that this would be a nice improvement over rom v5a, but a much less compelling one over at67's rom, once released.
We could just move one of the page3 instructions that is infrequently used, (like i did with SEXT), and create a new PREFX instruction page that supports this format and performs the offset calculation as part of PREFX, (if possible). If this is not possible or if the page wastage is too great, then using the modified PREFX3, (as above), may be an alternative.

P.S. I predict a lot of potential new instructions that could use a signed 8 bit offset, so I think there would eventually be a lot more than 8, making the page wastage a moot point hopefully.
Post Reply