New vCPU instructions 2.0

Using, learning, programming and modding the Gigatron and anything related.
Forum rules
Be nice. No drama.
at67
Site Admin
Posts: 647
Joined: 14 May 2018, 08:29

Re: New vCPU instructions 2.0

Post by at67 »

Update:
  • ANDB was removed
  • ORB was removed
  • XORI was moved from an external page back to page3, cycle time was decreased from 20 back to 14
I fell for the same trap a second time...thinking I could move XORI out of page 3 to make room for more instructions, the instruction following XORI is BRA; it modifies [vPC] and the instruction preceding BRA overlaps with this [vPC] modification. So, unless your new instruction is calculating a new vPC, you will probably end up chasing ghosts and mythical beings for the best part of a day like I did, (twice).

Removing ANDB and ORB hurt as they were replacing 3 instruction blocks, (6 bytes, approx 70 cycles), with 1 instruction, (3 bytes approx 30 cycles), but it had to be done.

There is now one instruction slot free for future expansion and plenty of room in the ROM for new sys calls and instruction modifications.
at67
Site Admin
Posts: 647
Joined: 14 May 2018, 08:29

Re: New vCPU instructions 2.0

Post by at67 »

Update:

I managed to find another 2 instruction slots by optimising instructions that weren't returning to NEXTY and that had a 3 instruction prologue, (this set of circumstances allows those instructions to be moved to 2 instruction prologue opcodes).

I also tried to optimise the more commonly used instructions that had 2 instruction prologues and didn't return to NEXTY, (this allowed a couple of 30 cycle instructions to be reduced to 28 cycles).
  • POKEI was added, pokes an immediate byte value to the address contained in [vAC], 20 cycles
  • DOKEI was added, dokes an immediate word value to the address contained in [vAC], 28 cycles
There is still one instruction slot free for future expansion and plenty of room in the ROM for new sys calls and instruction modifications.
at67
Site Admin
Posts: 647
Joined: 14 May 2018, 08:29

Re: New vCPU instructions 2.0

Post by at67 »

Update:

Fill the remaining free instruction slot with:
  • POKEV writes the byte contained in vAC.lo to the address contained in the zero page variable [var], 28 cycles
walter
Site Admin
Posts: 160
Joined: 13 May 2018, 08:00

Re: New vCPU instructions 2.0

Post by walter »

Nice work. It's a lot of time these kind of things take, especially if you really think them through well, as you did.
at67
Site Admin
Posts: 647
Joined: 14 May 2018, 08:29

Re: New vCPU instructions 2.0

Post by at67 »

Cheers Walter!

Update:

Don't write code when you are tired, if you do...sleep on it and then check it in the morning...before making forum posts about it.
  • POKEV removed as it was totally pointless...
We're back to one free instruction slot.
lb3361
Posts: 360
Joined: 17 Feb 2021, 23:07

Re: New vCPU instructions 2.0

Post by lb3361 »

What about
  • ADDB z : Adds the byte at zero page address z to the 16 bits vAC.
This would facilitate multibyte arithmetic for longs and floats with sequences such as
LD(a) # first byte
ADDB(b)
ST(a)
for i in range(1,3):
LD(vAC+i) # get carry
ADDB(a+i) # byte b z
ADDB(b+i)
ST(a+i)
Alternatively
  • ADDWV z : Adds vAC to word at zeropage location z, Returns carry (1 or 0) in vAC.

For compact and most likely faster code (for 4 bytes ints only)
LDW(b)
ADDWV(a)
ADDW(b+2) ## we might miss a carry for another round here
ADDWV(a+2) ## or ADDW(a+2) STW(a+2) since we don't need the carry
at67
Site Admin
Posts: 647
Joined: 14 May 2018, 08:29

Re: New vCPU instructions 2.0

Post by at67 »

lb3361 wrote: 21 Mar 2021, 11:21 What about
  • ADDB z : Adds the byte at zero page address z to the 16 bits vAC.
I haven't updated the original list, but I have since added, (vAC denotes full 16bits of vAC):

ADDBA var: vAC += var.lo, 28 cycles, (carry aware)
SUBBA var: vAC -= var.lo, 28 cycles, (borrow aware)
ADDBI var, imm: var.lo += imm, 28 cycles
SUBBI var, imm: var.lo -= imm, 28 cycles
ANDBI var, imm: var.lo &= imm, 28 cycles
ORBI var, imm: var.lo |= imm, 28 cycles
DEEKX var: Deek word at address contained in var, var.lo += 2, 30 cycles

I have found DEEKX, (as well as ADDBA), to be extremely useful.

The XXXBI instructions replace the LD/LDW OP ST, (when src=dst), pattern with one instruction, (which means you can set and reset bit flags in byte vars with one instruction), XOR and NOT are missing and I might try to find room for them at a later stage, (they haven't been a priority so far as the code generated by my compiler and my hand crafted stuff hasn't needed them).

I am still looking for one more free instruction slot for DOKEX, (but it's not a priority for now).
lb3361
Posts: 360
Joined: 17 Feb 2021, 23:07

Re: New vCPU instructions 2.0

Post by lb3361 »

As a matter of fact, since you have POKEI,DOKEI,PEEKV,DEEKV, you might like the following naming scheme:
  • MOVV --> POKEA(z) : Writes byte at zero page location z into address AC
  • MOVVW --> DOKEA(z) : Writes word at zero page location z,z+1 into addresses AC,AC+1.
  • MOVA --> PEEKA(z) : Reads byte at address AC into byte at zero page location z. Same as PEEK()+ST(z) but without trashing AC.
  • MOVAW --> DEEKA(z): Reads word at address AC,AC+1 into word at zero page location z,z+1. Same as DEEK()+STW(z) but without trashing AC.
One thing we cannot change is that the true reverse of POKE/DOKE are in fact PEEKV/DEEKV and not PEEK/DEEK.
lb3361
Posts: 360
Joined: 17 Feb 2021, 23:07

Re: New vCPU instructions 2.0

Post by lb3361 »

Hmm. I am writing on the C code generator. I am at the same time targeting the current vcpu and trying to plan ahead for your additions as well

All your additions change the character of the VCPU from a purely AC based machine to a machine that can use the whole zero page as registers. Only a few instructions really need vAC anymore. For instance there is no XORBI so one has to go to vAC. This is great. This is also a big shift for a code generator. Cool work.
at67
Site Admin
Posts: 647
Joined: 14 May 2018, 08:29

Re: New vCPU instructions 2.0

Post by at67 »

lb3361 wrote: 21 Mar 2021, 20:33
  • MOVV --> POKEA(z) : Writes byte at zero page location z into address AC
  • MOVVW --> DOKEA(z) : Writes word at zero page location z,z+1 into addresses AC,AC+1.
  • MOVA --> PEEKA(z) : Reads byte at address AC into byte at zero page location z. Same as PEEK()+ST(z) but without trashing AC.
  • MOVAW --> DEEKA(z): Reads word at address AC,AC+1 into word at zero page location z,z+1. Same as DEEK()+STW(z) but without trashing AC.
Done.
lb3361 wrote: 26 Mar 2021, 08:12 Hmm. I am writing on the C code generator. I am at the same time targeting the current vcpu and trying to plan ahead for your additions as well
Excellent, I can't wait to see how it turns out.
lb3361 wrote: 26 Mar 2021, 08:12 All your additions change the character of the VCPU from a purely AC based machine to a machine that can use the whole zero page as registers. Only a few instructions really need vAC anymore. For instance there is no XORBI so one has to go to vAC. This is great. This is also a big shift for a code generator. Cool work.
That was the basic idea, if I could there would be vAC and general register versions of every instruction that used vAC, but obviously that's not possible due to opcode limitations within page3 and native cycle limitations.

I've done the best I can with re-organising and optimising to get as many new opcodes as possible, but there is probably 2 or 3 more that can be scrounged up by spending a lot more time carefully choosing opcodes and their positions. I really need to release the new ROM ASAP, to get it out into the wild and get some testing and feedback, so that's my priority now, (as soon as I finish these sprite routines).
Post Reply