vCPU instruction frequency

Using, learning, programming and modding the Gigatron and anything related.
Forum rules
Be nice. No drama.
Post Reply
steve
Posts: 40
Joined: 08 Jul 2019, 19:40

vCPU instruction frequency

Post by steve »

For the people interested I did an evaluation of vCPU instruction frequency.
Following is the output after a session on PucMon :)

Code: Select all

tic hex count   vInstr  %   cum. description
18  1A  1770982 LD      17% 17%  Load byte from zero page (vAC=[D])
28  35  1340049 BCC     13% 30%  Test vAC and branch conditionally. CC can be
14  8C  1100337 XORI    11% 41%  Logical-XOR with small constant (vAC^=D)
20  2B  846162  STW     8%  49%  Store word in zero page ([D],[D+1]=vAC&255,vAC>>8)
20  21  730623  LDW     7%  56%  Word load from zero page (vAC=[D]+256*[D+1])
48  11  651299  LDWI    6%  63%  Load immediate word constant (vAC=$HHLL)
28  99  506221  ADDW    5%  68%  Word addition with zero page (vAC+=[D]+256*[D+1])
26  CF  481268  CALL    5%  72%  Goto address and remember vPC (vLR,vPC=vPC+2,[D]+256*[D+1]-2)
28  B8  470507  SUBW    5%  77%  Word subtraction with zero page (vAC-=[D]+256*[D+1])
99  B4  452052  SYS     4%  81%  Native function call using at most 2*T cycles, D=270-max(14,T)
16  FF  291126  RET     3%  84%  Leaf return (vPC=vLR-2)
16  93  204087  INC     2%  86%  Increment zero page byte ([D]++)
28  F8  180993  ANDW    2%  88%  Word logical-AND with zero page (vAC&=[D]+256*[D+1])
28  F6  179400  DEEK    2%  90%  Read word from memory (vAC=[vAC]+256*[vAC+1])
26  75  127153  PUSH    1%  91%  Push vLR on stack ([vSP-2],v[vSP-1],vSP=vLR&255,vLR>>8,vLR-2)
26  63  127149  POP     1%  92%  Pop address from stack (vLR,vSP=[vSP]+256*[vSP+1],vSP+2)
28  E6  125744  SUBI    1%  93%  Subtract small positive constant (vAC-=D)
16  82  107527  ANDI    1%  94%  Logical-AND with small constant (vAC&=D)
16  59  105974  LDI     1%  95%  Load immediate small positive constant (vAC=D)
26  AD  89853   PEEK    1%  96%  Read byte from memory (vAC=[vAC])
28  E9  78232   LSLW    1%  97%  Shift left ('ADDW vAC' will not work!) (vAC<<=1)
16  5E  78071   ST      1%  98%  Store byte in zero page ([D]=vAC&256)
14  90  70709   BRA     1%  99%  Branch unconditionally (vPC=(vPC&0xff00)+D)
28  E3  46274   ADDI    0%  99%  Add small positive constant (vAC+=D)
26  FC  30074   XORW    0%  99%  Word logical-XOR with zero page (vAC^=[D]+256*[D+1])
28  F3  29383   DOKE    0%  100% Write word in memory ([[D+1],[D]],[[D+1],[D]+1]=vAC&255,vAC>>8)
28  FA  23663   ORW     0%  100% Word logical-OR with zero page (vAC|=[D]+256*[D+1])
28  F0  12151   POKE    0%  100% Write byte in memory ([[D+1],[D]]=vAC&255)
26  7F  4669    LUP     0%  100% ROM lookup, needs trampoline in target page (vAC=ROM[vAC+D])
14  88  706     ORI     0%  100% Logical-OR with small constant (vAC|=D)
26  CD  18      DEF     0%  100% Define data or code (vAC,vPC=vPC+2,(vPC&0xff00)+D)
14  DF  0       ALLOC   0%  100% Create or destroy stack frame (vSP+=D)
28  85  0       CALLI   0%  100% Goto immediate address and remember vPC (vLR,vPC=vPC+3,$HHLL-2)
28  1F  0       CMPHS   0%  100% Adjust high byte for signed compare (vACH=XXX)
28  97  0       CMPHU   0%  100% Adjust high byte for unsigned compare (vACH=XXX)
26  EE  0       LDLW    0%  100% Load word from stack frame (vAC=[vSP+D]+256*[vSP+D+1])
26  EC  0       STLW    0%  100% Store word in stack frame ([vSP+D],[vSP+D+1]=vAC&255,vAC>>8)
at67
Site Admin
Posts: 647
Joined: 14 May 2018, 08:29

Re: vCPU instruction frequency

Post by at67 »

It's extremely interesting actually, as PucMon's code was generated by a compiler and it's fascinating to see where the hot-spots are. What would be even more interesting would be to compare compiler generated vCPU to hand generated vCPU.

DavidHK did something similar, but at the native instruction level when he was optimising his emulator:
https://forum.gigatron.io/viewtopic.php ... p=916#p916
https://github.com/at67/gigatron-rom/bl ... mu/gtemu.c
steve
Posts: 40
Joined: 08 Jul 2019, 19:40

Re: vCPU instruction frequency

Post by steve »

I've also the native instructions statistics of the same session:

Code: Select all

hex	frequency	%	cumulative
5d	231666880	37%	37%
c2	43701745	7%	44%
01	33627338	5%	49%
0d	26085751	4%	54%
00	24871641	4%	58%
80	19154237	3%	61%
a0	18761968	3%	64%
fc	16725411	3%	66%
e4	15980318	3%	69%
de	15428968	2%	71%
81	13625961	2%	74%
e8	13368376	2%	76%
14	11513358	2%	78%
fe	10351837	2%	79%
d2	9912622	2%	81%
15	9513300	2%	82%
89	9359247	1%	84%
12	7767936	1%	85%
05	7694605	1%	86%
18	7208385	1%	87%
20	6609144	1%	88%
e0	6459318	1%	90%
ca	6239518	1%	91%
11	5908221	1%	91%
fd	5150552	1%	92%
21	4032157	1%	93%
30	3939783	1%	94%
8d	3903766	1%	94%
d6	3601267	1%	95%
f0	3539284	1%	95%
ec	3413593	1%	96%
09	3125761	1%	96%
69	3119773	0%	97%
29	3119749	0%	97%
c6	2228788	0%	98%
e1	2078775	0%	98%
90	1422152	0%	98%
85	1216529	0%	98%
61	1132588	0%	99%
c0	1011892	0%	99%
02	994378	0%	99%
40	884117	0%	99%
a5	822970	0%	99%
f4	738596	0%	99%
19	718560	0%	99%
91	718482	0%	100%
60	550923	0%	100%
a1	407287	0%	100%
25	352185	0%	100%
41	279181	0%	100%
82	258459	0%	100%
45	217599	0%	100%
b0	127153	0%	100%
e2	114880	0%	100%
f8	103784	0%	100%
ce	66678	0%	100%
16	41534	0%	100%
94	12284	0%	100%
95	12284	0%	100%
c3	5988	0%	100%
dc	98	0%	100%
92	64	0%	100%
49	24	0%	100%
c1	1	0%	100%
cd	1	0%	100%
03	0	0%	100%
04	0	0%	100%
06	0	0%	100%
07	0	0%	100%
08	0	0%	100%
0a	0	0%	100%
0b	0	0%	100%
0c	0	0%	100%
0e	0	0%	100%
0f	0	0%	100%
10	0	0%	100%
13	0	0%	100%
17	0	0%	100%
1a	0	0%	100%
1b	0	0%	100%
1c	0	0%	100%
1d	0	0%	100%
1e	0	0%	100%
1f	0	0%	100%
22	0	0%	100%
23	0	0%	100%
24	0	0%	100%
26	0	0%	100%
27	0	0%	100%
28	0	0%	100%
2a	0	0%	100%
2b	0	0%	100%
2c	0	0%	100%
2d	0	0%	100%
2e	0	0%	100%
2f	0	0%	100%
31	0	0%	100%
32	0	0%	100%
33	0	0%	100%
34	0	0%	100%
35	0	0%	100%
36	0	0%	100%
37	0	0%	100%
38	0	0%	100%
39	0	0%	100%
3a	0	0%	100%
3b	0	0%	100%
3c	0	0%	100%
3d	0	0%	100%
3e	0	0%	100%
3f	0	0%	100%
42	0	0%	100%
43	0	0%	100%
44	0	0%	100%
46	0	0%	100%
47	0	0%	100%
48	0	0%	100%
4a	0	0%	100%
4b	0	0%	100%
4c	0	0%	100%
4d	0	0%	100%
4e	0	0%	100%
4f	0	0%	100%
50	0	0%	100%
51	0	0%	100%
52	0	0%	100%
53	0	0%	100%
54	0	0%	100%
55	0	0%	100%
56	0	0%	100%
57	0	0%	100%
58	0	0%	100%
59	0	0%	100%
5a	0	0%	100%
5b	0	0%	100%
5c	0	0%	100%
5e	0	0%	100%
5f	0	0%	100%
62	0	0%	100%
63	0	0%	100%
64	0	0%	100%
65	0	0%	100%
66	0	0%	100%
67	0	0%	100%
68	0	0%	100%
6a	0	0%	100%
6b	0	0%	100%
6c	0	0%	100%
6d	0	0%	100%
6e	0	0%	100%
6f	0	0%	100%
70	0	0%	100%
71	0	0%	100%
72	0	0%	100%
73	0	0%	100%
74	0	0%	100%
75	0	0%	100%
76	0	0%	100%
77	0	0%	100%
78	0	0%	100%
79	0	0%	100%
7a	0	0%	100%
7b	0	0%	100%
7c	0	0%	100%
7d	0	0%	100%
7e	0	0%	100%
7f	0	0%	100%
83	0	0%	100%
84	0	0%	100%
86	0	0%	100%
87	0	0%	100%
88	0	0%	100%
8a	0	0%	100%
8b	0	0%	100%
8c	0	0%	100%
8e	0	0%	100%
8f	0	0%	100%
93	0	0%	100%
96	0	0%	100%
97	0	0%	100%
98	0	0%	100%
99	0	0%	100%
9a	0	0%	100%
9b	0	0%	100%
9c	0	0%	100%
9d	0	0%	100%
9e	0	0%	100%
9f	0	0%	100%
a2	0	0%	100%
a3	0	0%	100%
a4	0	0%	100%
a6	0	0%	100%
a7	0	0%	100%
a8	0	0%	100%
a9	0	0%	100%
aa	0	0%	100%
ab	0	0%	100%
ac	0	0%	100%
ad	0	0%	100%
ae	0	0%	100%
af	0	0%	100%
b1	0	0%	100%
b2	0	0%	100%
b3	0	0%	100%
b4	0	0%	100%
b5	0	0%	100%
b6	0	0%	100%
b7	0	0%	100%
b8	0	0%	100%
b9	0	0%	100%
ba	0	0%	100%
bb	0	0%	100%
bc	0	0%	100%
bd	0	0%	100%
be	0	0%	100%
bf	0	0%	100%
c4	0	0%	100%
c5	0	0%	100%
c7	0	0%	100%
c8	0	0%	100%
c9	0	0%	100%
cb	0	0%	100%
cc	0	0%	100%
cf	0	0%	100%
d0	0	0%	100%
d1	0	0%	100%
d3	0	0%	100%
d4	0	0%	100%
d5	0	0%	100%
d7	0	0%	100%
d8	0	0%	100%
d9	0	0%	100%
da	0	0%	100%
db	0	0%	100%
dd	0	0%	100%
df	0	0%	100%
e3	0	0%	100%
e5	0	0%	100%
e6	0	0%	100%
e7	0	0%	100%
e9	0	0%	100%
ea	0	0%	100%
eb	0	0%	100%
ed	0	0%	100%
ee	0	0%	100%
ef	0	0%	100%
f1	0	0%	100%
f2	0	0%	100%
f3	0	0%	100%
f5	0	0%	100%
f6	0	0%	100%
f7	0	0%	100%
f9	0	0%	100%
fa	0	0%	100%
fb	0	0%	100%
ff	0	0%	100%
And yes, the "pixel burst" OUT instruction (superimposing the first two bits for the sync and incrementing X to pass to next pixel) is by far the most used, since the gigatron is spending most of his time generating the physical VGA signal!

Regarding the comparison will be very easy: I've done the changes to your emulator to get all the stats I wanted! :D
at67
Site Admin
Posts: 647
Joined: 14 May 2018, 08:29

Re: vCPU instruction frequency

Post by at67 »

steve wrote: 19 Jul 2020, 02:07 And yes, the "pixel burst" OUT instruction (superimposing the first two bits for the sync and incrementing X to pass to next pixel) is by far the most used, since the gigatron is spending most of his time generating the physical VGA signal!

Regarding the comparison will be very easy: I've done the changes to your emulator to get all the stats I wanted! :D
Yeah, I have a bunch of Native asm stats gathering code ifdef'd out in cpu.cpp, (not sure if you came across it), it's pretty cool seeing where the native hotspots are.

I also created some simple code in the emulator, (that's ifdef'd out), that shows the pixel OUT instructions visually with a cursor/cross hair that is nonsensical at full speed emulation but pretty cool when single step debugging.
steve
Posts: 40
Joined: 08 Jul 2019, 19:40

Re: vCPU instruction frequency

Post by steve »

at67 wrote: 19 Jul 2020, 03:22 I also created some simple code in the emulator, (that's ifdef'd out), that shows the pixel OUT instructions visually with a cursor/cross hair that is nonsensical at full speed emulation but pretty cool when single step debugging.
Yes I come across it... Unfortunately just et the end, too late for the overall changes but helped me in writing the output! :)
steve
Posts: 40
Joined: 08 Jul 2019, 19:40

Re: vCPU instruction frequency

Post by steve »

Just updated native instruction frequency with actual instruction code:

Code: Select all

hex frequency   %  cum instruction
--- ---------  --- --- ---------------
5d  231666880  37% 37% ora [Y,X++],out
c2   43701745   7% 44% st [nn]
01   33627338   5% 49% ld [nn]
0d   26085751   4% 54% ld [Y,X]
00   24871641   4% 58% ld nn
80   19154237   3% 61% adda nn
a0   18761968   3% 64% suba nn
fc   16725411   3% 66% bra nn
e4   15980318   3% 69% bgt nn
de   15428968   2% 71% st [Y,X++]
81   13625961   2% 74% adda [nn]
e8   13368376   2% 76% blt nn
14   11513358   2% 78% ld nn,Y
fe   10351837   2% 79% bra AC
d2    9912622   2% 81% st [nn],X
15    9513300   2% 82% ld [nn],Y
89    9359247   1% 84% adda [Y,nn]
12    7767936   1% 85% ld AC,X
05    7694605   1% 86% ld [X]
18    7208385   1% 87% ld nn,out
20    6609144   1% 88% anda nn
e0    6459318   1% 90% jmp Y,nn
ca    6239518   1% 91% st [Y,nn]
11    5908221   1% 91% ld [nn],X
fd    5150552   1% 92% bra [nn]
21    4032157   1% 93% anda [nn]
30    3939783   1% 94% anda nn,X
8d    3903766   1% 94% adda [Y,X]
d6    3601267   1% 95% st [nn],Y
f0    3539284   1% 95% beq nn
ec    3413593   1% 96% bne nn
09    3125761   1% 96% ld [Y,nn]
69    3119773   0% 97% xora [Y,nn]
29    3119749   0% 97% anda [Y,nn]
c6    2228788   0% 98% st [X]
e1    2078775   0% 98% jmp Y,[nn]
90    1422152   0% 98% adda nn,X
85    1216529   0% 98% adda [X]
61    1132588   0% 99% xora [nn]
c0    1011892   0% 99% st nn,[nn]
02     994378   0% 99% nop
40     884117   0% 99% ora nn
a5     822970   0% 99% suba [X]
f4     738596   0% 99% bge nn
19     718560   0% 99% ld [nn],out
91     718482   0% 100 adda [nn],X
60     550923   0% 100 xora nn
a1     407287   0% 100 suba [nn]
25     352185   0% 100 anda [X]
41     279181   0% 100 ora [nn]
82     258459   0% 100 adda AC
45     217599   0% 100 ora [X]
b0     127153   0% 100 suba nn,X
e2     114880   0% 100 jmp Y,AC
f8     103784   0% 100 ble nn
ce      66678   0% 100 st [Y,X]
16      41534   0% 100 ld AC,Y
94      12284   0% 100 adda nn,Y
95      12284   0% 100 adda [nn],Y
c3       5988   0% 100 st in,[nn]
dc         98   0% 100 st nn,[Y,X++]
92         64   0% 100 adda AC,X
49         24   0% 100 ora [Y,nn]
c1          1   0% 100 ctrl nn
cd          1   0% 100 ctrl Y,X
03          0   0% 100 ld in
10          0   0% 100 ld nn,X
2d          0   0% 100 anda [Y,X]
4d          0   0% 100 ora [Y,X]
50          0   0% 100 ora nn,X
65          0   0% 100 xora [X]
6d          0   0% 100 xora [Y,X]
70          0   0% 100 xora nn,X
7d          0   0% 100 xora [Y,X++],out
a9          0   0% 100 suba [Y,nn]
ad          0   0% 100 suba [Y,X]
b1          0   0% 100 suba [nn],X
b2          0   0% 100 suba AC,X
b4          0   0% 100 suba nn,Y
b5          0   0% 100 suba [nn],Y
dd          0   0% 100 ctrl Y,X++
Post Reply