PurpleGirl wrote: ↑18 Nov 2019, 11:25
It would be nice to find a way to chain 2 Gigatrons. I'm not sure how one would exactly do that. Sure, one could use dual-ported RAM to get the information out of RAM. Maybe the Out of the first one could be used to send special video and sound opcodes to a modified In of the second one, perhaps buffered through a shift register. The 2 ROMs would be modified to be fit for specialization.
[This came up in a
different thread, but I prefer to share my thoughts on chaining here.]
From the first beginnings, early 2017, the OUT and IN ports were designed so that they can be connected for high speed data transfer. It's no coincidence that the IN and OUT blocks are at the same vertical position in the block diagram. It is a bit of a coincidence that the instruction set gives such a high significance to the /IE signal. But because of that, the processor core can input a byte, do an ALU operation on it, and then output the result all in the same 160 ns clock cycle. That makes it a simple DSP.
In the kit edition, the 74LS244 non-inverting tri-state buffer 'BUS-IN' got replaced with the 74HC595 shift-register. Many hobbyists are familiar with it. The breadboard prototype still has a 74LS244 on that location. The block diagram wasn't even updated: today it still shows there are 8 parallel input bits into the system:
- Parallel IN.png (42.84 KiB) Viewed 10096 times
The
only reason we have the 74HC595 now, instead of a 74HCT244, is that we found those nice game controllers with a shift register inside. That was
late 2017, and at that point in the project we were really not in the mood to shoehorn software shifting in the video loop (which would have been the "proper" solution, more in Gigatron style). "Perfect is the enemy of good", so we put in the 74HC595: it was cheaper, had fewer pins, saved us a month of work and nobody would care that it really doesn't belong there (from a purist point of view). But if you look closely at that chip, it has the same 74xx244 on the inside, with a shift register next to it. (In my mind I consider only the buffer as part of the processor proper.) Even though it's not quite a UART yet, with its 16 internal flip-flops it's already an ugly and overly complex thing:
- 74HC595.png (262.13 KiB) Viewed 10096 times
Knowing that, it has become trivial to chain up two Gigatrons: replace one 74HC595 with an 74HCT244 and connect its A pins directly to the Q pins of the other system's OUT register. You may or may not want to share the system CLK lines as well, to make it a negative chip count modification. The keyboard or game controller goes on the right, where we still have a 74HCT595. But now we only trigger it whenever we need to poll it, and use two XOUT lines for that instead of the video sync signals. I think visually, so like this:
- Chained.png (187.01 KiB) Viewed 10096 times
I'm sure it will work just like that. To do the same video tricks, the left processor maintains the display list (videoTable at page 1) and the pixel memory. The right processor sends update commands over some protocol. The left one can spend all of its vCPU time listening and processing those. It can run an almost standard ROM version with just that 1 application replacing the main menu. The software on the right processor can start from scratch with an empty EPROM: the world is its oyster. It has no timing responsibilities and with that it can run native code without restrictions. It can even fake the
Loader protocol and load software this way (which needs to be interpreted because of the Harvard setup, so I would keep vCPU around for that). BabelFish needs to understand there's no 60 Hz signal anymore, those are small things.
For any dual-ported RAM concept I like to see some form of detail before I can join that part of the discussion. I like to see a data sheet of an actual component under consideration, and some sort of timing and layout concept. The reason is this: microcomputers and homecomputers were blessed with very fast RAM. Or actually they weren't and it only appears that way because they had processors made with very slow transistors inside. That made them cheap compared to "real" computers. In such a system, it makes sense to connect all kinds of external support logic (video, sound) to the same memory bus: those memories could easily handle multiple data streams because the processor was not using it at the fullest. In my mind, our situation is the opposite: we have very fast BJT-based logic organised in a way that saturates the 70 ns RAM chip's bandwidth (which is already very fast itself). Therefore expanding directly on the bus looks difficult to me.
Even if I carbon copy all the writes above address 0x100 to a second RAM, I don't know how to get it out in a clean way for display. I can double its speed and alternate access cycles. But then I can also just speed up the first RAM to begin with. And the one or two data sheets I've seen for dual-port memories have asymmetric access.
Edit: with a proper SYS extension, and in a shared clock configuration, the RAM-to-RAM transfer can be in bursts of 1 byte per clock cycle:
Code: Select all
Sender:
...
ld [y,x++],out
ld [y,x++],out
ld [y,x++],out
...
Receiver:
...
st in,[y,x++]
st in,[y,x++]
st in,[y,x++]
...
That is almost DMA already.