Possible Video Separation Ideas
Posted: 01 Oct 2021, 15:41
Many of us have pondered ways to get the video processing out of native code. I've pondered different ideas over the past couple of years. This thread is intended to be a place to have them in a unified place.
Partial Measures
First, we start with partial measures.
For those who don't want to use the Gigatron for gaming, they could modify it to include a monochrome mode. Then they can send 8 pixels at a time, add a shift register, and have 7 free cycles between each pixel group. Thus an entire frame could fit in 2400 bytes and there would be only 20 bytes sent per active row. It is still bit-banged, but there would be far fewer data to bit-bang. This would require a ROM rewrite and compatible software.
The line repeater project makes it possible to have a full screen while only sending 120 rows of data, allowing the Gigatron to send in the fastest mode while still filling the screen. If one wanted to, they could rewrite the native ROM to remove the mode selection to save a few native cycles, or they could somehow code in a speed selection for compatibility since some of the games are more challenging to play in the fastest rendering mode.
Moving from there, you could have an external frame buffer that could include the above idea. Then the Gigatron wouldn't have to work every frame, or necessarily be in sync together. One would likely need to rewrite the ROM so that nothing is sent to the port when vCPU changes nothing in the video. The question then is how the video interface would know when to overwrite the external buffer. However, this would still be inefficient in that changing a single byte could trigger an entire external frame overwrite.
More Comprehensive Changes
Over the past few days, an idea for using an FPGA to make a "passive" video coprocessor/controller/accelerator came to mind. On the native code side, this would require rewriting the ROM to strip out the majority of the video handling. The bit-banging from RAM to the port would be removed. As for syncs, one would want to leave them. Hsync would be needed for sound (unless moved to the coprocessor) and Vsync would be needed for the keyboard/game controller and any software that simulates an RTC. As for installing it into the Gigatron, it could go on a card that replaces the RAM and includes everything included in the I/O controller project. It could include its own DACs and include ribbon cables to replace the resistors or resistor packs on the board.
How the above would work would be to snoop the address and data lines and to shadow the video (or even sound) I/O addresses and use them from the card's own memory. So it is storing what goes into the Gigatron's RAM as it writes and imposes no demands on the Gigatron. So it would have its own frame-buffer and sync circuitry and work completely autonomously from the Gigatron. The line-quadding would be inherent to the design. So the native software does not have to generate the syncs for it (though it may need them for other reasons).
The above is just the start. It can be made to do more than the Gigatron. I've wondered how to include things like display lists, text mode, and more, and it could be simple. One could repurpose the redirection table. I don't know of any reason why the redirection table should be messing with Page 0, so assigning Page 0 could be the "magic number" in the first indirection table entry to let the controller know that something other than the default is happening. Then the offset of the first entry could then be the mode or "opcode." So this could tell the controller to use text-only mode or to use a display list. Or maybe the last entry could be used for this instead to minimize artifacts when switching modes. And some of the modes could be non-immediate so you can keep the last screen displayed up while switching modes.
Plus, the above idea could possibly add non-video features. For instance, the sound could be moved there. If doing that, I'd keep the note table and software wavetables in ROM/RAM. That way, custom sounds can still be added. But the tables could be shadowed by default and be used from the controller.
As for adding I/O features or math coprocessing, that would be more complicated. The above can all be done in a passive read-only fashion and requires no DMA. But if you need DMA writes to handle reads from external devices or to pass back a mathematical result, it would take some interesting ROM changes. Essentially, one way to do this is to have the Gigatron native code in a spinlock, and when the external controller sees this, it could put a fake value on the bus or latch the bus while the Gigatron is reading the bus. Then the controller directly accesses the RAM which is not connected to the Gigatron at this point, and then the controller sends a signal to reconnect the RAM. Then the condition needed to exit the spinlock is met. An example would be math coprocessing. Let's say you have an address to pass the FPU opcode and addresses for the operands and results. So the FPU could change the opcode byte to 0 when it is done. Using the spinlock, the Gigatron is stuck in a loop until the byte is 0. So you could latch the opcode so that is all the Gigatron sees, then disconnect and manipulate the RAM, and switch the RAM back with the opcode byte cleared. This is just a workaround so true DMA or a Halt line would not need to be added.
New Designs
The most drastic would be to create a new Gigatron-like machine that is fully compatible. One could include concurrent DMA. Thus you'd have at least 2 "threads" that run all the time. Most or even all of the previous ideas could be included. Some things could be added to make even bit-banging easier. For instance, if the Gigatron had at least 2 more registers, then bit-banging the current resolution at 12.5 Mhz or faster would be more feasible since vCPU instructions could be interleaved with sending the video data. That would require integrating the rest of the I/O to prevent I/O races. Everything could be done the Gigatron way. However, this method could introduce software races. One way to handle that is to have an I/O controller that will selectively throttle the CPU. So if the video changes get ahead of what the video controller can handle, it could throttle the CPU so the code won't get ahead of the video and sound.
Partial Measures
First, we start with partial measures.
For those who don't want to use the Gigatron for gaming, they could modify it to include a monochrome mode. Then they can send 8 pixels at a time, add a shift register, and have 7 free cycles between each pixel group. Thus an entire frame could fit in 2400 bytes and there would be only 20 bytes sent per active row. It is still bit-banged, but there would be far fewer data to bit-bang. This would require a ROM rewrite and compatible software.
The line repeater project makes it possible to have a full screen while only sending 120 rows of data, allowing the Gigatron to send in the fastest mode while still filling the screen. If one wanted to, they could rewrite the native ROM to remove the mode selection to save a few native cycles, or they could somehow code in a speed selection for compatibility since some of the games are more challenging to play in the fastest rendering mode.
Moving from there, you could have an external frame buffer that could include the above idea. Then the Gigatron wouldn't have to work every frame, or necessarily be in sync together. One would likely need to rewrite the ROM so that nothing is sent to the port when vCPU changes nothing in the video. The question then is how the video interface would know when to overwrite the external buffer. However, this would still be inefficient in that changing a single byte could trigger an entire external frame overwrite.
More Comprehensive Changes
Over the past few days, an idea for using an FPGA to make a "passive" video coprocessor/controller/accelerator came to mind. On the native code side, this would require rewriting the ROM to strip out the majority of the video handling. The bit-banging from RAM to the port would be removed. As for syncs, one would want to leave them. Hsync would be needed for sound (unless moved to the coprocessor) and Vsync would be needed for the keyboard/game controller and any software that simulates an RTC. As for installing it into the Gigatron, it could go on a card that replaces the RAM and includes everything included in the I/O controller project. It could include its own DACs and include ribbon cables to replace the resistors or resistor packs on the board.
How the above would work would be to snoop the address and data lines and to shadow the video (or even sound) I/O addresses and use them from the card's own memory. So it is storing what goes into the Gigatron's RAM as it writes and imposes no demands on the Gigatron. So it would have its own frame-buffer and sync circuitry and work completely autonomously from the Gigatron. The line-quadding would be inherent to the design. So the native software does not have to generate the syncs for it (though it may need them for other reasons).
The above is just the start. It can be made to do more than the Gigatron. I've wondered how to include things like display lists, text mode, and more, and it could be simple. One could repurpose the redirection table. I don't know of any reason why the redirection table should be messing with Page 0, so assigning Page 0 could be the "magic number" in the first indirection table entry to let the controller know that something other than the default is happening. Then the offset of the first entry could then be the mode or "opcode." So this could tell the controller to use text-only mode or to use a display list. Or maybe the last entry could be used for this instead to minimize artifacts when switching modes. And some of the modes could be non-immediate so you can keep the last screen displayed up while switching modes.
Plus, the above idea could possibly add non-video features. For instance, the sound could be moved there. If doing that, I'd keep the note table and software wavetables in ROM/RAM. That way, custom sounds can still be added. But the tables could be shadowed by default and be used from the controller.
As for adding I/O features or math coprocessing, that would be more complicated. The above can all be done in a passive read-only fashion and requires no DMA. But if you need DMA writes to handle reads from external devices or to pass back a mathematical result, it would take some interesting ROM changes. Essentially, one way to do this is to have the Gigatron native code in a spinlock, and when the external controller sees this, it could put a fake value on the bus or latch the bus while the Gigatron is reading the bus. Then the controller directly accesses the RAM which is not connected to the Gigatron at this point, and then the controller sends a signal to reconnect the RAM. Then the condition needed to exit the spinlock is met. An example would be math coprocessing. Let's say you have an address to pass the FPU opcode and addresses for the operands and results. So the FPU could change the opcode byte to 0 when it is done. Using the spinlock, the Gigatron is stuck in a loop until the byte is 0. So you could latch the opcode so that is all the Gigatron sees, then disconnect and manipulate the RAM, and switch the RAM back with the opcode byte cleared. This is just a workaround so true DMA or a Halt line would not need to be added.
New Designs
The most drastic would be to create a new Gigatron-like machine that is fully compatible. One could include concurrent DMA. Thus you'd have at least 2 "threads" that run all the time. Most or even all of the previous ideas could be included. Some things could be added to make even bit-banging easier. For instance, if the Gigatron had at least 2 more registers, then bit-banging the current resolution at 12.5 Mhz or faster would be more feasible since vCPU instructions could be interleaved with sending the video data. That would require integrating the rest of the I/O to prevent I/O races. Everything could be done the Gigatron way. However, this method could introduce software races. One way to handle that is to have an I/O controller that will selectively throttle the CPU. So if the video changes get ahead of what the video controller can handle, it could throttle the CPU so the code won't get ahead of the video and sound.