Page 1 of 1

Could a Propeller 2 make a good I/O controller for a modified Gigatron?

Posted: 06 Aug 2022, 14:41
by Sugarplum
Overview
I've been thinking of some things. I imagine a Propeller 2 could offload all of the bit-banged hardware. It has enough hub RAM to include all of the Gigatron's RAM. It has 8 cogs for doing each of the tasks. My approach would be asynchronous snooping with the option of lazy RAM writes and command mode. The memory isn't used all the time, so writes could be done when the CPU is not using the memory. There could be jumpers to allow external or internal syncs (so there could be stock ROM compatibility).

Features & Comments
The output circuitry for sound and video could be simpler since there are 4 DACs per cog.

More video modes could be added. One way to do this could be to use the indirection table to pass commands since snooping would make this feasible. To tell the controller to use the indirection table for other purposes, one could overwrite the page number with 0. Surely, there is no need to use the zero page as a frame buffer. That could be a way to pass parameters to the controller. The controller would have its own frame buffer. That could be used to move the rest of the indirection table, change video modes, add a text mode, or even use the area to pass I/O commands.

Stereo could be added to the sound and use only 2 GPIO lines. Different stereo simulation strategies could be added such as a slight time delay between the channels, slight detuning, alternating vibrato, etc. There could be commands to override this and provide direct access. Of course, to get the full benefit of the new features, a different vCPU could be written and software for it. One might be able to do better resolution sound with a better frequency response if one wanted to. For instance, one could use internal waveform tables and then detect any writes to the tables and then use what the software did instead. That is one way to compromise on better sound with being able to make your own waveform tables. Or, for expediency, when the sound tables are not changed, use the hardware waveform abilities. So if you are using triangle or ramp waves on a single channel, the I/O pin itself can take over the job.

If one is pushed for pins, they could Charlieplex the LEDs. You can treat the GPIO lines as floating and take advantage of the tri-state I/O, and the clock would be fast enough to make the lights appear constant. The controller ROM can also convert between the LED formats. But I think that would only save 2 pins.

File I/O could be handled here too. The shift register could be eliminated on the Gigatron side. It could have both the microcontroller assistance of the Pluggy and the wider I/O pipe. If the controller needs more data paths to the Gigatron, the ports could be repurposed. Input can be done DMA style either via command sequences and spinlocks or via lazy writing. So the keyboard/controller data can be written directly into memory whenever it can be.

Math coprocessing may be an option with this approach. The Propeller can do 16/16/32 multiplication in 2 cycles. It can do wider multiplication (32/32/64) and division (64/32/32), but it would do so up to 58 cycles longer since it would need to use the slower CORDIC solver in the hub to do it. Still, that would be 200+ Mhz cycles. Also, trig and other functions can be done.

Such a solution could be a stepping stone to a new vCPU and memory map if one is willing to code it. If I were to redo the memory map, I'd say the first 32K should be only for things like LUTs, important "registers," frame-buffer, sound tables, I/O-passing regions, math coprocessor registers, a new indirection table system, etc. Plus, I'd want to try to align things more. If you try to move to a 16-bit model, you find that too many things straddle word boundaries. Many 16-bit values start at odd addresses. That is not a problem as long as the native core is 8-bits (or if going to 16-bits, a 16-bit memory model is used). So to address this, a new memory map would be needed.

Addressing more memory could be a part of such a controller as with the memory expansion boards. There could be a new register if one wants, or more than that with the appropriate calls to the controller. In that case, they could be kept in cog memory without the limitations and performance hit of using hub memory. Most cog instructions take 2 cycles, but it would be running at 200 Mhz (at least, it is not rated faster, but 320 Mhz is possible).

Even random numbers are possible with this arrangement. The Gigatron looks for entropy during the porch times and updates this. Using the P2, one could probably have it to do lazy writes to memory. If you need more frequent random numbers, then one could make a function call for this to get them on demand, with the overhead of function calls. The Propeller 2 can return unique random numbers per cog and per pin. Of course, this could be a way to make the noise waveform work without using the table in memory.

If one wanted to, having sound, video, and random numbers in one place would allow emulating another aspect of the 80s experience. Imagine, when you turn on the Gigatron, you hear a faint 15.75 Khz sound (the horizontal whine of the yoke coil) and white noise, and you see a screen with "snow" on it, then the familiar blue screen appears. The higher pitches could even have a separate sound channel, maybe drive a piezo speaker on the board. And such an extra channel could emulate other aspects of 80s machines. The Atari clicked when you touched keys and had an internal "alarm" that was separate from the sound coming through the television. So if there were memory or cartridge problems, it would go off. And BASIC programmers could access that extra speaker. It was limited since I think that used the timer in the PIA chip and not Pokey. Even the IBM PC did similar, where it had a 1-bit channel that programmers could control. Of course, I installed a Sound Blaster to get decent sounds in games on a PC.

Gigatron Changes
If a respin is made using the Propeller, changes would be needed on the Gigatron side. For instance, everything can be done using snooping, lazy writes, spinlocks, and command sequences. The Out port could be used to send commands to the controller and not conflict with the Page 0 commands. The Input port could be repurposed to receive status bits. The Propeller would have the actual, dedicated I/O lines.

Adding such a controller could allow the Gigatron to run at more arbitrary speeds. They all should do 8 Mhz, and 10-12.5 Mhz would be possible with some mods. I think if I were to do a respin around the Propeller, I'd do the ALU mod I've proposed before that adds 2 more chips. Thus bit 7 is returned faster (and nibble skew is reduced) to make the higher speeds more stable.

Propeller Considerations & Additional Comments
I'd like to see it work asynchronously. So one GPIO line would likely need to be the Gigatron's clock, and the Gigatron's clock could also be the base clock for the Propeller. You can program the VCO and PLL to generate any Propeller clock you want from nearly any base clock. Spinlocks could be used on the Propeller side to sync itself to the Gigatron writes. I'd probably run the Propeller as fast as possible to allow more granularity with the Gigatron clock and to make up for the Propeller being more CISC in nature. The video could be clocked higher than necessary to ensure there is enough time to do everything and stream from the hub memory. Features to make use of would be the neighbor LUT-sharing mechanism and the ability to transfer doublewords from the hub.

Right now, I'm just exploring. I will pose that question to the Parallax community to see what they have to say about how feasible this would be. It would be nice to have a means to off-load all major I/O to other devices. I do have some design questions in mind. For instance, I'd need to see if it is possible to do I/O snooping, put things in the hub memory, and get it out of the hub memory fast enough. I might have to use 2 cogs for doing just that. I mean, since the I/O across the data and address lines is rather sporadic, putting the hub in streaming mode is not really an option. Plus streaming mode would block other hub traffic. Also, with the other expansion ideas, this could write non-memory accesses to memory. However, the cog that manages reading the pins might not be able to read the pins and commit the reads to the hub memory fast enough to avoid missing reads. There can be up to a 7-cycle penalty for doing this. So it might be better to read from the pins to the cog's LUT memory and use the neighboring cog to write from the LUT to the hub. In other words, it may need a cache. There are 4 pairs of cogs that can share their LUTs. And even consider doing async reads and coordinating it with the Gigatron's clock. There might need to be multiple reads to ensure stable data on the lines. So that sounds like a job for 2 cogs. Writing to the hub memory would be more of an issue since cogs are free to read doublewords from it with no additional penalty. So reading 4 bytes takes as much time as reading one.

Any thoughts?

Re: Could a Propeller 2 make a good I/O controller for a modified Gigatron?

Posted: 10 Aug 2022, 19:08
by Sugarplum
I asked at the Parallax forum, and one said that I probably could do this.

Others changed the direction of the thread to how to do an entire Gigatron inside a P2 chip. One gave a quick assembly mockup for emulating the Gigatron itself, clocking the P2 at 300 Mhz. Due to the variability of the hub memory, you kinda have to do things faster than needed much of the time and then use the hub streamer feature and gate the code by it somehow. So you'd write for the slowest case and then lock things to the streamer to prevent things from going faster than the streamer. When I looked at their code, I noticed they were emulating the native Gigatron. And there was an area to paste in the ROM, and they only allocated 32K for RAM. So they had a stock Gigatron.

I mentioned in that thread that it would make more sense, if one was after high performance, to emulate vCPU instead and to split out the software peripherals into other cogs. As for the memory map, I guess hardcode it where possible in the peripherals that do the I/O and maybe in an init routine.

Re: Could a Propeller 2 make a good I/O controller for a modified Gigatron?

Posted: 14 Aug 2022, 18:48
by Sugarplum
I've checked the Parallax forum and the one is about finished with his Gigatron emulation on the Propeller 2. The result is emulating a Gigatron at 6.25 Mhz that has SPI support and can bank up to 128K.

https://forums.parallax.com/discussion/ ... omputer/p1

Re: Could a Propeller 2 make a good I/O controller for a modified Gigatron?

Posted: 18 Aug 2022, 04:14
by Sugarplum
Roglow over at the Parallax forum emulated a stock Gigatron on a Propeller 2, and even one of the expansion boards:

Image

He got 64K with version 4 ROM and with 5A, he's getting 128K. His approach emulates the stock machine, so it requires a ROM, and it seems the Gigatron ROMs work with it. While he is clocking the P2 at 325 Mhz, the ROM is being executed at a rather stable 6.25 Mhz.

He had it working before, though only with red colors. However, that was a simple bug in the code where he updated the same register 3 times when he meant to update 3 different registers. I caught it and he quickly found the bug.

The nice thing about this is that it simplifies testing when adding other features. For instance, he may attempt my idea of splitting out the video from the code. He may be using a cog for that now (but not used in the above pic), where it reads the indirection table and outputs directly from there. On the P2, that is possible since concurrent DMA is inherent to the design. The 8 cogs take turns accessing the hub memory. And on the P2, the hub is split into banks, so if the next cog reads the next address, it should get it at the same time as the other cog. You still have a bottleneck when accessing hub memory, but isn't as bad on the P2 in comparison to the P1.

Re: Could a Propeller 2 make a good I/O controller for a modified Gigatron?

Posted: 18 Aug 2022, 05:22
by bmwtcu
He got 64K with version 4 ROM and with 5A, he's getting 128K.
ROMv5a had a memory detection bug that would erroneously report 128K on my 64K FPGA.

Re: Could a Propeller 2 make a good I/O controller for a modified Gigatron?

Posted: 20 Aug 2022, 10:14
by Sugarplum
bmwtcu wrote: 18 Aug 2022, 05:22
He got 64K with version 4 ROM and with 5A, he's getting 128K.
ROMv5a had a memory detection bug that would erroneously report 128K on my 64K FPGA.
So it could go either way, depending on how much of the hub RAM he set aside and how well he emulated the expansion. What version should he use since he's planning on including the expanders?

Re: Could a Propeller 2 make a good I/O controller for a modified Gigatron?

Posted: 20 Aug 2022, 10:34
by bmwtcu
If you're trying to use SD browser for SPI expander then I'm guessing DEVROM, otherwise v5a is the latest official ROM available publicly. I skimmed through the P2 thread. Quite impressive how quickly Roger turned it around.

Re: Could a Propeller 2 make a good I/O controller for a modified Gigatron?

Posted: 20 Aug 2022, 15:53
by Sugarplum
Come to think of it, 325 Mhz sounds appropriate based on how things work. Okay, the most common instructions take 2 cycles, you have 8 cogs competing for the common hub memory, and emulating at 3 times the target speed is probably a good rule of thumb. So multiplying all those, you get 48. But the hub RAM bottleneck might, in practice be worse than the figure implied above in a worst-case scenario, so 52x sounds about right for extra headroom, provided you have a reference rate to govern that to prevent things from going faster than intended. And apparently, the P2 has a means for doing that. So you'd have things even more coupled to the pixel clock. So if your hub fetch mechanism is slaved to the pixel clock, then variations in reading the memory due to the 8-channel concurrent DMA won't pose a problem.

If one wants to use my idea for the vCPU-only emulator, then a way to do that might be to use a P2 cog as a management engine and do all the startup, I/O management, memory mapping, the menu, RNG, keyboard polling, and the loader in P2 native code, and use another P2 cog for vCPU emulation, with a third cog for standard I/O, possibly a fourth for Pluggy, and possibly a 5th for other I/O. The vCPU emulator could also include a halt line for compatibility reasons and let the video cog gate that when/if desired (or do a selective halt mode where indirection table, sound, and frame buffer writes only happen during the porches, so you get improved processing while reducing possible sound issues or skipped frames). And the syscalls would need to be implemented as well. There is really only one way to know if this would work...