Beyond the Gigatron

General project related announcements and discussions. Events, other retro systems, the forum itself...
Post Reply
Sugarplum
Posts: 81
Joined: 30 Sep 2020, 22:19

Beyond the Gigatron

Post by Sugarplum »

I've thought of other ways to mod the Gigatron. For instance, with an I/O expander board, why not connect the ports to it and let it provide the keyboard, video, and sound? The ports could be repurposed in the ROM to send/receive commands to the expander/controller. So you can even emulate interrupts. So every so often, you can read the In port and take data from it, and even use the jump to address trick to go to the relevant handler. A device could even request a functional "halt" or DMA access this way. If a device wants the Gigatron to halt, it can send a signal requesting the halt/DMA time, the ROM can see that, maybe output an acknowledge code, and then enter into a spinlock as it reads the In port until a clear signal is found. The RAM will be untouched during this time, and external devices are free to manipulate the RAM. The controller would be free to snoop the bus for the video and sound, understand the indirection table system, provide its own syncs, accept input and place it directly into the RAM, produce its own sound and video, provide file I/O and more. All Pluggy and Pluggy reloaded functionality can go on that board, and even file I/O assistance can be added. So you have the much wider parallel pipe and microcontroller assistance in one place. So this could allow more communication with an outside controller that works mostly out of memory.

If you can move all the bit-banging to a controller board, you would be free to clock the base machine at any speed you want without new ROMs. The controller could update at least the vertical sync in memory or otherwise make the base machine aware of when that changes so that software that uses a realtime clock could still work, and that could allow for dynamic profiling on boot to know how many machines cycles per page frame there are, and maybe per raster as well. So the ROM can then be free to alter the behavior dynamically based on the speed differences between the syncs and the base machine.

I haven't forgotten about my 75+ Mhz Giga-similar machine idea, but I likely will never do it. It does sound neat using a 3-4 stage pipeline, adding more native instructions like full shifting, multiplication, division, random numbers, more registers, some native 16-bit support, etc. The stages would be Fetch, Decode, Access, and Execute. Access would be before Execute because only reads are modified by instructions, never writes. So if you need to write/store, that could be done in the next instruction and use the Access stage that time. Plus, what would be neat would be also having an auxiliary ALU in the access stage to use that slot another way when RAM is not needed. So you could natively do 16-bit addition/subtraction/logic, but only using registers. Plus the extra "ALU" could provide random numbers when it is not being used for instructions and allow the result to be manipulated in the next pipeline slot (such as inverting it or adding an offset). Additional registers would need to be added if you want to still do bit-banging at 75+ Mhz. That way, both the video thread and the vCPU thread would have both contexts live at the same time and can switch without penalty. So you can have 1 clock for a pixel, 11 instructions to use for vCPU, and then output a pixel, etc. At such speeds, you really don't need much external behavior as you'd have more power than you need. But my idea for how to make fast, custom control units would be costly, inefficient, and require SMD parts for most things. The idea would be to use memory for the CU and the ALU(s) and to copy from ROM to fast SRAMs on boot and use LUTs for everything.

***
Moving on

The more I think about things, I might want to just mess with a Propeller 2 chip and make my own ISA, memory map, etc. With 8 cogs, that is enough to have at least one CPU, one or more coprocessors, sound, I/O, etc. But I don't know what instruction set and features I'd like to add.

Instructions and ISA
While I could use the native P2 instructions, I think it might be more fun to make my own. I don't know what all to include. Probably include most of what is in the 6502 and/or vCPU instruction set, and if there is any space in the opcode map left over, add things like RNG, Mult, Div, and maybe a trig function or 2.

As for the ISA size, I haven't worked that out yet. I'd love to get to a point where I can use word memory with 20 address lines as external memory. That sounds like a challenge. Counting up to 5 control lines (word and wider memory add a control line per byte), that would take 41 GPIO lines out of 56 non-shared lines (64 in total). That isn't too bad. Of the 15 left, that would mean 5 for video (built-in DAC), 2 for keyboard, 5 or whatever for SD, and maybe 2 for sound. If more are needed, maybe the external memory could be multiplexed.

As for the ISA, I'm not sure. If I want to use external 16-bit RAM, maybe have instructions with 8 and 16-bit opcodes. The byte instructions can have a byte for an operand. For 16-bit operands, that will tie up a doubleword, and not sure what to do with the other byte, whether to let it access up to 256 byte registers or just make it use a 24-bit operand or do both. The 24-bit operand might be a good thing since it would allow an absolute jump for the entire range as an immediate. That might be better than how Intel did things since even protected mode didn't access memory in a flat plane. Oh, that was presented to the user like that as an emulation, but it always used segment:offset under the hood, even when the user didn't see it, and despite 32-bit operands. I'm not sure what instructions I'd like to have beyond the basics. I'd like Mult, Div, RND, bounded RND, and due to the overhead of it being an emulation, probably block instructions and maybe loop instructions. I'm not sure if I'd want elaborate memory instructions like ternary memory ops (eg., [mem]+[mem]=[mem]).

Of course, I'd need to decide on what to do for sound and video. I'd want no fewer than 4 sound channels, and probably 15-18 Khz as the top frequency. For accuracy, I'd likely want to use an external crystal. Sure, the P2 has an internal clock that does around 20 Mhz, and that varies per chip. The exact frequency doesn't matter, so long as it is known and doesn't drift. One can code the ROM to use the internal PLL and VCO to get whatever you need. And I don't know what waveforms and capabilities to provide. Obviously, I'd want square, ramp, triangle, and noise. Sine might be nice to have, as well as combination waveforms and near-instrument sounds. I might want a sound coprocessor besides just a sound generator to produce more complex sounds.

For video, I don't know if I want 320x240 or what. I'd want a text mode. I'm not sure what features I'd want. I'd probably want hardware scrolling and sprites. I don't know if I'd want to use 2 cogs for video or not. You could use 2 (preferably neighboring to use shared LUT RAM) and have one for rendering/effects and one for output. Some old computers did it that way, namely the Ataris. You had a chip to render on the fly and another to handle the output.

Any ideas? Wishlist?
Sugarplum
Posts: 81
Joined: 30 Sep 2020, 22:19

Re: Beyond the Gigatron

Post by Sugarplum »

Something else to consider would be if one wants to move beyond the Gigatron and still have compatibility with the Gigatron are alternatives to the GT1 format with new extensions. For instance, if one wants to use word RAM and move to a true 16-bit machine, one could use mostly the same format as the GT1 but use words for the cargo, with the segment length being the number of words.

TBH, a new memory map should be used in the above case and preferably with the most important addresses on word-aligned boundaries. Then, a word-based memory map could be more feasible, and an extended machine with word memory and word native instructions would be possible. Since the native core machine is 8 bits, no thought was given to alignment and misalignment penalties, since that only applies when memory is wider than 8-9 bits. There is 8-bit, 16-bit, 32-bit, and the lesser-common 24-bit memory. Plus there is parity RAM and FPGA BRAM which can be 9, 18, 27, or 36-bits. You don't have to use it as parity RAM (where you have circuitry to count the 1s and set it if the number of 1s is even). You could use the extra bits for other things. If it is 8-16 (or 8-24, 8-32, 9-18, etc.) then you don't have to worry about alignment as all fetches, loads, and stores are 8 bits at a time. So there is no additional overhead since you are using the maximum overhead regardless. But if you have 16-bits and want to do a 16-bit transfer, you'd prefer to move 16-bits at a time. So that could add up to 2 cycles, in some systems, though in this case, it could add just one due to the load with increment feature. On an unaligned read, the microcode or core ROM would have to read the upper byte of the base address and the lower byte of the next address. So if I (or anyone) make something like the Gigatron with 16-bit RAM and native instructions to use it, then I'd want to use a new memory map that puts everything important on even addresses (if using a byte map and word memory) or requires 16-bit addresses.

And then there are the allowed segment addresses. I mean, GT1 files only provide for pages and offsets. Yet I hear talk of loading programs larger than 64K. How do they do that? With a new format such as what I proposed above that uses words for program data, one could include a header to specify whether the code is intended to run on a machine with an explicit memory segment register or not. So it is a way of specifying whether there are 2-byte, 3-byte (or more) addresses.

Then, another consideration is how one might do overlays or DLL equivalents. Right now, there is no way of using files with more code that can fit into memory at once and being able to usefully use all of the code (beyond using GT1 as an animated bitmap display format). Sure, GT1 files can overwrite any memory they have already written to, but can only run what was placed there last. This hasn't really been discussed because folks are still working on mass storage peripherals and gaining speed there. There is no use to make a game with a huge playfield map if it takes two minutes every time you reach a threshold to load the next map fragment. But assuming overlays can be used, it would be nice to have a file extension, either for an overlay-specific file or even an extensible format that includes internal overlays. So how would such a program work? You'd have initialization code, common code and global variables, an overlay manager, and overlay modules. Any initialization code or splash screens would be fair game to evict once they are executed. There would likely need to be an API to select the file segments and jump to those. Maybe the overlay file should have a table in a fixed location within it with the addresses of each starting code fragment. Then from there, it could be similar to a GT1 file. So you have the master table of file locations for each code module (possibly with IDs if needed, or at least a count of how many exist), and then at each of those, you have the pages and offsets, lengths, and code for as much as needed and then the entry point (like a GT1 file).
Post Reply