10MHz, 12.5MHz and Beyond!

Using, learning, programming and modding the Gigatron and anything related.
Forum rules
Be nice. No drama.
Sugarplum
Posts: 93
Joined: 30 Sep 2020, 22:19

Re: 10MHz, 12.5MHz and Beyond!

Post by Sugarplum »

To go beyond a little past 12 Mhz, that is where you start to rethink other things.

Redesigning the ALU to use a carry-skip adder configuration would likely help because that would allow both carry possibilities to be calculated at the same time with a minor switching delay rather than calculating the nybbles in series. That adds only 2 more chips. Once you get to 20 Mhz, you'd likely need to make adders yourself using many of the fastest AND and XOR gates such as Drass is using in his 100 Mhz TTL/CMOS 6502 project. But that would not be 5v tolerant at all. Once you get really fast, you'd want to split the arithmetic and logic functions into 2 separate units. Or you could distribute the math/logic with even more chips where each operation has its own circuits.

Then you reach the idea of adding another pipeline. That would require a new ROM and inserting registers into the control lines to separate the control unit from the ALU. Theoretically, that should allow for up to 50% more speed, depending on how balanced your latencies are between the stages. By that point, your ROM will be the limiting factor.

When ROM becomes a limiting factor, one can add the fastest 16-bit SRAM and have a circuit to copy the ROM to the SRAM on boot and then execute out of the ROM shadow SRAM. So with a 3-stage pipeline and 40 ns SRAM shadowing the ROM, that would put you closer to 25 Mhz. If you are not afraid of 3.3v and lower voltages or SMTs, you might find 8-10 ns SRAM (with a theoretical maximum of 100-120 Mhz, depending on the other stages).

If one is not interested in native mode compatibility, they might be able to rework the ISA to help simplify the control unit. That would be incompatible with what we have, but you could still have vCPU compatibility. Finding a way around chaining decoders would be desirable. If you can't avoid that, then maybe one could borrow a cue from the carry-skip adder arrangement and calculate multiple values at the same time and use a "switch" (multiplexer) to put the correct one on the bus as determined by the earlier decoder.

Of course, the clock rate is not everything. Other speedups will be fruitful. The line repeater would allow you to use mode 4 all the time. Separating the video generation from the CPU would be helpful too. Doing that will increase performance at lower clock rates, but depending on how you do it, that could limit higher clock rates (unless you get more sophisticated with caches). More usable native opcodes would help gain speed through improving code density. Adding more registers and instructions that work with multiple data would speed up things, as would being able to run multiple instructions at the same time. More cores could help, but that depends on the software.
Last edited by Sugarplum on 31 May 2021, 19:46, edited 1 time in total.
ImmortanJoe
Posts: 2
Joined: 31 May 2021, 04:25

Re: 10MHz, 12.5MHz and Beyond!

Post by ImmortanJoe »

Hi. I've held off posting until I had the PCBs in my hands. They arrived today so I want to get started.

Given I just have the mainboard and pluggy, I'm left with the opportunity to put together the board how I see fit.

I was curious about the FLIR heat maps I saw on here. I'm a bit hazy on latency, but I wanted to know if it's only worthwhile uprating components that are being ran harder?

I'd like to do a modest clock increase if it's permissible. If so I might as well just buy those the right parts straight off.

I guess what I'm looking for is some help with a modified BOM?
monsonite
Posts: 101
Joined: 17 May 2018, 07:17

Re: 10MHz, 12.5MHz and Beyond!

Post by monsonite »

Joe,

I increased the clock frequency to 8MHz, just using the standard components, and I was still able to get a much compressed picture from my monitor.

I increased the frequency to 10MHz and although the monitor would no longer sync to this, the Blinkenlights still flashed confirming that the machine was still running.

I then swapped out almost all the chips for 74F (Fast) and got up to 12.5MHz.

Someone suggested using 74VHC series chips, as these have very low propagation time, but unfortunately not all are available, and not available in the DIL package.

I was happy to achieve 12.5MHz, and Marcel supplied me a special ROM with modified sync timings to cope with the doubled clock frequency.

Marcel also tried an experimental 4 layer pcb - which provided 5V power and ground planes for better power distribution and better EMC noise reduction. With this I believe he got up to about 15MHz.
walter
Site Admin
Posts: 160
Joined: 13 May 2018, 08:00

Re: 10MHz, 12.5MHz and Beyond!

Post by walter »

I have the 15MHz version here. It has a different SRAM (AS7C256A-10JCN) that is not available in DIP28, so it uses a small SOJ-to-DIP adapter board. The ROM is a fast 27C1024 EEPROM (containing the special ROM v3y). CLK2 is not used, CLK1 is split off to CLK2. The caps (C1 and C2) in the pierce oscillator are removed. (The clock uses a 74HCT04). The PCB is a 4-layer version. Unfortunately, the KiCad files of the 4-layer version are lost. Dave from eevblog also made a quick&dirty version of a 4-layer board.
monsonite
Posts: 101
Joined: 17 May 2018, 07:17

Re: 10MHz, 12.5MHz and Beyond!

Post by monsonite »

Walter,

Marcel and I collaborated over the high speed version. I supplied the RAM on its adaptor pcb, the blank ROM and all of the 74F series ICs that were still available - in particular the 74F283 adders and 74F153 multiplexers to speed up the ALU. The counters in the Program Counter, X-reg and the data selectors in the MAU were also upgraded to 74F parts.

The other modification that might not be immediately obvious is that the pull-up resistors in the instruction decoder diode matrix were lowered to 600 ohms.

It's a tribute to Marcel's bullet-proof design that even the standard version would still run at 10MHz!
walter
Site Admin
Posts: 160
Joined: 13 May 2018, 08:00

Re: 10MHz, 12.5MHz and Beyond!

Post by walter »

Thanks for the addition.
Sugarplum
Posts: 93
Joined: 30 Sep 2020, 22:19

Re: 10MHz, 12.5MHz and Beyond!

Post by Sugarplum »

I guess I could add here to prevent starting a new thread. I believe that adding a new pipeline stage would push you right to 18 Mhz. By that point, the CU and the ROM would be the bottlenecks. While a 3-adder ALU would cut some latency from the ALU stage, the other 2 stages would be the bottleneck. Even if you get the ROM to 40 ns, the CU might be slightly slower, assuming the fastest chips. If you could speed up the CU, then the ROM would be the bottleneck. In that case, you might want a reset circuit that copies the ROM to an 8-10 ns SRAM. Using LV SMD parts might help latencies overall. So about 20 Mhz might be the overall limit unless you could make a faster CU.

As for other ways to boost speed, a few ideas come to mind. For instance, there is the I/O snooper coprocessor idea. That would include the line repeater idea (saving up to 57,600 original speed cycles per frame) and prevent video reads (saving 19,200 more cycles), for a total saving of 76,800 original speed cycles per frame. Adding sound or I/O writes to the controller could possibly save more time. With enough board and ROM tricks, I/O reads might even be possible, even the FPU idea I've been floating around. That would require inserting spinlocks or NOPs in the ROM.

A weird idea comes to mind when it comes to instructions. With a 3-stage pipeline (and presumably 2 delay slots), while one likely won't get past 20 Mhz, it might be possible to clock the ALU stage twice as fast to where it works on both edges of the slower clock. Doing that would need something such as Drass's 6.7 ns adder (a bunch of SMD parts). So that could allow extending the native instruction set to allow some 2-cycle instructions. The 1-place left shift instruction could work twice for a native x4 multiplication. However, even 2 cycles could be enough for full multiplication (at least with a fast enough LUT). If the RAM can handle this speed, then there might be enough time for 2 RAM loads/stores for 16-bit instructions.
Sugarplum
Posts: 93
Joined: 30 Sep 2020, 22:19

Re: 10MHz, 12.5MHz and Beyond!

Post by Sugarplum »

I think I know how to get a discrete chip Gigatron up to 50 Mhz. Do it using SMD, do a LUT-based control unit shadowed into 8 ns SRAM, shadow the instruction ROM too, use 3 pipeline stages, use Drass' 6.7 ns ALU, and use 8ns system SRAM. Unless the SRAM can be uncoupled, about 66.667 Mhz would be the absolute maximum, though the clock-delay trick could work.

And to clock such a monstrosity, one might want a 50 (or 66) Mhz oscillator can, and use a high-speed clock buffer/amp for isolation. If one still wants to use the Pierce clock, you might need to recalibrate the R/C network and use mica capacitors (anything better than ceramic for the calculated values).

For 50 Mhz, bit-banging would work fine. That means if you keep the resolution, you can have 7 instructions between each pixel (unless you want 56.25 - 62.50 Mhz and get 8 or 9). With the LUT-based CU, you could even add maybe 2 more index registers to make bit-banging more efficient. The line repeater would be nice to add too or add the FPGA snooper idea (mainly for more modes, memory, and better sound).

Going that fast, it would then be good to add multitasking to the ROM. Thus one can get overly fast games down to playable speeds and have more I/O time for possible mass storage options. With that kind of speed, one would likely want to use program overlays and external media.
Sugarplum
Posts: 93
Joined: 30 Sep 2020, 22:19

Re: 10MHz, 12.5MHz and Beyond!

Post by Sugarplum »

On further thought, I think I can see how 120 Mhz might be possible. That would require going to a purer RISC model and being more accumulator-centric. Since the best ALU+RAM latency is probably about 15ns, then if you only use the ALU or the SRAM, you can get that down to 8 ns. But that would complicate bit-banging since that would require a load and then a bit-manipulation to the port. For writes, one would need to do any ALU ops and then a store.
lb3361
Posts: 360
Joined: 17 Feb 2021, 23:07

Re: 10MHz, 12.5MHz and Beyond!

Post by lb3361 »

Sugarplum wrote: 26 Oct 2021, 11:41 For 50 Mhz, bit-banging would work fine. That means if you keep the resolution, you can have 7 instructions between each pixel (unless you want 56.25 - 62.50 Mhz and get 8 or 9). With the LUT-based CU, you could even add maybe 2 more index registers to make bit-banging more efficient. The line repeater would be nice to add too or add the FPGA snooper idea (mainly for more modes, memory, and better sound).
It is probably impossible to split vCPU instructions into 7 cycle subunits. Maybe one could use a FIFO chip (as in the Video Repeater). The Gigatron would fill the FIFO at the beginning of each scanline, and the FIFO would deliver the pixels on time for the VGA screen. That way, all the inter-pixel time is consolidated in a single chunk that can be used to run the vCPU...
Post Reply