Making stdio leaner
The stdio C library was designed for minicomputers such as the PDP/11. Compared to such machines, a Gigatron 32K leaves very little memory available to store a program. Unfortunately the traditional implementation of the stdio library makes it an all or nothing proposition. Consider for instance the famous "Hello world!" program:
Code: Select all
#include <stdio.h>
int main() {
printf("Hello world!");
return 0;
}
- This program needs to link the code of the function printf(). This is a complex function that needs a lot of code to handle all the formatting features dictated by ANSI C, even though the program does not use these features.
- Since printf is able to print integers, one needs to import all the code that converts integers into strings.
- Since printf is able to print longs and doubles, one needs to link all the code that converts such numbers into strings. On the Gigatron, this requires a good chunk of the emulation code that manipulates longs and doubles.
- Since printf uses stdout, one needs to import the array of FILE descriptors which also contains stdin.
- Since one imports stdin, one needs to import all the code that reads lines from the console, understands backspace, etc.
- Since stdio is designed to handle filesystems and contains lots of features to handle errors, one also needs to import all the code that checks for error conditions, even though this program will never use them.
All this represents a lot of code that will never run.
This has been a struggle since the beginnings of GLCC. Early GLCC versions already contained a weak linking hack to prevent printf from importing all the long and double emulation code in printf unless used elsewhere. Symbols named "__glink_weak_XXX" are always defined to be equal to symbol "XXX" if "XXX" is defined and to zero otherwise. For instance, when printf() calls the subroutine that prints a double, it calls __glink_weak_doprint_double() instead of _doprint_double(). But one still needs to ensure that the real _doprint_double() is imported when needed. This is achieved with a so-called conditional import logic supported by glink.
Alas this is insufficient because printf() still needs all the code that parses the % formatting specifications and calls their respective implementations. This is why the library offers alternate printf() functions such as mincprintf() which only understands %s and %d without any adornment. The latest version of GLCC offers an intermediate version, named midcprintf(), which is a lot more useful because it handles a richer subset of formatting specifications, including field sizes such "%04x" or "%-12s". But more importantly, GLCC offers a new linker option
--option=PRINTF_SIMPLE that uses the midcprintf formatting routine in all all printf-like functions such as printf, cprintf, sprintf, etc.
But this is not enough. Besides CR ('\r'), NL ('\n'), and BS ('\b'), stdio output supports a number of control characters such as TAB ('\t'), FF ('\f'), and even BELL ('\a') which emits a beep when printed. GLCC has now a new option
--option=CTRL_SIMPLE to prevent importing this extra code.
But this still is not enough. The latest implementation of stdio in GLCC breaks with the tradition and defines stdin and stdout as separate variables so that our hello world program does not need to import all the reading code. It also breaks the tradition by letting all the driver code implement buffering and error signaling. As long as one only uses file descriptors that hit the console, no buffering or error signaling is needed. This was inspired by picolibc
https://github.com/picolibc/picolibc and adapted for the Gigatron.
How much progress does this represent? Here are two tables that report the size of the hello world program with a variety of options and implementations. The code for this is in
https://github.com/lb3361/gigatron-lcc/ ... tuff/hello. The first table gives the sizes achieved with the latest GLCC version.
Code: Select all
+---------------------------------+
| GLCC-2.2-23 |
| -rom=v5a | -rom=v6 | -rom=dev7 |
+--------------------------+----------+----------+-----------+
| glcc | 4063 | 4025 | 3558 |
| --option=CTRL_SIMPLE | 3874 | 3836 | 3369 |
| \ --option=PRINTF_SIMPLE | 2657 | 2622 | 2274 |
+--------------------------+----------+----------+-----------+
| glcc -Dprintf=cprintf | 3756 | 3722 | 3252 |
| \ --option=PRINTF_SIMPLE | 2536 | 2498 | 2154 |
+--------------------------+----------+----------+-----------+
| glcc -Dprintf=midcprintf | 2533 | 2498 | 2149 |
| glcc -Dprintf=mincprintf | 1955 | 1917 | 1622 |
+--------------------------+----------+----------+-----------+
| glcc -DUSE_CPUTS | 1452 | 1452 | 1261 |
+--------------------------+----------+----------+-----------+
| glcc -DUSE_CONSOLE | 1448 | 1448 | 1256 |
+--------------------------+---------------------+-----------+
| glcc -DUSE_RAWCONSOLE | 808 | 808 | 695 |
| \ --no-runtime-bss | 641 | 641 | 539 |
+--------------------------+---------------------+-----------+
The second table gives the GLCC-2.2 sizes for the options that GLCC-2.2 supports.
Code: Select all
+---------------------------------+
| GLCC-2.2 |
| -rom=v5a | -rom=v6 | -rom=dev7 |
+--------------------------+----------+----------+-----------+
| glcc | 5696 | 5625 | 4915 |
+--------------------------+----------+----------+-----------+
| glcc -Dprintf=cprintf | 4382 | 4312 | 3800 |
+--------------------------+----------+----------+-----------+
| glcc -Dprintf=mincprintf | 2246 | 2209 | 1888 |
+--------------------------+----------+----------+-----------+
| glcc -DUSE_CONSOLE | 1736 | 1736 | 1528 |
+--------------------------+---------------------+-----------+
| glcc -DUSE_RAWCONSOLE | 808 | 808 | 695 |
| \ --no-runtime-bss | 641 | 641 | 539 |
+--------------------------+---------------------+-----------+
This shows that without any option, the hello world program on ROMv6 shrinks from 5625 bytes to 4025 bytes because it no longer imports all the reading code and skips most of the error setting code. Using --option=CTRL_SIMPLE reduces this to 3836 bytes, which gets close to the cprintf solution
-Dprintf=cprintf which weighs 3722 bytes. The difference between these two numbers reflects the cost of checking the error flags and dispatching the output request to the console driver. Further using options --option=PRINTF_SIMPLE reduces both numbers to 2622 and 2418 bytes for respectively the printf and cprintf solutions.
The hello.c program
https://github.com/lb3361/gigatron-lcc/ ... lo/hello.c also contains alternate hello world implementations. The most basic one -DUSE_RAWCONSOLE only uses the very low level functions _console_clear() and _console_printchars(), producing an executable that can be as small as 641 bytes (when one disables the bss initialization to save the last 150 bytes). Using console_print() instead with -DUSE_CONSOLE grows this to 1448 bytes because it adds cursor management and scrolling capabilities. Using conio's function cputs() with -DUSE_CPUTS yields about the same size. So there remains a gap of about 1000 bytes between these specialized hello world programs and the generic hello world function that uses printf. This is progress.