char
to int
in the generated object code, regardless of whether this made any particular sense. For example, if x
and y
were of type char
, and were to be added with the sum placed in z
, then the assembly output would execute something resembling the following steps:- Load
x
into an eight-bit register, which on the 8080 or z80 would normally be thea
register. - Move the
a
register to the lower half of a sixteen-bit register, for example, thel
portion of thehl
register. - Test the sign bit of the
a
register. - Using one of a number of obscure instruction sequences, fill the upper half of the sixteen-bit register (in this example,
h
) with eight copies of this sign bit. - Repeat the above steps to get a sign-extended copy of
y
into (say) thede
register. - Move
l
toa
. - Add
e
toa
(which sets the carry bit in the condition-code register appropriately). - Move
a
to (say)c
. - Repeat the preceding three steps to add (with carry)
h
tod
, placing the result inb
. - Move
b
toa
, never mind that this is where we just gotb
from. - Store
a
intoz
.
This sort of assembly code sequence was simply not what you wanted to see when attempting to jam 50,000 lines of source code into a 64KB address space, even when given a full 256K of physical memory and an overlay loader hacked up to do bank switching. Other long-dead and unlamented languages were able to generate a pair of loads, an add, and a store, which did much to limit C's uptake on 8-bit platforms. Of course, these other languages had their own charming limitations, such as lacking any variable-initialization syntax (use assembly instead!), insanely inefficient
for
loops, strange restrictions on their equivalent of union
, and much else besides.After all, the compiler had to fit in 48K, since the operating system (such as it was) consumed the remaining 16K. (My application dispensed with the operating system, and hence could use the entire 64K.)
The past three decades have seen C grow up to be almost unrecognizable, which, believe me, is a very good thing. This growth allows us to enjoy a huge number of language features, including:
- identifiers longer than eight characters (and longer than seven for
extern
identifiers). - type-checked function arguments and return values.
- the deprecation of the
gets()
function. - 32-bit and 64-bit integers.
- inline functions.
- standard
printf()
formats (yes, these really did vary from compiler to compiler).
But one feature that I hadn't noticed until recently is the empty structure declaration, which Peter Zijlstra recently introduced me to in the form of the non-lockdep definition of
struct lock_class_key
. A recently detected hang in RCU under extreme conditions caused me to want an array of struct lock_class_key
, which of course turns into an array of empty structures in non-lockdep builds.To my surprise, this actually works, at least in gcc:
#include <stdio.h> struct foo {}; struct foo a[10]; int main(int argc, char *argv[]) { printf("%p %p\n", &a[0], &a[1]); return 0; }
This generates an object module whose bss and data really are of zero size:
text data bss dec hex filename 66 0 0 66 42 arrayemptystruct.o
And different elements of the array really do have identical addresses:
./arrayemptystruct 0x8049588 0x8049588
The C language certainly has come a long way in the past three decades. And yes, after all these years, I am still easily amused.
Comments
AFAIK, this is part of C standard.
Such a shortcut might well be prohibited on machines with ones-complement arithmetic -- depending on exactly how such machines and compilers deal with truncation from 16 bits to 8 bits in cases where the result does not fit in 8 bits.
However, for twos-complement machines, this optimization does work. And my version of gcc actually employs a variant of this optimization for -O, producing three instructions: a sign-extended load, a single-byte add, then a single-byte store.