C!! – Redefining the native types of C++ for C bang bang

Posted on Saturday, October 18, 2014 by Simon Cooke

This article is going to be a lot simpler than the other articles on C!! (which hopefully are still percolating a bit). Mainly because I want to handle something simple before I dive into deeper waters – like generators.

So if you were going to redefine C++’s type system, how would you do it?

My goals are to make the types explicit sizes where possible, and obvious where not. And some of these are (as with everything else in C!! at this point) up for debate.

By the way, simple doesn’t mean short, sadly. This is a long one.

Integer Types

Let’s start with the explicitly sized types. This should be easy.

Explicit-size Integer Types

Name	Description
uint8 byte char	Unsigned 8-bit integer
uint16 wchar ushort	Unsigned 16-bit integer
uint32 uint	Unsigned 32-bit integer
uint64 ulong	Unsigned 64-bit integer
uint128	Unsigned 128-bit integer
uint256	Unsigned 256-bit integer
uint512	Unsigned 512-bit integer
uintn	Reserved. n must be a power of 2.
int8	Signed 8-bit integer
int16 short	Signed 16-bit integer
int32 int	Signed 32-bit integer
int64 long	Signed 64-bit integer
int128	Signed 128-bit integer
int256	Signed 256-bit integer
int512	Signed 512-bit integer
intn	Signed n-bit integer. Reserved. Must be a power of 2.

The idea behind uintn and intn are that they are for future expansion, and just block off that part of the namespace. I’m still noodling on the 128/256/512-bit integers; they might be doable with SIMD instructions, but they’re not exactly general purpose. We could leave them out and stop at (u)int64. Either way, we’ll reserve the syntax.

Platform-specific Integer Types

Name	Description
uintr	Unsigned, aligned atomic integer
intr	Signed, aligned atomic integer

By default, uintr and intr are register-sized for a given target platform. So on x64, they’re guaranteed to be 64-bit. On x86, they’d be 32-bit. These types take the old-school meaning of int and unsigned int – that is, the native CPU word size.

By default, they also have the minimum atomic platform alignment for the type. (These are meant for speed).

Backwards-compatible Integer types

Because we’re keeping compatibility with C++ and C where it makes sense, I’m not going to redefine the wheel entirely. But I’m also not going to go too crazy. As a concession to C++ programmers, let’s keep unsigned, signed, char, short, int, long. Let’s also keep wchar, but drop the _t.

I’m philosophically opposed to long long. (Remember: we’re not going for code-level backwards compatibility here, so I can drop it).

I stuck them in the table above for brevity.

Real types

The reals (as opposed to the integers) come in three flavors – floating point, fixed point, and decimal. (Yeah, I know… decimal isn’t a game-y thing, but it’s needed for completeness).

Floating-point types

Name	Description
float16 half	16-bit floating point value (5 exponent, 10 fraction bits)
float32 float single scalar	32-bit floating point value (8 exponent, 23 fraction bits)
float64 double	64-bit floating point value (11 exponent, 52 fraction bits)
float80 long double	80-bit floating point value (15 exponent, 63 fraction bits)
float128 quadruple	128-bit floating point value (15 exponent, 112 fraction bits)
floatn	Reserved for the future

The proliferation of names is quite horrible, particularly for float32. We might want to cut them down to size a bit (but it’s so tempting to include scalar for symmetry with vector).

Fixed-point types

Fixed point comes in with a slightly different naming scheme:

Name	Description
fixedn_m	Signed fixed point with n 2’s complement integer bits, and m fraction bits.
ufixedn_m	Unsigned fixed point with n integer bits, m fraction bits.

The space taken by the fixed-point value will be rounded up to the nearest machine word size, unless it’s packed into a bitfield, in which case it’ll use n + m bits.

Decimal is simply decimal32, decimal64 and decimal128 from IEEE-754 2008. We reserve decimaln for the future, like the other definitions.

SIMD Types

You could make a case here for these all being striped structs, but there are some advantages to them being native types. We need two; vector and matrix.

I’m tempted to go for a variant on HLSL syntax here:

Name	Description
vector2	2 float-32’s (x,y)
vector3	3 float-32’s (x,y,z). When promoted to vector, w is set to 0.
vector vector4	4 float-32’s (x,y,z,w)
vector2i	vector of 2 signed int32’s
vector2u	vector of 2 unsigned int32’s
vector3i, vector3u, vectori, vectoru	signed/unsigned int32 vector variants
vectorh, vector2h, vector3h…	Half-precision float vector variants
vectord, vector2d, vector3d	Double-precision
vectore, vector2e, vector3e	Extended precision
vectorq, vector2q, vector3q	Quadruple-precision
matrixnxm	n x m matrix of float-32’s

That said, the template-style format allows much clearer definitions, and you could always alias them to shorter names:

Name	Description
vector	Declares a vector type
matrix	Declares a matrix type

I think we need the latter, but we’ll define aliases for the others. (At least this way, we can easily extend the types supported for each one).

We may need ways of handling unused components (initialize to 1.0, 0.0) - this mainly comes into play when we promote vector2 and vector3 to vector4, and is needed for operations such as 3-component dot products on hardware which always performs them on all 4 components. We also need to differentiate between contiguous vector3’s in memory (unaligned), and padded vector3’s (aligned).

We can synthesize the right behavior by forcing the type (when written to memory) to be aligned via the align keyword. But this makes for a lot more typing than is idea. It might be better to use vector4, and make one of the rules that when you load a vector3 from memory, and then cast/promote it to a vector4, the uninitialized component is always initialized to zero.

The other alternative here is to use a mapping to do the dirty work.

Boolean Types

This is an area that I’m in two minds about. I would like to see the default be that bool is either packed for cache/storage efficiency (so, 1-bit), or to be passed back in a register-width type. These do have slightly different usages though, so let’s define a new type – flag – which is a single bit value, and keep bool for the other one. We’ll let packing rules handle the rest – so if you put multiple flag values side-by-side, you’ll get all the benefits of using bit-fields with less of the slightly messy syntax. The compile packs them appropriately based on the alignment requirements of everything around them, from LSB to MSB.

Name	Description
flag	Single bit. (Equivalent of bool x : 1 in C/C++)
bool8/bool16/bool32/etc	Explicit-width bool (rarely used, but we can use the bool semantics here)
boolr	Register-width bool (do we need this? or is this just bool?)

As with C++, true and false are literal values.

Maybe Worth Including: Comparison Types

When performing comparisons, you typically end up with three values; equality, less-than and greater-than – which map to 0, less than 0, and greater than 0.

We could spell this out if we wanted to, which would allow us to possibly optimize comparisons. It really depends on how good the compiler is at hoisting them out and recognizing that it can perform one comparison and branch effectively based on it. (Possibly combining those operations).

This is somewhat of a micro-optimization, but look at this code:

__declspec(noinline) int Select(int a, int b, int lt, int eq, int gt)
{
    if (a < b)
    {
        return lt;
    }
    else if (a == b)
    {
        return eq;
    }
    else //if (a > b)
    {
        return gt;
    }
}

int _tmain(int argc, _TCHAR* argv[])
{
Select(argc, (int)argv[0], argc, argc-1, argc+1); // Trick the optimizer a little
return 0;
}

Seems pretty simple, right?

Except it generates this assembly:

_TEXT    SEGMENT
_lt$ = 8                        ; size = 4
_eq$ = 12                        ; size = 4
_gt$ = 16                        ; size = 4
?Select@@YAHHHHHH@Z PROC                ; Select, COMDAT
; _a$ = ecx
; _b$ = edx

; 7 : {

push ebp
mov ebp, esp

; 8 : if (a < b)

cmp ecx, edx
jge SHORT $LN4@Select

; 9 : {
; 10 : return lt;

mov eax, DWORD PTR _lt$[ebp]

; 13   :     {
; 14   :         return eq;
; 15   :     }
; 16   :     else //if (a > b)
; 17   :     {
; 18   :         return gt;
; 19   :     }
; 20   : }

pop ebp
ret 0
$LN4@Select:

; 11 : }
; 12 : else if (a == b)

    mov    eax, DWORD PTR _gt$[ebp]
    cmp    ecx, edx
    cmove    eax, DWORD PTR _eq$[ebp]

; 13   :     {
; 14   :         return eq;
; 15   :     }
; 16   :     else //if (a > b)
; 17   :     {
; 18   :         return gt;
; 19   :     }
; 20   : }

    pop    ebp
    ret    0
?Select@@YAHHHHHH@Z ENDP                ; Select
_TEXT    ENDS

In the scheme of things, this is a tiny tiny complaint, but we need to perform two cmp’s here. mov doesn’t affect the rFlags, and cmp knows we’re dealing with a constant integer literal here, so somewhere, somehow, the optimizer got wrapped around the axle. (This was generated with VS 2013). What the code should look like is this:

    push   ebp
    mov    ebp, esp
    cmp    ecx, edx
    jge    SHORT $LN4@Select
    mov    eax, DWORD PTR _lt$[ebp]
    pop    ebp
    ret    0
$LN4@Select:
    mov    eax, DWORD PTR _gt$[ebp]
    cmove eax, DWORD PTR _eq$[ebp]
    pop    ebp
    ret    0

(And you could even make the argument that it should really be this…)

    push   ebp
    mov    ebp, esp
    cmp    ecx, edx
    cmova eax, DWORD PTR _gt$[ebp]
    cmovb eax, DWORD PTR _lt$[ebp]
    cmove eax, DWORD PTR _eq$[ebp]
    pop    ebp
    ret    0

… but I’ve not measure the instruction timings, so the previous version may after the cost of the jump is considered vs. the indirect loads, be faster. The big question here is: will the data be available in the cache/registers? If so, it’s probably cheaper to do the cmovXX version. Ah, I know what you’re thinking: what about branch prediction? Let’s assume that because we’re likely to be using this comparison to partition or sort data, it’s likely that branch prediction will be broken as we’re operating on a fundamentally random data stream – a good portion of the time it’s just going to plain guess wrong. It still may be cheaper to do a mov followed by the two cmovs though).

So we could make a case here for breaking out a comparison as its own type, kind of like bool. Or we could improve the optimizer so that it better understands when flags are affected by the result of an operation. (My guess here would be that because of the jump, the compiler considers the flags indeterminate on entry to $LN4@Select, thus the repeated cmp).

Not In The Language: Trinary Bools

I’m tempted to include trinary booleans for completeness (yes/no/maybe). But to be honest they’re rare enough that you rarely need them. If you need to include them, the math is pretty simple. If you want to OR a series of bools to get a trinary boolean value, you just start at zero and increment for each bool that is true. If the value you get at the end is the same as the number of bools you’ve checked, it’s true. If the value is zero, it’s false. Any other value? It’s maybe. (Or a mixture/indefinite).

The thing is, I’ve only really ever seen them used for UI in tri-state checkboxes. And it’s easy to implement the logic yourself (see previous paragraph). So we’ll leave it out.

Not In The Language: Complex Numbers (but in the library)

I’d like Complex numbers to be covered, but for performance reasons, I think they need to be defined outside of the native language types, and relegated to library code. The reason for this is that it’s likely to be necessary to define complex numbers either as simple vector2 tuples, or as striped structs. (I’ve seen enough FFT implementations to know that keeping the real and imaginary parts separate in memory affords performance gains in some cases).

We’ll reserve the name complex for it though, and require that every implementation include it.

Strings

Strings are implemented like OLE/Pascal strings (with a length prefix, and a terminating nul character). By default, they are stored in UTF-16 format with 16-bit characters, and the type name is string.

stringutf8 is also supplied by the language for scenarios that need it. It has 8-bit characters, and the data is stored as utf8.

I’m in two minds as to whether string should have a 16-bit length, or a 32-bit length parameter. 32-bit seems overkill (if your string is > 64KiB, maybe you should be using a different data type). I’m almost tempted to allow variable sizes for the length parameter, but that seems like a mess waiting to happen.

Type Traits

Types can have traits. They’re just constant literals embedded in the type’s namespace, and are standardized by the compiler. So uint8.max is 255, for example. (Or is it uint8::max? I’ve grown to hate the scope-resolution operator in C++. Maybe we’ll get rid of it in a later article).

Rather than spelling them out, let’s just say for now that all of the type traits found in the C++ numeric_limits.h file are allowed, but instead of needing to include that file, they just get put underneath the type’s name and are provided by the compiler.

(We might also want to come up with more explicit definitions of some of the values – for example, max and min are horribly overloaded and have radically different meanings for reals vs. integer types).

What’s not covered here?

Pointers, pointer types (e.g. void) – let’s just say they work a lot like C/C++. Other literals (such as null – no need for nullptr; we don’t need to work around previous definitions, so we’ll use null instead).

Decisions, decisions…

One thing that we need to figure out with the types is are multiple names for the same type a good idea when they’re just aliases? The purist in me says no, get rid of the cruft. People can always define their own aliases. The realist in me says that they’re fine, but you might have gone overboard a bit. I’ll let this sit for a while and we can figure it out later.

This article is part of a series. The previous part is here.

Simon Cooke is an occasional video game developer, ex-freelance journalist, screenwriter, film-maker, musician, and software engineer in Seattle, WA.

The views posted on this blog are his and his alone, and have no relation to anything he's working on, his employer, or anything else and are not an official statement of any kind by them (and barely even one by him most of the time).