Table of Contents
- Primitive Datatypes in C Laying the Foundation:
- Integer Datatypes in C:
- Floating Point Datatypes in C:
- Void Datatype in C:
- Primitive Datatype Table:
- What Datatype and Sign Should I use?
- C Datatype Code Examples:
- Datatype Recap and Closing Remarks:
Primitive Datatypes in C Laying the Foundation:
When starting with C one of the most important fundamentals to learn about the language is what I like to call the “building blocks”. The building blocks in C are known as Primitive Datatypes. Every datatype in C can either be derived or built from the Primitive Datatypes. In C, There are a few parameters of metadata that we need to know about. See the list below for more information regarding this metadata:
- Size
- Each variable in C occupies a piece of memory. How much memory is reserved for the variable depends on the datatype. The smallest unit of allocation is a byte.
- Range
- Some datatypes in C need to be declared with
signed
orunsigned
. These keywords allow the compiler to know the range of values that it can represent depending on its size and the use of these keywords. Refer to the section Integer Datatypes in C for more details. Or for a quick reference refer to the Primitive Datatype Table: section. - Format
- Some datatypes such as
float
anddouble
have special formatting. Refer to the Floating Point Datatypes in C for more details. Or for a quick reference refer to the Primitive Datatype Table:
As shown in the hierarchy drawing above, I would like to focus our attention on the Primitive section for the remainder of this article. I will create more in-depth content explaining the other two columns in the near future!
Integer Datatypes in C:
Before we go any further, I want to take this time to mention two keywords that are used in the C language. They are signed
and unsigned
. When used, these keywords tell the compiler the range of a particular variable. Either the variable can represent positive and negative numbers or it can only represent positive numbers. When the variable is declared as signed
the most significant bit tells you if the value is positive 0
or 1
is negative.
Note: Integer datatypes (i.e. char
, short
, int
, long
) are signed
by default. This means if signed
or unsigned
is not specified when a variable is declared it will be treated as a signed
value.
Let’s take a look at a simple example for a byte using the hexvalue of 0xC8
or decimal 200
.
First, look at the signed value on the left. Here, the bit in position 7 shows a 1
which, as mentioned earlier for signed values means negative. To find out what this bit pattern represents, some additional work needs to be done. We will have to do a little thing called two’s complement. See the picture below demonstrating this.
As you can see, after performing two’s complement we now have a hex value of 0x38
or decimal value of 56
. Remember, the bit in position 7 is 1
for the bit pattern of 0xC8
, so the final value is -56
.
Next, look at the value on the right. This value is unsigned and can be between 0
and 255
. In this case, the value 0xC8
is 200
. This is much more straightforward than the previous example!
Floating Point Datatypes in C:
All of the primitive datatypes that we have talked about so far are integer types. This means all of the values are whole numbers, there is no fractional piece. Well, what do we do if we want to represent a decimal number in C like 3.14159? There are two types of decimal representations in C. These types are described in great detail in the IEEE 754-2019 Standard for Floating Point Arithmetic. I will provide you with a brief overview and save you from all the gory details of reading the specification.
As mentioned earlier, there are two types of formats single precision and double precision. Single precision is represented by the keyword float
. A float
in C is 32 bits or 4 bytes in size. Double precision is represented by the keyword double
it is 64 bits or 8 bytes in size. Take a look at the drawings below to see how the data is formatted for these two datatypes.
There are three parts to both a single precision and double precision floating point number.
- Sign
- Exactly as the name suggests,
0
is a positive value and1
is a negative value. - Exponent
- This component needs to represent both positive and negative values, therefore a bias is applied before being stored in the exponent field (i.e. 3.14 x 106). In this example,
6
is the exponent. - When storing values in this field, a bias is applied and can be calculated with the following formula 2k-1-1 where, k is the number of bits of the exponent field. For single precision the bias is
127
and double precision the bias is1023
. - Mantissa
- The mantissa is the significand part of a number when represented in scientific notation (i.e. 3.14 x 106). In this case,
3.14
is the mantissa.
Void Datatype in C:
The void
datatype in C is an interesting datatype and should not be confused with the void*
(pointer) datatype. Void is what we like to call an incomplete datatype. This type is a placeholder that has no value and occupies no memory (some compiler options like gcc extensions can have the void type return a size of 1
for sizeof(void)
but this is not the standard). Here are some code snippets of how void can be used in practice.
No Return Value:
Let’s use void
instead of a return type. This means the function will not return a value. Sometimes one may refer to these types of functions as procedures.
// This is an example of a function with no return value.
void myExampleFunction1(char* message)
{
printf("The message is: %s", message);
}
No Input Parameters:
In this example, void
is used in place of the parameter list. This means the function has no inputs to it.
#include <stdio.h>
#define OK 0x00
#define NOT_OK 0xFF
char myExampleFunction2(void)
{
char status = NOT_OK; // Assume its not ok.
char input = getchar(); // Get input from the user.
// If the user's input is 'y' then
// print a simple hello message and set the status to 'OK'.
if(input == 'y')
{
printf("Hello World!!");
status = OK;
}
return status;
}
Ignore the Return Value:
Here, the caller function myExampleFunction3()
is intentionally ignoring the return value of myExampleFunction2()
by doing the following (void)myExampleFunction2();
. This is a great coding habit to pickup! It allows the reader to know the author’s original intent by explicitly ignoring the return value of a function that actually returns something.
// This is an example of a function explicitly ignoring
// the return value of another function.
void myExampleFunction3(void)
{
(void)myExampleFunction2();
}
Primitive Datatype Table:
This table provides a quick reference for C primitive datatypes.
DATATYPE | SIZE IN BYTE(S) | RANGE |
---|---|---|
unsigned char | 1 | 0 to +255 |
signed char | 1 | -128 to +127 |
unsigned short | 2 | 0 to +65,535 |
signed short | 2 | -32,768 to +32,767 |
signed int | 4 | -2,147,483,648 to +2,147,483,647 |
unsigned int | 4 | 0 to +4,294,967,295 |
signed long | 4 (32-bit Windows)
4 8 | -2,147,483,648 to +2,147,483,647 (32-bit)
-9,223,372,036,854,775,808 |
unsigned long | 4 (32-bit Windows)
4 8 | 0 to +4,294,967,295 (32-bit)
0 to +18,446,744,073,709,551,615 |
float | 4 | ±1.18×10−38 to ±3.4×1038 |
double | 8 | ±2.23×10−308 to ±1.80×10308 |
void* | 4 (32-bit)
8 | 4 Byte Memory Address
8 Byte Memory Address |
void | None | None |
What Datatype and Sign Should I use?
One might ask, now that I know all this wonderful information what datatype should I be using? Unfortunately, this is not a straight forward answer. However, I will give some guidelines to consider when deciding.
Understand What the Datatype Represents
Understand what the datatype is trying to represent. If you are 100% certain it will NEVER go beyond a certain value then try to make it as small as possible. For example, we know there are exactly 7
days in a week and you cannot possibly exceed the size of a byte with the number 7
whether it is signed
or unsigned
. In this case we should be using a char
. We don’t want to make it anything larger such as a short
, int
or long
because the extra bytes will never be used and the program is just wasting memory.
Make a Reasonable Assumption
Try to make a reasonable assumption, if you are NOT certain of how large a value can get. Unfortunately, this is not an easy task and I would recommend making the value larger rather than smaller. A good example of not making a value large enough is IPv4. Many, many moons ago when IPv4 was in its infancy nobody could ever have guessed we would run out of IP addresses considering 32 bits gives you 4,294,967,296 unique possibilities! Well, fast forward a couple decades with the explosion of the internet and the digital age we are now approaching that limit!! Since then, a new version of the IP protocol called IPv6 was created to help fix the 32 bit IPv4 shortcomings. I honestly have to say I don’t think we will have to worry about running out of addresses anytime soon! IPv6 allows for 340 trillion trillion trillion unique addresses!!
Simple ASCII Strings
If you are doing simple ASCII encoded strings then a char
datatype will suffice.
Decimal Representation
If you want to represent decimal numbers use a float
or double
. Remember, if you need more precision it comes with a cost, a double
is twice the size of a float
. For most applications especially for embedded devices a float
will provide enough precision. If you are unsure, try using the float
type first and if you aren’t getting the resolution you need then use the double
data type.
When to Use Unsigned or Signed:
- Use
unsigned
if the datatype value is always positive. - When doing low level register settings, or bit manipulation use
unsigned
. - Need the maximum positive range of a datatype make the value
unsigned
. - All other cases use
signed
.
C Datatype Code Examples:
Here is an example program that you can run to demonstrate the sizes of the different datatypes. I have compiled and run this program for win32, win64 and gcc 64 bit. See the code example and the console output for each program instance.
#include <stdio.h>
// A simple program to print the different
// datatype sizes.
int main(void)
{
// Characters
printf("size of unsigned char is: %i \n", sizeof(unsigned char));
printf("size of signed char is: %i \n", sizeof(signed char));
// Short
printf("size of unsigned short is: %i \n", sizeof(unsigned short));
printf("size of signed short is: %i \n", sizeof(signed short));
// Integers
printf("size of unsigned int is: %i \n", sizeof(unsigned int));
printf("size of signed int is: %i \n", sizeof(signed int));
// Long
printf("size of unsigned long is: %i \n", sizeof(unsigned long));
printf("size of signed long is: %i \n", sizeof(signed long));
// Float
printf("size of float is: %i \n", sizeof(float));
// Double
printf("size of double is: %i \n", sizeof(double));
// Void pointer
printf("size of a pointer is: %i \n", sizeof(void*));
// Return success
return 0;
}
# The program output for Windows x86
size of unsigned char is: 1
size of signed char is: 1
size of unsigned short is: 2
size of signed short is: 2
size of unsigned int is: 4
size of signed int is: 4
size of unsigned long is: 4
size of signed long is: 4
size of float is: 4
size of double is: 8
size of a pointer is 4
# The program output for Windows x64
size of unsigned char is: 1
size of signed char is: 1
size of unsigned short is: 2
size of signed short is: 2
size of unsigned int is: 4
size of signed int is: 4
size of unsigned long is: 4
size of signed long is: 4
size of float is: 4
size of double is: 8
size of a pointer is: 8
# The program output for Ubuntu x64
size of unsigned char is: 1
size of signed char is: 1
size of unsigned short is: 2
size of signed short is: 2
size of unsigned int is: 4
size of signed int is: 4
size of unsigned long is: 8
size of signed long is: 8
size of float is: 4
size of double is: 8
size of a pointer is: 8
Datatype Recap and Closing Remarks:
Just a quick recap of what we have covered so far.
- All datatypes in C can be derived or created from the Primitive Datatypes.
- The
void
datatype is an incomplete datatype and should not be confused with thevoid*
(pointer) datatype. - The exact same datatype on different platforms or different compilers can end up being different sizes. Refer to the code example section demonstrating this behavior.
- Integer datatypes (i.e.
char
,short
,int
,long
) need to know the rangesigned
orunsigned
. If a range is not specified thensigned
is used by default. - The
float
anddouble
are used to represent decimal numbers.
I would love to hear from you so please leave a comment in the reply section below! Also, don’t forget to subscribe to receive the latest updates via email from Tsunami Code!
Until next time!
-Brandon Sloat