Photo By: fabio

Starting With C Datatypes: An In Depth Explanation

Table of Contents


Primitive Datatypes in C Laying the Foundation:

When starting with C one of the most important fundamentals to learn about the language is what I like to call the “building blocks”. The building blocks in C are known as Primitive Datatypes. Every datatype in C can either be derived or built from the Primitive Datatypes. In C, There are a few parameters of metadata that we need to know about. See the list below for more information regarding this metadata:

Size
Each variable in C occupies a piece of memory. How much memory is reserved for the variable depends on the datatype. The smallest unit of allocation is a byte.
Range
Some datatypes in C need to be declared with signed or unsigned. These keywords allow the compiler to know the range of values that it can represent depending on its size and the use of these keywords. Refer to the section Integer Datatypes in C for more details. Or for a quick reference refer to the Primitive Datatype Table: section.
Format
Some datatypes such as float and double have special formatting. Refer to the Floating Point Datatypes in C for more details. Or for a quick reference refer to the Primitive Datatype Table:
Primitive Datatypes

As shown in the hierarchy drawing above, I would like to focus our attention on the Primitive section for the remainder of this article. I will create more in-depth content explaining the other two columns in the near future!

Return to Table of Contents


Integer Datatypes in C:

Before we go any further, I want to take this time to mention two keywords that are used in the C language. They are signed and unsigned. When used, these keywords tell the compiler the range of a particular variable. Either the variable can represent positive and negative numbers or it can only represent positive numbers. When the variable is declared as signed the most significant bit tells you if the value is positive 0 or 1 is negative.

Note: Integer datatypes (i.e. char, short, int, long) are signed by default. This means if signed or unsigned is not specified when a variable is declared it will be treated as a signed value.

Let’s take a look at a simple example for a byte using the hexvalue of 0xC8 or decimal 200.

Integer Datatype C

First, look at the signed value on the left. Here, the bit in position 7 shows a 1 which, as mentioned earlier for signed values means negative. To find out what this bit pattern represents, some additional work needs to be done. We will have to do a little thing called two’s complement. See the picture below demonstrating this.

Twos Complement Example

As you can see, after performing two’s complement we now have a hex value of 0x38 or decimal value of 56. Remember, the bit in position 7 is 1 for the bit pattern of 0xC8, so the final value is -56.

Next, look at the value on the right. This value is unsigned and can be between 0 and 255. In this case, the value 0xC8 is 200. This is much more straightforward than the previous example!

Return to Table of Contents


Floating Point Datatypes in C:

All of the primitive datatypes that we have talked about so far are integer types. This means all of the values are whole numbers, there is no fractional piece. Well, what do we do if we want to represent a decimal number in C like 3.14159? There are two types of decimal representations in C. These types are described in great detail in the IEEE 754-2019 Standard for Floating Point Arithmetic. I will provide you with a brief overview and save you from all the gory details of reading the specification.

As mentioned earlier, there are two types of formats single precision and double precision. Single precision is represented by the keyword float. A float in C is 32 bits or 4 bytes in size. Double precision is represented by the keyword double it is 64 bits or 8 bytes in size. Take a look at the drawings below to see how the data is formatted for these two datatypes.

Floating Point Datatype C

There are three parts to both a single precision and double precision floating point number.

Sign
Exactly as the name suggests, 0 is a positive value and 1 is a negative value.
Exponent
This component needs to represent both positive and negative values, therefore a bias is applied before being stored in the exponent field (i.e. 3.14 x 106). In this example, 6 is the exponent.
When storing values in this field, a bias is applied and can be calculated with the following formula 2k-1-1 where, k is the number of bits of the exponent field. For single precision the bias is 127 and double precision the bias is 1023.
Mantissa
The mantissa is the significand part of a number when represented in scientific notation (i.e. 3.14 x 106). In this case, 3.14 is the mantissa.

Return to Table of Contents


Void Datatype in C:

The void datatype in C is an interesting datatype and should not be confused with the void* (pointer) datatype. Void is what we like to call an incomplete datatype. This type is a placeholder that has no value and occupies no memory (some compiler options like gcc extensions can have the void type return a size of 1 for sizeof(void) but this is not the standard). Here are some code snippets of how void can be used in practice.

No Return Value:

Let’s use void instead of a return type. This means the function will not return a value. Sometimes one may refer to these types of functions as procedures.

// This is an example of a function with no return value. 
void myExampleFunction1(char* message) 
{   
  printf("The message is: %s", message);  
}

Return to Table of Contents


No Input Parameters:

In this example, void is used in place of the parameter list. This means the function has no inputs to it.

#include <stdio.h>

#define OK     0x00
#define NOT_OK 0xFF

char myExampleFunction2(void)
{
  char status = NOT_OK;    // Assume its not ok.
  char input  = getchar(); // Get input from the user.
  
  // If the user's input is 'y' then 
  // print a simple hello message and set the status to 'OK'.
  if(input == 'y')
  {
    printf("Hello World!!");
    status = OK;
  }

  return status;
} 

Return to Table of Contents


Ignore the Return Value:

Here, the caller function myExampleFunction3() is intentionally ignoring the return value of myExampleFunction2() by doing the following (void)myExampleFunction2();. This is a great coding habit to pickup! It allows the reader to know the author’s original intent by explicitly ignoring the return value of a function that actually returns something.

// This is an example of a function explicitly ignoring 
// the return value of another function. 
void myExampleFunction3(void) 
{   
  (void)myExampleFunction2(); 
}

Return to Table of Contents


Primitive Datatype Table:

This table provides a quick reference for C primitive datatypes.

DATATYPESIZE IN BYTE(S)RANGE
unsigned char10 to +255
signed char1-128 to +127
unsigned short20 to +65,535
signed short2-32,768 to +32,767
signed int4-2,147,483,648 to +2,147,483,647
unsigned int40 to +4,294,967,295
signed long4
(32-bit Windows)

 

4
(64-bit Windows)

8
(64-bit Linux)

-2,147,483,648 to +2,147,483,647
(32-bit)

 

-9,223,372,036,854,775,808
to
+9,223,372,036,854,775,807
(64-bit)

unsigned long4
(32-bit Windows)

 

4
(64-bit Windows)

8
(64-bit Linux)

0 to +4,294,967,295
(32-bit)

 

0 to +18,446,744,073,709,551,615
(64-bit)

float4±1.18×10−38
to
±3.4×1038
double8±2.23×10−308
to
±1.80×10308
void*4
(32-bit)

 

8
(64-bit)

4 Byte Memory Address

 

8 Byte Memory Address

voidNoneNone

Return to Table of Contents


What Datatype and Sign Should I use?

One might ask, now that I know all this wonderful information what datatype should I be using? Unfortunately, this is not a straight forward answer. However, I will give some guidelines to consider when deciding.

Understand What the Datatype Represents

Understand what the datatype is trying to represent. If you are 100% certain it will NEVER go beyond a certain value then try to make it as small as possible. For example, we know there are exactly 7 days in a week and you cannot possibly exceed the size of a byte with the number 7 whether it is signed or unsigned. In this case we should be using a char. We don’t want to make it anything larger such as a short, int or long because the extra bytes will never be used and the program is just wasting memory.

Return to Table of Contents


Make a Reasonable Assumption

Try to make a reasonable assumption, if you are NOT certain of how large a value can get. Unfortunately, this is not an easy task and I would recommend making the value larger rather than smaller. A good example of not making a value large enough is IPv4. Many, many moons ago when IPv4 was in its infancy nobody could ever have guessed we would run out of IP addresses considering 32 bits gives you 4,294,967,296 unique possibilities! Well, fast forward a couple decades with the explosion of the internet and the digital age we are now approaching that limit!! Since then, a new version of the IP protocol called IPv6 was created to help fix the 32 bit IPv4 shortcomings. I honestly have to say I don’t think we will have to worry about running out of addresses anytime soon! IPv6 allows for 340 trillion trillion trillion unique addresses!!

Return to Table of Contents


Simple ASCII Strings

If you are doing simple ASCII encoded strings then a char datatype will suffice.

Return to Table of Contents


Decimal Representation

If you want to represent decimal numbers use a float or double. Remember, if you need more precision it comes with a cost, a double is twice the size of a float. For most applications especially for embedded devices a float will provide enough precision. If you are unsure, try using the float type first and if you aren’t getting the resolution you need then use the double data type.

Return to Table of Contents


When to Use Unsigned or Signed:

  1. Use unsigned if the datatype value is always positive.
  2. When doing low level register settings, or bit manipulation use unsigned.
  3. Need the maximum positive range of a datatype make the value unsigned.
  4. All other cases use signed.

Return to Table of Contents


C Datatype Code Examples:

Here is an example program that you can run to demonstrate the sizes of the different datatypes. I have compiled and run this program for win32, win64 and gcc 64 bit. See the code example and the console output for each program instance.

#include <stdio.h>

// A simple program to print the different
// datatype sizes.

int main(void)
{  
  // Characters
  printf("size of unsigned char is: %i \n", sizeof(unsigned char));
  printf("size of signed char is: %i \n", sizeof(signed char));
  
  // Short
  printf("size of unsigned short is: %i \n", sizeof(unsigned short));
  printf("size of signed short is: %i \n", sizeof(signed short));

  // Integers
  printf("size of unsigned int is: %i \n", sizeof(unsigned int));
  printf("size of signed int is: %i \n", sizeof(signed int));
  
  // Long
  printf("size of unsigned long is: %i \n", sizeof(unsigned long));
  printf("size of signed long is: %i \n", sizeof(signed long));
  
  // Float
  printf("size of float is: %i \n", sizeof(float));
  
  // Double
  printf("size of double is: %i \n", sizeof(double));
    
  // Void pointer
  printf("size of a pointer is: %i \n", sizeof(void*));
  
  // Return success
  return 0;
}
# The program output for Windows x86

size of unsigned char is: 1
size of signed char is: 1
size of unsigned short is: 2
size of signed short is: 2
size of unsigned int is: 4
size of signed int is: 4
size of unsigned long is: 4
size of signed long is: 4
size of float is: 4
size of double is: 8
size of a pointer is 4
# The program output for Windows x64

size of unsigned char is: 1
size of signed char is: 1
size of unsigned short is: 2
size of signed short is: 2
size of unsigned int is: 4
size of signed int is: 4
size of unsigned long is: 4
size of signed long is: 4
size of float is: 4
size of double is: 8
size of a pointer is: 8
# The program output for Ubuntu x64

size of unsigned char is: 1 
size of signed char is: 1 
size of unsigned short is: 2 
size of signed short is: 2 
size of unsigned int is: 4 
size of signed int is: 4 
size of unsigned long is: 8 
size of signed long is: 8 
size of float is: 4 
size of double is: 8 
size of a pointer is: 8

Return to Table of Contents


Datatype Recap and Closing Remarks:

Just a quick recap of what we have covered so far.

  1. All datatypes in C can be derived or created from the Primitive Datatypes.
  2. The void datatype is an incomplete datatype and should not be confused with the void*(pointer) datatype.
  3. The exact same datatype on different platforms or different compilers can end up being different sizes. Refer to the code example section demonstrating this behavior.
  4. Integer datatypes (i.e. char, short, int, long) need to know the range signed or unsigned. If a range is not specified then signed is used by default.
  5. The float and double are used to represent decimal numbers.

I would love to hear from you so please leave a comment in the reply section below! Also, don’t forget to subscribe to receive the latest updates via email from Tsunami Code!

Return to Table of Contents

Until next time!

-Brandon Sloat