BetterOS.org : An attempt to make computer machines run better


HOME | BETTER LINUX | GAMES | SOFTWARE | TUTORIALS | ABOUT | REFERENCES | FORUM | WEB LOG |

Tutorials: INDEX | C TUTORIAL 1 | 2 | 3 | 4 | LOW-LEVEL GRAPHICS |

Tutorials

C Tutorial

The things you should learn

Tutorial 4: Understanding Better


Introduction
 If you understood everything so far, you already have a pretty good understanding of the basics of C (except for one very important concept). There are some things I haven't yet made clear, and those are the things I would like to do in this tutorial. I will also cover the one very important concept which I neglected so far: pointers.
Also, I want to do something a little bit different in this tutorial by providing not just examples but also negative examples, which illustrate what doesn't work and why. With each negative example, I will provide a warning so somebody quickly scanning through the tutorial won't mistake that for good code. However, attempting to compile negative examples and observing the error may help you to understand.


Function Prototypes
 We already learned a little about function prototypes and why they are important. A prototype is helps the linker to connect your function call to the correct function and helps the compiler do some error checking to determine if your call is correct. However, if you noticed in our last example (tutorial 3), we wrote our own function and called it, but without a function prototype and it worked fine. This worked because we defined our function above all the points in our code where we call it, so by the time we called it, the compiler already knew about the function and no prototype was needed. Actually, in this case, even if the function was defined after main, there would still be no problem because the compiler would make an assumption about how to call that function, and in this case the compiler would guess correctly. However, this will not always be the case, and sometimes there will be problems.
negative example:
#include <stdio.h>
#include <time.h>

int main(int argc, char **argv)
{
	int rnd;

	rnd = getRandomf(time(NULL));

	printf("Random number: %d\n",rnd);
}

float getRandomf(int i)
{
	srand(i);
	return rand();
}

 Now this code is clearly silly; there is no reason to return the output of rand() as a float, but nevertheless, this type of scenario will come up in real programming. If you try to compile this code, you will get an error. This is because the C compiler compiles starting from the beginning of the file going down. So when it reaches your function call rnd = getRandomf(time(NULL));, it has no idea what getRandomf() returns or what argument types it requires. So the compiler takes it's best guess at what it might be (it assumes that everything is integers). This is called an implicit function declaration. Then when you later define the function properly, it is different than the implicit declaration, causing an error.
 Fixing this is easy, add a function prototype before your call. Normally, this is either done inside your own header file, or in the beginning of your source code, right after your #include statements. Writing a function prototype is easy, it just looks like your function definition except without the code inside it (and ending with a semicolon). Also, the argument names are not needed in a prototype, just the argument types, because you aren't actually referencing them until your definition, so you can leave them out. So both of the following function prototypes would be valid
float getRandomf(int i);

or
float getRandomf(int);

 Adding this prototype to the negative example I provided earlier will make it work as you would expect:
#include <stdio.h>
#include <time.h>

float getRandomf(int);

int main(int argc, char **argv)
{
	int rnd;

	rnd = getRandomf(time(NULL));

	printf("Random number: %d\n",rnd);
}

float getRandomf(int i)
{
	srand(i);
	return rand();
}

 Of course the other way to fix this problem is to define your function before calling it, but this isn't always an option, and isn't possible once you start writing programs which cannot be contained in one source code file. You could also rewrite the function so that it matches the implicit declaration, but this is also not always possible.

Headers and Constants
 Headers are important when using functions from the stardard C library like printf() and time() or rand() because without them, the compiler won't be able to check your function calls to make sure they are correct. Implicit function definitions are generally bad. So inside header files are function prototypes and a few other things like structs and constants. However, if you know what you are doing, it should be possible to write your programs without actually including anything. For example:
int printf(char *, ...);
int time(int);
int rand();
void srand(int);
float getRandomf(int);

int main(int argc, char **argv)
{
	int rnd;

	rnd = getRandomf(time(0));

	printf("Random number: %d\n",rnd);
}

float getRandomf(int i)
{
	srand(i);
	return rand();
}

 Here, we replaced the include statements with the function prototypes that we need from those headers. The code still works just as well as before, it's essentially the same code, just written in a different way. However, if you wanted to use more than just one function, you would need to include prototypes for those as well. Likewise if you want to use a constant such as RAND_MAX or even NULL. If you wanted getRandomf to return a random in the range of 0.0-1.0, then your code would need to look like this:
int printf(char *, ...);
int time(int);
int rand();
void srand(int);
float getRandomf(int);

#define RAND_MAX 2147483647

int main(int argc, char **argv)
{
	int rnd;

	rnd = getRandomf(time(0));

	printf("Random number: %d\n",rnd);
}

float getRandomf(int i)
{
	srand(i);
	float r = rand();
	return r / RAND_MAX;
}

 Here we wanted to use RAND_MAX, which is defined in a header, but instead of including the header, we used #define (a preprocessor directive) to make a constant called RAND_MAX equal to 2147483647 (which is the maximum value rand() can produce on my computer).
 WARNING: You should include stdio.h for RAND_MAX because your code might be run on somebody elses computer some day, and if their computer has a rand() function that generates a value in a different range, say 1-50, then your code will probably not work properly. This is unlikely, but still very possible.


Characters and Numbers
 In C, there are no strings. Instead of strings, we use character arrays. So in order to store the string "hello world" in memory, we store it as an array of characters, 'h' is the first element, 'e' is the second element, 'l' is the third, 'l' is the fourth, etc... However, what we haven't covered is exactly what a character is. It is quite a simple concept however, a character is just a number. In C, a character is denoted by single quotes (e.g. 'a'), but all the computer sees is a number. This might not seem important at first, but one of the implications of this is that you can actually do math on characters. For example, 'a'+1 is equal to 'b'. Also, if you want to determine if a user entered character is lower-case or capital, you can compare it with 'a'.
	char letter;
	scanf("%c",&letter);
	if (letter >= 'a')
		printf("%c is lower-case\n",letter)
	else
		printf("%c is a capital\n",letter)



Memory Allocation
 Memory allocation is very important in C and it's one of the reasons why C is superior to most other languages. C is all about working with memory which is why you can do so much with C, because everything that happens in a computer is in memory. However, on modern computers with modern operating systems, the OS restricts what memory your program can access as a security feature and for stability. This way you can't write a program that modifies another program's data (without permission from the OS). So by default, your program is only allowed to read and execute it's own code. When you define a variable, your compiler makes room for that variable in the executable, and then when your program is executed, your OS gives you enough memory to store that data, and gives your program permission to access that memory.
 For instance, if you define an integer variable called i, and then a character array containing the string "Hello", your program will run with 9 bytes of memory allocated for those variable, 4 bytes for the int and 1 byte for each letter in the string plus one for the null-character on the end.
 It would probably be laid out in memory like this:
1  2  3  4  5  6  7  8  9  10 12 13 14 15
|     i    |'H''e''l''l''o' 0  

 So if we took the address of i and added 4 to it, and then looked at what was inside the memory at that location, we would find 'H'. If we try to look at what is inside the memory address of i + 11, then our program will crash because we haven't allocated that space for our program, another program might be in that space.


Pointers and Dynamic Memory Allocation
 Now that you have read the previous section and you understand how what C is doing with variables, we can understand pointers pretty easily. A variable is equal to a value stored in memory, a pointer is equal to a memory address, a location in memory.
So if we create a variable called i containing 14, then we create a pointer called p and set that pointer to point to i, then i would equal 14 and p would equal wherever i is stored.
 Usually, however, it is not all that useful to work with just the memory location, usually we want to deal with what is inside that memory, this is called dereferencing. To dereference a pointer, you can add * before the pointer when accessing it. For instance, p is equal to the memory location of i, but *p is equal to 14. If we were to change i to 15, but not change p, then *p would be 15, not 14 because we changed the data stored in that memory location.
 Pointers might seem to only be useful in obscure circumstances at first, but in reality, they are a crucial part of the C language, and are used very frequently. Understanding pointers is key to writing good C code.
 It is also important to understand how exactly memory gets allocated when working with pointers. When we declare a regular variable, the OS allocates memory for that variable, however, when we declare a pointer, there is memory allocated to store the address, but not the value. This means if we were to do the following:
negative example:
	int *p;
	*p=100;
Then the program would probably crash. This is undefined behavior because we did not allocate memory to store that 100. However, we can allocate memory for that value by using dynamic memory allocation. The most common and probably best way to do this is with the C standard function malloc(). malloc() takes one argument which is the amount of memory you want to allocate in bytes, and it returns a pointer to the newly allocated memory. It's prototype looks like this:
void *malloc(int size);

Which probably looks odd at first, since "void" is a special type which takes no memory and therefore cannot be used to store a value, but void* is a different than void, it is a pointer; void* is a pointer but it doesn't specify what type of data is stored there, like a generic pointer.
 So now that we know this we can allocate space, however, to allocate enough space to store the value, we need to know how much space that value will take up in memory. Since it is an integer, I know that it takes up 32 bits of memory (4 bytes), however, it's usually not a good idea to hard code this into your program. It is not considered "safe" usually to hard code type sizes because the size might be different if you compile the program for a different processor or operating system. So, instead of hard coding the size, we can use "sizeof" to determine the size of the correct size. sizeof is actually an operator like == or +, but it only takes one parameter, making it a unary operator. sizeof gets you the size of its operand. For the operand, we could use a value, a variable, or a type name. For instance:
	int i;
	sizeof(100);
	sizeof(i);
	sizeof(int);

Are all equal to 4. 100 is an integer, which takes 4 bytes, so sizeof(100) is 4; i is defined as an int, so sizeof(i) is 4; sizeof(int) is the same deal. This is very useful when we want to allocate memory because we can do something like this:
	int *p;

	p = malloc(sizeof(int));

	*p=100;

In this example, we create a pointer called p, then we allocate enough memory to store an integer, and then store the address of that memory in p. Then we store the value 100 in that memory location.


Freeing Memory
 Now dynamically allocating memory seems pretty great, but there are some disadvantages to it. First of all, it take a little bit of time to allocate the memory. Statically allocated memory happens when the program is loaded, but in some cases, dynamically allocating memory can cause noticable slowdowns in program execution. Second, you can't know how much memory your program uses at compile time or load time. Finally, you have to free the memory. See, dynamically allocated memory is not subject to scope (though your pointers are), so if you allocate memory dynamically inside a function, when the function ends, that memory is still reserved for your program, but you will probably no longer have any pointers telling you what location in memory you reserved. When this happens, it is called a "memory leak", and if your program continues to do it, eventually your computer will run out of memory completely, and malloc will begin to fail, and nobody will be happy.
 To fix this problem, we have to make sure that we free the memory using free(). Freeing the memory will cause the operating system to deny your program access to that memory and allow the operating system to use it for something else, which is really good. You must be careful, however, because trying to free memory which you have not allocated will cause your program to crash.
 The free() function returns nothing and takes one argument, a pointer to the memory which has been allocated with malloc(). You don't have to worry about telling free() how much memory you allocated because malloc() keeps a record of it in it's internal state.
	int *p;

	//Allocate some memory
	p = malloc(sizeof(int));

	//Now we can use the memory
	*p=100;

	//Give the memory back to the OS
	free(p);



Pointers and Arrays
 Pointers in C become most useful in C when dealing with arrays and strings. In C, there is no "string" type, a string is an array of characters, and an array in C is basically just a pointer to the first element of the array. When a string is stored in memory, it is stored contiguously so that the first letter is stored in one memory location, and then the next letter is stored in the next address and on and on until the end of the string. So to get the first character of a string, we just have to look at the location pointed to by the array variable. In order to get the second letter of the string, we can look at the array variable + 1.
For instance, if a string is stored in a variable called str (char *str; ), then an algorithm to print the string with each character on a new line might look like this:
	int i;
	for (i=0; *(str+i) != 0; i++)
	{
		//Print one character and then a new-line
		printf("%c\n",*(str+i));
	}

However, there is another notation for addressing array contents which looks nicer in most cases. The name of the array followed by a number in square brackets is the same as that pointer + the number then dereferenced. The above example using this notation is as follows:
	int i;
	for (i=0; str[i] != '\0'; i++)
	{
		//Print one character and then a new-line
		printf("%c\n",str[i]);
	}

Notice that in this case there is no need to use the indirection operator (*) to dereference str, and we do not need to add i to str using the addition operator (+). str[i] is used instead to access the i element of the array. *(str+i) is the same as str[i].
 You can also take an existing variable and get it's address. To do this, we use the operator & before the variable name, this is called the reference operator. In the following example, it is used to set a pointer equal to the address of another variable.
	int i;
	int *p;

	i=14;

	p=&i;

	i=999;

	printf("%d\n",*p);


Function Pointers
 In my opinion, function pointers are the best part of C. They can be used to do very silly, inefficent things, but in many cases they can be used to avoid unnecesary comparisons and increase efficiency. This is especially true when writing programs that need to perform some action continuously, such as most games.
 The concept of function pointers is really quite simple, but can be hard for beginners to grasp. We have a tendency to think of variables as data being stored, and our program acting on those variables. However, what may not be as obvious is that your actual program code is stored in memory also. This means that a function name is really a variable, and when you call a function, what you are really doing is telling the computer to execute the code stored in a specific place in memory. Thus, we can even create a pointer that points to the location where a function is stored. Then we can tell the computer to execute the code at the address the pointer points to.
 Fuction pointers can be created using the following syntax:
return type (*function name)(arguments);
 For instance, the following example creates a function pointer, sets it equal to the address of a function, and then calls it.
int func(char arg)
{
	return arg[1];
}

int main(int argc, char ** argv)
{
	int r;
	int (*func_point)(char);

	func_point = func;

	r = func_point('A');

	return r;
}


Example Program Source
Notice: I do not endorse numerology, this is just an example to illustrate the concepts described here.
#include <stdio.h>        //printf, scanf
#include <stdlib.h>        //malloc, free

#define MAX_NAME_LEN 64                               //The maximum length of a name

int get_name_number(char *);          //Destiny Number
int get_soulurge_number(char *);      //Soul Urge
int get_personality_number(char *);   //Personality
int get_char_number(char);            //Get the number of a character
int reduce(int);                      //Reduce a number

int main(int argc, char **argv)
{
        char *name;
        int number;

        name = malloc(sizeof(char) * MAX_NAME_LEN);

        printf("Welcome to this example numerology name calculator.\n\n");
        printf("Please enter your full birth name:\n");

        //Get the user's name
        scanf("%[^\n]",name);

        //Create function pointer and set up a "default" function in case of an error
        int (*get_number)(char *)=get_name_number;

        printf("\nWhat would you like to calculate?\n");
        printf("(d)estiny number, (s)oul urge number, or (p)ersonality number:\n");

        char choice;

        //Get the user's selection
        getchar();        //This eats the '\n' left on the input stream by scanf
        scanf("%c",&choice);

        //Modify function pointer based on user's selection
        if (choice=='d')
                get_number=get_name_number;
        if (choice=='s')
                get_number=get_soulurge_number;
        if (choice=='p')
                get_number=get_personality_number;

        //Find each name and calculate the number of the name and update the total
        int name_start=0, name_end=0, total=0;
        while (name[name_start]!='\0')
        {
                if (name[name_end]=='\0')        //Check if the end of full name is reached
                {
                        number = get_number(&name[name_start]);
                        printf("%s : %d / %d\n",&name[name_start],number,reduce(number));
                        total+=number;        //Update total
                        name_start=name_end;
                }
                if (name[name_end]==' ')        //Check if end of name is reached
                {
                        name[name_end]='\0';        //Add '\0' to mark the end of this name
                        number = get_number(&name[name_start]);        //Get the number for this name
                        printf("%s : %d / %d\n",&name[name_start],number,reduce(number));
                        total+=number;        //Update total
                        name_start=name_end+1;        //Start the next name
                }
                name_end++;        //Continue on to the next name
        }

        //Print out the result
        printf("\nNumber: %d / %d\n",total,reduce(total));

        //Name is no longer needed, so we can free the memory
        free(name);

        return number;
}

//calculate total name number
int get_name_number(char *str)
{
        int n=0;

        int i;
        for (i=0; str[i] != '\0'; i++)
                n+=get_char_number(str[i]);

        return n;
}

//calculate soul urge number
int get_soulurge_number(char *str)
{
        int n=0;

        int i;
        for (i=0; str[i] != '\0'; i++)
        {
                //Only count vowels
                if (str[i]=='A' || str[i]=='a' ||
                        str[i]=='E' || str[i]=='e' ||
                        str[i]=='I' || str[i]=='i' ||
                        str[i]=='O' || str[i]=='o' ||
                        str[i]=='U' || str[i]=='u' ||
                        str[i]=='Y' || str[i]=='y')
                        n+=get_char_number(str[i]);
        }

        return n;
}

int get_personality_number(char *str)
{
        int n=0;

        int i;
        for (i=0; str[i] != '\0'; i++)
        {
                //Only count consonants
                if (str[i]!='A' && str[i]!='a' &&
                        str[i]!='E' && str[i]!='e' &&
                        str[i]!='I' && str[i]!='i' &&
                        str[i]!='O' && str[i]!='o' &&
                        str[i]!='U' && str[i]!='u' &&
                        str[i]!='Y' && str[i]!='y')
                        n+=get_char_number(str[i]);
        }

        return n;
}

int get_char_number(char c)
{
        if (c>='a')
                return ((c-'a')%9)+1;
        else
                return ((c-'A')%9)+1;
}

//Reduce number by adding digits together
int reduce(int n)
{
        if (n<10 || n==11 || n==22)
                return n;
        else
        {
                int sum=0;
                while (n!=0)
                {
                        int r = n%10;
                        sum+=r;
                        n=n/10;
                }
                return reduce(sum);
        }
}


Quiz
 Now let's do a quick quiz to see how much you remember.
Answer the following questions:

1. What is stored in a header file?
2. When are header files needed?
3. What is a character in C?
4. How much memory does a character take?
5. What does arr[i] mean?
6. What does the & operator before a variable (reference operator) do?
7. Why do we need to "free" dynamically allocated memory?
8. How do function pointers work?
9. What is a pointer?
10. If a pointer to a char is created (char *p;), how much memory is allocated and what is stored there?
*hint: you can use sizeof() to check

That's all for now, hopefully there will be more in the near future, though at this point you should already have a pretty good understanding of the most important C concepts. There is still more to learn though.