Bad programming practices in : C
mashhoor.d[at]gmail.com
Updated April 11th 2005
Thanks to :
Ben Hinrichs (corrected a linguistic error) and Orion (corrected typos).
Preface
Programmers , especially new ones , tend to fall to bad programming practices without noticing. Those bad practices range from ones that put your programs at risk to ones that make your code less readable or even confusing. This tutorial will cover some of those and introduce alternates for some others. I thought this would be more usefull for novice C programmers , so i've tried my best to represent each issue as simple as i can.
Contents
This is what lots of people argue on especially new programmers. Using 'int main()' is always better , because function 'main()' must return a value to the invoker to report successful program execution. The behavior of 'void main' is undefined , so even if it works , it doesn't mean that its right and would still be considred a bad practice. Besides , the standard says that function 'main()' should return an 'int'.
Also note that using 'main()' alone without specifying it's return type is treated differently in different compilers. Some will assume its 'int main()' while others will see it as 'void main()', so you must explicitly specify that.
2. Clearing '\n' with fflush(stdin)
You might've noticed , that whenever you use 'scanf()' followed by 'fgets()' , the program would skip 'fgets()' as if you've hit <enter> without any input. Why does that happen?? Basically because scanf() can't read the '\n' (newline) character therefore it leaves it in the standard input stream.
More : Think of it as a queue (stdin) of people (characters) waiting to be served by you (scanf). You know how to serve all of them except for a certian guy ('\n'). You'll simply leave that guy (character) in the queue untill someone else (another function) serves (reads) him .
Now when you use fgets() later in your program , it expects to take characters from the standard input stream (the keyboard here) untill a newline character is reached (the user hit enter ) , but in our case the new line would be already there (remember scanf() couldn't read it and left it in the queue) , thus fgets() will terminate right away.
This problem occurs with the function 'getchar()', too :
int ch = getchar();
getchar() returns the next character from stdin. So if you input the letter 'a' then hit enter (wich produces '\n') , not only 'a' will be saved in the stream , but the newline character '\n' as well (if you call getchar() again , it will read the '\n' from the stream and assign it to ch). Now since getchar() reads only one character from stdin , it would read the 'a' only and leave the '\n' in the stream wich would terminate fgets() if you used it while the '\n' is still in stdin.
As far as i know , there is no standard method to clear the newline character from the stream in C , but here's a little trick that will do so :
while( (ch = getchar()) != '\n' && ch != EOF);
As you might know , getchar() reads a character , converts it to an INT and return return its value (so it doesn't actually return a character. Read No.8 for more) . Now this will let 'ch' (of type INT) take inputs that are NOT a new line character therefore going through the whole input stream clearing the '\n' left there by 'scanf()' untill EOF (End Of File) is reached. You won't have to hit any thing when the program runs and reaches that line. Becareful though when using that line to clear the input stream, because if there weren't anything to clear , the program will wait untill there is , wich might lead to unexpected results.
So you wanna know why is fflush(stdin) is bad?? in short words .. it was built to clear the output stream (stdout) not the input .Using it for flushing stdin is undefind behavior. I know it might work but that doesn't mean it was built for it. Smacking someone's forehead with a hammer can kill him , but that doesn't mean hammers were built to kill people.
Books , tutorials and college teachers usually introduce this function as a way to input strings , explain it's weakness and recommend using 'fgets()' afterwards. The problem is that some people insist on using it (including teachers themselves) either because its "quick" or they know its weak but they can't\won't use fgets() for some reason.
the problem with gets() is that it has only one argument wich is a char pointer that points to your char array (string) , so it has no idea how long the array is. Consider the following program :
#include <stdio.h> int main(void) { char array[20]; gets(array); return 0; }
if you input a string that has more than the compiler expects (more than 20 characters) gets() will try to fit all of the characters into the array , and since the array's size is less than the input , your program would crash. To avoid this problem, you can use 'fgets()' wich is declared in 'stdio.h' as :
char *fgets(char *s, int n, FILE *stream);
first argument is a pointer to your array , second argument takes the maximum numebr of characters allowed as an input , third is for the stream you'd use wich would be 'stdin' (STanDard INput) to get an input from the keyboard. Now we could replace the 'gets()' function in the previous program with :
fgets(array , sizeof(array), stdin);
'sizeof(array)' is equivalent to 20 (wich is the size of array here). But fgets() will only take the first 19 characters from your string as the last element is reserved for the string null byte '\0'. And as you know , every string in C should end with a null byte.
4. fgets() and the newline character
This issue might not fall under bad practices , but you should be aware of it anyway.
Assume that we've declared a string called 'str' that can hold 5 characters. It would look something like this in memory :
Str : [ ][ ][ ][ ][ ] 0 1 2 3 4
When you input a string consisting of 4 or more characters , fgets() will store the first 4 characters only and place the null byte in the last slot available. But what if your input was less than 4 characters? in this case , fgets() will store all of your characters in the array , followed with the a new line character and finally adds the null byte to the end of the array.
for example , lets say you've entered "cat". fgets() will store it this way :
Str : [ c ][ a ][ t ][\n ][\0 ] 0 1 2 3 4
"How would that be a problem" you might ask? consider this code :
#include <stdio.h> int main(void) { char str[10], str2[10]; printf("Enter the first name : "); fgets(str, sizeof(str), stdin); printf("Enter the second name : "); fgets(str2, sizeof(str2), stdin); printf("You've entered : %s and %s", str, str2); return 0; }If you enter 'joe' as the first name and 'mike as the second for example , You'll notice the output being :
You've entered : joe and mike
You've expected the output to be in one line , but fgets() stored a newline character at the end of each string therefore the output showed in two lines. Just imagine how much mess there would be in a bigger output.
So , how do we get rid of that newline? there are couple of ways to do this , one of them is to search the string for a newline character ('\n') and replace it with a null byte. We'll go with this way by using a function and a pointer. The function we'll use is 'strchr()' , defined in 'string.h', which scans a string forward for a given character and returns a pointer to it if it exist , otherwise it returns NULL :
#include <stdio.h> #include <string.h> /* for strchr() */ int main(void) { char str[10], str2[10]; char *replacer; printf("Enter the first name : "); fgets(str, sizeof(str), stdin); printf("Enter the second name : "); fgets(str2, sizeof(str2), stdin); /* lets 'replacer' point to the newline if found or to NULL if it doesn't exist */ replacer = strchr(str, '\n'); /* if the newline is found replace it with a null byte */ if(replacer != NULL) *replacer = '\0'; printf("You've entered : %s and %s", str, str2); return 0; }
Pretty simple.
Well, using scanf() to input numbers isn't a bad practice ofcourse , but it has a little problem with validating input when dealing with numbers , it wouldn't mind taking alphabet characters where it's supposed to take digits for example. A combination of letters and numbers as one input would be accepted as well leading to erroneous results. What we need is a method to strict the way a user enters number to our program. So the input MUST be a number , otherwise the program will display a message "Invalid input. please enter a valid number." (for example). here it is :
#include <stdio.h> #include <stdlib.h> /* for strtol() */ int main() { char input[BUFSIZ]; char *p; long result; printf("Please enter a number : "); fgets(input, sizeof(input), stdin); result = strtol(input , &p, 10); if(input[0] != '\n' && (*p == '\n' || *p == '\0')) printf("Correct Input : %d", result); else printf("Invalid input. Please enter a number."); return 0; }
This seems complicated at first glance , but with a little practicing it'll be piece of cake.assuming that you already know about 'fgets()' and what precedes it ,ill start explaining the rest.
we are using a function called 'strtol()' (strtol = STRing TO Long). It converts a string to a number of type LONG. It also give us some control over the input. Why would we use such function when we wanna deal with numbers?? as you see , we used 'fgets()' to store the number as a string of characters to split up the input and check each element for non-digit characters , So if you input the number '57' for example , it'll be stored in the array as :
input : [ '5' ][ '7' ][ '\0'] 0 1 2
Thats why we need a function to convert the string back to the number the user has input , and since the function converts the string to a number of type LONG , we'll declare and use a LONG variable to hold 'strtol()'s return value.
the function strtol() is declared as the following :
long strtol(const char *s, char **endptr, int base);
first argument is a pointer that we'll use to point to our 'input' array. Third argument is the base we wish to use which is base 10 (decimal). With this argument , strol() is going to stop scanning the string once it encounters a value that doesn't belong to the base we're using and sets **endptr to point to the elemnt holding that value.
To elaborate on this , lets say we've input '546m7' in our program , fgets() will store it this way in our 'input' array :
input : [ '5' ][ '4' ][ '6' ][ 'm' ][ '7' ][ '\0'] 0 1 2 3 4 5
as you see , element 4 (input[3]) is 'm'. With this input, strtol() will stop scanning once 'm' is reached and **endptr will point to it , because it's the first occurance of a non-digit character.
Lets get back to our example. The final line we need to study is :
if(input[0] != '\n' && (*p == '\n' || *p == '\0'));
this will strictly validate the input. Remember when i said that inputs are stored as characters with fgets()?? the first condition in the IF statement will insure that the user won't hit <enter> without typing anything. The second condition insures that the last non-digit character in the user input must be either a terminating NULL character '\0' OR a new line '\n', so '35645k' or '2344d432' aren't acceptable.
scanf() doesn't validate the input in anyway. It only return the number of values successfly read , converted and stored. Although scanf() can't validate your input , some programmers prefer to use it because it allows multiple inputs to be scanned in one line. In our method you'll only be able to scan one input in each call.
Two things new programmers usually ask about, pausing the output and clearing the screen. There are multiple ways to do both tasks , but many people use the simplest-one-lined-method which is using system() with the corresponding shell command. For pausing you'd use 'system("PAUSE")', and for clearing 'system("CLS")' would be your choice (assuming you are on Windows).
Personally i've been using 'system()' for a long while thinking its the way to go untill some programmers warned me about it. The cons of using 'system()' mainly revolve around these points :
It's not portable
As you might know, different platforms use different console commands. So using a DOS command with system() might not work with someone who's trying to compile your code under his Linux machine, thus your application becomes non-portable and your code will need modifications in order to work under other platforms.
While this might not be a problem for some programmers , but it's somthing you should always put in mind.
It's very slow
Let's say you've used 'system("CLS")'. Whenever your program calls that function with the "CLS" argument , it's going to suspend its self to execute CLS.EXE and it takes a really long time compared to other functions. if you want to make sure by your-self , make a program that prints 5 strings using 5 printf()'s seperated by 'system("CLS")' and see how slow the process goes.
Some people could take advantage of it
As we said in our previous example , your program would call the file CLS.EXE from your computer to clear the screen. The problem is , 'system()' can't tell whether it executed the correct CLS.EXE file or not, it just returns a value telling whether the requested command was successfuly found and executed or not. This way, someone can replace CLS.EXE with another malicious program under the same name and your program would execute it instead thinking its the right EXE.
I think those were some good reasons to stop using 'system()' in your programs. If you still need a method to clear\pause the output then search google for them. I'm sure this topic has been discussed hundreds of times.
This one is pretty simple if you know what the keyword typedef and pointers are. If you don't , then read this when you do.
"hiding pointers" basically means declaring a pointer type with typedef and use it in your program to declare variables as pointers. For example :
/* this code declare pointers typically (the recommended way) */ int variable = 5; int *p; p = &variable;
/* this code hides pointers */ int variable = 5; typedef int * my_pointer_type; my_pointer_type p; p = &variable;
The second code would be confusing to you and to whoever reads your code. Why hiding pointers when the only thing you get is confusion? if you still insist on declaring your own pointer types , then at least use names that distinguish them such as "mytype_p" or "Pmytype".
8. getchar() doesn't return a 'char'
getchar() returns a character from stdin ; for this reason , we should use a 'char' to hold its return value , right? well, wrong. getchar() returns an 'int' as you see in it's prototype :
int getchar(void);
This might look silly , but its done for a reason.
getchar() returns EOF (a special value returned to report End Of File) when there is no more input or when an error occur. EOF is defined in <stdio.h> and has a value larger than a 'char' can hold , so it won't be mistaken for a valid character. You shouldn't care about EOF's value for now , just know that its an integer that uses more bits than a 'char' can hold , wich is why we should use an 'int' to hold getchar()'s return value.
Omitting values returned by functions is a real bad programming practice. Checking return values (known as "Error checking") is important for both the programmer and the consumer. It tells the consumer what happened if things went wrong , and makes tracking down the problem easier for you. Here's a simple "error checking" procedure for 'malloc()' :
#include <stdio.h> #include <stdlib.h> // for malloc() and free() int main(void) { int *var; var = malloc(sizeof(int)); if( var == NULL) { printf("Error : could not allocate space\n"); return 1; } *var = 5; printf("var points to : %d\n", *var); free(var); return 0; }
You can refer to http://www.cppreference.com/ for values that standard functions return. Keep in mind that even your own functions should return a certain value in case of error.
Its common to see new programmers relying on 'conio.h' in their programs , especially for inputs with getch() and getche()
, which is fine till you know that 'conio.h' isn't standard and not supported by many compilers (i heard only Borland compilers support it , i'm not sure about this though).
Now you have a reason to think twice before including
it to your program next time.
11. Using malloc() without free()
What's messing in the this code? :
#include <stdio.h> #include <stdlib.h> int main(void) { int *var; var = malloc(sizeof(int)); if( var == NULL ) { printf("Insufficient memory. Terminating program..\n"); exit(1); } *var = 5; printf("%d\n", *var); return 0; }
Apparently , i didn't use 'free()', but why should i use it?
The previous code allocated space in memory for 'var' and never free'd it for the system to reclaim it. This means that , even after terminatnig your program , the 4 (or 2) bytes we've allocated for 'var' will stay there (in memory) as long as your computer is running. This is considred a serious bug in bigger programs and is referred to as a "Memory leak".
Memory leaks occur when a program fails to free allocated space or when the programmer forgets to explicitly free a manually allocated space with free(). Programs with memory leaks will cause your computer to run slower as they fill the memory with more and more useless data. Languages like Java and VB use what's called "garbage collection" that implicitly free allocated memory thats no longer in use by their applications. C\C++ doesn't offer this feature , but i heard some libraries offer it.
To see this bug in action , make a program that allocates large chunks of memory within a long\infinite for loop and watch your computer's performance degrade. don't worry though , once you restart your computer everything will get back to normal.
That's all for this tutorial. If you need to contact me to correct some errors/typos/misconception (or think something should be modified\changed) , for help or whatever else , feel free to e-mail me.
Site designed and run by Mashhoor Al Dubayan.
© Mashhoor 2005