Tutorial : C - Introduction

Introduction
C is a practical and still-current software tool; it remains one of the most popular programming languages in existence, particularly in areas such as embedded systems. C facilitates writing code that is very efficient and powerful and, given the ubiquity of C compilers, can be easily ported to many different platforms. Also, there is an enormous code-base of C programs developed over the last 30 years, and many systems that will need to be maintained and extended for many years to come.

Programming and Programming Languages : 
The native language of a computer is binary—ones and zeros—and all instructions and data must be provided to it in this form. Native binary code is called machine language. The earliest digital electronic computers were programmed directly in binary, typically via punched cards, plug-boards, or front-panel switches. Later, with the advent of terminals with keyboards and monitors, such programs were written as sequences of hexadecimal numbers, where each hexadecimal digit represents a four binary digit sequence. Developing correct programs in machine language is tedious and complex, and practical only for very small programs.

In order to express operations more abstractly, assembly languages were developed. These languages have simple mnemonic instructions that directly map to a sequence of machine language operations. For example, the MOV instruction moves data into a register, the ADD instruction adds the contents of two registers together. Programs written in assembly language are translated to machine code using an assembler program. While assembly languages are a considerable improvement on raw binary, they still very low-level and unsuited to large-scale programming. Furthermore, since each processor provides its own assembler dialect, assembly language programs tend to be non-portable; a program must be rewritten to run on a different machine.

The 1950s and 60s saw the introduction of high-level languages, such as Fortran and Algol. These languages provide mechanisms, such as subroutines and conditional looping constructs, which greatly enhance the structure of a program, making it easier to express the progression of instruction execution; that is, easier to visualise program flow. Also, these mechanisms are an abstraction of the underlying machine instructions and, unlike assembler, are not tied to any particular hardware. Thus, ideally, a program written in a high-level language may be ported to a different machine and run without change. To produce executable code from such a program, it is translated to machinespecific assembler language by a compiler program, which is then coverted to machine code by an assembler (see Appendix B for details on the compilation process).

Compiled code is not the only way to execute a high-level program. An alternative is to translate the program on-the-fly using an interpreter program (e.g., Matlab, Python, etc). Given a text-file containing a high-level program, the interpreter reads a high-level instruction and then executes the necessary set of low-level operations. While usually slower than a compiled program, interpreted code avoids the overhead of compilation-time and so is good for rapid implementation and testing. Another alternative, intermediate between compiled and interpreted code, is provided by a virtual machine (e.g., the Java virtual machine), which behaves as an abstract-machine layer on top of a real machine. A high-level program is compiled to a special byte-code rather than machine language, and this intermediate code is then interpreted by the virtual machine program. Interpreting byte code is usually much faster than interpreting high-level code directly. Each of these representations has is relative advantages: compiled code is typically fastest, interpreted code is highly portable and quick to implement and test, and a virtual machine offers a combination of speed and portability.

The primary purpose of a high-level language is to permit more direct expression of a programmer's design. The algorithmic structure of a program is more apparent, as is the flow of information between different program components. High-level code modules can be designed to "plug" together piece-by-piece, allowing large programs to be built out of small, comprehensible parts. It is important to realise that programming in a high-level language is about communicating a software design to programmers not to the computer. Thus, a programmer's focus should be on modularity and readability rather than speed. Making the program run fast is (mostly) the compiler's concern.

The C Programming Language : 
C is a general-purpose programming language, and is used for writing programs in many different domains, such as operating systems, numerical computing, graphical applications, etc. It is a small language, with just 32 keywords. It provides "high-level" structured programming constructs such as statement grouping, decision making, and looping, as well as "lowlevel" capabilities such as the ability to manipulate bytes and addresses.

Since C is relatively small, it can be described in a small space, and learned quickly. A programmer can reasonably expect to know and understand and indeed regularly use the entire language

C achieves its compact size by providing spartan services within the language proper, foregoing many of the higher-level features commonly built-in to other languages. For example, C provides no operations to deal directly with composite objects such as lists or arrays. There are no memory management facilities apart from static definition and stack-allocation of local variables. And there are no input/output facilities, such as for printing to the screen or writing to a file.

Much of the functionality of C is provided by way of software routines called functions. The language is accompanied by a standard library of functions that provide a collection of commonly used operations. For example, the standard function printf() prints text to the screen (or, more precisely, to standard output—which is typically the screen). The standard library will be used extensively throughout this text; it is important to avoid writing your own code when a correct and portable implementation already exists.

A First Program
A C program, whatever its size, consists of functions and variables. A function contains statements that specify the computing operations to be done, and variables store values used during the computation . The following program is the traditional first program presented in introductory C courses and textbooks.

1 /* First C program: Hello World */
2  #include <stdio.h>
3
4 int main(void)
5 {
6 printf("Hello World!\n");
7 }

1. Comments in C start with /* and are terminated with */. They can span multiple lines and are not nestable. For example,

/* this attempt to nest two comments /* results in just one comment,
ending here: */ and the remaining text is a syntax error. */

2. Inclusion of a standard library header-file. Most of C’s functionality comes from libraries. Header- files contain the information necessary to use these libraries, such as function declarations and macros.

4. All C programs have main() as the entry-point function. This function comes in two forms:

int main(void)
int main(int argc, char *argv[])

The first takes no arguments, and the second receives command-line arguments from the environment in which the program was executed—typically a command-shell.  The function returns a value of type int (i.e., an integer ).

5 and 7. The braces { and } delineate the extent of the function block. When a function completes, the program returns to the calling function. In the case of main(), the program terminates and control returns to the environment in which the program was executed. The integer return value of main() indicates the program’s exit status to the environment, with 0 meaning normal termination.

6. This program contains just one statement: a function call to the standard library function printf(), which prints a character string to standard output (usually the screen). Note, printf() is not a part of the C language, but a function provided by the standard library (declared in header stdio.h). The standard library is a set of functions mandated to exist on all systems conforming to the ISO C standard. In this case, the printf() function takes one argument (or input parameter): the string constant "Hello World!\n". The \n at the end of the string is an escape character to start a new line. Escape characters provide a mechanism for representing hard-to-type or invisible characters (e.g., \t for tab, \b for backspace, \" for double quotes). Finally, the statement is terminated with a semicolon (;). C is a free-form language, with program meaning unaffected by whitespace in most circumstances. Thus, statements are terminated by ; not by a new line.

Variants of Hello World : 
The following program produces identical output to the previous example. It shows that a new line is not automatic with each call to printf(), and subsequent strings are simply abutted together until a \n escape character occurs.

1 /* Hello World version 2 */
2 #include <stdio.h>
3
4 int main(void)
5 {
6 printf("Hello ");
7 printf("World!");
8 printf("\n");
9 }

The next program also prints “Hello World!” but, rather than printing the whole string in one go, it prints it one character at a time. This serves to demonstrate several new concepts, namely: types, variables, identifiers, pointers, arrays, array subscripts, the \0 (NUL) escape character, logical operators, increment operators, while-loops, and string formatting.

This may seem a lot, but don’t worry—you don’t have to understand it all now, and all will be explained in subsequent chapters. For now, suffice to understand the basic structure of the code: a string, a loop, an index parameter, and a print statement.

1 /* Hello World version 3 */
2 #include <stdio.h>
3
4 int main(void)
5 {
6 int i = 0;
7 char *str = "Hello World!\n";
8
9 /* Print each character until reach ’\0’ */
10 while (str[i] != ’\0’)
11 printf("%c", str[i++]);
12
13 return 0;
14 }

6–7. All variables must be declared before they are used. They must be declared at the top of a block before any statements; (a block is a section of code enclosed in brackets { and }). They may be initialised by a constant or an expression when declared.

6. The variable with identifier i is of type int, an integer, initialised to zero.

7. The variable with identifier str is of type char *, which is a pointer to a character. In this case, str refers to the characters in a string constant.

10–11. A while-loop iterates through each character in the string and prints them one at a time. The loop executes while ever the expression (str[i] != ’\0’) is non-zero. (Non-zero corresponds to TRUE and zero to FALSE.) The operator != means NOT EQUAL TO. The term str[i] refers to the i-th character in the string (where str[0] is ’H’). All string constants are implicitly appended with a NUL character, specified by the escape character ’\0’.

11. The while-loop executes the following statement while ever the loop expression is TRUE. In this case, the printf() takes two arguments—a format string "%c" and a parameter str[i++]—and prints the i-th character of str. The expression i++ is called the post-increment operator ; it returns the value of i and then increments it i=i+1.

13. Unlike the previous versions of this program, this one includes an explicit return statement for the program’s exit status.

Style note :  Throughout this text take notice of the formatting style used in the example code, particularly indentation. Indentation is a critical component in writing clear C programs. The compiler does not care about indentation, but it makes the program easier to read for programmers.

A Numerical Example

1 /* Fahrenheit to Celcius conversion table (K&R page 12) */
2 #include <stdio.h>
3
4 int main(void)
5 {
6 float fahr, celsius;
7 int lower, upper, step;
8
9 /* Set lower and upper limits of the temperature table (in Fahrenheit) along with the
10 * table increment step-size */
11 lower = 0;
12 upper = 300;
13 step = 20;
14
15 /* Create conversion table using the equation: C = (5/9)(F - 32) */
16 fahr = lower;
17 while (fahr <= upper) {
18 celsius = (5.0/9.0) * (fahr−32.0);
19 printf("%3.0f \t%6.1f\n", fahr, celsius);
20 fahr += step;
21 }
22 }

6–7. This program uses several variables. These must be declared at the top of a block, before any statements. Variables are specified types, which are int and float in this example.

9–10. Note, the * beginning line 10 is not required and is there for purely aesthetic reasons.

11–13. These first three statements in the program initialise the three integer variables.

16. The floating-point variable fahr is initialised. Notice that the two variables are of different type (int and float). The compiler performs automatic type conversion for compatible types.

17–21. The while-loop executes while ever the expression (fahr <= upper) is TRUE. The operator <= means LESS THAN OR EQUAL TO. This loop executes a compound statement enclosed in braces— these are the three statements on lines 18–20.

18. This statement performs the actual numerical computations for the conversion and stores the result in the variable celcius.

19. The printf() statement here consists of a format string and two variables fahr and celcius. The format string has two conversion specifiers, %3.0f and %6.1f, and two escape characters, tab and new-line. (The conversion specifier %6.1f, for example, formats a floating-point number allowing space for at least six digits and printing one digit after the decimal point. See Section 13.1.1 for more information on printf() and conversion specifiers.)

20. The assignment operator += produces an expression equivalent to fahr = fahr + step.

Style note : Comments should be used to clarify the code where necessary. They should explain intent and point-out algorithm subtleties. They should avoid restating code idioms. Careful choice of identifiers (i.e., variable names, etc) can greatly reduce the number of comments required to produce readable code.

Another Version of the Conversion Table Example : 
This variant of the conversion table example produces identical output to the first, but serves to introduce symbolic constants and the for-loop.

1 /* Fahrenheit to Celcius conversion table (K&R page 15) */
2 #include <stdio.h>
3
4 #define LOWER 0 /* lower limit of temp. table (in Fahrenheit) */
5 #define UPPER 300 /* upper limit */
6 #define STEP 20 /* step size */
7
8 int main(void)
9 {
10 int fahr;
11
12 for (fahr = LOWER; fahr <= UPPER; fahr += STEP)
13 printf("%3d \t%6.1f\n", fahr, (5.0/9.0) * (fahr−32.0));
14 }

4–6. Symbolic constants are names that represent numerical constants. These are specified by #define, and mean that we can avoid littering our code with numbers. Numbers scattered through code are called "magic numbers" and should always be avoided. (There are rare exceptions where a literal constant is okay; the most common example is the number 0 to begin a loop over an array.) 

12–13. The for-loop has three components separated by two semicolons (;). The first initialises the loop, the second tests the condition (identical to the while-loop), and the third is an expression executed after each loop iteration. Notice that the actual conversion expression appears inside the printf() statement; an expression can be used wherever a variable can. 

Style note. Variables should always begin with a lowercase letter, and multi-word names should be written either like_this or likeThis. Symbolic constants should always be UPPERCASE to distinguish them from variables.