Tutorial : C - Types, Operators, And Expressions

Identifiers : 
Identifiers (i.e., variable names, function names, etc) are made up of letters and digits, and are case-sensitive. The first character of an identifier must be a letter, which includes underscore (_). The C language has 32 keywords which are reserved and may not be used as identifiers (eg, int, while, etc). Furthermore, it is a good idea to avoid redefining identifiers used by the C standard library (such as standard function names, etc).

 

Style Note. Use lowercase for variable names and uppercase for symbolic constants. Local variable names should be short and external names should be longer and more descriptive. Variable names can begin with an underscore (_), but this should be avoided as such names, by convention, are reserved for library implementations.

 

Types : 
C is a typed language. Each variable is given a specific type which defines what values it can represent, how its data is stored in memory, and what operations can be performed on it. By forcing the programmer to explicitly define a type for all variables and interfaces, the type system enables the compiler to catch type-mismatch errors, thereby preventing a significant source of bugs. 

There are three basic types in the C language: characters, and integer and floating-point numbers. The numerical types come in several of sizes. below Table shows a list of C types and their typical sizes, although the sizes may vary from platform to platform. Nearly all current machines represent an int with at least 32-bits and many now use 64-bits. The size of an int generally represents the natural word-size of a machine; the native size with which the CPU handles instructions and data.

C Data Types
char usually 8-bits (1 byte)
int usually the natural word size for a machine or OS (e.g., 16, 32, 64 bits)
short int at least 16-bits
long int at least 32-bits
float usually 32-bits
double usually 64-bits
long double usually at least 64-bits

With regard to size, the standard merely states that a short int be at least 16-bits, a long int at least 32-bit, and

short int ≤ int ≤ long int

The standard says nothing about the size of floating-point numbers except that

float ≤ double ≤ long double.

A program to print the range of values for certain data types is shown below. The parameterssuch as INT_MIN can be found in standard headers limits.h and float.h

1 #include 
2 #include  /* integer specifications */
3 #include  /* floating-point specifications */
4
5 /* Look at range limits of certain types */
6 int main (void)
7 {
8 printf("Integer range:\t%d\t%d\n", INT MIN, INT MAX);
9 printf("Long range:\t%ld\t%ld\n", LONG MIN, LONG MAX);
10 printf("Float range:\t%e\t%e\n", FLT MIN, FLT MAX);
11 printf("Double range:\t%e\t%e\n", DBL MIN, DBL MAX);
12 printf("Long double range:\t%e\t%e\n", LDBL MIN, LDBL MAX);
13 printf("Float-Double epsilon:\t%e\t%e\n", FLT EPSILON, DBL EPSILON);
14 }

 

Note. The size of a type in number of characters (which is usually equivalent to number of bytes) can be found using the sizeof operator. This operator is not a function, although it often appears like one, but a keyword. It returns an unsigned integer of type size_t, which is defined in header-file stddef.h.

1 #include 
2
3 int main (void)
4 /* Print the size of various types in “number-of-chars” */
5 {
6 printf("void\tchar\tshort\tint\tlong\tfloat\tdouble\n");
7 printf("%3d\t%3d\t%3d\t%3d\t%3d\t%3d\t%3d\n",
8 sizeof (void), sizeof (char), sizeof (short), sizeof (int),
9 sizeof (long), sizeof (float), sizeof (double));
10 }

The keywords short and long are known as type qualifiers because they affect the size of a basic int type. (The qualifier long may also be applied to type double.) Note, short and long, when used on their own as in

short a;
long x;

are equivalent to writing short int and long int, respectively. Other type qualifiers2 are signed, unsigned, const, and volatile. The qualifiers signed or unsigned can apply to char or any integer type. A signed type may represent negative values; the most-significant-bit (MSB) of the number is its sign-bit, and the value is typically encoded in 2’s-complement binary. An unsigned type is always non-negative, and the MSB is part of the numerical value—doubling the maximum representable value compared to an equivalent signed type. For example, a 16-bit signed short can represent the numbers −32768 to 32767, while a 16-bit unsigned short can represent the numbers 0 to 65535. 

 

Note. Integer types are signed by default (e.g., writing short is equivalent to writing signed short int). However, whether plain char’s are signed or unsigned by default is machine dependent.

The qualifier const means that the variable to which it refers cannot be changed.

const int DoesNotChange = 5;
DoesNotChange = 6; /* Error: will not compile */

The qualifier volatile refers to variables whose value may change in a manner beyond the normal control of the program. This is useful for, say, multi-threaded programming or interfacing to hardware; topics which are beyond the scope of this text. The volatile qualifier is not directly relevant to standard-conforming C programs, and so will not be addressed further in this text.

Finally, there is a type called void, which specifies a “no value” type. It is used as an argument for functions that have no arguments, and as a return type for functions that return no value.

 

Constants : 
Constants can have different types and representations. This section presents various constant types by example. First, an integer constant 1234 is of type int. An constant of type long int is suffixed by an L, 1234L; (integer constants too big for int are implicitly taken as long). An unsigned int is suffixed by a U, 1234U, and UL specifies unsigned long.

Integer constants may also be specified by octal (base 8) or hexadecimal (base 16) values, rather than decimal (base 10). Octal numbers are preceded by a 0 and hex by 0x. Thus, 1234 in decimal is equivalent to 02322 and 0x4D2. It is important to remember that these three constants represent exactly the same value (0101 1101 0010 in binary). For example, the following code

int x = 1234, y = 02322, z = 0x4D2;
printf("%d\t%o\t%x\n", x, x, x);
printf("%d\t%d\t%d\n", x, y, z);
----------------
prints
1234 2322 4d2
1234 1234 1234

Notice that C does not provide a direct binary representation. However, the hex form is very useful
in practice as it breaks down binary into blocks of four bits .

Floating-point constants are specified by a decimal point after a number. For example, 1. and 1.3 are of type double, 3.14f and 2.f are of type float, and 7.L is of type long double. Floatingpoint numbers can also be written using scientific notation, such as 1.65e-2 (which is equivalent to 0.0165). Constant expressions, such as 3+7+9.2, are evaluated at compile-time and replaced by a single constant value, 19.2. Thus, constant expressions incur no runtime overhead.

Character constants, such as ’a’, ’\n’, ’7’, are specified by single quotes. Character constants are noteworthy because they are, in fact, not of type char, but of int. Thus, sizeof(’Z’) will equal 4 on a 32-bit machine, not one. Most platforms represent characters using the ASCII character set, which associates the integers 0 to 127 with specific characters (e.g., the character ’T’ is represented by the integer 84). Tables of the ASCII character set are readily found .

There are certain characters that cannot be represented directly, but rather are denoted by an “escape sequence”. It is important to recognise that these escape characters still represent single characters. A selection of key escape characters are the following: \0 for NUL (used to terminate character strings), \n for newline, \t for tab, \v for vertical tab, \\ for backslash, \’ for single quotes, \" for double quotes, and \b for backspace.

String constants, such as "This is a string" are delimited by quotes (note, the quotes are not actually part of the string constant). They are implicitly appended with a terminating ’\0’ character. Thus, in memory, the above string constant would comprise the following character sequence: This is a string\0.

 

Note. It is important to differentiate between a character constant (e.g., ’X’) and a NUL terminated string constant (e.g., "X"). The latter is the concatenation of two characters X\0. Note also that sizeof(’X’) is four (on a 32-bit machine) while sizeof("X") is two.

 

Symbolic Constants
Symbolic constants represent constant values, from the set of constant types mentioned above, by a symbolic name. For example

#define BLOCK_SIZE 100
#define TRACK_SIZE (16*BLOCK_SIZE)
#define HELLO "Hello World\n"
#define EXP 2.7183

Wherever a symbolic constant appears in the code, it is equivalent to direct text-replacement with the constant it defines. For example,

printf(HELLO);

prints the string Hello World. The reason for using symbolic constants rather than constant values directly, is that it prevents the proliferation of “magic numbers”—numerical constants scattered throughout the code.3 This is very important as magic numbers are error-prone and are the source of major difficulty when attempting to make code-changes. Symbolic constants keep constants together in one place so that making changes is easy and safe.

Note. The #define symbol, like the #include symbol for file inclusion, is a preprocessor command (see Section 10.2). As such, it is subject to different rules than the core C language. Importantly, the # must be the first character on a line; it must not be indented.

Another form of symbolic constant is an enumeration, which is a list of constant integer values. For example,

enum Boolean { FALSE, TRUE };

The enumeration tag Boolean defines the “type” of the enumeration list, such that a variable may be declared of the particular type.

enum Boolean x = FALSE;

If an enumeration list is defined without an explicit tag, it assumes the type int.For example,

enum { RED=2, GREEN, BLUE, YELLOW=4, BLACK };
int y = BLUE;

The value of enumeration lists starts from zero by default, and increments by one for each subsequent member (e.g., FALSE is 0 and TRUE is 1). List members can also be given explicit integer values, and non-specified members are each one greater than the previous member (e.g., RED is 2, GREEN is 3, BLUE is 4, YELLOW is 4, and BLACK is 5).

 

Style Note. Symbolic constants and enumerations are by convention given uppercase names. This makes them distinct from variables and functions, which, according to good practice, should always begin with a lowercase letter. Variables qualified by const behave like constants and so should also be identified with uppercase names, or with the first letter uppercase.

 

printf Conversion Specifiers
The standard function printf() facilitates formatted text output. It merges numerical values of any type into a character string using various formatting operators and conversion specifiers.

printf("Character values %c %c %c\n", ’a’, ’b’, ’c’);
printf("Some floating-point values %f %f %f\n", 3.556, 2e3, 40.1f);
printf("Scientific notation %e %e %e\n", 3.556, 2e3, 40.1f);
printf("%15.10s\n", "Hello World\n"); 
/* Right-justify string with space for
15 chars, print only first 10 letters */

Important. A conversion specifier and its associated variable must be of matching type. If they are not, the program will either print garbage or crash. For example,

printf("%f", 52); 
/* Mismatch: floating point specifier, integer value */

 

Declarations
All variables must be declared before they are used. They must be declared at the top of a block (a section of code enclosed in brackets { and }) before any statements. They may be initialised by a constant or an expression when declared. The following are a set of example declarations.

{ /* bracket signifies top of a block */
int lower, upper, step; /* 3 uninitialised ints */
char tab = ’\t’; /* a char initialised with ’\t’ */
char buf[10]; /* an uninitialised array of chars */
int m = 2+3+4; /* constant expression: 9 */
int n = m + 5; /* initialised with 9+5 = 14 */
float limit = 9.34f;
const double PI = 3.1416;

The general form of a declaration is

   = ,  = , ... ;

where the assignment to an initial value is optional

 

Arithmetic Operations
The arithmetic (or numerical) operators come in two varieties: unary and binary. The binary operators are plus +, minus −, multiply ∗, divide /, and the modulus operator %. The first four operators can be used on integer or floating-point types, although it is important to notice that integer division truncates any fractional part (e.g., 17/5 is equal to 3). The modulus operator is valid only for non-floating-point types (e.g., char, int, etc), and x%y produces the remainder from the division x/y (e.g., 18 % 7 is equal to 4).

Note. For negative integers, the direction of truncation for /, and the sign for the result of %, are implementation defined (i.e., they may have different results on different platforms). 

The unary operators plus + and minus - can be used on integer or floating-point types, and are used as follows.

int ispositive = +34;
double isnegative = -56.3;

The unary + is a redundant operator as numbers are positive by default. It exists only for symmetry with the unary - operator.

An important set of unary operators are the increment ++ and decrement -- operators. These operators add 1 to a variable and subtract 1 from a variable, respectively. Thus, the expression x++ is equivalent to x=x+1. An unusual quality of ++ and -- is that they may be used prefix ++x or postfix x++ with different characteristics. For example,

double x = 3.2;
double y = ++x;
double z = x++;

In the first case, called preincrement, the value of x is increased to 4.2 and then assigned to y, which then also equals 4.2. In the second case, called postincrement, the value of x is first assigned to z, and subsequently increased by 1; so, z equals 4.2 and x equals 5.2.

The precedence of the arithmetic operators is as follows: ++, --, and unary + and − have the highest precedence; next comes ∗, /, and %; and finally, binary + and − have the lowest precedence.

int a=2, b=7, c=5, d=9;
printf("a*b + c*d = %d\n", a*b + c*d); 
/* prints a*b + c*d = 59 */

Two common errors can occur with numerical operations: divide-by-zero and overflow. The first occurs during a division operation z=x/y where y is equal to zero; this is the case for integer or floating-point division. Divide-by-zero errors can also occur with the modulus operator if the second operand is 0. The second error, overflow, occurs when the result of a mathematical operation cannot be represented by the result type. For example, 

int z = x + 1;

will overflow if the value of x is the largest representable value of type int. The value of z following a divide-by-zero or overflow error will be erroneous, and may be different on different platforms.

 

Relational and Logical Operations
There are six relational operators: greater-than >, less-than <, greater-than-or-equal-to >=, lessthan-or-equal-to <=, equal-to == and not-equal-to !=. Relational expressions evaluate to 1 if they are TRUE and 0 if they are FALSE. For example, 2.1 < 7 evaluates to one, and x != x evaluates to zero.

Note. A very common programming error is to mistakenly type = (assignment) for == (equality). For example, consider a loop that is to execute while ever x == 3. If it is written as

while (x = 3) {
/* various statements here */
}

then x will be assigned the value 3 and this value will be the loop conditional, which is always non-zero (and therefore TRUE) resulting in an infinite loop. The three logical operators are AND && and OR || and NOT !. All the relational and logical operators are binary except the !, which is unary. The && and || operators connect pairs of conditional expressions, with && being TRUE only if both expressions are TRUE, and || being TRUE if either expression is TRUE. They can be used to chain together multiple expressions, as in the following example where, given the integer values a=1, b=2, c=3, d=3,

(a < b && b < c && c < d) /* FALSE */
(a < b && b < c && c <= d) /* TRUE */
((a < b && b < c) || c < d) /* TRUE */

The order of evaluation of && and || is left-to-right, and evaluation stops as soon as the truth or falsehood of the result is known—leaving the remaining expressions unevaluated. This feature results in several common idioms in C programs. For example, given an array of length SIZE, it is incorrect to evaluate array[SIZE], which is one-beyond the end of the array. The idiom

i = 0;
while (i < SIZE && array[i] != val)
++i;

ensures that, when i == SIZE, the conditional expression terminates before evaluating array[i].
The unary operator ! simply converts a non-zero expression to zero and vice-versa. For example, the statement

if (!valid)
x = y;

performs the assignment x=y only if valid equals 0. The unary ! tends to be used infrequently as it can lead to obscure code, and typically == or != provide a more readable alternative.

if (valid == 0)
x = y;

The precedence of the relational and logical operators is lower than the arithmetic operators, except for the unary !, which has equal precedence to the unary + and -. Of the others, >, <, >=, and <= have highest precedence; followed by == and !=; then &&; and finally, ||.

Style Note. C has precedence rules for all its operators. However, for correctness and readability, it is good practice to make minimal use of these rules (e.g., * and / are evaluated before + and -) and use parentheses everywhere else.

The following example is a segment of code where the intuitive precedence is not correct, and the code is faulty. This code is intended to copy the characters of a string t to a character array s, an operation which is complete when the terminating ’\0’ is copied.

while (s[i] = t[i] != ’\0’)
++i;

However, the != has higher precedence than the =, and so s[i] will not be assigned t[i] but the result of t[i] != ’\0’, which is 1 except for the final iteration when it will be 0. The correct result is obtained using parentheses.

while ((s[i] = t[i]) != ’\0’)
++i;

 

Bitwise Operators
C possesses a number if bitwise operators that permit operations on individual bits (i.e., binary 1s and 0s). These are essential for low-level programming, such as controlling hardware. 

The operators are the bitwise AND &, bitwise OR |, bitwise exclusive OR ^, left shift <<, right shift >>, and one’s complement operator ~. It is important to realise that & is not &&, | is not ||, and >> does not mean “much-greater-than”. The purpose and usage of the logical and bitwise operators are quite disparate and may not be used interchangeably

 

Assignment Operators
Expressions involving the arithmetic or bitwise operators often involve the assignment operator = (for example, z=x+y). Sometimes in these expressions, the left-hand-side variable is repeated immediately on the right (e.g., x=x+y). These types of expression can be written in the compressed form x += y, where the operator += is called an assignment operator.

The binary arithmetic operators, +, −, *, /, and %, each have a corresponding assignment operator +=, -=, *=, /=, and %=. Thus, we can write x *= y + 1 rather than x = x * (y + 1). For completeness, we mention also the bitwise assignment operators: &=, |=, ^=, <<=, and >>=.

 

Type Conversions and Casts : 
When an operator has operands of different types, they are converted to a common type according to a small number of rules.

For a binary expression such as a*b, the following rules are followed (assuming neither operand is unsigned):
• If either operand is long double, convert the other to long double.
• Otherwise, if either operand is double, convert the other to double.
• Otherwise, if either operand is float, convert the other to float.
• Otherwise, convert char and short to int, and, if either operand is long, convert the other to long

If the two operands consist of a signed and an unsigned version of the same type, then the signed operand will be promoted to unsigned, with strange results if the previously signed value was negative.

A simple example of type promotion is shown in the following code.

short a = 5;
int b = 10;
float c = 23.1f;
double d = c + a*b;

Here the multiply is performed first, so a is promoted to int and multiplied with b. The integer result of this expression is promoted to float and added to c. This result is then promoted to double and assigned to d.

Note. The promotion from char to int is implementation-dependent, since whether a plain char is signed or unsigned depends on the compiler. Some platforms will perform “sign extension” if the left-most bit is 1, while others will fill the high-order bits with zeros—so the value is always positive. Assignment to a “narrower” operand is possible, although information may be lost. Conversion to a narrower type should elicit a warning from good compilers. Conversion from a larger integer to a smaller one results in truncation of the higher-order bits, and conversion from floating-point to integer causes truncation of any fractional part. For example,

int iresult = 0.5 + 3/5.0;

The division 3/5.0 is promoted to type double so that the final summation equals 1.1. The result then is truncated to 1 in the assignment to iresult. Note, a conversion from double to float is implementation dependent and might be either truncated or rounded.

Narrowing conversions should be avoided. For the cases where they are necessary, they should be made explicit by a cast. For example,

int iresult = (int)(0.5 + 3/5.0);

Casts can also be used to coerce a conversion, such as going against the promotion rules specified above. For example, the expression

result = (float)5.0 + 3.f;

will add the two terms as float’s rather than double’s.