Modern C++ Basics Part -2

In this chapter we will learn about the Variables

1.2 Variables

C++ is a strongly typed language (in contrast to many scripting languages). This means that every variable has a type and this type never changes. A variable is declared by a statement beginning with a type followed by a variable name with optional initialization—or a list thereof:

int     i1= 2;             // Alignment for readability only
int     i2, i3= 5;
float   pi= 3.14159;
double  x= -1.5e6;         // -1500000
double  y= -1.5e-6;        // -0.0000015
char    c1= 'a', c2= 35;
bool    cmp= i1 < pi,      // -> true
        happy= true;

The two slashes // here start a single-line comment; i.e., everything from the double slashes to the end of the line is ignored. In principle, this is all that really matters about comments. So as not to leave you with the feeling that something important on the topic is still missing, we will discuss it a little more in Section 1.9.1.

Back to the variables! Their basic types—also called Intrinsic Types—are given in Table 1–1.

Table 1–1: Intrinsic Types

Name	Semantics
char	letter and very short integer number
short	rather short integer number
int	regular integer number
long	long integer number
long long	very long integer number
unsigned	unsigned versions of all the former
signed	signed versions of all the former
float	single-precision f loating-point number
double	double-precision floating-point number
long double;	long floating-point number
bool	boolean

The first five types are integer numbers of non-decreasing length. For instance, int is at least as long as short; i.e., it is usually but not necessarily longer. The exact length of each type is implementation-dependent; e.g., int could be 16, 32, or 64 bits. All these types can be qualified as signed or unsigned. The former has no effect on integer numbers (except char) since they are signed by default.

When we declare an integer type as unsigned, we will have no negative values but twice as many positive ones (plus one when we consider zero as neither positive nor negative). signed and unsigned can be considered adjectives for the nouns short, int, et cetera with int as the default noun when the adjective only is declared.

The type char can be used in two ways: for letters and rather short numbers. Except for really exotic architectures, it almost always has a length of 8 bits. Thus, we can either represent values from -128 to 127 (signed) in or from 0 to 255 (unsigned) and perform all numeric operations on them that are available for integers. When neither signed nor unsigned is declared, it depends on the implementation of the compiler which one is used. We can also represent any letter whose code fits into 8 bits. It can be even mixed; e.g., ‘a’ + 7 usually leads to ‘h’ depending on the underlying coding of the letters. We strongly recommend not playing with this since the potential confusion will likely lead to a perceivable waste of time.

Using char or unsigned char for small numbers, however, can be useful when there are large containers of them.

Logic values are best represented as bool. A boolean variable can store true and false.

The non-decreasing length property applies in the same manner to floating-point numbers: float is shorter than or equally as long as double, which in turn is shorter than or equally as long as long double. Typical sizes are 32 bits for float, 64 bits for double, and 80 bits for long double.

In the following section, we show operations that are often applied to integer and float types. In contrast to other languages like Python, where ‘ and ” are used for both characters and strings, C++ distinguishes between the two of them. The C++ compiler considers ‘a’ as the character “a” (it has type char) and “a” is the string containing “a” and a binary 0 as termination (i.e., its type is char[2]). If you are used to Python, please pay attention to this.

Advice

Declare variables as late as possible, usually right before using them the first time and whenever possible not before you can initialize them.

This makes programs more readable when they grow long. It also allows the compiler to use the memory more efficiently with nested scopes.

C++11 can deduce the type of a variable for us, e.g.:

auto i4= i3 + 7;

The type of i4 is the same as that of i3 + 7, which is int. Although the type is automatically determined, it remains the same, and whatever is assigned to i4 afterward will be converted to int. We will see later how useful auto is in advanced programming. For simple variable declarations like those in this section it is usually better to declare the type explicitly. auto will be discussed thoroughly in Section 3.4.

1.2.1 Constants

Syntactically, constants are like special variables in C++ with the additional attribute of constancy.

const int    ci1= 2;
const int    ci3;            // Error: no value
const float  pi= 3.14159;
const char   cc 'a';
const bool   cmp= ci1 < pi;

As they cannot be changed, it is mandatory to set their values in the declaration. The second constant declaration violates this rule, and the compiler will not tolerate such misbehavior.

Constants can be used wherever variables are allowed—as long as they are not modified, of course. On the other hand, constants like those above are already known during compilation. This enables many kinds of optimizations, and the constants can even be used as arguments of types (we will come back to this later in §5.1.4).

1.2.2 Literals

Literals like 2 or 3.14 are typed as well. Simply put, integral numbers are treated as int, long, or unsigned long depending on the number of digits. Every number with a dot or an exponent (e.g., 3e12 ≡ 3 · 10¹²) is considered a double.

Literals of other types can be written by adding a suffix from the following table:

Literal	Type
2	int
2u	unsigned
2l	long
2ul	unsigned long
2.0	double
2.0f	float
2.0l	long double

In most cases, it is not necessary to declare the type of literals explicitly since the implicit conversion (a.k.a. Coercion) between built-in numeric types usually sets the values at the programmer’s expectation.

There are, however, three major reasons why we should pay attention to the types of literals:

Availability: The standard library provides a type for complex numbers where the type for the real and imaginary parts can be parameterized by the user:

std::complex<float> z(1.3, 2.4), z2;

Unfortunately, operations are only provided between the type itself and the underlying real type (and arguments are not converted here).² As a consequence, we cannot multiply z with an int or double but with float:

z2= 2 * z;       // Error: no int * complex<float>
z2= 2.0 * z;     // Error: no double * complex<float>
z2= 2.0f * z;    // Okay: float * complex<float>

Ambiguity: When a function is overloaded for different argument types (§1.5.4), an argument like 0 might be ambiguous whereas a unique match may exist for a qualified argument like 0u.

Accuracy: The accuracy issue comes up when we work with long double. Since the non-qualified literal is a double, we might lose digits before we assign it to a long double variable:

long double third1= 0.3333333333333333333;     // may lose digits
long double third2= 0.3333333333333333333l;    // accurate

If the previous three paragraphs were too brief for your taste, there is a more detailed version in Section A.2.1.

Non-decimal Numbers: Integer literals starting with a zero are interpreted as octal numbers, e.g.:

int o1= 042;         // int o1= 34;
int o2= 084;         // Error! No 8 or 9 in octals!

Hexadecimal literals can be written by prefixing them with 0x or 0X:

int h1= 0x42;        // int h1= 66;
int h2= 0xfa;        // int h2= 250;

C++14 introduces binary literals which are prefixed by 0b or 0B:

int b1= 0b11111010;  // int b1= 250;

To improve readability of long literals, C++14 allows us to separate the digits with apostrophes:

long              d=   6'546'687'616'861'129l;
unsigned long     ulx= 0x139'ae3b'2ab0'94f3;
int               b=   0b101'1001'0011'1010'1101'1010'0001;
const long double pi=  3.141'592'653'589'793'238'462l;

String literals are typed as arrays of char:

char s1[]= "Old C style"; // better not

However, these arrays are everything but convenient and we are better off with the true string type from the library <string>. It can be created directly from a string literal:

#include <string>

std::string s2= "In C++ better like this";

Very long text can be split into multiple sub-strings:

std::string s3= "This is a very long and clumsy text"
                "that is too long for one line.";

For more details on literals, see for instance [43, §6.2].

1.2.3 Non-narrowing Initialization

Say we initialize a long variable with a long number:

long l2= 1234567890123;

This compiles just fine and works correctly—when long takes 64 bits as on most 64-bit platforms. When long is only 32 bits long (we can emulate this by compiling with flags like -m32), the value above is too long. However, the program will still compile (maybe with a warning) and runs with another value, e.g., where the leading bits are cut off.

C++11 introduces an initialization that ascertains that no data is lost or in other words that the values are not Narrowed. This is achieved with the Uniform Initialization or Braced Initialization that we only touch upon here and expand in Section 2.3.4. Values in braces cannot be narrowed:

long l= {1234567890123};

Now, the compiler will check if the variable l can hold the value on the target architecture.

The compiler’s narrowing protection allows us to verify that values do not lose precision in initializations. Whereas an ordinary initialization of an int by a floating-point number is allowed due to implicit conversion:

int i1= 3.14;        // compiles despite narrowing (our risk)
int i1n= {3.14};     // Narrowing ERROR: fractional part lost

The new initialization form in the second line forbids this because it cuts off the fractional part of the floating-point number. Likewise, assigning negative values to unsigned variables or constants is tolerated with traditional initialization but denounced in the new form:

unsigned u2= -3;     // Compiles despite narrowing (our risk)
unsigned u2n= {-3};  // Narrowing ERROR: no negative values

In the previous examples, we used literal values in the initializations and the compiler checks whether a specific value is representable with that type:

float f1= {3.14};    // okay

Well, the value 3.14 cannot be represented with absolute accuracy in any binary floating-point format, but the compiler can set f1 to the value closest to 3.14. When a float is initialized from a double variable or constant (not a literal), we have to consider all possible double values and whether they are all convertible to float in a loss-free manner.

double d;
...
float f2= {d};       // narrowing ERROR

Note that the narrowing can be mutual between two types:

unsigned u3= {3};
int      i2= {2};

unsigned u4= {i2};   // narrowing ERROR: no negative values
int      i3= {u3};   // narrowing ERROR: not all large values

The types signed int and unsigned int have the same size, but not all values of each type are representable in the other.

1.2.4 Scopes

Scopes determine the lifetime and visibility of (non-static) variables and constants and contribute to establishing a structure in our programs.

1.2.4.1 Global Definition

Every variable that we intend to use in a program must have been declared with its type specifier at an earlier point in the code. A variable can be located in either the global or local scope. A global variable is declared outside all functions. After their declaration, global variables can be referred to from anywhere in the code, even inside functions. This sounds very handy at first because it makes the variables easily available, but when your software grows, it becomes more difficult and painful to keep track of the global variables’ modifications. At some point, every code change bears the potential of triggering an avalanche of errors.

Advice

Do not use global variables.

If you do use them, sooner or later you will regret it. Believe us. Global constants like

const double pi= 3.14159265358979323846264338327950288419716939;

are fine because they cannot cause side effects.

1.2.4.2 Local Definition

A local variable is declared within the body of a function. Its visibility/availability is limited to the { }-enclosed block of its declaration. More precisely, the scope of a variable starts with its declaration and ends with the closing brace of the declaration block.

If we define π in the function main:

int main ()
{
    const double pi= 3.14159265358979323846264338327950288419716939;
    std::cout ≪ "pi is " ≪ pi ≪ ".\n";
}

the variable π only exists in the main function. We can define blocks within functions and within other blocks:

int main ()
{
    {
        const double pi= 3.14159265358979323846264338327950288419716939;
    }
    std::cout ≪ "pi is " ≪ pi ≪ ".\n"; // ERROR: pi is out of scope
}

In this example, the definition of π is limited to the block within the function, and an output in the remainder of the function is therefore an error:

≫pi≪ is not defined in this scope.

because π is Out of Scope.

1.2.4.3 Hiding

When a variable with the same name exists in nested scopes, then only one variable is visible. The variable in the inner scope hides the homonymous variables in the outer scopes. For instance:

int main ()
{
    int a= 5;           // define a#1
    {
        a= 3;           // assign a#1, a#2 is not defined yet
        int a;          // define a#2
        a= 8;           // assign a#2, a#1 is hidden
        {
            a= 7;       // assign a#2
        }
    }                   // end of a#2's scope
    a= 11;              // assign to a#1 (a#2 out of scope)

    return 0;
}

Due to hiding, we must distinguish the lifetime and the visibility of variables. For instance, a#1 lives from its declaration until the end of the main function. However, it is only visible from its declaration until the declaration of a#2 and again after closing the block containing a#2. In fact, the visibility is the lifetime minus the time when it is hidden.

Defining the same variable name twice in one scope is an error.

The advantage of scopes is that we do not need to worry about whether a variable is already defined somewhere outside the scope. It is just hidden but does not create a conflict.³ Unfortunately, the hiding makes the homonymous variables in the outer scope inaccessible. We can cope with this to some extent with clever renaming. A better solution, however, to manage nesting and accessibility is namespaces; see Section 3.2.1.

static variables are the exception that confirms the rule: they live till the end of the execution but are only visible in the scope. We are afraid that their detailed introduction is more distracting than helpful at this stage and have postponed the discussion to Section A.2.2.

1.2 Variables

Table 1–1: Intrinsic Types

1.2.1 Constants

1.2.2 Literals

1.2.3 Non-narrowing Initialization

1.2.4 Scopes

1.2.4.1 Global Definition

1.2.4.2 Local Definition

1.2.4.3 Hiding

Reference Book by Peter Gottschling

Leave a Reply

Categories

Pages

Programmer’s Academy