# Modern C++ Basics Part -8 Arrays, Pointers, and References

modern c++,Arrays, Pointers, and References

## 1.8 Modern c++,Arrays, Pointers, and References

### 1.8.1 Arrays

The intrinsic array support of modern C++ has certain limitations and some strange behaviors. Nonetheless, we feel that every C++ programmer should know it and be aware of its problems.

An array is declared as follows:

`int x[10];`

The variableÂ xÂ is an array with 10Â intÂ entries. In standard C++, the size of the array must be constant and known at compile time. Some compilers (e.g.,Â gcc) support run-time sizes.

Arrays are accessed by square brackets:Â x[i]Â is a reference to theÂ i-th element ofÂ x. The first element isÂ x[0]; the last one isÂ x[9]. Arrays can be initialized at the definition:

`float v[]= {1.0, 2.0, 3.0}, w[]= {7.0, 8.0, 9.0};`

In this case, the array size is deduced.

The list initialization in C++11 cannot be narrowed any further. This will rarely make a difference in practice. For instance, the following:

`int v[]= {1.0, 2.0, 3.0};    // Error in C++11: narrowing`

was legal in C++03 but not in C++11 since the conversion from a floating-point literal toÂ intÂ potentially loses precision. However, we would not write such ugly code anyway.

Operations on arrays are typically performed in loops; e.g., to computeÂ xÂ =Â vÂ â€“ 3wÂ as a vector operation is realized by

```float x[3];
for (int i= 0; i < 3; ++i)
x[i]= v[i] - 3.0 * w[i];```

We can also define arrays of higher dimensions:

```float A[7][9];     // a 7 by 9 matrix
int   q[3][2][3];  // a 3 by 2 by 3 array```

The language does not provide linear algebra operations upon the arrays. Implementations based on arrays are inelegant and error-prone. For instance, a function for a vector addition would look like this:

```void vector_add(unsigned size, const double v1[], const double v2[],
double s[])
{
for (unsigned i= 0; i < size; ++i)
s[i]= v1[i] + v2[i];
}```

Note that we passed the size of the arrays as first function parameter whereas array parameters donâ€™t contain size information.11Â In this case, the functionâ€™s caller is responsible for passing the correct size of the arrays:

```int main ()
{
double x[]= {2, 3, 4}, y[]= {4, 2, 0}, sum[3];
...
}```

Since the array size is known during compilation, we can compute it by dividing the byte size of the array by that of a single entry:

`vector_add(sizeof x / sizeof x[0], x, y, sum);`

With this old-fashioned interface, we are also unable to test whether our arrays match in size. Sadly enough, C and Fortran libraries with such interfaces where size information is passed as function arguments are still realized today. They crash at the slightest user mistake, and it can take enormous efforts to trace back the reasons for crashing. For that reason, we will show in this book how we can realize our own math software that is easier to use and less prone to errors. Hopefully, future C++ standards will come with more higher mathematics, especially a linear-algebra library.

Arrays have the following two disadvantages:

• Indices are not checked before accessing an array, and we can find ourselves outside the array and the program crashes with segmentation fault/violation. This is not even the worst case; at least we see that something goes wrong. The false access can also mess up our data; the program keeps running and produces entirely wrong results with whatever consequence you can imagine. We could even overwrite the program code. Then our data is interpreted as machine operations leading to any possible nonsense.
• The size of the array must be known at compile time.12Â For instance, we have an array stored to a file and need to read it back into memory:`ifstream ifs("some_array.dat"); ifs â‰« size; float v[size]; // Error: size not known at compile time`This does not work because the size needs to be known during compilation.

The first problem can only be solved with new array types and the second one with dynamic allocation. This leads us to pointers.

### 1.8.2 Pointers

A pointer is a variable that contains a memory address. This address can be that of another variable that we can get with the address operator (e.g.,Â &x) or dynamically allocated memory. Letâ€™s start with the latter as we were looking for arrays of dynamic size.

`int* y= new int[10];`

This allocates an array of 10Â int. The size can now be chosen at run time. We can also implement the vector reading example from the previous section:

```ifstream ifs("some_array.dat");
int size;
ifs â‰« size;
float* v= new float[size];
for (int i= 0; i < size; ++i)
ifs â‰« v[i];```

Pointers bear the same danger as arrays: accessing data out of range which can cause program crashes or silent data invalidation. When dealing with dynamically allocated arrays, it is the programmerâ€™s responsibility to store the array size.

Furthermore, the programmer is responsible for releasing the memory when not needed anymore. This is done by

`delete[] v;`

Since arrays as function parameters are treated internally as pointers, theÂ vector_addÂ function from page 47 works with pointers as well:

```int main (int argc, char* argv[])
{
double *x= new double[3], *y= new double[3], *sum= new double[3];
for (unsigned i= 0; i < 3; ++i)
x[i]= i+2, y[i]= 4-2*i;
...
}```

With pointers, we cannot use theÂ sizeofÂ trick; it would only give us the byte size of the pointer itself which is of course independent of the number of entries. Other than that, pointers and arrays are interchangeable in most situations: a pointer can be passed as an array argument (as in the previous listing) and an array as a pointer argument. The only place where they are really different is the definition: whereas defining an array of sizeÂ nÂ reserves space forÂ nÂ entries, defining a pointer only reserves the space to hold an address.

Since we started with arrays, we took the second step before the first one regarding pointer usage. The simple use of pointers is allocating one single data item:

`int* ip= new int;`

Releasing this memory is performed by

`delete ip;`

Note the duality of allocation and release: the single-object allocation requires a single-object release and the array allocation demands an array release. Otherwise the run-time system will handle the deallocation incorrectly and most likely crash at this point. Pointers can also refer to other variables:

```int  i= 3;
int* ip2= &i;```

The operatorÂ &Â takes an object and returns its address. The opposite operator isÂ *Â which takes an address and returns an object:

`int  j= *ip2;`

This is calledÂ Dereferencing. Given the operator priorities and the grammar rules, the meaning of the symbolÂ *Â as dereference or multiplication cannot be confusedâ€”at least not by the compiler.

Pointers that are not initialized contain a random value (whatever bits are set in the corresponding memory). Using uninitialized pointers can cause any kind of error. To say explicitly that a pointer is not pointing to something, we should set it to

```int* ip3= nullptr;    // >= C++11
int* ip4{};           // ditto```

or in old compilers:

```int* ip3= 0;          // better not in C++11 and later
int* ip4= NULL;       // ditto```

The address 0 is guaranteed never to be used for applications, so it is safe to indicate this way that the pointer is empty (not referring to something). Nonetheless the literal 0 does not clearly convey its intention and can cause ambiguities in function overloading. The macroÂ NULLÂ is not better: it just evaluates toÂ 0. C++11 introducesÂ nullptrÂ as a keyword for a pointer literal. It can be assigned to or compared with all pointer types. As it cannot be confused with other types and is self-explanatory, it is preferred over the other notations. The initialization with an empty braced list also sets aÂ nullptr.

The biggest danger of pointers isÂ Memory Leaks. For instance, our arrayÂ yÂ became too small and we want to assign a new array:

`int* y= new int[15];`

We can now use more space inÂ y. Nice. But what happened to the memory that we allocated before? It is still there but we have no access to it anymore. We cannot even release it because this requires the address too. This memory is lost for the rest of our program execution. Only when the program is finished will the operating system be able to free it. In our example, we only lost 40 bytes out of several gigabytes that we might have. But if this happens in an iterative process, the unused memory grows continuously until at some point the whole (virtual) memory is exhausted.

Even if the wasted memory is not critical for the application at hand, when we write high-quality scientific software, memory leaks are unacceptable. When many people are using our software, sooner or later somebody will criticize us for it and eventually discourage other people from using our software. Fortunately, there are tools to help you to find memory leaks, as demonstrated in Section B.3.

The demonstrated issues with pointers are not intended as fun killers. And we do not discourage the use of pointers. Many things can only be achieved with pointers: lists, queues, trees, graphs, et cetera. But pointers must be used with utter care to avoid all the really severe problems mentioned above.

There are three strategies to minimize pointer-related errors:

• Use standard containers:Â from the standard library or other validated libraries.Â std::vectorÂ from the standard library provides us all the functionality of dynamic arrays, including resizing and range check, and the memory is released automatically.
• Encapsulate:Â dynamic memory management in classes. Then we have to deal with it only once per class.13Â When all memory allocated by an object is released when the object is destroyed, then it does not matter how often we allocate memory. If we have 738 objects with dynamic memory, then it will be released 738 times. The memory should be allocated in the object construction and deallocated in its destruction. This principle is calledÂ Resource Allocation Is InitializationÂ (RAII). In contrast, if we calledÂ newÂ 738 times, partly in loops and branches, can we be sure that we have calledÂ deleteÂ exactly 738 times? We know that there are tools for this but these are errors that are better to prevent than to fix.14Â Of course, the encapsulation idea is not idiot-proof but it is much less work to get it right than sprinkling (raw) pointers all over our program. We will discuss RAII in more detail in Section 2.4.2.1.
• Use smart pointers:Â which we will introduce in the next section (Â§1.8.3).

Pointers serve two purposes:

• Referring to objects; and
• Managing dynamic memory.

The problem with so-calledÂ Raw PointersÂ is that we have no notion whether a pointer is only referring to data or also in charge of releasing the memory when it is not needed any longer. To make this distinction explicit at the type level, we can useÂ Smart Pointers.

### 1.8.3 Smart Pointers

Three new smart-pointer types are introduced in C++11:Â unique_ptr,Â shared_ptr, andÂ weak_ptr. The already existing smart pointer from C++03 namedÂ auto_ptrÂ is generally considered as a failed attempt on the way toÂ unique_ptrÂ since the language was not ready at the time. It should not be used anymore. All smart pointers are defined in the headerÂ <memory>. If you cannot use C++11 features on your platform (e.g., in embedded programming), the smart pointers in Boost are a decent replacement.

#### 1.8.3.1 Unique Pointer

This pointerâ€™s name indicatesÂ Unique OwnershipÂ of the referred data. It can be used essentially like an ordinary pointer:

```#include <memory>

int main ()
{
unique_ptr<double> dp{new double};
*dp= 7;
...
}```

The main difference from a raw pointer is that the memory is automatically released when the pointer expires. Therefore, it is a bug to assign addresses that are not allocated dynamically:

```double d;
unique_ptr<double> dd{&d}; // Error: causes illegal deletion```

The destructor of pointerÂ ddÂ will try to deleteÂ d.

Unique pointers cannot be assigned to other pointer types or implicitly converted. For referring to the pointerâ€™s data in a raw pointer, we can use the member functionÂ get:

`double* raw_dp= dp.get();`

It cannot even be assigned to another unique pointer:

```unique_ptr<double> dp2{dp}; // Error: no copy allowed
dp2= dp;                    // ditto```

It can only be moved:

```unique_ptr<double> dp2{move(dp)}, dp3;
dp3= move(dp2);```

We will discuss move semantics in Section 2.3.5. Right now let us just say this much: whereas a copy duplicates the data, aÂ MoveÂ transfers the data from the source to the target. In our example, the ownership of the referred memory is first passed fromÂ dpÂ toÂ dp2Â and then toÂ dp3.Â dpÂ andÂ dp2Â areÂ nullptrÂ afterward, and the destructor ofÂ dp3Â will release the memory. In the same manner, the memoryâ€™s ownership is passed when aÂ unique_ptrÂ is returned from a function. In the following example,Â dp3Â takes over the memory allocated inÂ f():

```std::unique_ptr<double> f()
{    return std::unique_ptr<double>{new double}; }

int main ()
{
unique_ptr<double> dp3;
dp3= f();
}```

In this case,Â move()Â is not needed since the function result is a temporary that will be moved (again, details in Â§2.3.5).

Unique pointer has a special implementation15Â for arrays. This is necessary for properly releasing the memory (withÂ delete[]). In addition, the specialization provides array-like access to the elements:

```unique_ptr<double[]> da{new double[3]};
for (unsigned i= 0; i < 3; ++i)
da[i]= i+2;```

In return, theÂ operator*Â is not available for arrays.

An important benefit ofÂ unique_ptrÂ is that it has absolutely no overhead over raw pointers: neither in time nor in memory.

Further reading:Â An advanced feature of unique pointers is to provide our ownÂ Deleter; for details see [26, Â§5.2.5f], [43, Â§34.3.1], or an online reference (e.g.,Â cppreference.com).

#### 1.8.3.2 Shared Pointer

As its name indicates, aÂ shared_ptrÂ manages memory that is used in common by multiple parties (each holding a pointer to it). The memory is automatically released as soon as noÂ shared_ptrÂ is referring the data any longer. This can simplify a program considerably, especially with complicated data structures. An extremely important application area is concurrency: the memory is automatically freed when all threads have terminated their access to it.

In contrast to aÂ unique_ptr, aÂ shared_ptrÂ can be copied as often as desired, e.g.:

```shared_ptr<double> f()
{
shared_ptr<double> p1{new double};
shared_ptr<double> p2{new double}, p3= p2;
cout â‰ª "p3.use_count() = " â‰ª p3.use_count() â‰ª endl;
return p3;
}

int main ()
{
shared_ptr<double> p= f();
cout â‰ª "p.use_count() = " â‰ª p.use_count() â‰ª endl;
}```

In the example, we allocated memory for twoÂ doubleÂ values: inÂ p1Â and inÂ p2. The pointerÂ p2Â is copied intoÂ p3Â so that both point to the same memory as illustrated inÂ Figure 1â€“1.

Figure 1â€“1:Â Shared pointer in memory

We can see this from the output ofÂ use_count:

```p3.use_count() = 2
p.use_count() = 1```

When the function returns, the pointers are destroyed and the memory referred to byÂ p1Â is released (without ever being used). The second allocated memory block still exists sinceÂ pÂ from theÂ mainÂ function is still referring to it.

If possible, aÂ shared_ptrÂ should be created withÂ make_shared:

`shared_ptr<double> p1= make_shared<double>();`

Then the management and business data are stored together in memoryâ€”as shown inÂ Figure 1â€“2â€”and the memory caching is more efficient. SinceÂ make_sharedÂ returns a shared pointer, we can use automatic type detection (Â§3.4.1) for simplicity:

`auto p1= make_shared<double>();`

Figure 1â€“2: Shared pointer in memory afterÂ make_shared

We have to admit that aÂ shared_ptrÂ has some overhead in memory and run time. On the other hand, the simplification of our programs thanks toÂ shared_ptrÂ is in most cases worth some small overhead.

Further reading:Â For deleters and other details ofÂ shared_ptrÂ see the library reference [26, Â§5.2], [43, Â§34.3.2], or an online reference.

#### 1.8.3.3 Weak Pointer

A problem that can occur with shared pointers isÂ Cyclic ReferencesÂ that impede the memory to be released. Such cycles can be broken byÂ weak_ptrs. They do not claim ownership of the memory, not even a shared one. At this point, we only mention them for completeness and suggest that you read appropriate references when their need is established: [26, Â§5.2.2], [43, Â§34.3.3], orÂ cppreference.com.

For managing memory dynamically, there is no alternative to pointers. To only refer to other objects, we can use another language feature calledÂ ReferenceÂ (surprise, surprise), which we introduce in the next section.

### 1.8.4 References

The following code introduces a reference:

```int i= 5;
int& j= i;
j= 4;
std::cout â‰ª "j = " â‰ª j â‰ª '\n';```

The variableÂ jÂ is referring toÂ i. ChangingÂ jÂ will also alterÂ iÂ and vice versa, as in the example.Â iÂ andÂ jÂ will always have the same value. One can think of a reference as an alias: it introduces a new name for an existing object or sub-object. Whenever we define a reference, we must directly declare what it is referring to (other than pointers). It is not possible to refer to another variable later.

So far, that does not sound extremely useful. References are extremely useful for function arguments (Â§1.5), for referring to parts of other objects (e.g., the seventh entry of a vector), and for building views (e.g., Â§5.2.3).

As a compromise between pointers and references, the new standard offers aÂ reference_wrapperÂ class which behaves similarly to references but avoids some of their limitations. For instance, it can be used within containers; see Â§4.4.2.

### 1.8.5 Comparison between Pointers and References

The main advantage of pointers over references is the ability of dynamic memory management and address calculation. On the other hand, references are forced to refer to existing locations.16Â Thus, they do not leave memory leaks (unless you play really evil tricks), and they have the same notation in usage as the referred object. Unfortunately, it is almost impossible to construct containers of references.

In short, references are not fail-safe but are much less error-prone than pointers. Pointers should be only used when dealing with dynamic memory, for instance when we create data structures like lists or trees dynamically. Even then we should do this via well-tested types or encapsulate the pointer(s) within a class whenever possible. Smart pointers take care of memory allocation and should be preferred over raw pointers, even within classes. The pointer-reference comparison is summarized in Table 1-9.

### 1.8.6 Do Not Refer to Outdated Data!

Function-local variables are only valid within the functionâ€™s scope, for instance:

```double& square_ref(double d) // DO NOT!
{
double s= d * d;
return s;
}```

Here, our function result refers the local variableÂ sÂ which does not exist anymore. The memory where it was stored is still there and we might be lucky (mistakenly) that it is not overwritten yet. But this is nothing we can count on. Actually, such hidden errors are even worse than the obvious ones because they can ruin our program only under certain conditions and then they are very hard to find.

Such references are calledÂ Stale References. Good compilers will warn us when we are referring to a local variable. Sadly enough, we have seen such examples in web tutorials.

The same applies to pointers:

```double* square_ptr(double d) // DO NOT!
{
double s= d * d;
return &s;
}```

This pointer holds a local address that has gone out of scope. This is called aÂ Dangling Pointer.

Returning references or pointers can be correct in member functions when member data is referred to; see Section 2.6.

Only return pointers and references to dynamically allocated data, data that existed before the function was called, or static data.

### 1.8.7 Containers for Arrays

As alternatives to the traditional C arrays, we want to introduce two container types that can be used in similar ways.

#### 1.8.7.1 Standard Vector

Arrays and pointers are part of the C++ core language. In contrast,Â std::vectorÂ belongs to the standard library and is implemented as a class template. Nonetheless, it can be used very similarly to arrays. For instance, the example from Section 1.8.1 of setting up two arraysÂ vÂ andÂ wÂ looks for vectors as follows:

```#include <vector>

int main ()
{
std::vector<float> v(3), w(3);
v[0]= 1; v[1]= 2; v[2]= 3;
w[0]= 7; w[1]= 8; w[2]= 9;
}```

The size of the vector does not need to be known at compile time. Vectors can even be resized during their lifetime, as will be shown in Section 4.1.3.1.

The element-wise setting is not particularly compact. C++11 allows the initialization with initializer lists:

`std::vector<float> v= {1, 2, 3}, w= {7, 8, 9};`

In this case, the size of the vector is implied by the length of the list. The vector addition shown before can be implemented more reliably:

```void vector_add(const vector<float>& v1, const vector<float>& v2,
vector<float>& s)
{
assert(v1.size() == v2.size());
assert(v1.size() == s.size());
for (unsigned i= 0; i < v1.size(); ++i)
s[i]= v1[i] + v2[i];
}```

In contrast to C arrays and pointers, theÂ vectorÂ arguments know their sizes and we can now check whether they match. Note: The array size can be deduced with templates, which we leave as an exercise for later (see Â§3.11.9).

Vectors are copyable and can be returned by functions. This allows us to use a more natural notation:

```vector<float> add(const vector<float>& v1, const vector<float>& v2)
{
assert(v1.size() == v2.size());
vector<float> s(v1.size());
for (unsigned i= 0; i < v1.size(); ++i)
s[i]= v1[i] + v2[i];
return s;
}

int main ()
{
std::vector<float> v= {1, 2, 3}, w= {7, 8, 9}, s= add (v, w);
}```

This implementation is potentially more expensive than the previous one where the target vector is passed in as a reference. We will later discuss the possibilities of optimization: both on the compiler and on the user side. In our experience, it is more important to start with a productive interface and deal with performance later. It is easier to make a correct program fast than to make a fast program correct. Thus, aim first for a good program design. In almost all cases, the favorable interface can be realized with sufficient performance.

The containerÂ std::vectorÂ is not a vector in the mathematical sense. There are no arithmetic operations. Nonetheless, the container proved very useful in scientific applications to handle non-scalar intermediate results.

#### 1.8.7.2Â valarray

AÂ valarrayÂ is a one-dimensional array with element-wise operations; even the multiplication is performed element-wise. Operations with a scalar value are performed respectively with each element of theÂ valarray. Thus, theÂ valarrayÂ of a floating-point number is a vector space.

The following example demonstrates some operations:

```#include <iostream>
#include <valarray>

int main ()
{
std::valarray<float> v= {1, 2, 3}, w= {7, 8, 9}, s= v + 2.0f * w;
v= sin(s);
for (float x : v)
std::cout â‰ª x â‰ª ' ';
std::cout â‰ª '\n';
}```

Note that aÂ valarray<float>Â can only operate with itself orÂ float. For instance,Â 2 * wÂ would fail since it is an unsupported multiplication ofÂ intÂ withÂ valarray<float>.

A strength ofÂ valarrayÂ is the ability to access slices of it. This allows us toÂ EmulateÂ matrices and higher-order tensors including their respective operations. Nonetheless, due to the lack of direct support of most linear-algebra operations,Â valarrayÂ is not widely used in the numeric community. We also recommend using established C++ libraries for linear algebra. Hopefully, future standards will contain one.

In Section A.2.9, we make some comments onÂ Garbage CollectionÂ which is essentially saying that we can live well enough without it.

Reference Book byÂ Peter Gottschling

(Visited 11 times, 1 visits today)