Introduction to C++

As an statistician and software developer enthusiast, I believe that C++ is an excellent tool to boost your code in your preferred language for data analysis like R, python or Julia. Anyway, there are several reason why C++ is so useful, so I will share some notes for learning C++, based on the course called Introduction to C++ on edX.

What is c++?

C++ programming language was created by Bjarne Stroustrup, who defines C++ as a general purpose programming language with a bias towards systems programming that

  • is a better C
  • supports data abstraction
  • supports object-oriented programming
  • supports generic programming.

Basic Program Structure of C++ Code

The structure of a C++ script is quite similar to other languages, including R, where first we declare the packages or libraries for being used in the current application and then the body. The famous first program “Hello World” is shown below.

1
2
3
4
5
6
7
#include <iostream>

int main()
{
  std::cout << "Hello World!";
  return 0;
}
  • Line 1: Pre-processor directive, locates the code for iostream library.
  • Line 3: main method is required for any C++ program, it is the starting point of any application. int indicates that the output will be an integer.
  • Line 4 and 7: Indicates the beginning and end of the main method body.
  • Line 5: cout is a method that prints “Hello World” and it is found, ::, in the std namespace.
  • Line 7: Returns 0 if everything executes fine inside the body of main method.

Compilation and Code Formatting

Once a C++ script is written, it is build which means that is has passes the compilation process consisting of on th work of the preprocessor, compiler and linker. In general terms, the preprocessor takes the code and made some small modifications; then the compiler takes this output checking the syntax, semantic rules, so on and accepts the promises that used things are defined in other files. Finally the linker, links the objects into an executable program ensuring that all the promises are kept.

On linux, a C++ script can be executed as follows:

1
2
g++ hello-world.cpp # build the program
./a.out # executes the output

The first line uses g++ to compile the program, saving the output on the executable file a.out and the second line execute this file.

C++ is case sensitive.

  • Preprocessor: prior tasks to code compiling.
  • Directives: namespaces to include.
  • Function header: Return type, function name and parameters.
  • Function body: code perfomed by the function.
  • Statements
  • Comments
  • Curly braces to enclose bodies of statements.
  • Arbitrary use of whitespace.

C++ Statements

  • Declarations: variables and constants.
  • Assignments: values to variables.
  • Preprocessor directives: prior tasks to code compiling.
  • Comments
  • Function declaration
  • Executable statements: e.g. cout << "Hello Wordl!"

Data Types

Numeric Data

Name Bytes Alias Approximate Range
int 4 signed $-2$ x $10^9$ to $2$ x $10^9$
unsigned int 4 unsigned $0$ to $4$ x $10^9$
__int8 1 char $-128$ to $127$
unsigned __int8 1 unsigned char $0$ to $255$
__int16 2 short, short int, signed short int $–32,768$ to $32,767$
unsigned __int16 2 unsigned short, unsigned short int $0$ to $65,535$
__int32 4 signed, signed int, int $-2$ x $10 ^ 9$ to $2$ x $10 ^ 9$
unsigned __int32 4 unsigned, unsigned int $0$ to $4$ x $10 ^ 9$
__int64 8 long long, signed long long $-9$ x $10^{18}$ to $9$ x $10^{18}$
unsigned __int64 8 unsigned long long $0$ to $18$ x $10^{18}$
short 2 short int, signed short int $–32,768$ to $32,767$
unsigned short 2 unsigned short int $0$ to $65,535$
long 4 long int, signed long int $-2$ x $10 ^ 9$ to $2$ x $10 ^ 9$
unsigned long 4 unsigned long int $0$ to $4$ x $10 ^ 9$
long long 8 none $-9$ x $10 ^ {18}$ to $9$ x $10 ^ {18}$
unsigned long long 8 none $0$ to ${18}$ x $10 ^ {18}$
float 4 none 3.4E +/- 38 (7 digits)
double 8 none 1.7E +/- 308 (15 digits)
long double 8 none 1.7E +/- 308 (15 digits)

Note: 3.4E +/- 38 (7 digits) means that:

  • the smallest positive value es $3.4$ x $10^{-38}$,
  • the largest positive value es $3.4$ x $10^{38}$,
  • only 7 significant decimal digits can be represented.
  • Similarly for the smallest and largest negative value.
  • The type names that start with a __ character are considered non-standard types.

Character Data

Name Bytes Alias Approximate Range
char 1 none -128 to 127 or 0 to 255
signed char 1 none -128 to 127
unsigned char 1 none 0 to 255
wchar*_t 2 or 4 __wchar_t 0 to $65\times 10^3$ or $4\times 10^9$

Other Data

Name Bytes Alias Approximate Range
bool 1 none true or false
enum varies none dependant on the enclosed data types

Variables and Constants

Variables are named memory locations. When creating them, you must provide the data type. Case sensitives, start always with a letter or underscore. Customized types can be also created. $x^2$

1
2
int myVar = 0;
int youVar{1};

Be careful when assigning different data types than the defined for certain variable (assigning decimal to integer data type), because information can be lost.

Constants are named memory location, but their values can not be changed. They are created with const, and the value should be assigned when it is created.

Explicit Type Conversion Using the type cast statement and cast operator static_cast.

1
2
3
4
5
long myLong = (long)myInt;
long myLong = long(myInt);
char ch = static_cast<char>(i);   // int to char
double dbl = static_cast<double>(f);   // float to double
auto i = 3.0/2;

Complex Data Types

Compound data types store more than on piece of data or more than one data type.

Arrays

An array is collection of elements of the same type. An array of one, two and three dimension is a list, table and cube respectively. It has the following features:

  • Every element contains a value.
  • Indexation starts from 0.
  • Its size is the number of elements.
  • Single or multi-dimensional.
  • Its rank is the dimension of the array.

The code below shows how to initialize, access and iterate through an array.

1
2
3
4
5
6
7
8
9
10
11
12
//Initialize
int arrayName[10];
int arrayName[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; // with values
int arrayName[10] = {1, 2, 3};// only some values, otherwise 0
//Accesing
int number = arrayName[2]; // value 3
//Iterating Over an Array
for (int i = 0; i < 5; i++)
{
     int number = arrayName[i];
     ...
}

Strings

Last character is a null character string \0.

1
2
3
4
5
6
7
8
// Basic way with character array.
char myString[5] = {'c', 'h', 'a', 'r', '\0'}
cout << myString << endl;
// Create and initialize.
char isAString[6] = "Hello";
char isAString = "Hello"; // not necessary to add the length
// With string class.
std::string myNewString = "More easy!";

Structures

Arrays can only store data of the same type.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Declare a new structure called user.
struct user
{
     string name;
     string country;
     int age;
};
// Create an object user with initial values.
user newUser = {"David", "Peru", 13};
// Create an object without initial values.
user unkUser;
unkUser.name = "David";
unkUser.country = "Peru";
unkUser.age = 13;
std::cout << "User " + newUser.name + " is from " + newUser.country + " and " + newUser.age + " years old."

Unions

Similar to structures, but can only store one piece of data at a time.

1
2
3
4
5
6
7
8
9
10
11
union numericUnion
{
     int intValue;
     long longValue;
};
numericUnion myUnion;
myUnion.intValue = 3;
cout << myUnion.intValue << endl;
myUnion.longValue = 4.5;
cout << myUnion.longValue << endl;
cout << myUnion.intValue; cout << endl; // 0 because just one field is stored.

Enumerations

Create symbolic constants. Common case for day of the week.

1
2
3
4
5
6
enum Day {Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday};
// Notice that it starts from 0, but you can specify the starting value.
enum Day {Sunday = 2, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday};
Day payDay;
payDay = Thursday;
cout << payDay << endl; // shows 6, because internally they are numbers.

Control Statements

C++ Operators

Operator Description
+ addition
- subtraction
* multiplication
/ division
% modulo
+= (y += x) same as y = y + x
-= (y -= x) same as y = y - x
*= (y *= x) same as y = y * x
++ increment by 1
-- decrement by 1
== equal to
!= not equal to
> greater than
< less than
>= greater than or equal to
<= less than or equal to
&& logical AND
|| logical OR
! logical NOT

Decision Statements

Uses conditional to set the behaviour of the program.

if statement

1
2
3
4
5
char response = 'y';
if (response == 'y' || response == 'Y')
{
    cout << "Positive response received" << endl;
} // if there is no curly braces, only execute the first next line.

if else statement

1
2
3
4
5
6
7
8
9
string response;
if (response == "connection_failed")
{
    // Block of code executes if the value of the response variable is "connection_failed".
}
else
{
    // Block of code executes if the value of the response variable is not "connection_failed".
}

else if statement

1
2
3
4
5
6
7
8
9
10
11
12
13
string response;
if (response == "connection_failed")
{
    // Block of code executes if the value of the response variable is "connection_failed".
}
else if (response == "connection_error")
{
    // Block of code executes if the value of the response variable is "connection_error".
}
else
{
    // Block of code executes if the value of the response variable is neither above responses.
}

switch statement

Includes a default category when no other case is matched. C++ supports int, char or enumerations for case option.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
char response = 'y';
switch (response)
{
   case 'y':
      // Block of code executes if the value of response is y.
      break;
   case 'Y':
      // Block of code executes if the value of response is Y.
      break;
   case 'n':
      // Block of code executes if the value of response is n.
      break;
   default:
      // Block executes if none of the above conditions are met.
      break;
}

The conditional operator

Similar to if else operator but more compactly using three operands. The first operand evaluate the condition, the second operand is evaluated if the condition is hold; otherwise, the third operand is evaluated.

1
2
3
4
5
6
7
#include <iostream>
using namespace std;
int main()
{
     int i = 1, j = 2;
     cout << ( i > j ? i : j ) << " is greater." << endl;
}

Repetition statements

Repetition through the use of loops.

for loop

Three main attributes that are separated by a semicolon. The first attribute indicates how to get started; the second, when to continue the loop and the last one, how to move on. This third attribute is executed at the end of each iteration.

1
2
3
4
for ([initializer(s)]; [condition]; [iterator])
{
   // code to repeat goes here
}
1
2
3
4
for (int i = 0 ; i < 10; i++)
{
  std::cout << i << endl;
}

while loop

Executes the code while a certain condition is hold. Only includes the condition as an attribute; is up to you how to initialize and when the loop finishes.

1
2
3
4
5
6
7
8
9
10
11
12
string response;
cout << "Enter menu choice " << endl << "More" << endl << "Quit" << endl;
cin >> response;

    while (response != "Quit")
    {
        // Code to execute if Quit is not entered

        // Prompt user again with menu choices until Quit is entered
        cout << "Enter menu choice " << endl << "More" << endl << "Quit" << endl;
        cin >> response;
    }

do loop

Similar to a while loop, but the do loop will always execute the block of code at least once.

1
2
3
4
5
6
7
8
9
10
string response; // response should be defined outside the loop

do
{
     cout << "Enter menu choice " << endl << "More" << endl << "Quit" << endl;
     cin >> response;

     // Process the data.

} while (response != "Quit"); // this semicolon is required

Functions and Objects

Introduction to Functions

A function is basically a bock of code with a given name. It can be called in order to execute the block of code inside this function. It could accept arguments and may return a certain value when it is executed.

1
2
3
4
int Sum(int x, int y)
{
     return x + y;
}

A function can also be overloaded, which means that you can define another function with the same name and different number or arguments. Then, the compiler will call the function according to the number of arguments.

Usually, when defining a function (function prototype) you have to specify its

  • storage class,
  • return type,
  • name,
  • parameters.

Function Parameters

A function can accept values, through the parameters, that will be used in the block code inside the function. Those values are called arguments. The next function have two parameters a and b of type int.

1
2
3
4
int Sum(int a, int b)
{
     return a + b;
}

The function can be called as follows:

1
int result = Sum(2, 3);

Inline Functions

Inline functions avoid the overhead associated with traditional function calls.

1
2
3
4
5
6
inline void swap(int & a, int & b)
{
  int temp = a;
  a = b;
  b = temp;
}
1
2
3
4
5
6
7
// Traditional method that results in a function call
swap(5, 6);

// Using an inline function call, the compiler converts the previous line to
int temp = a;
a = b;
b = temp;

Inline functions are suggested only for small functions that are used frequently.

Storage Classes and Scope

1
2
3
4
5
6
7
8
9
10
11
12
13
#include <iostream>
int main()
{
    int total = 0;
    for(int i = 1; i <= 10; i++)
    {
         total += i;
    }
    std::cout << "The sum of the numbers 1 to 10 is " << total << std::endl;
   std::cout << "Current value of i is " << i << std::cout;
   // The code at line 10 will result in an error in C++ that indicates the variable is undefined.
return 0;
}

Usually you write your function on a .cpp file, and declare the signature on the main .cpp file. It is more useful to declare the signature on a header file and then use # include to use those signatures in any file.

Objects

Classes

Classes allow you to create you own data type. They can be seen as blueprint of the type of object you cant to define. Methods and fields can be defined for a class.

1
2
3
4
5
6
7
//Declaring a Class
class Rectangle
{
public:
    int _width;
    int _height;
}; // necessary semicolon

A rectangle class with two public (accessible) variables to represent the width and height of the rectangle.

Initialize

Instances of a rectangle can be created.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
void main()
{
     Rectangle outer;
     Rectangle inner;

     outer._width = 10;
     outer._height = 10;

     inner._width = 5;
     inner._height = 5;

     Rectangle small{3, 4};
     Rectangle small{}; // _width = 0; height = 0
}

You should always initialize your types in C++.

Encapsulation

http://www.tutorialspoint.com/cplusplus/cpp_data_encapsulation.htm

“Encapsulation is an Object Oriented Programming concept that binds together the data and functions that manipulate the data, and that keeps both safe from outside interference and misuse.”

Any C++ program where you implement a class with public and private members is an example of data encapsulation and data abstraction.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#include <iostream>
using namespace std;

class Adder{
   public:
      // constructor
      Adder(int i = 0) {
         total = i;
      }

      // interface to outside world
      void addNum(int number) {
         total += number;
      }

      // interface to outside world
      int getTotal() {
         return total;
      };

   private:
      // hidden data from outside world
      int total;
};

int main( ) {
   Adder a;

   a.addNum(10);
   a.addNum(20);
   a.addNum(30);

   cout << "Total " << a.getTotal() <<endl;
   return 0;
}

constant objects

You define your object class as a constant and also say in the definition of the class which functions do not change the values of the class. This way we can use the methods on the class but not to change the values of the object.