The C++ programming Language

by Tyler Swann

This book assumes you are at least using a C++11 compliant compiler but concepts and practices from C++11 to C++20 will be covered. See the "Installation" page of the "Getting Started" Chapter for more details.

This book is under active development. Much of the material is absent, incomplete or subject to change. If you have suggestions create a discussion or issue on GitHub.

Introduction

Welcome to The C++ Programming Language, an introductory book aimed at teaching C++. C++ is a high-level, general purpose, multi-paradigm programming language aimed at giving developers absolute control over their programs but also the means to design, build and use any kind of abstraction to make the language more ergonomic and expressible with zero penalties for what you don't use. This allows for C++ programs to be performant as well as expressive.

Who/What is C++ for?

C++ was designed for building systems and embedded software in resources constrained contexts. These are systems that prioritize performance, efficiency and flexibility of use allow the developer to write performant code that can run or make any kind of system. If you want the ability to build helpful abstractions but needs to run in a constrained environment and make use of every resource as effectively as possible C++ will get you there.

Who is this book for?

This book is generally aimed at people who have programmed in at least one other language but regardless of which (ones). This is to say you should have an idea about what a program is, common programming concepts, and a rough idea how a computer works but does not restrict the background in which you learnt these concepts. I aim to make the material as approachable as possible to anyone from any background. It is possible to read this book without any prior experience to programming but it is highly recommended you start at an introduction to programming book or course if you have never programmed before. The purpose of this book is to showcase how the C++ language works and the various concepts and capabilities present within the language, as apposed to teaching you about the basics of programming using C++ as the content medium. C++ is a very dense language in its entirety and and it can be much more difficult to become comfortable programming and utilising the concepts from this particular book effectively if you do not have a foundation for programming in general.

How to Use This Book

In general, this book assumes that you're reading it in sequence from front to back. Later chapters build on concepts in earlier chapters, and earlier chapters might not delve into details on a particular topic but will revisit the topic in a later chapter.

You'll find two kinds of chapters in this book: concept chapters and project chapters. In concept chapters, you'll learn about an aspect or concept from C++. In project chapters, we'll build small programs together, applying what you've learned so far. Chapters ... are project chapters; the rest are concept chapters. Add the end of a concept chapter there will be challenges that you can complete. These are simple quiz like questions that you can use to query your understanding of the concepts presented in the previous chapter.

Note: You can also search for specific content using the search button in the top left or by pressing the S key.

Synopsis

Getting Started, explains how install the necessary tools for compiling C++ programs on various platforms like Windows, macOS and Linux. It also goes through writing a classical "Hello, world!" program and will discuss the anatomy of a basic C++ program and using the CMake build system.
Project: Guessing Game, is the first project chapter where you will build a simple 'number guessing game'. This will introduce you to compiling and building a C++ program and utilising various pieces from C++ at a high level, with later chapters offering more details.

Planned

Common Programming Concepts, will cover the basics of the C++ language from variables and data types to creating functions and controlling the execution flow of a program.
Ownership, will cover C++ ownership model and how you are able share data or even transfer data ownership.
Structured Data will look at how to create custom types using structs.
In Managing Projects we'll discuss how to compile multiple files together and how CMake makes this process easier.
Custom Types explores how to create more powerful custom types and how to manage the lifetime of data.
Error Handling will look at the various ways to verify the correctness of your programs at compile time. We will also look at recovering from errors to prevent crashes.
Templates covers C++'s metaprogramming capabilities that allow you to build generic code that applies multiple types.
In Functional Language Features we will look
The IO chapter will briefly look deeper at C++ IO capabilities using streams. We also explore C++ filesystem library.
Memory will showcase how to safey (and unsafely ... for science) control memory.
In Concurrency we will look at how to parallelize our programs using a miriade of concurrency concepts while ensure safe access and manipulation of shared data.
Appendices, The appendices hold extra information may be of use to the reader but do not fit in elsewhere in the book.
- A - Keywords
- B - Operators
- C - Standard Versions
- D - Compilation Pipeline
- E - Value Categories
- F - Compiler Vendors
- G - Challenge Answers

Possible Future Chapters

IO Project, will look at utilising ideas from previous chapters in order to build a tool that replicates a subset of the functionality of the command line tool grep.
Algorithms, will showcase a few of the common algorithms available in the C++ standard library and they can be used to manipulate any of the standard containers in an expressive and generic manner. We will also cover the concept of a range and a view and how they allow use to write composable algorithms.
Improved IO Project, will look at improving our IO project from Chapter 11 by utilising the standard algorithms.
Object Orientated Programming In C++, covers C++ support for write object orientated code and how it contrasts to the rest of the languages features and object oriented principles you may be familiar with from other languages.
Date, Time and Localization, introduces C++ support for working with time and dates how to change the locale currently being used to express said times and dates.

There is no wrong way to read this book: if you want to skip ahead, go for it! You might have to jump back to earlier chapters if you experience any confusion. But do whatever works for you.

An important part of the process of learning any programming language is learning how to read the error messages the compiler displays, which can be challenging for large codebases, especially if they are written in C++ (although this is improving). Error messages no matter the language will offer key insight into where the compilation of a program failed and in the case of C++, why it failed, which will guide you toward working code. As such, I'll provide many examples that don't compile along with the error message the compiler will show you in each situation. Know that if you enter and run a random example, it may not compile! Make sure you read the surrounding text to see whether the example you're trying to run is meant to error.

Note: the error message style and content can be dramatically different given a different compiler, compiler version and standard of C++ being used.

Source Code

The source code from which this book is generated can be found on GitHub. Refer to the supporting docs on the books repo for details on how to contribute changes, fix typos or create new content for this book.

External Resources

Getting Started

Let us begin our journey! In this chapter we will discuss:

Installing C++ on Linux, macOS and Windows
Creating a C++ program to print Hello, world!
Using CMake to create cross-platform builds.

Installation

Each platform or Operating System (OS) has a different set of compiler tools so the following sub-chapters will outline how to get setup on each platform.

Available C++ Compilers

Compiler	Description	Windows	Linux	MacOS
GNU Compiler Collection (GCC)	A collection of compiler technologies for many different languages including C, C++, Objective-C, Ada, D and Go. Part of the GNU project and the default compiler on Linux.	✅¹	✅	✅
Clang	A compiler frontend and build runner that is a part of the LLVM Project. Used to compile C, C++ and Objective-C.	✅²	✅	✅
Microsoft Visual Compilers (MSVC)	Microsoft's proprietary compiler toolchain for building C and C++. Usually installed with the Visual Studio IDE.	✅	❌	❌

Info

Note: The use of $ or > as the first character on a line in any code block for a shell (commands etc.) is used to indicate the prompt with the command following. This is used to clarify a shell code block that contains commands and the (generally) expected output. You do not need to copy the $ or > when running commands.

¹
via MinGW or Cygwin
²
via Visual Studio, MinGW or Cygwin

Linux

Installing GCC and Clang on most Linux systems is relatively trivial. Most of the time it requires just installing the GCC or Clang package and some supporting developer tooling packages. These are often bundled together to make installation as simple as possible.

Installing

Depending on your distribution you will use a different package manager and package upstream repository, therefore some package names might be different than what is listed below. Consult your platforms docs for the most seamless way to install a C++ compiler if the below commands fail.

# Debian, Ubuntu, ElementaryOS, Linux Mint, Pop!_OS (APT)
$ sudo apt install build-essential gdb clang llvm cmake

# RedHat, CentOS, Fedora (DNF)
$ sudo dnf install make automake gcc gcc-c++ kernel-devel gdb clang llvm cmake

# Arch, Manjaro (Pacman)
$ sudo pacman -Sy base-devel gdb clang llvm cmake

# OpenSUSE (Zypper)
$ sudo zypper install -t pattern devel_basis
$ sudo zypper install gdb clang llvm cmake

Verifying Installation

To verify the install worked for either GCC or Clang we can run the compiler programs with the version flag and ensure the install has been successful. You should get something like the following output:

# Verify GCC
$ g++ --version
g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

# Verify Clang
$ clang++ --version
Ubuntu clang version 14.0.0-1ubuntu1.1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

Note

Any details displayed from verifying a given newly installed tool may differ to what is displayed in this book.

Tip

The name of GNU's compiler toolchain is 'GCC' aka GNU Compiler Collection. This is in contrast to the CLI tool called gcc which stands for GNU C Compiler.
The C++ compiler from GCC is called g++. Make sure to use this command when compiling C++ code.

Installing CMake

We will also want a tool to help manage larger projects and allow us to build on different machines from the same source. CMake is one such build tool for C++ projects. It is used to manage different configurations for a projects. You would have already installed CMake when you installed the C++ compilers earlier as we added CMake to the install list. You can verify by running:

$ cmake --version
cmake version 3.25.1

CMake suite maintained and supported by Kitware (kitware.com/cmake).

Installing vcpkg

We will also need some way to install external libraries. While many different tools exist the tool vcpkg was chosen for this book. vcpkg is an open source tool developed by Microsoft used for downloading and managing C++ libraries with CMake. We can install, add it to our PATH and validate the install using the following commands:

cd ~
mkdir bin
cd bin
git clone https://github.com/Microsoft/vcpkg.git
./vcpkg/bootstrap-vcpkg.sh
printf '\n# >>> vcpkg >>>\nexport VCPKG_ROOT="$HOME/bin/vcpkg"\nexport PATH="$VCPKG_ROOT:$PATH"\n# <<< vcpkg <<<\n' >> ~/.bashrc
source ~/.bashrc

Verify vcpkg

$ vcpkg --version
vcpkg package management program version 2023-10-18-27de5b69dac4b6fe8259d283cd4011e6d20a84ce

See LICENSE.txt for license information.

Windows

Warning

🚧 Page Is Incomplete! 🏗️

Windows has many different compilers at its disposal. Some offer native support to building against the Windows runtime while others will emulate a UNIX (the predecessor to Linux and BSD) environment to aid in porting software built for UNIX-like systems. As the specifics can get confusing, this book will only cover the installation of Window's native compiler toolchain MSVC.

MSVC Installation

The Microsoft Visual C++ (MSVC) compiler is Microsoft's official toolchain for building software natively on Windows. It is installed with the Visual Studio Integrated Developer Environment (IDE). MSVC (and the whole Visual Studio suite) can be obtained from Microsoft's official download page. Make sure to select the correct edition (community being the free version) and click 'Download'. This will download the setup program VisualStudioSetup.exe, which is used to install and configure Visual Studio Installer (VSI). The VSI allows you to select which tools and technologies from the Visual Studio suite you want to install. Once you have installed the VSI, start the program and you should be presented with some default tool configurations (workflows). For developing with C++ you will need to select the 'Desktop development with C++' workflow. You will also want to tick a few optional features as well (found in the side bar).

VSI Options Reference Image

Finally, click the 'Install' button in the bottom right of the window to start the installation.

Microsoft's official installation instructions for C++

Verifying MSVC Installation

To verify you installed Visual Studio correctly you can open the newly installed 'Developer Command Prompt for VS'. This prompt is needed in order to load the MSVC tooling into the prompt as it is not including by default in CMD or PowerShell. Simply run the following command to verify the install of the compiler.

> cl
Microsoft (R) C/C++ Optimizing Compiler Version 19.37.32822 for x86
Copyright (C) Microsoft Corporation.  All rights reserved.

usage: cl [ option... ] filename... [ /link linkoption... ]

Note

Any details displayed from verifying a given newly installed tool may differ to what is displayed in this book.

Tip

Alternatively you can follow Microsoft's tutorial for creating a new C++ VS Project. This will be more convenient than opening a 'Developer Command Prompt' every time you want to compile a program and having to run the cl command manually but it takes more work setting compiler flags etc. for simple projects.

Installing Git

We will also need to install Git in order to install a particular package later. Git can be installed by going to the 'Git for Windows' installation page and selecting the correct version (eg. x64 for 64-bit systems) and following the installation Wizard. Be sure to select the option for adding Git to the PATH.

Installing CMake

CMake is a build tool for C++ projects. It is used to manage different configurations for a projects. You can download the latest release from CMake's Release Page (scroll down to 'Latest Release' not 'Release Candidate'). You can verify it was installed correctly by opening CMD and running.

> cmake --version
cmake version 3.25.1

CMake suite maintained and supported by Kitware (kitware.com/cmake).

Installing `vcpkg`

We will also need some way to install external libraries. While many different tools exist the tool vcpkg was chosen for this book. vcpkg is an open source tool developed by Microsoft used for downloading and managing C++ libraries with CMake. We can install, add it to your PATH and validate the install using the following batch/CMD commands:

> cd %userprofile%
> mkdir bin
> cd bin
> git clone https://github.com/Microsoft/vcpkg.git
> .\vcpkg\bootstrap-vcpkg.bat -disableMetrics
> setx VCPKG_ROOT %userprofile%\bin\vcpkg
> setx PATH "%PATH%;%userprofile%\bin\vcpkg"
:: You must now reload CMD for the Environment Variables to refresh by closing and reopening the CMD.

> vcpkg --version
vcpkg package management program version 2023-10-18-27de5b69dac4b6fe8259d283cd4011e6d20a84ce

See LICENSE.txt for license information.

MacOS

To install GCC and Clang on MacOS we will need Apple's developer toolchain called Xcode and a package manager for MacOS called Homebrew.

Installation

To build almost anything on MacOS we need the Xcode developer suite. This is a set of libraries, environment configurations and binaries used at the core of all Apple software products. The full installation can be found on Apple's developer page (requires a login) but this is an extremely large package requiring ~40Gb of disk space. Luckily there is a much smaller CLI package that just installs the necessary tooling for working with software from the terminal. One of these tools is the Clang compiler. To install GCC you will need the Homebrew, a package manager which will by default install the latest stable version of the GCC formula. If you need a different version you can can check the GCC formula page for available versions. To install these packages, open the 'Terminal' app and run:

# Install Xcode CLI tools
$ xcode-select --install

# Install Homebrew
$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Add `brew` command to your PATH
$ (echo; echo 'eval "$(${HOMEBREW_PREFIX}/bin/brew shellenv)"') >> ${shell_profile}

# Install GCC
$ brew install gcc cmake

Verifying Installation

To verify the install worked for either GCC or Clang we can run the compiler programs with the version flag and ensure the install has been successful.

# Verify GCC
$ g++-13 --version
g++-13 (Homebrew GCC 13.2.0) 13.2.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

# Verify Clang
$ clang++ --version
Apple clang version 15.0.0 (clang-1500.0.40.1)
Target: x86_64-apple-darwin22.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

Note

Any details displayed from verifying a given newly installed tool may differ to what is displayed in this book.

Tip

The name of GNU's compiler toolchain is 'GCC' aka GNU Compiler Collection. This is in contrast to the CLI tool called gcc which stands for GNU C Compiler.
The C++ compiler from GCC is called g++. Make sure to use this command when compile C++ code.
You must specify the versioned g++ command in order to use the Homebrew version of the command. You can find the default version installed by running brew info gcc. We must do this because the regular g++ command redirects back to Apple's Clang implementation.

Installing CMake

We will also want a tool to help manage larger projects and allow us to build on different machines from the same source. CMake is one such build tool for C++ projects. It is used to manage different configurations for a projects. You would have already installed CMake when you installed the C++ compilers earlier with Homebrew as we added CMake to the install list. You can verify by running:

$ cmake --version
cmake version 3.25.1

CMake suite maintained and supported by Kitware (kitware.com/cmake).

Installing `vcpkg`

We will also need some way to install external libraries. While many different tools exist the tool vcpkg was chosen for this book. vcpkg is an open source tool developed by Microsoft used for downloading and managing C++ libraries with CMake. We can install, add it to our PATH and validate the install using the following commands:

cd ~
mkdir bin
cd bin
git clone https://github.com/Microsoft/vcpkg.git
./vcpkg/bootstrap-vcpkg.sh
printf '\n# >>> vcpkg >>>\nexport VCPKG_ROOT="$HOME/bin/vcpkg"\nexport PATH="$VCPKG_ROOT:$PATH"\n# <<< vcpkg <<<\n' >> ~/.bashrc
source ~/.bashrc

Verify vcpkg

$ vcpkg --version
vcpkg package management program version 2023-10-18-27de5b69dac4b6fe8259d283cd4011e6d20a84ce

See LICENSE.txt for license information.

Hello World

Now that you've installed a C++ compiler, its time to write your first C++ program. It is tradition when learning a new programming language to write a program that prints "Hello, world!" to the screen and we'll be doing the same.

Info

"Hello, world!" was first introduced as a teaching mechanism for people learning a new programming language in Brian Kernighan's 1972 "A Tutorial Introduction to the Language B".

Creating a Project Directory

First, you'll create a new directory to store you C++ code. It is a good idea to create a 'projects' or 'dev' directory within you home or user directory in order to store any project you might develop for this book an beyond. Open a terminal and run the following commands.

Linux, MacOS or PowerShell on Windows:

$ mkdir ~/projects
$ cd ~/projects
$ mkdir hello_world
$ cd hello_world

CMD on Windows:

> mkdir "%userprofile%\projects"
> cd "%userprofile%\projects"
> mkdir hello_world
> cd hello_world

Writing and Running a C++ Program

Within this new 'hello_world' directory we will create a new file called main.cxx. This is called a C++ source file. A C++ program is then built from one or more of these files. We use the file extension *.cxx to denote that this file contains C++ source code. If a filename contains multiple words the convention is to separate the words with an underscore eg. hello_world.cxx over helloworld.cxx. Now open the file you have just created and copy the code from Listing 1-1 into the file.

Listing 1-1: Program the prints Hello, world!

#include <iostream>

auto main() -> int {
  std::cout << "Hello, world!\n";
  return 0;
}

Info

C++ source files can have various different extensions such as *.cpp, or *.c++ however, for this book the *.cxx style will be used. Is also good practice to use the same extension type across a project so no matter which one you use, just be consistent.

Save the file and return to your terminal open to the ~/projects/hello_world directory and run the following command.

On Linux or MacOS

$ g++ -std=c++20 -o hello_world main.cxx
$ ./hello_world
Hello, world!

On Windows

:: Must be done in a 'Developer Command Prompt for VS ...' 
> cl /std:c++20 /EHsc /Fe: hello_world.exe main.cxx
> .\hello_world.exe
Hello, world!

If you see "Hello, world!" printed on your terminal, congratulations, you've officially written your first C++ program!

Note

You can swap the g++ command with the clang++ command if you want to use the Clang compiler instead of the GCC compiler.
The -std=c++20 (GCC/Clang) and /std:c++20 flag options specify the compiler to use the C++20 (2020) version of C++. The -o <name> is used to specify the name and/or directory for the compiled program.

Anatomy of a C++ Program

Let's go into some more detail on the structure of our "Hello, world!" program. The first component to cover is:

auto main() -> int {

}

This declares a function called main. The main function is known as the program's entry point, meaning main is the very first function that runs in every executable C++ program. This declaration of main takes no parameters and returns an integer (int). If there were parameters they would be declared within the parenthesis (). The body of the function is wrapped in curly braces {}.

The body of the function contains the following two lines:

  std::cout << "Hello, world!\n";
  return 0;

The second line returns a status code from main to the (OS) indicating whether the program run successfully or not. A status code of 0 indicates the program ran was successfully with any other value indicating the program failed.

The first line is where the action occurs! We start by accessing the symbol cout from the namespace std "(usually pronounced stood) using the namespace resolution operator ::. cout is a global character output stream that is linked to stdout ie. your terminal's output (you'll learn more about streams and IO in later chapters). .

We can push characters through the stream using the << operator where the left argument must be an output stream and the right argument is a series of characters, numbers or a string. In this case we are pushing the string literal "Hello, world!\n" through the stream. We use the \n character to specify a newline to be printed after our string has been written to the terminal.

You'll notice that we end the line with a semicolon ;. Semicolon's are used to indicate the end of an expression.

Note

The operator << has been overloaded for use with cout (and other output streams), as such it is only defined to work with C++'s primitive and standard library types. You would need to provide you own definition for custom types.

You'll also notice at the top of the file the following line:

#include <iostream>

This is a preprocessor instruction, more specifically it is an instruction use to import the 'iostream' library into our program. This is where the symbol for the cout output stream comes from. We include libraries by utilising the preprocessor directive #include which basically copies and pastes the contents of the file indicating within the <> symbols into our program, which in this case is the file 'iostream'. Assume this file's location (and the location of any others used in the same manner) is known to your compiler unless specified otherwise. Files imported using #include are known as headers .

Compiling and Running Are Separate Steps

You may notice that it took two separate steps in order to run our program. This is because C++ is a compiled language, meaning that our source code is transformed into something else. In the case of C++, the compiler will generate binary machine code for our target platform; which in this case is our own device before running. This means the generate (machine) code is specific to the target and you cannot transferred and run on a different computer if its architecture is different. This allows the compiler to optimise your code for the target platform but does require the additional step.

This is in contrast to interpreted languages; like Python, Ruby, JavaScript etc., which will perform the conversion while the program is running but this in turn requires another program; the interpreter, to run alongside yours, taking up extra resources but it usually means your programs are more portable as they can run on anywhere the interpreter can. These are some trade-offs made when designing or using a language.

For simple programs, directly using a C++ compiler (like g++) is fine, but as your project grows you'll want to manage all the options and make it easy to share your code. Next, we'll introduce you to the CMake tool, which will help you write manage much larger projects.

Hello, CMake

CMake is a third-party tool used to configure and build C++ projects. While there are other tools like CMake for configuring C++ compiler toolchains, CMake is the most ubiquitous within the C++ community. CMake allows use to define one or more targets that our project produces. Targets can be an executable, library, documentation or even testing. This allows a single project to build many different outputs for different platforms from a single source. Targets can also be consumed by other targets allowing more modular builds.

Creating a Project with CMake

To start off, go back to your projects/ directory and create a new directory called 'hello_cmake'.

$ mkdir hello_cmake
$ cd hello_cmake

Within this directory we will need to create three new files main.cxx, CMakeLists.txt and CMakePresets.json. For the main.cxx file you can copy the below code which is identical to the one found on the previous page except printing slightly different content.

#include <iostream>

auto main() -> int
{
    std::cout << "Hello, CMake!\n";
    return 0;
}

We will first look at the CMakeLists.txt file.

CMake Configuration Files

A CMake project is defined by a set of 'CMakeLists.txt' files located in the source tree (directories containing your source code). These describe your projects targets, source files etc.. For a simple single file project we only need a single 'CMakeLists.txt' alongside our main.cxx source file. Copy the contents from Listing 1-2.

Listing 1-2: Basic CMake configuration file.

cmake_minimum_required(VERSION 3.22)

project(hello_cmake
    VERSION 0.1.0
    DESCRIPTION "Hello, CMake!"
    LANGUAGES CXX)

add_executable(hello_cmake main.cxx)
target_compile_features(hello_cmake PRIVATE cxx_std_20)

Let's break down our CMakeLists.txt file. First we specify the minimum required version of CMake this project uses. This helps to ensure that any CMake features used in the projects configuration are available to end users and collaborators.

cmake_minimum_required(VERSION 3.22)

We then define the basic information about our project such as its name, description, version and what languages it uses.

project(hello_cmake
    VERSION 0.1.0
    DESCRIPTION "Hello, CMake!"
    LANGUAGES CXX)

In order to mark our main.cxx as an executable we use the add_executable() function where we specify the executable's name ie. the name of the target created from the executable as well as the source file used to make the executable.

add_executable(hello_cmake main.cxx)

Finally, we can add compilation features; such as setting the C++ Standard to use for building the target, using the target_compile_features() function. Here we add the builtin CMake feature cxx_std_20 to our executable which ensures it is built using the 2020 C++ Standard.

target_compile_features(hello_cmake PRIVATE cxx_std_20)

Info

See Appendix D for more information on C++ Standards.

CMake Presets

We can also specify presets for CMake that define different configurations by a unique name. These presets can be used to configure your project to compile on multiple different platforms as well as set various flags and options depending on how your want the project to be built. This is better than writing large 'CMakeLists.txt' files with complicated conditional logic that makes just writing the configuration complicated. A minimalistic CMakePresets.json file would look similar to Listing 1-3.

Listing 1-3: Minimalistic CMake presets file.

{
    "version": 3,
    "cmakeMinimumRequired": {
        "major": 3,
        "minor": 22,
        "patch": 0
    },
    "configurePresets": [
        {
            "name": "default",
            "binaryDir": "${sourceDir}/build"
        }
    ]
}

A CMakePresets.json file is starts with a key-value pair indicating the version of the preset engine to use from CMake. We also specify the minimum CMake version required for this project, similar to the first line Listing 1-2.

    "version": 3,
    "cmakeMinimumRequired": {
        "major": 3,
        "minor": 22,
        "patch": 0
    },

We then have a configuration array which stores our presets objects used for configuring our projects for different targets. All presets must have a unique name used to identify them.

    "configurePresets": [
        // ... preset objects go here
    ]

In our preset named "default" specify where we want the resulting binary to be put. In this case we specified it to be placed in the build/ directory at the root of our project.

        {
            "name": "default",
            "binaryDir": "${sourceDir}/build"
        }

One final thing to mention is that CMakePresets.json files support macro expansions which allow you to obtain common variables. The syntax for expanding a macro is to use a dollar sign ($) followed by the variables identifier surrounded in braces ({}). We can see one being used in Listing 1-3 when we specify where our binary should be built. We can see that instead of hard coding a path or using relative path we can leverage CMake knowing where our projects root is (which is where the root CMakeLists.txt file is located) and obtain the root of our source directory using the sourceDir variable, hence its expansion being used on line 11 eg. "binaryDir": "${sourceDir}/build". Variable names are always in camel case.

More information of CMake's presets can be found on CMake's official documentation cmake-presets(7).

Building and Running a CMake Project

When building a CMake project we have to perform two steps. The first step is to configure the project. What this does is generate the build recipe(s) for your project according to your 'CMakeLists.txt' files. A recipes are the instructions used to actually compile your project with a single recipe being used to build one or more targets. CMake then builds one or more of these targets according to a recipe.

For our project we only have a single target which also happens to correspond to our single preset so we can simply run the following to build our recipe.

$ cmake --preset=default
-- The CXX compiler identification is GNU 11.4.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/user/projects/hello_cmake/build

Info

If you do not want to use presets you can manually build the project with the following command.

$ cmake -S . -B build

We can then build the target using the following command:

$ cmake --build build
[ 50%] Building CXX object CMakeFiles/hello_cmake.dir/main.cxx.o
[100%] Linking CXX executable hello_cmake
[100%] Built target hello_cmake

This will produce a binary called main in the build/ directory on Linux and MacOS and the build/Debug/ directory on Windows. We can run our program like normal.

$ ./build/hello_cmake  # ... or .\build\Debug\hello_cmake.exe on Windows
Hello, CMake!

Tip

The reason for Windows based builds having an additional intermediate directory Debug/ for the output is because the underlying builder(s) used on Windows can be configured to output both debug and release builds from the same recipe which is controlled with CMake's --config=<config> flag during the build step. You can test creating a 'Release' build by running the following command which should now produce and executable in the build\Release\ directory.

> cmake --build build --config=Release

Compiling with Flags (Optional)

Often we want to have specific flags set for the compiler(s) we are using but because each compiler has different flags available it can become difficult to have parity across compilers. Luckily presets make this much easier. Below I have created a preset for each platform with the correct flags set for the compiler(s) of each platform, ensuring some of the most common errors and bugs are caught by the compiler and reported to us. Listing 1-4 showcases these presets which i'd recommend copying over these presets into the projects. There are also some hidden presets that are used to define settings across presets; for example, I have set the C++ standard to 20 for all presets by inheriting the "std-cxx" preset in the non-hidden platform presets. Listing 1-5 demonstrates the commands needed to configure, build and run the executable target for each preset. From now on in the book, I will assuming the use of presets for building C++.

Note

You'll have to specify the build directory using the -B flag like it is shown in Listing 1-5 because the presets do not define this however, this allows you to customize the build location.
These presets are for tailored for a single executable target and may not be robust to handle exporting libraries.

Listing 1-4: Base CMakePresets.json for a single cross-platform executable target with compiler flags set.

{
    "version": 3,
    "cmakeMinimumRequired": {
        "major": 3,
        "minor": 22,
        "patch": 0
    },
    "configurePresets": [
        {
            "name": "vcpkg",
            "hidden": true,
            "toolchainFile": "$env{VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake"
        },
        {
            "name": "std-cxx",
            "hidden": true,
            "cacheVariables": {
                "CMAKE_CXX_EXTENSIONS": "OFF",
                "CMAKE_CXX_STANDARD": "20",
                "CMAKE_CXX_STANDARD_REQUIRED": "ON"
            }
        },
        {
            "name": "common",
            "hidden": true,
            "inherits": [
                "std-cxx",
                "vcpkg"
            ],
            "cacheVariables": {
                "CMAKE_EXPORT_COMPILE_COMMANDS": "ON"           
            }
        },
        {
            "name": "linux",
            "inherits": [
                "common"
            ],
            "description": "These flags are supported by both GCC and Clang",
            "cacheVariables": {
                "CMAKE_CXX_FLAGS": "-fstack-protector-strong -fcf-protection=full -fstack-clash-protection -Wall -Werror -Wextra -Wpedantic -Werror -Wnarrowing -Wconversion -Wsign-conversion -Wcast-qual -Wformat=2 -Wundef -Werror=float-equal -Wshadow -Wcast-align -Wunused -Wnull-dereference -Wdouble-promotion -Wimplicit-fallthrough -Wextra-semi -Woverloaded-virtual -Wnon-virtual-dtor -Wold-style-cast",
                "CMAKE_EXE_LINKER_FLAGS": "-Wl,--allow-shlib-undefined,--as-needed,-z,noexecstack,-z,relro,-z,now",
                "CMAKE_SHARED_LINKER_FLAGS": "-Wl,--allow-shlib-undefined,--as-needed,-z,noexecstack,-z,relro,-z,now"
            }
        },
        {
            "name": "apple-darwin",
            "inherits": [
                "common"
            ],
            "cacheVariables": {
                "CMAKE_CXX_FLAGS": "-fstack-protector-strong -Wall -Werror -Wextra -Wpedantic -Wnarrowing -Wconversion -Wsign-conversion -Wcast-qual -Wformat=2 -Wundef -Werror=float-equal -Wshadow -Wcast-align -Wunused -Wnull-dereference -Wdouble-promotion -Wimplicit-fallthrough -Wextra-semi -Woverloaded-virtual -Wnon-virtual-dtor -Wold-style-cast"
            }
        },
        {
            "name": "windows-x64",
            "inherits": [
                "common"
            ],
            "description": "Note that all the flags after /WX are required for MSVC to conform to the language standard",
            "cacheVariables": {
                "CMAKE_CXX_FLAGS": "/sdl /guard:cf /utf-8 /diagnostics:caret /w14165 /w44242 /w44254 /w44263 /w34265 /w34287 /w44296 /w44365 /w44388 /w44464 /w14545 /w14546 /w14547 /w14549 /w14555 /w34619 /w34640 /w24826 /w14905 /w14906 /w14928 /w45038 /WX /permissive- /volatile:iso /Zc:inline /Zc:preprocessor /Zc:enumTypes /Zc:lambda /Zc:__cplusplus /Zc:externConstexpr /Zc:throwingNew /EHsc",
                "CMAKE_EXE_LINKER_FLAGS": "/machine:x64 /guard:cf"
            }
        },
        {
            "name": "windows-x86",
            "inherits": [
                "common"
            ],
            "description": "Note that all the flags after /WX are required for MSVC to conform to the language standard",
            "cacheVariables": {
                "CMAKE_CXX_FLAGS": "/sdl /guard:cf /utf-8 /diagnostics:caret /w14165 /w44242 /w44254 /w44263 /w34265 /w34287 /w44296 /w44365 /w44388 /w44464 /w14545 /w14546 /w14547 /w14549 /w14555 /w34619 /w34640 /w24826 /w14905 /w14906 /w14928 /w45038 /WX /permissive- /volatile:iso /Zc:inline /Zc:preprocessor /Zc:enumTypes /Zc:lambda /Zc:__cplusplus /Zc:externConstexpr /Zc:throwingNew /EHsc",
                "CMAKE_EXE_LINKER_FLAGS": "/machine:x86 /guard:cf"
            }
        }
    ]
}

Listing 1-5: Commands for building with Listing 1-4's presets.

# Linux (debug)
$ cmake -S . -B build/linux/debug --preset=linux  # configure
$ cmake --build build/linux/debug                 # build
$ ./build/linux/debug/<exe-name>                  # execute

# Linux (release)
$ cmake -S . -B build/linux/release --preset=linux -DCMAKE_BUILD_TYPE="Release"  # configure
$ cmake --build build/linux/release                                              # build
$ ./build/linux/release/<exe-name>                                               # execute

# --------------------------------------------

# macOS (debug)
$ cmake -S . -B build/macos/debug --preset=macos  # configure
$ cmake --build build/macos/debug                 # build
$ ./build/macos/debug/<exe-name>                  # execute

# macOS (release)
$ cmake -S . -B build/macos/release --preset=macos -DCMAKE_BUILD_TYPE="Release"  # configure
$ cmake --build build/macos/release                                              # build
$ ./build/macos/release/<exe-name>                                               # execute

# --------------------------------------------

# Windows [x64] (debug)
$ cmake -S . -B build/windows-x64 --preset=windows-x64  # configure
$ cmake --build build/windows-x64 --config=Debug        # build
$ ./build/windows-x64/Debug/<exe-name>.exe              # execute

# Windows [x64] (release)
$ cmake -S . -B build/windows-x64 --preset=windows-x64   # configure
$ cmake --build build/windows-x64 --config=Release       # build
$ ./build/windows-x64/Release/<exe-name>.exe             # execute

# --------------------------------------------

# Windows [x86] (debug)
$ cmake -S . -B build/windows-x86 --preset=windows-x86  # configure
$ cmake --build build/windows-x86 --config=Debug        # build
$ ./build/windows-x86/Debug/<exe-name>.exe              # execute

# Windows [x86] (release)
$ cmake -S . -B build/windows-x86 --preset=windows-x86   # configure
$ cmake --build build/windows-x86 --config=Release       # build
$ ./build/windows-x86/Release/<exe-name>.exe             # execute

Hello, vcpkg

While CMake can be used to build your project and help to customize its configuration for different platforms and uses, it is not very good at managing packages. For this reason we will be using another tool built for this purpose called vcpkg. It is an open-source project developed at Microsoft that interacts directly with CMake.

Setting Up CMake Project with vcpkg

To get started we are going to create another new directory in our parent directory projects/.

$ mkdir hello_vcpkg
$ cd hello_vcpkg

We can then copy all files from the previous pages project into this new directory, these are the main.cxx, CMakeLists.txt and CMakePresets.json files. We can then initialise a new vcpkg project using the command.

$ vcpkg new --application

This will create two new files vcpkg.json and vcpkg-configuration.json. The vcpkg.json will currently be empty but it is used to specify dependencies. It can also declare available features for downstream users of your project if it is setup as a library however, this is not relevant to us right now. The vcpkg-configuration.json is used to specify the source location of packages as well as lock the version of the source to a particular version to make reproducible builds easier, which is important in production software however, we can largely ignore that file.

The next thing we'll do is ensure that CMake is aware of vcpkg so the two tools can work together. CMake supports a notion of a toolchain file which specifies which underlying tools CMake must use which assist in building projects across different systems and helps determine how to build the packages your project requests. We can specify the vcpkg toolchain in our CMakePresets.json by adding the file's path to CMake's variable cache. This can be done by adding the "cacheVariables object below the "binaryDir" entry in our CMakePresets.json file with an entry in the new object for the variable CMAKE_TOOLCHAIN_FILE. This can be seen in Listing 1-6.

Listing 1-6: CMakePresets.json with vcpkg toolchain file added to CMake cache.

{
    "version": 3,
    "cmakeMinimumRequired": {
        "major": 3,
        "minor": 22,
        "patch": 0
    },
    "configurePresets": [
        {
            "name": "default",
            "binaryDir": "${sourceDir}/build",
            "cacheVariables": {
                "CMAKE_TOOLCHAIN_FILE": "$env{VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake"
            }
        }
    ]
}

This leverages the $env{} macro which obtains environment variables, in this case the variable $VCPKG_ROOT variable which is where our vcpkg install lives.

Warning

Make sure to have a comma at the end of the line for the "binaryDir" entry.

Adding Packages

Let us add a package to our project. For this example we are going to use the fast formatting and I/O library {fmt}. To add dependencies we simply need to add an entry into vcpkg.json called "dependencies" which is an array of objects or strings representing our projects dependencies. We can do this with the following command:

vcpkg add port fmt

We can also more add a minimum version constraint to dependencies by converting the newly added dependency from string a into an object similar to Listing 1-7.

Listing 1-7: vcpkg.json specifying {fmt} as a dependency with a minimum version constraint.

{
    "dependencies": [
        {
            "name": "fmt",
            "version>=": "10.1.0"
        }
    ]
}

The full reference for vcpkg.json contains more details related to controlling your projects setup and dependencies.

Next, we need to tell CMake that {fmt} exists and we'd like to use it. To do this we must tell CMake to find the package using the find_package() in the projects CMakeLists.txt file.

cmake_minimum_required(VERSION 3.14)

project(hello_vcpkg
    VERSION 0.1.0
    DESCRIPTION "Hello, vcpkg from {fmt}!"
    LANGUAGES CXX)

find_package(fmt CONFIG REQUIRED)

add_executable(hello_vcpkg main.cxx)
target_compile_features(hello_vcpkg PRIVATE cxx_std_20)

Because {fmt} will be a required package for our project we must tell CMake to fail if it cannot be found which is why we pass the REQUIRED keyword to the function. We also pass in the CONFIG keyword to indicate for CMake to use the packages provided configuration file which is what allows vcpkg to manipulate how the packages CMake targets are made. We then must add {fmt} (more specifically, a target from {fmt}) to our projects executable target. We can do this with the target_link_libraries() function. This function first takes the name of the target we want to add libraries to, we then specify the names of the targets we wish to link using, ensuring to specify a scope keyword eg. one of PRIVATE, PUBLIC or INTERFACE. Linking a library simply means adding it to another target. In Listing 1-8 we use PRIVATE scoping for adding {fmt} to our executable because the usage of {fmt} is limited to the internals of our projects resulting binary and will not be exposed.

Listing 1-8: Example executable using {fmt}.

# ... rest of CMakeLists.txt

add_executable(hello_vcpkg main.cxx)
target_compile_features(hello_vcpkg PRIVATE cxx_std_20)

target_link_libraries(hello_vcpkg PRIVATE fmt::fmt)

You may also notice we are linking fmt::fmt not just fmt in the call to target_link_libraries(). This is because the first fmt is a namespace for the package found through find_package(). We then access the target named fmt and link it to our executable.

With all that done we can now use {fmt} in our main.cxx file. Listing 1-9 is an example program that uses {fmt} which you can copy into the projects main.cxx.

Listing 1-9: Example executable using {fmt}.

#include <fmt/core.h>

auto main() -> int
{
    fmt::println("Hello, vcpkg from {{fmt}}");
    return 0;
}

Note

The use of two braces in the format string of fmt::println() is so the one pair is actually printed in th output. Normally braces have a special meaning in {fmt} but if we need a literal '{' or '}' we use two.

Building and Running with vcpkg

If we copy over the presets file from the bottom of the previous page, we can build our smalle package using presets. This is because of the "vcpkg" preset which links CMake to vcpkg together such that CMake can find packages installed with vcpkg.

$ cmake -S . -B build/<platform> --preset=linux
$ cmake --preset=default
$ cmake --build build/<platform>
$ ./build/<platform>/hello_vcpkg  # ... or .\build\windows-x[86|64]\Debug\hello_vcpkg.exe on Windows
Hello, vcpkg from {fmt}

"Hello, Godbolt!"

Compiling Online

The ability to quickly test and prototype software is extremely useful however, doing so in C++ is not so easy. There's a lot of steps that need to be taken to setup a project correctly which is good for building robust software but slows the speed of prototyping down to a halt. Luckily there exists a tool that makes this processes much easier. It is online C++ compiler known as Godbolt. This site allows you to compile C++ using many different compilers; even at the same time, as well as execute the resulting binary and even see the assembly generated from the compiler. It also allows you to share your session with others so they can see not only the code you wrote but the exact compiler(s), and flags you have set. It is a massively useful tool that is invaluable to the C++ community. Here is an example "Hello, world!" on Godbolt which shows the generated assembly as well as the output from the executed binary. You can also see the godbolt instance embedded below.

Project: Guessing Game

Warning

🚧 Page Under Construction! 🏗️

Let us jump straight into C++ by developing a project together! This will help expose you to some common concepts from C++ and how they are used in an actual program. You'll learn how create variables, control the flow of your program, take in user input, create functions and more! These concepts will be explored in more detail in future chapters while this one will focus on the fundamentals.

We'll be implementing a simple number guessing game. The program will generate a random integer between 1 and 100 (inclusive). It will then prompt the user to type in a guess. After the guess is entered the program will indicate whether the guess was too high or to low or a congratulatory message if the user got it right and exit the program.

Setting Up a New Project

To begin, create a new directory in your projects/ directory called guessing_game and enter it.

$ mkdir guessing_game
$ cd guessing_game

As usual, we'll need to create the files main.cxx, CMakeLists.txt and CMakePresets.json. Our main.cxx file can just be an empty main() function like Listing 2-1 and for our CMakeLists.txt file we must specify a minimum project configuration detailed in Listing2-2. As for our CMakePresets.json file, we can use the either one from Chapter 1; Listing 1-3 or Listing 1-4.

Listing 2-1: Empty main() function in main.cxx.

auto main() -> int {
    return 0;
}

Listing 2-2: Simple 'CMakeLists.txt' for guessing game executable.

cmake_minimum_required(VERSION 3.14)

project(guessing_game
    VERSION 0.1.0
    DESCRIPTION "Number Guessing Game"
    LANGUAGES CXX)

add_executable(guessing_game main.cxx)
target_compile_features(guessing_game PRIVATE cxx_std_20)

Processing a Guess

First we will need to we need to ask the user for input, process that input and ensure it is in a form we expected. To start we'll simply take in the users guess and return it to them. Listing2-3 shows the starting code.

Listing 2-3: Code for obtaining an input from the user and printing it.

#include <iostream>
#include <string>

auto main() -> int
{
    std::cout << "Guessing Game!\n";

    std::cout << "Please input your guess (1..100): ";

    auto guess = std::string {};

    std::getline(std::cin, guess);

    std::cout << "You guessed: " << guess << std::endl;

    return 0;
}

Let's briefly go over the new concepts introduced in Listing 2-3. We have included a new header <string> which contains the definitions the type std::string and supported functions.

#include <string>

We then prompt the user with the name of the game as well as request input from the user using the output stream std::cout, which we covered in Chapter 1.

    std::cout << "Guessing Game!\n";

    std::cout << "Please input your guess (1..100): ";

Storing Data with Variables

Next, we construct a new variable to store the users input in.

    auto guess = std::string {};

Now this is where things begin to get interesting. This line is an assignment expression which is used to bind a value to a variable. Here is another!

auto boxes = 7;

Note

Note the lack of a type after the =. This is because we can initialize boxes with a integer literal and thus a type is not needed.

In C++ variables are mutable by default which means we are allowed to change it's value. This concept will be discussed more in Chapter 3 | Variables and Mutability. To make a variable constant ie. its value cannot change once it is set, we use the const keyword after/before auto (I choose after).

auto const boxes = 7;  // constant
auto crates = 4;  // mutable

Tip

The // syntax indicates a comment that continues until the end of the line. Everything in a comment is ignored by C++. You will learn more about them in Chapter 3 | Comments.

In this case of our variable guess in our guessing game program, we have (default) constructed a temporary value with the type std::string which we then bind to the variable named guess using the = operator. We have also used auto to allow the compiler to deduce the type that the variable guess should have. We could have written explicitly the type on the left-hand-side instead of auto like the example below but this would be more verbose as we have to express the type twice. It also means that if we change the type on the RHS we must also change it on the LHS but with auto the compiler will do that for us!

    std::string input = std::string {};

Tip

When constructing our std::string we have used what is known as brace. This is a safer variant of regular construction (which uses parenthesis ()) as it prevents narrowing which causes the bit representation of some types to be truncated. We also have default constructed our std::string which in this case means the std::string is constructed as an empty string not as an invalid object.

Receiving User Input

There are a few different ways for handling user input from the terminal in C++. For this program we have used the std::getline().

    std::getline(std::cin, input);

This function extracts all characters from the first argument which is of type std::basic_istream<>. In this case, the input stream is std::cin. Once no characters remain in the stream or the designated deliminator is encountered; which defaults to '\n' (third argument), the extracted characters are then written to the second argument which is a reference to a string of the same underlying character type. References allow functions to read and/or modify data passed to them and have the effects reflected on the callers side. We'll cover references and ownership in C++ during Chapter 4. In effect this function reads an entire line and copies the characters into a string.

Printing with Output Streams

As we first saw in "Hello, world!" we can output text using std::cout global object using the operator <<. You may be wondering why the "unique" syntax for out has been chosen for printing? This is because the Input/Output library is more generic than just a printing facility. As the name suggests it is a library for manipulating and using Input/Output (IO) streams. Streams can be thought of as a pipeline between two endpoints eg. a program and the terminal screen where data can be pushed from one end (the program) and extracted at the other end (the terminal screen). The C++ IO library uses streams to model how data is transferred between various endpoints like a program, the terminal screen, files etc. with the << and >> operators being used to perform formatted IO ie. push formatted data to and/or extract formatted data from a stream respectively. These facilities were then used to wrap low level IO handles such as stdin, stdout and stderr; which are used to print and take user input, in global stream objects eg. std::cin, std::cout and std::cerr which meant they could be manipulated using the same API and functionality provided by the standard C++ IO library.

Note

The C++23 Standard includes a new header <print> with functions like std::println() which use the C++20 formatting library which make printing much more intuitive and faster. This library was directly inspired by the {fmt} library.

If you are familiar with other languages you may be wondering why << is used to push to a streams as this operator is normally used for the left bit shifting operations. We are able to use the << operator because it has been overloaded. Essentially this means the functionality of << has been changed and customized for particular types. Within the C++ standard library, << has been overloaded to support taking a reference to a std::basic_ostream<> object as the left argument; ie. the type of std::cout, and various builtin C++ types and library types from the standard library as the right argument eg. int and std::string, which allows the << syntax to be used with many different types already in C++. Overloading will be covered in more detail in Chapter 3 | Functions.

In this program we have seen that we can chain the calls to <<.

    std::cout << "You guessed: " << input << std::endl;

This is because each call to << returns a reference to the same stream passed as the left argument, allowing you to make subsequent calls to << one after another. This can make it easier to build up pipelines to and from streams as we can create arbitrarily long chains.

Finally, you may notice the std::endl at the end of the chain. This is a stream manipulator. Stream manipulators are used to modify the stream to support different kinds of formatting. In this case, std::endl simply appends a '\n' to the stream and flushes the underlying buffer. So why not just use '\n'? Well, you should. Using std::endl repeatedly just to add newlines will dramatically degrade performance because repeatedly flushing the internal buffer forces the OS the immediately display the characters instead of allowing for the output to buffer ie. reach a large enough size to warrant making a system call. std::endl should only be used when you want to flush the streams buffer and place a newline eg. at the end of a program, otherwise use an explicit '\n'.

Generating a Secret Number

Now we want some way to generate a secret number that the player will try to guess. We also want the number to be different each time so the game is more fun but we'll keep it between 1 and 100 to ensure it is not too difficult. To generate our secret number we'll use a random number generator. The C++ standard library contains a header <random> which contains a bunch of facilities for generating random numbers. Update your main.cxx file according to Listing 2-4.

Listing 2-4: Added code to generate a random number.

#include <iostream>
#include <random>
#include <string>

auto main() -> int
{
    std::cout << "Guessing Game!\n";

    auto rd = std::random_device {};
    auto gen = std::mt19937 { rd() };
    auto distrib = std::uniform_int_distribution<unsigned> { 1u, 100u };

    auto const secret_number = distrib(gen);

    std::cout << "The secret number is: " << secret_number << '\n';

    std::cout << "Please input your guess: ";

    auto input = std::string {};

    std::getline(std::cin, input);

    std::cout << "You guessed: " << input << std::endl;

    return 0;
}

First we include the new header <random> so we can access the (pseudo-) random number generation types. Next we add the lines

    auto rd = std::random_device {};
    auto gen = std::mt19937 { rd() };
    auto distrib = std::uniform_int_distribution { 1, 100 };

The first line (default) constructs a new std::random_device. This is a uniformly distributed, non-deterministic number generator. While we could generate a random number from simply calling rd, this is considered bad practice as std::random_device performance degrades with use due to its entropy pool being used up. For this reason we simply use it to seed a proper Pseudo-Random Number Generator (PRNG) such as std::mt19937 which is what we do on the second line. Finally we construct a std::uniform_int_distribution<> which is used to uniformly generate integers between the two provided bounds.

This sets up our random number generator. To obtain a random number we can call the distribution object, passing in the generator and returning a new random value.

    auto const secret_number = distrib(gen);

Comparing the Guess to the Secret Number

Next we want to compare our players guess to the secret number. The updated code can be seen in Listing 2-5.

Listing 2-5: Added code to compare players input to the secret number.

#include <compare>
#include <iostream>
#include <random>
#include <string>

auto main() -> int
{
    std::cout << "Guessing Game!\n";

    auto rd = std::random_device {};
    auto gen = std::mt19937 { rd() };
    auto distrib = std::uniform_int_distribution { 1, 100 };

    auto const secret_number = distrib(gen);

    std::cout << "The secret number is: " << secret_number << '\n';

    std::cout << "Please input your guess: ";

    auto input = std::string {};

    std::getline(std::cin, input);

    auto guess = std::stoi(input);

    if (auto const cmp = guess <=> secret_number; std::is_eq(cmp)) {
        std::cout << "You guessed correctly!\n";
    } else if (std::is_lt(cmp)) {
        std::cout << "Too small!\n";
    } else if (std::is_gt(cmp)) {
        std::cout << "Too big!\n";
    }

    return 0;
}

Before we are able to compare the players input to our secret number we must first convert the raw input into a number so they can be compared.

    auto guess = std::stoi(input);

C++ offers a few functions for converting strings into numbers which all start with the prefix std::sto* meaning 'string-to' followed by a designator for the conversion type. Because we want to parse our input as a plain int we can use std::stoi().

Next we compare the guess to our secret_number. Here we can make use of the spaceship operator (<=>) which allows us to perform a '3 way comparison' which we can then query with the utility functions std::is_eq, std::is_lt, std::is_gt etc.. In this case we create a new object cmp and then use these 'named comparison' functions to check the result. We use if and else if branches to test the comparisons result and run a separate piece of code if that branch succeeds.

    if (auto const cmp = guess <=> secret_number; std::is_eq(cmp)) {
        std::cout << "You guessed correctly!\n";
    } else if (std::is_lt(cmp)) {
        std::cout << "Too small!\n";
    } else if (std::is_gt(cmp)) {
        std::cout << "Too big!\n";
    }

We have also used a initialiser statement in the first if branch. This allows us to run an expression at the start of the if branches and store the result in a local variable (in this case cmp) which can only be accessed within the if branches. This helps ensure that cmp is not modified or accessed outside the if branches it belongs to.

Handling Parsing Errors with Exceptions

Our game is coming along quite nicely but it has one fundamental flaw. What happens if we give our game the input "abcd34" or "38574876546456476745"? We get the following two errors and our game crashes!

# input: "abcd34"
terminate called after throwing an instance of 'std::invalid_argument'
  what():  stoi
[1]    27989 IOT instruction  ./build/.../guessing_game

# input: "38574876546456476745"
terminate called after throwing an instance of 'std::out_of_range'
  what():  stoi
[1]    1513 IOT instruction  ./build/.../guessing_game

This is not ideal as it gives no way for the system to recover from the error and let the user try again. How do we fix this? Well notice in the error message it states that an instance of (either) std::invalid_argument (or) std::out_of_range was thrown. What are these objects? These are known as exceptions. They are a special object used to indicate that an exceptional event has occurred. These are pathways in our program that we do not expect to occur but might and exceptions allow us to recover the system without fully crashing. This is a useful mechanism for allowing systems to remain online and perform self recovery if an error does occur.

Before we look at how to handle thrown exceptions we'll first discuss what each of these exceptions mean in the context of std::stoi(). std::invalid_argument is used to indicate that a general parsing error has occurred due to a bad input ie. prefixing the input with letters eg. "abcd34". The exception std::out_of_range is used to indicate that the input value cannot fit into the conversion type. For example if "38574876546456476745" is passed to std::stoi() we have this exception thrown because the max value that can be fit inside an int is 2147483647 which is much smaller than 38574876546456476745.

Tip

The std::sto* function family will 'successfully' parse inputs like "34abc" as they extract the number from the front and will discard the rest.

Catching Exceptions

So how do we handle an exception that has been thrown? We can use a try-catch block. When there is a chance for something to fail we place the potentially failing code in a try block. After a try block we put one or more catch blocks. These are used to define the exception handling pathway for that particular exception. For our simple program we can define a try-catch block like in Listing 2-6.

Listing 2-6: Added exception error handling for std::stoi call.

#include <compare>
// --snip--
#include <exception>
#include <iomanip>
#include <iostream>
#include <random>
#include <string>
// --snip--

auto main() -> int
{
    std::cout << "Guessing Game!\n";

    auto rd = std::random_device {};
    auto gen = std::mt19937 { rd() };
    auto distrib = std::uniform_int_distribution { 1, 100 };

    auto const secret_number = distrib(gen);

    std::cout << "The secret number is: " << secret_number << '\n';

    std::cout << "Please input your guess: ";

    auto input = std::string {};

    std::getline(std::cin, input);

    // --snip--

    auto guess = int {};

    try {
        guess = std::stoi(input);
    } catch (std::invalid_argument const&) {
        std::cout << "Invalid input " << std::quoted(input) << "!\n";
        std::exit(0);
    } catch (std::out_of_range const&) {
        std::cout << "Input " << std::quoted(input) << " is too large!" << '\n';
        std::exit(0);
    }

    if (auto const cmp = guess <=> secret_number; std::is_eq(cmp)) {
        std::cout << "You guessed correctly!\n";
    } else if (std::is_lt(cmp)) {
        std::cout << "Too small!\n";
    } else if (std::is_gt(cmp)) {
        std::cout << "Too big!\n";
    }

    return 0;

    // --snip--
}

Warning

While try-catch block's do model a form of control flow they are very different to regular control flow mechanisms like if statements. You should not be used try-catch blocks to control the regular/expected execution pathway of a program as they are much slower nor should you throw exceptions in order to jump out to a particular scope. Exceptions should only be used to indicate that a recoverable error has occurred and try-catch blocks being used to handle recovering from this event eg. giving any allocated resources back to the OS, as such exceptions should be used only in exceptional (pun most definitely intended) cases and when appropriate for your domain (as they can be undesirable in many situations). The main purpose of showing exceptions now is to demonstrate how to handle them not throw your own.

Allowing Multiple Guesses with a Loop

Now that we correctly handle the exceptional cases of parsing our player's input we can look at making the game more interactive. Only having one guess doesn't make our game very fun. Lets allow the player to make multiple guesses by introducing a loop! We will want this loop to run forever with explicit mechanisms for exiting the loop. We can use a while loop with its condition simply being true. This will create our infinite loop. But how and when do we exit the loop? We want the loop to be broken when the player guesses the correct number. We can do this by introducing a break statement in the first if branch when comparing the player's input to the secret number. break is used to break out of the enclosing loop block. We also need the program to run the next loop iteration if an exception occurs, skipping the comparisons. We can do this with a continue statement within each of the catch blocks to skip to the next iteration. Finally, be sure to move the prompt output and player input logic into the loop so they are called each iteration.

Listing 2-7: Placed game in a infinite loop to allow player multiple guesses.

#include <compare>
#include <exception>
#include <iomanip>
#include <iostream>
#include <random>
#include <string>

// --snip--

auto main() -> int
{
    std::cout << "Guessing Game!\n";

    auto rd = std::random_device {};
    auto gen = std::mt19937 { rd() };
    auto distrib = std::uniform_int_distribution { 1, 100 };

    auto const secret_number = distrib(gen);

    std::cout << "The secret number is: " << secret_number << '\n';

    auto input = std::string {};

    auto guess = int {};

    // --snip--

    while (true) {
        std::cout << "Please input your guess: ";
        std::getline(std::cin, input);

        // --snip--

        try {
            guess = std::stoi(input);
        } catch (std::invalid_argument const&) {
            std::cout << "Invalid input " << std::quoted(input) << "!\n";
            continue;
        } catch (std::out_of_range const&) {
            std::cout << "Input " << std::quoted(input) << " is too large!" << '\n';
            continue;
        }

        if (auto const cmp = guess <=> secret_number; std::is_eq(cmp)) {
            std::cout << "You guessed correctly!\n";
            break;
        } else if (std::is_lt(cmp)) {
            std::cout << "Too small!\n";
        } else if (std::is_gt(cmp)) {
            std::cout << "Too big!\n";
        }
    }

    return 0;
}

Fantastic! With a final tweak we have finished the guessing game. Our game is still printing the secret number! We can fix this by deleting the line. The final code is available in Listing 2-8.

Listing 2-8: Final game.

#include <compare>
#include <exception>
#include <iomanip>
#include <iostream>
#include <random>
#include <string>

auto main() -> int
{
    std::cout << "Guessing Game!\n";

    auto rd = std::random_device {};
    auto gen = std::mt19937 { rd() };
    auto distrib = std::uniform_int_distribution { 1, 100 };

    auto const secret_number = distrib(gen);
    auto input = std::string {};
    auto guess = int {};

    while (true) {
        std::cout << "Please input your guess: ";
        std::getline(std::cin, input);

        try {
            guess = std::stoi(input);
        } catch (std::invalid_argument const&) {
            std::cout << "Invalid input " << std::quoted(input) << "!\n";
            continue;
        } catch (std::out_of_range const&) {
            std::cout << "Input " << std::quoted(input) << " is too large!" << '\n';
            continue;
        }

        if (auto const cmp = guess <=> secret_number; std::is_eq(cmp)) {
            std::cout << "You guessed correctly!\n";
            break;
        } else if (std::is_lt(cmp)) {
            std::cout << "Too small!\n";
        } else if (std::is_gt(cmp)) {
            std::cout << "Too big!\n";
        }
    }

    return 0;
}

Summary

This project offered a hands on way to learn many of C++ features: auto, variables, functions, if statements, exception handling and loops! In the upcoming chapters you will delve deeper into these concepts as well as explore many new ones. See you there!

Common Programming Concepts

Warning

🚧 Page Under Construction! 🏗️

Throughout this chapter we will cover some of the most common concepts that appear in many different programming languages and how they work in C++. None of these concepts are unique to C++ but they may work slightly different to how you are used to.

Info

Keywords are words reserved for use by the language meaning they cannot be used as identifier names for variables or functions. See Appendix A for full list of keywords.

Variables and Mutability

Warning

🚧 Page Under Construction! 🏗️

We first saw variables in our mini guessing game project where we used them to store the guess of the user and create our PRNG etc.. Let's exlore what happens when we try to modify constant data and when we would want to allow mutations.

Danger

By default, variables are mutable, allowing you to modify them freely. While this offers great flexibiliy and ease of programming, it is beneficial to opt-in to immutabilty using the const keyword which data that does not need to change, cannot change; opting to remove the const keyword when data needs to be mutable.

Create a new (or use an existing project) with a main.cxx and CMakeLists.txt etc. like we did for our previous programs; or use an existing one, and we'll explore mutability. Change the name of the target to main in the CMakeLists.txt, as I'll be using this as the target name from (near) all examples from now on in the book.

cmake_minimum_required(VERSION 3.22)

project(main
    VERSION 0.1.0
    DESCRIPTION "C++ Book Example"
    LANGUAGES CXX)

add_executable(main main.cxx)
target_compile_features(main PRIVATE cxx_std_20)

In your main.cxx, write the following program. When we try to compile this we should get an error like so.

#include <iostream>

auto main() -> int {
    auto const x = 42;

    std::cout << x << std::endl;
    x = 43;
    std::cout << x << std::endl;
    
    return 0;
}

When we try to compile this we should get an error like so:

$ cmake -S . -B build --preset=<platform>
$ cmake --build build
[ 50%] Building CXX object CMakeFiles/main.dir/main.cxx.o
/home/user/projects/common/main.cxx: In function ‘int main()’:
/home/user/projects/common/main.cxx:7:7: error: assignment of read-only variable ‘x’
    7 |     x = 43;
      |     ~~^~~~
gmake[2]: *** [CMakeFiles/main.dir/build.make:76: CMakeFiles/main.dir/main.cxx.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/main.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2

It is vital that we catch errors like this are compile time as it prevents us writing bad and security vulnerable code. Constant data is also easier to reason about as we can assume that no part of the program will modify this piece of data. The benefits of this do not emerge properly until we introduce functions and have to share data across the function boundaries where we expect the function to not mutate data passed to it even though the surrounding scope might. More on this later.

Even though immutable data is easier to reason about, mutating data is where the fun parts of computation occur. We can see that by dropping the const we can mutate the variable freely.

#include <iostream>

auto main() -> int {
    auto x = 42;

    std::cout << x << std::endl;
    x = 43;
    std::cout << x << std::endl;
    
    return 0;
}

With it compiling to...

$ cmake -S . -B build --preset=<platform>
$ cmake --build build
$ ./build/main
42
43

Constant Expressions

C++ allows for us to define constants whose value is computed at compile time using the constexpr keyword. This allows you to define variables that are the result of some computation but have the value ready at runtime instead of performing the computation perform during runtime. constexpr are naturally immutable.

To actually see this feature in action, we need to look at the assembly generated for code using constexpr and code without. Take below, we see two numbers, one is is a constexpr and is initialized to some expression; even containing a function call, and another initialized to a simple number but immediately changed to the same expression value.

#include <iostream>

auto constexpr sum(auto const n) {
    auto acc = 0;
    for (auto i = 0; i < n; ++i) {
        acc += 1;
    }

    return acc;
}

auto main() -> int {
    auto constexpr x = (42 + 7) / sum(23);
    auto y = 6;
    y = (42 + 7) / sum(23);

    std::cout << x << std::endl;
    std::cout << y << std::endl;
    
    return 0;
}

This generates the following assembly (at least for GCC-14):

main:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     DWORD PTR [rbp-4], 2
        mov     DWORD PTR [rbp-8], 6
        mov     edi, 23
        call    auto sum<int>(int)
        mov     ecx, eax
        mov     eax, 49
        cdq
        idiv    ecx
        mov     DWORD PTR [rbp-8], eax
        mov     esi, 2

The place of interest is the 5th and 6th line and then the lines 8-14. The first set are the variables x and y being initialized. Line 6 makes sense because we initialized the value with a literal 6, but line 5 shows 2. Compare this to the lines 8-14 which show the process of calling the sum() function, calculating and moving the result into registers, a division call (idiv) and finally pushing the result onto the variable on stack frame. That's not even to mention the instructions needed to run sum() (take a look at the link below for the full assembly). The difference is quite distinguishable.

constexpr example

While the example above is simple (and a little contrived*), constexpr has become a very powerful feature of C++ and is capable of computing super complex expressions at compile time, even expression involving objects that typically interact with runtime only entities like the heap however, we'll learn more about this in future chapters.

Abstract

*This initialization and immediate change is necessary to force the compiler to generate the unoptimized assembly I wanted to show off. Compilers have gotten so good that regardless of constexpr or no constexpr, a variable directly initialized to this expression will cause the compiler to optimize the whole thing away into the result of the expression and directly initialize the variable with that value.

In fact, it completely removes the definition of sum() as it is only used in these expressions which run at compile time, so there is no need to store the functions code in the resulting binary if it is never used again. Setting the second value to a temporary value disallows the compiler to make these optimizations.

It's amazing how much heavy lifting compilers are able to do for us.

Note

It should be noted that constexpr only indicates to the compiler that this expression could be computable at compile time but makes no guarantee that it will. For that, consteval was introduced.

Type Deduction

You may be wondering why we I am using auto to declare variables instead of writing the type like below. C++ is a statically typed language after all... right?

int x = 5;
auto y = 6;

auto is a keyword that allows the compiler to perform type deduction, which means we tell the compiler to figure out the type of the variable or function return signature from the context it is given.

Storage Duration

Data in C++ falls into different storage duration categories which dictates the lifetime of the data. So far we have seen data with automatic storage duration, this is data that is automatically freed when it goes out of scope. These are variables that do not allocate heap memory and instead live entirely on the stack and thus are freed when stack frames are popped, which occurs naturally as functions return.

Data with dynamic storage duration is data that is created at runtime and must be deallocated manually before the program finishes. This is data that is usually stored on the heap or what C++ formally calls the free store.

One we haven't looked at yet is static storage duration. This is data that is encoded directly in the binary of a program and thus lives for the entire duration of the program. To give data this storage duration we declare it with the static keyword. Global variables declared outside of a functions are implicitly static.

Data Types

Warning

🚧 Page Under Construction! 🏗️

As we mentioned on the last page, C++ is a statically typed language which means the type of data must be known (or deducable) to the compiler. C++ has a large selection of types available to use, some are language primitives and others are defined in the standard library. In this page we will look at four categories of types, scalar integrals, floating point, compound and special types.

Scalar Types

Scalar integrals are types encoded as whole numbers. This not only includes integers types but C++ character and Boolean types.

Integer Types

An integer is a whole number. C++ has a few different integer types which have diffenent bit widths. The default int is 32-bits wide on most platforms. By default integer types are signed ie. they can represent both positive and negative numbers. If you need unsigned numbers we can use the unsigned qualifier.

int const x = -5;
unsigned int const y = 5;

If you need integers of a different sizes you can either use size qualifiers with the int type to dictate the minimum size the integer can be. All of these can be used in combination with the unsigned qualifier.

Type	Full Type	Minimum Size	Signed Value Range	Unsigned Value Range
`char`	`char`	at least 8	-128 to 127	0 to 255
`short`	`short int`	at least 16	-32,768 to 32,767	0 to 65,535
`int`	`int`	at least 16	-32,768 to 32,767	0 to 65,535
`long`	`long int`	at least 32	-2,147,483,648 to 2,147, 483,647	0 to 4,294, 967,296
`long long`	`long long int`	at least 64	-9,223,372,036,854,775,808 to 9,223,372,036,854,775,807	0 to 18,446,744,073,709,551,615

You can also use fixed width integer types (FWIT). FWIT have the form std::intN_t or std::uintN_t where N is the exact number of bits wide. The standard library define FWIT (signed and unsigned) for 8, 16, 32, 64 bits widths.

The bit width of an integer dictates how many values the integer can represent. As of C++20, all integers must be represented by 2s-complement which means that for signed numbers the range of values is $-2^{N-1}$ to $+2^{N-1}-1$ eg. -128 to 127 for an 8-bit number and for an unsigned number the range is $2^N-1$ eg. values 0 to 255 for an 8-bit number.

In addition to these integer types there are std::size_t and std::ptrdiff_t which are the unsigned and signed types respectively that have the max bit width available on a given architecture, eg. 64 bits on 64-bit architecture. std::size_t is the type used when index arrays or getting the size of objects. The odd name for std::ptrdiff_t is because this is the type returned after pointer arithmetic however, it is really the largest signed integer type.

Literals

You can specify the type/width of an integer using a literal suffix from the table below with the u suffix being able to be used in combination with the other two.

Keyword	Description
`u` or `U`	`unsigned`
`l` or `L`	`long`
`ll` or `LL`	`long long`

Additionally you can write integer literals in a different base form by changing the prefix of the literal.

auto const decimal = 42;
auto const octal = 052;
auto const hex = 0x2a;
auto const Hex = 0X2A; // capital hex digits
auto const binary 0b101010;

Integers can also be separated using a ' to make large numbers easier to read.

auto const x = 1'234'567'890;

Character Types

You'll notice that we have included the char type in the integer list above. This is because character types in C++ are represented using numbers, specifically char represents ASCII code points. Character literals are specified with single quotes like the example below.

char const x = 'a';
auto const y = 'b';

Boolean Type

C++'s Boolean type is called bool and can either hold the value true or false. Booleans are used mostly in conditional and loop statements eg. if and while.

bool x = false;
auto y = true;

Tip

The C language; C++'s mother language, originally did not have a native Boolean type with Boolean expressions return 1 for true and 0 for false. Later in the 1999 standard of C (C99), the _Bool type was introduced to support Booleans.

Floating Point Types

C++ has three floating point types, all of which are based on the IEEE-754 standard. Floating point numbers are used to represent decimal numbers ie. numbers that can store fractional components. These types are the float, double and long double; with float represent single precision (32-bit) numbers, double being double precision (64-bit) numbers and long double being an extended or quadruple precision (128-bit) floating point number.

With auto, floating point values being initialized as a double by default and float and long double literals being specified by f and l literal suffixes.

auto const f = -0.06f;
auto const d = 47.5768;
auto const l = -655456.457567l;

We can also initialize floating points using exponential form:

auto const f = -6e-2f;
auto const d = 475768e4;
auto const l = -655456457567le7l;

Arithmetic Operations

Integral and floating point types are categorized as arithmetic types which mean they support the common arithmetic operations like addition, subtraction etc.

auto main() -> int {
    // addition
    auto const sum = 4 + 6;

    // subtraction
    auto const diff = 10 - 5.5;

    // multiplication
    auto const mul = 5 * 3.2;

    // division
    auto const idiv = 10 / 3;
    auto const fdif = 13.5 / 2.4;

    // remainder
    auto const = 23 % 4;

    return 0;
}

Tip

Division between two integrals performs integer division and truncates towards 0 while if one argument is a floating point then floating point division is performed.
Remainder is only valid between integral types.

Compound Data Types

Compound data types store multiple pieces of data or are data that can take multiple values.

Enumerations

Enumerations or enums are a construct that allows you to define a type whose value is restricted to a set of named variants or enumerators. These named constants have an underlying integral type. Specifying the underlying type is optional ie. omit the : type in the enum declaration.

enum class colour : char {
    red,
    green,
    blue
};

auto const c = colour::red;

Tuple

Tuples allow you to pack multiple pieces of data of different types into a single structure. Tuples have a fixed size/number of elements that cannot grow or shrink once declared. Tuples in C++ are not language types but are provided by the standard library in the <tuple> header and is called std::tuple. We create a tuple using brace initialization (top) or using the helper function std::make_tuple().

auto const t = std::tuple { 5u, 5.34f, -345, "abc", false };
auto const u = std::make_tuple(5u, 5.f, -345, "abc", false);

Tuples can be accessed using std::get<I>(t) with I being the index of the value we want to access and t is the tuple object.

auto const e = std::get<2>(t);  // e := -345

You can also destructure tuples into its constituent values like so.

auto const [v, w, x, y, z] = t;

There is a specialization of tuples called std::pair which holds just two values. The values of a pair can be extracted using the same methods as tuples but they also have public members std::pair::first and std::pair::second which allows you to access the data.

auto const p = std::pair {5, 'a'};
auto const [x, y] = p;
auto const z = p.second;

Special Types

C++ has a handful of special types that you won't use as directly as types but are fundamental to the language.

The first is the void type is an incomplete type that is used to indicate that a function does not return a value.

auto foo(auto const i) -> void {
   i + 5; 
}

The other type is std::nullptr_t which is the type of nullptr the value of a pointer pointing to nothing.

Array Types

C++ array type is a fixed sized container where elements are all of the same type. The array type is called std::array and is found in the <array> header. Array elements can be accessed using the subscript operator [] or the array::at() method with indices starting at 0. The subscript element access does not perform bounds checking while array::at() does, meaning the later will throw and exception if an out of bounds index is used while the former will crash the program... sometimes.

auto const a = std::array { 1, 2, 3, 4, 5 };
auto const e1 = a[0]; // valid
auto const e2 = a.at(5); // exception std::out_of_range

Functions

Warning

🚧 Page Under Construction! 🏗️

Functions are fundamental to programming as they allow us to write reusable pieces of code. We have already been using a function in the examples we have shown so far, that is the main() function which is called by our OS to start the program. We have also seen a function in constexpr example.

Functions are defined by introducing a type (or auto) followed by the functions name, a(n optional) comma-seperated list of parameters surrounded in parenthesis followed by the body of the function in (curly-)braces. We call a function through its name and suffixing parenthesis to it.

#include <iostream>
// --snip--

auto another_one() {
    std::cout << "Another one!\n";
}

auto main() -> int {
    std::cout << "Main function!\n";

    another_one();
    return 0;
// --snip--
}

Abstract

A function must be declared before it can be used as the compiler has to know the function symbol (name + parameter and return types) exists however, it does not have to defined. Note that the return type must be explicitly stated so the return type can be deduced.

#include <iostream>
// --snip--

// declaration
auto another_one() -> void;

auto main() -> int {
    std::cout << "Main function!\n";

    another_one();
    return 0;
// --snip--
}

// definition
auto another_one() -> void {
    std::cout << "Another one!\n";
}

This mechanism is a result of how C and thus C++ code was and still is compiled and linked together. It allows you to state a symbol existed in a public header (declare) but define it later in a source file which was usually built into a binary library with the linker then connected the calls to the function to the location in the library.

Parameters

Parameters are a way to pass information into functions. The type of each parameter must be specified, using the same syntax we saw to declare a variable (without an initializer).

#include <iostream>
// --snip--

auto another_one(int const x, int const y) {
    std::cout << "x: " << x << ", y: " << y << "\n";
}

auto main() -> int {
    std::cout << "Main function!\n";

    another_one(7, 6);
    return 0;
// --snip--
}

Tip

As we saw in the constexpr example from the previous page, function parameters may also be declared with auto but this can sometimes make hard to know what the type of the parameter is supposed to be.

Return Values

Functions can also return values using the return keyword. The type of the return value is indicated either before the functions name (C-style) or using a trailing return type, like we've been using for main(). When a function doesn't a value, it's return type is void.

#include <iostream>
#include <sstream>
#include <string>
// --snip--

auto another_one(int const x, int const y) -> std::string {
    auto ss = std::stringstring{};
    ss << "x: " << x << ", y: " << y << "\n";
    return ss.str();
}

auto main() -> int {
    std::cout << "Main function!\n";

    std::cout << another_one(7, 6);
    return 0;
// --snip--
}

Overloading

In C++ you can overload functions of the same name to have different implementations as long as the type signature of the function is different. This is because the type signature is part of the functions symbol and thus functions with the same name but different parameters (and possibly return type) is an entirely different function.

#include <iomanip>
#include <iostream>
#include <sstream>
#include <string>
// --snip--

auto another_one(int const x, int const y) -> std::string {
    auto ss = std::stringstream {};
    ss << "x: " << x << ", y: " << y << "\n";
    return ss.str();
// --snip--
}

auto another_one(float const x, float const y) -> std::string {
    auto ss = std::stringstream {};
    ss << std::setprecision(4) 
       << "x: "
       << x
       << ", y: "
       << y
       << "\n";

    return ss.str();
}

auto main() -> int {
    std::cout << "Main function!\n";

    std::cout << another_one(7, 6);
    std::cout << another_one(7.456575654f, 6.0f);
    return 0;
// --snip--

This concept also extends to C++ operators, which can also be overloaded to have custom functionality between custom types. Operators are overloaded using the operator keyword as the function name, suffixed with the operator we wish to overload. Operator overload functions can only take two parameters except unary operators, which can only take one.

#include <iostream>
#include <ostream>
#include <utility>
// --snip--

auto operator<<(std::ostream& os, std::pair<int, int> p) -> std::ostream& {
    auto const [x, y] = p;
    os << "x: " << x << ", y: " << y << "\n";
    return os;
}

auto main() -> int {
    auto const p = std::pair {7, 6};

    std::cout << p << "\n";
    return 0;
// --snip--
}

Warning

There are a few operators that cannot be overloaded such as scope lookup (::), and member access operators (., ->, .* and ->*).

Comments

Warning

🚧 Page Under Construction! 🏗️

Comments are a way to document code for other people, and yourself. In C++ there are two types of comments, single line and multi-line. We've seen single line comments in many of the previous examples but to reiterate, a single line comment is started with // and any text written after it until a newline is ignored by the compiler.

// Comment on its own line

auto const x = 5; // Comment

Multi-line comments are specified using /* */ quoting ie. the comment extends from /* comment opener and continues until */. This allows comments to extend multiple lines or be nested amongst code (if you really want).

/*
multi-line comment
another line
*/

auto const /* int */ x = 5;

Control Flow

Warning

🚧 Page Under Construction! 🏗️

Control flow is how we get our programs to do interesting things, it allows us to write programs that do different things depending on conditions (branch) or easily repeat code (loops). C++ also has various relational and logical operators used to construct conditional expressions used by the control flow statements. You can read about them in Appendix B.

Branches

if statements

An if statement is the simplest control flow structure, it allows us to execute a piece of code as long as a condition is true. if statements are declared using the if keyword followed by the conditional expression in parenthesis. The code to execute is contained in braces like function definitions.

#include <iostream>
// --snip--

auto main() -> int {
    auto const x = 6;

    if (x % 2 == 0) {
        std::cout << "Even\n";
    }

    return 0;
// --snip--
}

We can add an alternative branch using the else keyword after the closing the brace of the if the block. This branch will run if the condition in the if statement is false.

#include <iostream>
// --snip--

auto main() -> int {
    auto const x = 5;

    if (x % 2 == 0) {
        std::cout << "Even\n";
    } else {
        std::cout << "Odd\n";
    }

    return 0;
// --snip--
}

We can create a multiple branches based on various conditions using an else if statement. These declared after the initial if statement.

#include <iostream>
// --snip--

auto main() -> int {
    auto const x = 5;

    if (x % 2 == 0) {
        std::cout << "Even\n";
    } else if (x % == 5) {
        std::cout << "5 multiple\n";
    } else {
        std::cout << "Odd\n";
    }

    return 0;
// --snip--
}

switch statements

switch statements are a way to mix control flow with enums. switch statements are given a enum object which are then matched against different cases ie. enum variants. There is a default case that is used if no case is match, the equivalent of else from if statements.

The cases of a switch statements automatically fallthrough to the next case if you do not use a break statement to escape from the switch.

#include <iostream>
// --snip--

enum class colour : char {
    red,
    green,
    blue
};

auto main() -> int {
    auto const c = colour::red;
    
    switch (c) {
        case colour::red:
            std::cout << "red\n";
            break;
        case colour::green:
            std::cout << "green\n";
            break;
        case colour::blue:
            std::cout << "blue\n";
            break;
        default:
            std::cout << "unknown\n";
            break;
    }

    return 0;
// --snip--
}

Note

Because enums are fundamentally based on an underlying integral type, switch statements thus can work on any integral type like char or int however, you have to be sure to cover all the cases as there is no formally notion of pattern matching over integral ranges.

Loops

while loop

while loops are the fundamental looping construct in C++. A while loops will repeat as long as the condition remains true.

#include <iostream>
// --snip--

auto main() -> int {
    auto i = 0uLL;
    auto acc = 0uLL;

    while (i < 10) {
        acc += i;
        i += 1;
    }

    std::cout << "Sum: " << acc << "\n";

    return 0;
// --snip--
}

There is another while loop called a do-while loop. This has the same semantics as a while loop but the loop condition is checked at the end of the loop instead of at the start. This has the effect of running the loop at least once.

#include <iostream>
// --snip--

auto main() -> int {
    auto i = 0uLL;
    auto acc = 0uLL;

    do {
        acc += i;
        i += 1;
    } while (i < 1);

    std::cout << "Sum: " << acc << "\n";

    return 0;
// --snip--
}

for loop

for loops further abstract the concepts of loops by providing dedicated syntax for initializing the loop counter and incrementing the loop unlike a while loop which only only has syntax for checking the loop condition. We saw a for loop in our constexpr example.

#include <iostream>
// --snip--

auto main() -> int {
    auto acc = 0uLL;

    for (auto i = 0; i < 10; i++) {
        acc += i;
    }

    std::cout << "Sum: " << acc << "\n";

    return 0;
// --snip--
}

range-for loop

In C++11, we got another for loop called a range-for loop. This loop is able to automatically traverse C++ standard container types like array. This is beneficial as it prevents us from incorrectly accessing/traversing the container ie. indexing out of the array/containers bounds.

#include <iostream>
#include <array>
// --snip--

auto main() -> int {
    auto const a = std::array {1, 2, 3, 4, 5};
    auto acc = 0uLL;

    for (auto const x : a) {
        acc += x;
    }

    std::cout << "Sum: " << acc << "\n";

    return 0;
// --snip--
}

Ownership

Ownership of data and resources is vital to consider when writing complex and sophisticated programs in C++ (or other systems level programming language) due to needing to manage resources like memory manually. Having a clear picture of who owns what data and who has access to data ensures we write safer programs.

What is Ownership?

Warning

🚧 Page Under Construction! 🏗️

Ownership is the notion that some data is managed or owned by a particular variable and thus is responsible for ensuring that it's data lives long enough for all parts of the program that reference the data can correctly access the data.

We first had a look at lifetimes in Common Concepts - Variables and Mutability when discussing storage duration of data but we are now going to discuss how this comes into effect in our programs.

The Stack and the Heap

You'll hear a lot about the stack and the heap when discussing C++ but what are they?

These are two regions of memory that your program can access during its execution. The stack is a fixed sized region that is utilised automatically by your program. When variables are created, the data is pushed onto the top of the stack and the stack pointer is incremented by the size in bytes of this newly pushed variable. When that variable is no longer referenceable ie. it goes out of scope, the value is popped off the stack thus deleted the data. Data with automatic storage duration live on the stack and it is where all variables we have shown in the previous examples have been allocated to.

Function calls also interact with the stack in an interesting way. When a function is called the stack creates a new stack frame which encapsulates all the data created during the function call, as well as information about parameters and how to get back to the function's call site ie. parameter data and return address storage. This is done so that when a function does return, the entire stack frame can be popped off, deallocating all data created during the functions execution.

All in all, the stack is super fast and automatic allocates and deallocates memory for us thus allowing the lifetime of variables to be computed by the compiler, not us! So why don't we always use the stack? We can't because the stack is a fixed size and cannot grow beyond its original capacity which usually isn't very large because our OS wants to allow lots of programs to be able to run at once.

This is where the heap comes in. The heap is slow but dynamic memory that our program requests at runtime. This allows us to create variable sized memory regions that we can grow and shrink as need be however, this comes at the cost of having to manually return this memory back to the OS otherwise it is leaked! This means we have to track the lifetime of the data we create and ensure it is freed correctly. Data of this kind is categories as having dynamic storage duration.

Scope

Scopes define what set of symbols and objects are valid to reference in our program. We've encounter quite a few different uses of scope in our travels this far. The obvious one being functions. Functions create an entirely new scope that isn't just semantic (ie. only enforced by the compiler for correctness sake) but have an effect on the execution of a program. When a function is called it allocates a new stack frame meaning the lifetime of all data creating in that function is bound to that function's lifetime.

We also can see scope with conditional statements like for and range-for loops as the initializer and iterator for each statement type respectively is only bound to the scope of the statement body. In fact, you can introduce an unnamed scope using a brace block.

{
    auto const x = 5;

    // do stuck with x
}
// x out of scope

So how do we share data? In C++, variables have copy semantics and what this means is that the data of an object is copied when we bind a new variable to an existing variable. We can see this in the play below with y being assigned the value of x not x itself and thus the address of each object is unique.

#include <iostream>
// --snip--

auto main() -> {
    auto const x = 5;
    auto const y = x;

    std::cout << &x << "\n";
    std::cout << &y << "\n";
    return 0;
}
// --snip--

Tip

Notice the addresses of x and y are only 4 bytes apart, this is because they are right next to each other on the stack as we discussed above.

The `std::string` Type

So what happens when data on the heap goes out of scope? To demonstrate what happens we need to introduce the std::string type. string is more complex than the type introduced in Common Concepts - Data Types as it allocates its data on the heap and can change its size during runtime, as opposed to string literals which are encoded directly into the compiled binary. We even saw string in our guessing game!

Note

String literials are declared using a pair of double quotes ("") to surround the text and is of the type of a pointer to the first character (const char*).

So how can we ensure that the memory allocated on the heap is automatically freed when the variable goes out of scope? Some languages use a Garbage Collector (GC) to clean up memory that hasn't been used recently. In C++ we do not have a GC so it is our responsiblility to identify when memory is no longer needed or is it? C++ uses a concept known as Resource Acquisition Is Initialization or RAII. In essence it is the idiom of binding the lifetime of a resource; like memory, to the variable or object that own it and thus allowing for the resource to be freed when the owning variable goes out of scope. This is how string; and any other standard library containers, works.

#include <iostream>
#include <string>
// --snip--

auto main() -> {
    {
        auto const s = std::string {"hello"};

        // s is in scope
    }
    // s out of scope and data freed

    return 0;
}
// --snip--

References and Moves

Warning

🚧 Page Under Construction! 🏗️

Reference Semantics

So how do dynamic objects like string interact with C++ copy semantics? Well, they obey the same rules, the data is copied into a new heap location, creating two distinct objects.

#include <iostream>
#include <string>
// --snip--

auto foo(std::string const s) {
    std::cout << "Address of s: " << static_cast<const void*>(s.data()) << "\n";
}

auto main() -> int {
    auto const s = std::string {"hello"};

    std::cout << "Address of s: " << static_cast<const void*>(s.data()) << "\n";

    foo(s);

    return 0;
// --snip--
}

This is fine for primitive values that are small in size eg. int, bool etc. which are small but a string can get really big and copying it's data every time; when say pass it to a function, takes $O(n)$ time. What if we could refer to the same data without copying it? This is where references come into effect. As their name suggests reference allow us to refer to another object and treat ourselves as said object. References are declared by suffxing an ampersand (&) to a type declaration on a variable or parameter.

#include <iostream>
#include <string>
// --snip--

auto foo(std::string const& s) {
    std::cout << "Address of s: " << static_cast<const void*>(s.data()) << "\n";
}

auto main() -> int {
    auto const s1 = std::string {"hello"};
    auto const& s2 = std::string {"hello"};

    std::cout << "Address of s: " << static_cast<const void*>(s.data()) << "\n";
    std::cout << "Address of s: " << static_cast<const void*>(s.data()) << "\n";
    foo(s2);

    return 0;
// --snip--
}

Note

Binding a referencing to another reference doesn't create a reference to a reference. This is because references pass information through themselves thus the new reference points the original object.

References have a few special semantics, for one references; once bound, cannot be rebound and thus will refer to the same object for the references lifetime. References can also not refer to nothing, they must be bound at construction. This makes references super effective at sharing data safely however, you do have to be careful as C++ does not guarantee a reference does not outlive the object it refers to and thus you can have a dangling reference which refers to a non-existent object and is invalid to use.

This is particularly important to consider when returning references from functions as we as programmers must ensure the object being referred to is not cleaned up when the function returns.


#include <iostream>
#include <sstream>
#include <string>
// --snip--

auto foo(std::string const& s) -> std::string const& {
    auto ss = std::stringstream {};
    ss << "Address of s: " << static_cast<const void*>(s.data()) << "\n";
    return ss.str(); // error: returning reference to temporary
}

auto main() -> int {
    auto const s = std::string {"hello"};

    std::cout << "Address of s: " << static_cast<const void*>(s.data()) << "\n";
    std::cout << foo(s);

    return 0;
// --snip--
}

cmake -S . -B build --preset=<platform>
cmake --build build
[ 50%] Building CXX object CMakeFiles/main.dir/main.cxx.o
/home/user/projects/ownership/main.cxx: In function ‘const std::string& foo(const std::string&)’:
/home/user/projects/ownership/main.cxx:9:18: error: returning reference to temporary [-Werror=return-local-addr]
    9 |     return ss.str(); // error: returning reference to temporary
      |            ~~~~~~^~
cc1plus: all warnings being treated as errors
gmake[2]: *** [CMakeFiles/main.dir/build.make:76: CMakeFiles/main.dir/main.cxx.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/main.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2

If you need to return something out of a function and it was allocated in the lifetime of the function and won't exist beyond the function, the return type should not be a reference but a plain value.

Move Semantics

C++ has another method for control data ownership called move semantics which allows you to transfer ownership of data to another object. This will leave the previously owning object in a default initialized state or its empty state. Moves; contrary to the name, moves don't move data but rather transfer ownership of data. To make a object movable we need to turn it into what is called an x-value expression ie. a temporary value, such that the compiler can correctly resolve the move. This is done with the std::move() function found in the <utility> header.

#include <iostream>
#include <string>
#include <utility>
// --snip--

auto constexpr str_addr(std::string const& s) -> const void* {
    return static_cast<const void*>(s.data());
}

auto main() -> int {
    auto s1 = std::string {"hello this is a really long string"};
    std::cout << sizeof(s1) << "\n";
    
    std::cout << "String: " << s1 << " | addr: " << str_addr(s1) << "\n";
    auto const s2 = std::move(s1);

    std::cout << "String: " << s1 << " | addr: " << str_addr(s1) << "\n";
    std::cout << "String: " << s2 << " | addr: " << str_addr(s2) << "\n";

    return 0;
// --snip--
}

Note

We have to make s1 non-const to see the behaviour I specified above because if s1 were const deleted the stored data would violate the invariant that s1 is const as we would have mutated it thus const data will invoke a copy not a move.

This restriction is due to moves not being destructive in C++ which would mean s1 would become an invalid object and generate a compiler warning if we accessed it after moving from it.

The `span` and `string_view` types

Warning

🚧 Page Under Construction! 🏗️

`string_view`

Often we want to reference only part of a string, in the past we would use string::substr() however this would return a newly allocated string so in C++17 we got std::string_view which is a reference to a series of characters however, it does not own the characters. string_view has almost all the same operations as string which makes it super versatile as a string substitute when needing to reference part of a string.

#include <iostream>
#include <string>
#include <string_view>
// --snip--

auto main() -> int {
    auto s = std::string { "hello" };
    auto sv = std::string_view { s.data() + 1, 3 };

    std::cout << s << "\n";
    std::cout << sv << "\n";
    
    return 0;
// --snip--
}

Note

The string::data() method is used to get the address of the first element in a string thus we can use it to get the starting address of our substring by offsetting it by the correct number of characters as seen above.

We can also use string_view to handle string literals, these are the strings we create using double quotes (""). This makes string literals; which previously was just an address to the character data, much easier to use and much closer to strings, with the the constraint that you cannot modify this text.

#include <iostream>
#include <string>
#include <string_view>

using namespace std::literals;

auto main() -> int {
    auto sv1 = std::string_view { "hello" };
    auto sv2 = "bye"sv;

    std::cout << sv1 << "\n";
    std::cout << sv2 << "\n";
    
    return 0;
// --snip--
}

Tip

We can create strings and string_views from string literals using literal operators. Literal operators are suffixes you attach to a literals; like the u suffix to make an integer literal unsigned, that can be used to construct a custom type from the literals. In this case, we can make a string or string_view using the s or sv literal operators respectively. These are found in the namespace std::literals which we expose globally in the line above main().

Spans

We can general this ntion of a view using the std::span type. Because spans are more general than a string_view there are far fewer methods available however, they still cover all you need when working with a generalised view (or span) of a contiguous data structure.

spans are used for similar reasons to string_view, to easily accesses subslices of a contiguous data structure (ie. a subarray) or to adapt C-arrays into a safer type.

#include <iostream>
#include <array>
#include <span>

auto main() -> int {
    // --snip--

    auto a1 = std::array { 1, 2, 3, 4, 5 };
    auto s1 = std::span { a1.data() + 1, 3 };

    int a2[] = { 1, 2, 3, 4, 5 }; // C-array
    auto s2 = std::span { a2 };
    
    return 0;
}
    // --snip--

Danger

You don't need to worry about why C-arrays are unsafe for the purposes of this book. In a nutshell C-arrays (and string literals for that matter) are very primitive structures that do not provide any guards from misuse.

This has the benefit of allowing clever uses of the structures for the sake of performance and optimization which can be a good thing for system languages especially for the time period C came onto the scene however, when learning a system language guards help ensure correct practices are followed and engrained early in your journey so they do not become footguns in the future. This is why this book does not cover content from C as C++ has given many safer alternatives for decades.

Structures

A structure or struct is a way to aggregate or group related data together while giving each piece of data a distinct name, unlike tuples. We'll explore; in this chapter, how to define and instantiate structs, access data and call special functions called methods on instances of structs.

Note

Variables of a struct care called member variables while functions are called methods. Together, these form the members of the struct.

Creating Structures

Warning

🚧 Page Under Construction! 🏗️

To declare a struct we use the struct keyword followed by the name of the new type. Members are defined inside curly braces using the same variable and function declaration syntax we have seen previously; although variables do not need an initializer and thus auto is less powerful in member variable declarations. The entire struct is capped by a semicolon.

#include <string>

struct Person {
    bool alive;
    std::size_t age;
    std::string name;
    std::string email;
};

auto main() -> int {
    return 0;
}

We can then create an instance of the struct using an aggregate initializer. This is the process of giving concrete value to the member variables using a brace-initializer list. The order in which we initialize member variables is the same as the order member variables are declared in.

#include <string>

struct Person {
    bool alive;
    std::size_t age;
    std::string name;
    std::string email;
};

auto main() -> int {
    auto const p = Person {
        true,
        23,
        "John Doe",
        "johnd@example.com"
    };

    return 0;
}

To access member variables we use the member access operator (.). If your object is not constant you can also assign new values to members through the dot operator.

#include <string>

struct Person {
    bool alive;
    std::size_t age;
    std::string name;
    std::string email;
};

auto main() -> int {
    auto p = Person {
        true,
        23,
        "John Doe",
        "johnd@example.com"
    };

    p.email = "jdoe@sample.com";

    return 0;
}

Tip

You can mark individual member variables as const so only certain data can be modified.

Functions can return structs just like builtin types. Here we have a function that creates a Person.

#include <string>
#include <string_view>

struct Person {
    bool alive;
    std::size_t age;
    std::string name;
    std::string email;
};

auto make_person(std::string_view const name, std::string_view const email) -> Person {
    return Person {
        true,
        0,
        std::string{ name },
        std::string{ email }
    };
}

auto main() -> int {
    auto const p = make_person(
            "John Doe",
            "johnd@example.com"
    );

    return 0;
}

For simple structs like this, the compiler will generate a few constructors for us such as a default constructor and a copy constructor. These allow these simple types to be copied or constructed in a default state without having to specify this process ourselves.

#include <string>
#include <string_view>

struct Person {
    bool alive;
    std::size_t age;
    std::string name;
    std::string email;
};

auto make_person(std::string_view const name, std::string_view const email) -> Person {
    return Person {
        true,
        0,
        std::string{ name },
        std::string{ email }
    };
}

auto main() -> int {
    auto const p1 = make_person(
            "John Doe",
            "johnd@example.com"
    );
    
    // Default construct
    auto p2 = Person {};

    // Copy
    auto p3 = p1;

    return 0;
}

Constructors

We will explore constructors Chapter 8 - Custom Types and how we can use them to control the initialization of our own types. We will also explore how to disable certain constructors to disallow certain behaviours from our types.

Using Structures

Let us explore how structs can be used in everyday programs. We are going to create a simple program to calculate operations on a 3D vector type.

#include <cmath>
#include <iostream>

auto magnitude(auto const x, auto const y, auto const z) -> double {
    return std::sqrt(x * x + y * y + z * z);
}

auto main() -> int {
    auto const x = 2.;
    auto const y = 3.;
    auto const z = 5.;

    std::cout << "The magnitude of the vector is "
              << magnitude(x, y, z)
              << "units.\n";

    return 0;
}

Refactoring with Tuples

We can make this code more concise by packing the data into a tuple. This allows the type signature of magnitude() to be much simpler; taking a single parameter, and ensures all our data is collected together. However, using a tuple leaves room for ambiguity in which piece of data has which meaning as none of the elements have names.

#include <cmath>
#include <iostream>
#include <tuple>

using vec3 = std::tuple<double, double, double>;

auto magnitude(vec3 const vec) -> double {
    auto const& [x, y, z] = v;
    return std::sqrt(x * x + y * y + z * z);
}

auto main() -> int {
    auto const v = vec3 { 2., 3., 5. };

    std::cout << "The magnitude of the vector is "
              << magnitude(v)
              << "units.\n";

    return 0;
}

Tip

The line starting with the using keyword is used introduce a type alias. This allows us to define a shorter name for a type we are using frequently. This is particularly useful for tuples such that we can distinguish two tuples of the same underlying types but with different purposes.
We could also have used std::make_tuple() to create our tuple object in main() however, I used the brace-initialized from with the type alias to make it clearer what type v is supposed to be.

Refactoring with structs

We can add more meaning by create a vec3 struct with named x, y and z data members. Now our magnitude() function is able to access the member variables by name.

#include <cmath>
#include <iostream>

struct vec3 {
    double x;
    double y;
    double z;
};

auto magnitude(vec3 const vec) -> double {
    return std::sqrt(vec.x * vec.x
                   + vec.y * vec.y
                   + vec.z * vec.z);
}

auto main() -> int {
    auto const v = vec3 { 2., 3., 5. };

    std::cout << "The magnitude of the vector is "
              << magnitude(v)
              << "units.\n";

    return 0;
}

Methods

As discussed before, methods are functions that are called on instances of a struct. This allows the method to access the member variables of the struct and just like regular functions we can pass parameters and return values from methods.

Defining Methods

Let's change our example program from before to use methods instead of a free function. We define methods within the structs curly braces just like regular functions and call the function using the dot syntax on an instance of the struct.

#include <cmath>
#include <iostream>

struct vec3 {
    double x;
    double y;
    double z;

    auto magnitude() const -> double {
        return std::sqrt(x * x + y * y + z * z);
    }
};

auto main() -> int {
    auto const v = vec3 { 2., 3., 5. };

    std::cout << "The magnitude of the vector is "
              << v.magnitude()
              << "\n";

    return 0;
}

Note

The const after the parameter declaration and before the trailing return arrow does not mean the return type is constant, but rather indicates that this method does not modify the member variables of this vec3 instance and thus can be used on on const instances.

`this` keyword

Implicitly, all methods are passed an argument called this which is a pointer to the instance of the struct the method was called on. this can be omitted in most cases with variables automatically being looked up in the struct instance however, if the name lookup is ambiguous ie. there is a parameter of the same name, then you will need to access the member variable through this. Because this is a pointer you cannot use the dot operator but must use the -> operator to deference the pointer.

#include <cmath>
#include <iostream>
#include <string>

struct vec3 {
    double x;
    double y;
    double z;

    auto magnitude() const -> double {
        return std::sqrt(x * x + y * y + z * z);
    }

    auto normalized() const -> vec3 {
        auto const sz = this->magnitude();
        return vec3 { x / sz, y / sz, z / sz };
    }

    // Helper method for stringifying vec3
    auto to_string() const -> std::string {
        auto ss = std::stringstream {};
        ss << "{ "
           << x
           << ", "
           << y
           << ", "
           << z
           << " }";

        return ss.str();
};

auto main() -> int {
    auto const v = vec3 { 2., 3., 5. };

    std::cout << "The magnitude of the vector is "
              << v.magnitude()
              << "\n";

    auto const n = v.normalized();

    std::cout << "Vector v: "
              << v.to_string()
              << " is: "
              << n.to_string()
              << "\n";

    return 0;
}

Warning

We will discuss pointers properly and in detail in Chapter 13 - Memory but for now, think of pointers as like references but closer to a hardware concept.

Taking Parameters

As stated before, we can declare parameters for methods such that they can take arguments with parameters a declared the same as with free functions.

#include <cmath>
#include <iostream>
#include <sstream>
#include <string>

struct vec3 {
    double x;
    double y;
    double z;

    auto magnitude() const -> double {
        return std::sqrt(x * x + y * y + z * z);
    }

    auto normalized() const -> vec3 {
        auto const sz = this->magnitude();
        return vec3 { x / sz, y / sz, z / sz };
    }

    auto dot(vec3 const& u) const -> double {
        return x * u.x + y * u.y + z * u.z;
    }

    // Helper method for stringifying vec3
    auto to_string() const -> std::string {
        auto ss = std::stringstream {};
        ss << "{ "
           << x
           << ", "
           << y
           << ", "
           << z
           << " }";

        return ss.str();
    }
};

auto main() -> int {
    auto const v = vec3 { 2., 3., 5. };

    std::cout << "The magnitude of the vector is "
              << v.magnitude()
              << "\n";

    auto const n = v.normalized();

    std::cout << "Vector v: "
              << v.to_string()
              << " is: "
              << n.to_string()
              << "\n";

    auto const u = vec3 { 2., -3., 5. };

    std::cout << "Dot product of v: "
              << v.to_string()
              << " and u: "
              << u.to_string()
              << " is: "
              << v.dot(u)
              << " units \n";

    return 0;
}

Operator Overloading

Just like we can define overloaded operators as free functions we can define overloaded operators within a struct however, the left hand argument is always the the struct instance the operator is defined on.

#include <cmath>
#include <iostream>
#include <sstream>
#include <string>

struct vec3 {
    double x;
    double y;
    double z;

    auto magnitude() const -> double {
        return std::sqrt(x * x + y * y + z * z);
    }

    auto normalized() const -> vec3 {
        auto const sz = this->magnitude();
        return vec3 { x / sz, y / sz, z / sz };
    }

    auto dot(vec3 const& u) const -> double {
        return x * u.x + y * u.y + z * u.z;
    }

    auto operator*(vec3 const& u) const -> double {
        return this->dot(u);
    }

    // Helper method for stringifying vec3
    auto to_string() const -> std::string {
        auto ss = std::stringstream {};
        ss << "{ "
           << x
           << ", "
           << y
           << ", "
           << z
           << " }";

        return ss.str();
    }
};

auto main() -> int {
    auto const v = vec3 { 2., 3., 5. };

    std::cout << "The magnitude of the vector is "
              << v.magnitude()
              << "\n";

    auto const n = v.normalized();

    std::cout << "Vector v: "
              << v.to_string()
              << " is: "
              << n.to_string()
              << "\n";

    auto const u = vec3 { 2., -3., 5. };

    std::cout << "Dot product of v: "
              << v.to_string()
              << " and u: "
              << u.to_string()
              << " is: "
              << v * u
              << " units \n";

    return 0;
}

If we want to reorder the parameters of an operator on our struct but keep the definition all together we can use the friend keyword to create a free function in a structs definition. This also allows the friend function to access the members of the struct instance. The friend keyword becomes more relevant when discussing Access Modifiers in Chapter 8.

#include <cmath>
#include <iostream>
#include <ostream>

struct vec3 {
    double x;
    double y;
    double z;

    auto magnitude() const -> double {
        return std::sqrt(x * x + y * y + z * z);
    }

    auto normalized() const -> vec3 {
        auto const sz = this->magnitude();
        return vec3 { x / sz, y / sz, z / sz };
    }

    auto dot(vec3 const& u) const -> double {
        return x * u.x + y * u.y + z * u.z;
    }

    auto operator*(vec3 const& u) const -> double {
        return this->dot(u);
    }

    friend auto operator<<(std::ostream& os, vec3 const& v) -> std::ostream& {
        os << "{ "
           << x
           << ", "
           << y
           << ", "
           << z
           << " }";

        return os;
    }
};

auto main() -> int {
    auto const v = vec3 { 2., 3., 5. };

    std::cout << "The magnitude of the vector is "
              << v.magnitude()
              << "\n";

    auto const n = v.normalized();

    std::cout << "Vector v: "
              << v
              << " is: "
              << n
              << "\n";

    auto const u = vec3::unit_x();

    std::cout << "Dot product of v: "
              << v
              << " and u: "
              << u
              << " is: "
              << v * u
              << " units \n";

    return 0;
}

Static Functions

We can also declare static methods on a struct which do not operate on an instance but are simply bound to the struct itself. We declare static methods with the static keyword

#include <cmath>
#include <iostream>
#include <ostream>

struct vec3 {
    double x;
    double y;
    double z;

    static auto unit_x() -> vec3 {
        return vec3 { 1., 0., 0. };
    }

    auto magnitude() const -> double {
        return std::sqrt(x * x + y * y + z * z);
    }

    auto normalized() const -> vec3 {
        auto const sz = this->magnitude();
        return vec3 { x / sz, y / sz, z / sz };
    }

    auto dot(vec3 const& u) const -> double {
        return x * u.x + y * u.y + z * u.z;
    }

    auto operator*(vec3 const& u) const -> double {
        return this->dot(u);
    }

    // Helper method for stringifying vec3
    friend auto operator<<(std::ostream& os, vec3 const& v) -> std::ostream& {
        os << "{ "
           << x
           << ", "
           << y
           << ", "
           << z
           << " }";

        return os;
    }
};

auto main() -> int {
    auto const v = vec3 { 2., 3., 5. };

    std::cout << "The magnitude of the vector is "
              << v.magnitude()
              << "\n";

    auto const n = v.normalized();

    std::cout << "Vector v: "
              << v
              << " is: "
              << n
              << "\n";

    auto const u = vec3 { 2., -3., 5. };

    std::cout << "Dot product of v: "
              << v
              << " and u: "
              << u
              << " is: "
              << v * u
              << " units \n";

    return 0;
}

Summary

While this chapter has only a handful of pages we covered a lot of new features and syntax. From defining and creating structs, attaching methods to structures and even static methods!

Appendix

Warning

🚧 Page Under Construction! 🏗️

Useful info about C++ that doesn't fit into the model of the book.

A - Keywords

Warning

🚧 Page Under Construction! 🏗️

This is the list of keywords reserved by C++. This means these words cannot be used as an identifier for variables, functions, class/struct member names etc.. Some are reserved with no current or deprecated usecase.

Currently in Use

Keyword	Description
`alignas` (C++11)
`and`
`and_eq`
`asm`
`auto`
`bitand`
`bitor`
`break`
`case`
`catch`
`class`
`compl`
`concept` (C++20)
`const`
`consteval` (C++20)
`constexpr` (C++11)
`constinit` (C++20)
`continue`
`co_await` (C++20)
`co_return` (C++20)
`co_yield` (C++20)
`decltype` (C++11)
`default`
`do`
`double`
`else`
`enum`
`explicit`
`export`
`extern`
`false`
`float`
`for`
`friend`
`goto`
`if`
`inline`
`mutable`
`namespace`
`noexcept` (C++11)
`not`
`not_eq`
`nullptr` (C++11)
`operator`
`or`
`or_eq`
`private`
`protected`
`public`
`register`
`requires` (C++20)
`return`
`signed`
`sizeof`
`static`
`static_assert` (C++11)
`struct`
`switch`
`template`
`this`
`thread_local` (C++11)
`throw`
`true`
`try`
`typedef`
`typename`
`union`
`unsigned`
`using`
`virtual`
`void`
`volatile`
`while`
`xor`
`xor_eq`

Reserved In Specific Contexts

These keywords are reserved in specific circumstances like in a class declaration etc..

Keyword	Description
`final` (C++11)	Specifies virtual member function cannot be overridden in child class.
`override` (C++11)	Specifies virtual member function definition overrides parent definition.
`import` (C++20)	Module import declaration.
`module` (C++20)	Module and module fragment declaration.

Reserved for Future Use

These keywords are reserved for experimental features being tested in a Technical Specification.

Keyword	Technical Specification	Description
`atomic_cancel`	Transactional Memory (TM) TS	Starts atomic block that will restore data modified during atomic block for some exception types, otherwise it will call `std::abort`.
`atomic_commit`	Transactional Memory (TM) TS	Starts atomic block that commits data changes regardless of exceptions being thrown.
`atomic_noexcept`	Transactional Memory (TM) TS	Starts atomic block that will call `std::abort` if exception is thrown within the block.
`reflexpr`	Reflection TS	Provides meta info about an object by returning a meta-object.
`synchronized`	Transactional Memory (TM) TS	Starts a synchronized block
`transaction_safe`	Transactional Memory (TM) TS	Indicates that a function is transaction-safe.
`transaction_safe_dynamic`	Transactional Memory (TM) TS	Indicates that a virtual function is transaction-safe.

B - Operators

Warning

🚧 Page Under Construction! 🏗️

This page is a high level overview of C++ operators and other symbols and what they do.

✅ - Fully overloadable
☑️ - Overloadable with constraints
⚠️ - Overloadable but not recommended
❌ - Not overloadable

Basic Operators

Operator	Example	Description	Overloadable
`+`	`+expr`	Arithmetic posigation	✅
`+`	`expr + expr`	Arithmetic addition	✅
`++`	`++expr`	Prefix increment	✅
`++`	`expr++`	Postfix increment	✅
`+=`	`var += expr`	Arithmetic addition and assignment	✅
`-`	`-expr`	Arithmetic negation	✅
`-`	`expr - expr`	Arithmetic subtraction	✅
`--`	`--expr`	Prefix decrement	✅
`--`	`expr--`	Postfix decrement	✅
`-=`	`var -= expr`	Arithmetic subtraction and assignment	✅
`*`	`*expr`	Pointer dereference	☑️
`*`	`expr * expr`	Arithmetic multiplication	✅
`*=`	`var *= expr`	Arithmetic multiplication and assignment	✅
`/`	`expr / expr`	Arithmetic division	✅
`/=`	`var /= expr`	Arithmetic division and assignment	✅
`%`	`expr % expr`	Arithmetic remainder	✅
`%=`	`var %= expr`	Arithmetic remainder and assignment	✅
`~`	`~expr`	Bitwise Complement	✅
`&`	`&expr`	Address of	✅
`&`	`type ident&`, `type ident const&`	Reference type	❌
`&`	`expr & expr`	Bitwise AND	✅
`&=`	`var &= expr`	Bitwise AND and assignment	✅
`&&`	`expr && expr`	Logical AND	☑️
`\|`	`expr \| expr`	Bitwise OR	✅
`\|=`	`var \|= expr`	Bitwise OR and assignment	✅
`\|\|`	`expr \|\| expr`	Logical OR	☑️
`^`	`expr ^ expr`	Bitwise XOR	✅
`^=`	`var ^= expr`	Bitwise XOR and assignment	✅
`<<`	`expr << expr`	Bitwise left shift	✅
`<<=`	`var <<= expr`	Bitwise left shift and assignment	✅
`>>`	`expr >> expr`	Bitwise right shift	✅
`>>=`	`var >>= expr`	Bitwise right shift and assignment	✅
`!`	`!expr`	Logical NOT	✅
`==`	`expr == expr`	Equality comparison	✅
`!=`	`expr != expr`	Inequality comparison	✅
`<`	`expr < expr`	Less than	✅
`<=`	`expr <= expr`	Less than or equal	✅
`>`	`expr > expr`	Greater than	✅
`>=`	`expr >= expr`	Greater than or equal	✅
`<=>`	`expr <=> expr`	Three way comparison	✅
`[]`	`expr[expr, expr, ..]`	Subscript / array indexing (multi-argument since C++23)	✅
`()`	`expr(expr, expr, ..)`	Function object invocation	✅
`,`	`expr, expr`	Comma sequencing	⚠️
`=`	`var = expr`, `ident = expr`	Assignment / Binding	☑️
`?:`	`expr ? expr : expr`	Ternary expression	❌
`::`	`ident::ident, ident::var`	Namespace lookup	❌
`...`	`typename types...`, `type T...`, `T... args`	Parameter type and value packs	❌
`.`	`expr.ident`	Member access	❌
`.*`	`expr.*ident`	Member access to pointer members	❌
`->`	`expr->ident`	Member access through a pointer	☑️
`->*`	`expr->*ident`	Member access through a pointer to pointer members	☑️
`""`	`literal_suffix-ident`	User defined literal	☑️

Memory Operators

Operator	Example	Description
`new`	`new type (init-list)`	Allocate a heap memory object constructed with parameters in init-list
`new []`	`new type[size] {init-list}`	Allocate a heap memory block initialized with elements in init-list
`delete`	`delete expr`	Delete heap memory object
`delete []`	`delete [] expr`	Deletes heap memory block

Type Casting Operators

Operator	Example	Description
`static_cast`	`static_cast<T>(expr)`	Casts `expr` to type `T`
`dynamic_cast`	`dynamic_cast<T>(expr)`	Casts pointers and references to classes up, down and sideways through inheritance hierarchy
`reinterpret_cast`	`reinterpret_cast<T>(expr)`	Casts `expr` to type `T` by reinterpreting underlying bits of `expr`
`const_cast`	`const_cast<T>(expr)`	Can cast to or away `const` when type of `expr` and `T` are similar types
C-cast	`(type)expr`	Legacy type cast from C, uses a combination of above casts

Other Operators

Operator	Example	Description
`sizeof`	`sizeof(expr)`, `sizeof(type)`	Obtains the size in bytes of a type or expression
`sizeof...`	`sizeof...(pack-expr)`, `sizeof(pack-type)`	Obtains the number of elements of a parameter pack
`typeid`	`typeid(expr)`, `typeid(type)`	Obtains compiler representation of a type
`noexcept`	`noexcept(expr)`	Checks if an expression will throw an exception
`alignof`	`alignof(typeid)`	Obtains the alignment required by a type