The C++ programming Language
by Tyler Swann
This book assumes you are at least using a C++11 compliant compiler but concepts and practices from C++11 to C++20 will be covered. See the "Installation" page of the "Getting Started" Chapter for more details.
This book is under active development. Much of the material is absent, incomplete or subject to change. If you have suggestions create a discussion or issue on GitHub.
Introduction
Welcome to The C++ Programming Language, an introductory book aimed at teaching C++. C++ is a high-level, general purpose, multi-paradigm programming language aimed at giving developers absolute control over their programs but also the means to design, build and use any kind of abstraction to make the language more ergonomic and expressible with zero penalties for what you don't use. This allows for C++ programs to be performant as well as expressive.
Who/What is C++ for?
C++ was designed for building systems and embedded software in resources constrained contexts. These are systems that prioritize performance, efficiency and flexibility of use allow the developer to write performant code that can run or make any kind of system. If you want the ability to build helpful abstractions but needs to run in a constrained environment and make use of every resource as effectively as possible C++ will get you there.
Who is this book for?
This book is generally aimed at people who have programmed in at least one other language but regardless of which (ones). This is to say you should have an idea about what a program is, common programming concepts, and a rough idea how a computer works but does not restrict the background in which you learnt these concepts. I aim to make the material as approachable as possible to anyone from any background. It is possible to read this book without any prior experience to programming but it is highly recommended you start at an introduction to programming book or course if you have never programmed before. The purpose of this book is to showcase how the C++ language works and the various concepts and capabilities present within the language, as apposed to teaching you about the basics of programming using C++ as the content medium. C++ is a very dense language in its entirety and and it can be much more difficult to become comfortable programming and utilising the concepts from this particular book effectively if you do not have a foundation for programming in general.
How to Use This Book
In general, this book assumes that you're reading it in sequence from front to back. Later chapters build on concepts in earlier chapters, and earlier chapters might not delve into details on a particular topic but will revisit the topic in a later chapter.
You'll find two kinds of chapters in this book: concept chapters and project chapters. In concept chapters, you'll learn about an aspect or concept from C++. In project chapters, we'll build small programs together, applying what you've learned so far. Chapters ... are project chapters; the rest are concept chapters. Add the end of a concept chapter there will be challenges that you can complete. These are simple quiz like questions that you can use to query your understanding of the concepts presented in the previous chapter.
Note: You can also search for specific content using the search button in the top left or by pressing the S key.
Synopsis
- Getting Started, explains how install the necessary tools for compiling C++ programs
on various platforms like Windows, macOS and Linux. It also goes through writing a
classical
"Hello, world!"
program and will discuss the anatomy of a basic C++ program and using the CMake build system. - Project: Guessing Game, is the first project chapter where you will build a simple 'number guessing game'. This will introduce you to compiling and building a C++ program and utilising various pieces from C++ at a high level, with later chapters offering more details.
Planned
- Common Programming Concepts, will cover the basics of the C++ language from variables and data types to creating functions and controlling the execution flow of a program.
- Ownership, will cover C++ ownership model and how you are able share data or even transfer data ownership.
- Structured Data will look at how to create custom types using structs.
- In Managing Projects we'll discuss how to compile multiple files together and how CMake makes this process easier.
- Custom Types explores how to create more powerful custom types and how to manage the lifetime of data.
- Error Handling will look at the various ways to verify the correctness of your programs at compile time. We will also look at recovering from errors to prevent crashes.
- Templates covers C++'s metaprogramming capabilities that allow you to build generic code that applies multiple types.
- In Functional Language Features we will look
- The IO chapter will briefly look deeper at C++ IO capabilities using streams. We also explore C++ filesystem library.
- Memory will showcase how to safey (and unsafely ... for science) control memory.
- In Concurrency we will look at how to parallelize our programs using a miriade of concurrency concepts while ensure safe access and manipulation of shared data.
- Appendices, The appendices hold extra information may be of use to the reader but do
not fit in elsewhere in the book.
- A - Keywords
- B - Operators
- C - Standard Versions
- D - Compilation Pipeline
- E - Value Categories
- F - Compiler Vendors
- G - Challenge Answers
Possible Future Chapters
- IO Project, will look at utilising ideas from previous chapters in order to build a
tool that replicates a subset of the functionality of the command line tool
grep
. - Algorithms, will showcase a few of the common algorithms available in the C++ standard library and they can be used to manipulate any of the standard containers in an expressive and generic manner. We will also cover the concept of a range and a view and how they allow use to write composable algorithms.
- Improved IO Project, will look at improving our IO project from Chapter 11 by utilising the standard algorithms.
- Object Orientated Programming In C++, covers C++ support for write object orientated code and how it contrasts to the rest of the languages features and object oriented principles you may be familiar with from other languages.
- Date, Time and Localization, introduces C++ support for working with time and dates how to change the locale currently being used to express said times and dates.
There is no wrong way to read this book: if you want to skip ahead, go for it! You might have to jump back to earlier chapters if you experience any confusion. But do whatever works for you.
An important part of the process of learning any programming language is learning how to read the error messages the compiler displays, which can be challenging for large codebases, especially if they are written in C++ (although this is improving). Error messages no matter the language will offer key insight into where the compilation of a program failed and in the case of C++, why it failed, which will guide you toward working code. As such, I'll provide many examples that don't compile along with the error message the compiler will show you in each situation. Know that if you enter and run a random example, it may not compile! Make sure you read the surrounding text to see whether the example you're trying to run is meant to error.
Note: the error message style and content can be dramatically different given a different compiler, compiler version and standard of C++ being used.
Source Code
The source code from which this book is generated can be found on GitHub. Refer to the supporting docs on the books repo for details on how to contribute changes, fix typos or create new content for this book.
External Resources
Getting Started
Let us begin our journey! In this chapter we will discuss:
- Installing C++ on Linux, macOS and Windows
- Creating a C++ program to print
Hello, world!
- Using CMake to create cross-platform builds.
Installation
Each platform or Operating System (OS) has a different set of compiler tools so the following sub-chapters will outline how to get setup on each platform.
Available C++ Compilers
Compiler | Description | Windows | Linux | MacOS |
---|---|---|---|---|
GNU Compiler Collection (GCC) | A collection of compiler technologies for many different languages including C, C++, Objective-C, Ada, D and Go. Part of the GNU project and the default compiler on Linux. | ✅1 | ✅ | ✅ |
Clang | A compiler frontend and build runner that is a part of the LLVM Project. Used to compile C, C++ and Objective-C. | ✅2 | ✅ | ✅ |
Microsoft Visual Compilers (MSVC) | Microsoft's proprietary compiler toolchain for building C and C++. Usually installed with the Visual Studio IDE. | ✅ | ❌ | ❌ |
Note: The use of $
or >
as the first character on a line in any code block for a
shell (commands etc.) is used to indicate the prompt with the command following. This is
used to clarify a shell code block that contains commands and the (generally) expected
output. You do not need to copy the $
or >
when running commands.
-
1
via MinGW or Cygwin
-
2
via Visual Studio, MinGW or Cygwin
Linux
Installing GCC and Clang on most Linux systems is relatively trivial. Most of the time it requires just installing the GCC or Clang package and some supporting developer tooling packages. These are often bundled together to make installation as simple as possible.
Installing
Depending on your distribution you will use a different package manager and package upstream repository, therefore some package names might be different than what is listed below. Consult your platforms docs for the most seamless way to install a C++ compiler if the below commands fail.
# Debian, Ubuntu, ElementaryOS, Linux Mint, Pop!_OS (APT)
$ sudo apt install build-essential gdb clang llvm cmake
# RedHat, CentOS, Fedora (DNF)
$ sudo dnf install make automake gcc gcc-c++ kernel-devel gdb clang llvm cmake
# Arch, Manjaro (Pacman)
$ sudo pacman -Sy base-devel gdb clang llvm cmake
# OpenSUSE (Zypper)
$ sudo zypper install -t pattern devel_basis
$ sudo zypper install gdb clang llvm cmake
Verifying Installation
To verify the install worked for either GCC or Clang we can run the compiler programs with the version flag and ensure the install has been successful. You should get something like the following output:
# Verify GCC
$ g++ --version
g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
# Verify Clang
$ clang++ --version
Ubuntu clang version 14.0.0-1ubuntu1.1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Any details displayed from verifying a given newly installed tool may differ to what is displayed in this book.
- The name of GNU's compiler toolchain is 'GCC' aka GNU Compiler Collection. This is in
contrast to the CLI tool called
gcc
which stands for GNU C Compiler. - The C++ compiler from GCC is called
g++
. Make sure to use this command when compiling C++ code.
Installing CMake
We will also want a tool to help manage larger projects and allow us to build on different machines from the same source. CMake is one such build tool for C++ projects. It is used to manage different configurations for a projects. You would have already installed CMake when you installed the C++ compilers earlier as we added CMake to the install list. You can verify by running:
$ cmake --version
cmake version 3.25.1
CMake suite maintained and supported by Kitware (kitware.com/cmake).
Installing vcpkg
We will also need some way to install external libraries. While many different tools
exist the tool vcpkg was chosen for this book. vcpkg is an open source tool developed by
Microsoft used for downloading and managing C++ libraries with CMake. We can install, add
it to our PATH
and validate the install using the following commands:
cd ~
mkdir bin
cd bin
git clone https://github.com/Microsoft/vcpkg.git
./vcpkg/bootstrap-vcpkg.sh
printf '\n# >>> vcpkg >>>\nexport VCPKG_ROOT="$HOME/bin/vcpkg"\nexport PATH="$VCPKG_ROOT:$PATH"\n# <<< vcpkg <<<\n' >> ~/.bashrc
source ~/.bashrc
Verify vcpkg
$ vcpkg --version
vcpkg package management program version 2023-10-18-27de5b69dac4b6fe8259d283cd4011e6d20a84ce
See LICENSE.txt for license information.
Windows
Windows has many different compilers at its disposal. Some offer native support to building against the Windows runtime while others will emulate a UNIX (the predecessor to Linux and BSD) environment to aid in porting software built for UNIX-like systems. As the specifics can get confusing, this book will only cover the installation of Window's native compiler toolchain MSVC.
MSVC Installation
The Microsoft Visual C++ (MSVC) compiler is Microsoft's official toolchain for building
software natively on Windows. It is installed with the Visual Studio Integrated Developer
Environment (IDE). MSVC (and the whole Visual Studio suite) can be obtained from
Microsoft's official download page. Make sure
to select the correct edition (community being the free version) and click 'Download'.
This will download the setup program VisualStudioSetup.exe
, which is used to install
and configure Visual Studio Installer (VSI). The VSI allows you to select which tools and
technologies from the Visual Studio suite you want to install. Once you have installed
the VSI, start the program and you should be presented with some default tool
configurations (workflows). For developing with C++ you will need to select the 'Desktop
development with C++' workflow. You will also want to tick a few optional features as
well (found in the side bar).
Finally, click the 'Install' button in the bottom right of the window to start the installation.
Verifying MSVC Installation
To verify you installed Visual Studio correctly you can open the newly installed 'Developer Command Prompt for VS'. This prompt is needed in order to load the MSVC tooling into the prompt as it is not including by default in CMD or PowerShell. Simply run the following command to verify the install of the compiler.
> cl
Microsoft (R) C/C++ Optimizing Compiler Version 19.37.32822 for x86
Copyright (C) Microsoft Corporation. All rights reserved.
usage: cl [ option... ] filename... [ /link linkoption... ]
Any details displayed from verifying a given newly installed tool may differ to what is displayed in this book.
Alternatively you can follow Microsoft's tutorial
for creating a new C++ VS Project. This will be more convenient than opening a 'Developer
Command Prompt' every time you want to compile a program and having to run the cl
command manually but it takes more work setting compiler flags etc. for simple projects.
Installing Git
We will also need to install Git in order to install a particular package later. Git can
be installed by going to the 'Git for Windows' installation
page and selecting the correct version (eg. x64 for 64-bit systems) and following the
installation Wizard. Be sure to select the option for adding Git to the PATH
.
Installing CMake
CMake is a build tool for C++ projects. It is used to manage different configurations for a projects. You can download the latest release from CMake's Release Page (scroll down to 'Latest Release' not 'Release Candidate'). You can verify it was installed correctly by opening CMD and running.
> cmake --version
cmake version 3.25.1
CMake suite maintained and supported by Kitware (kitware.com/cmake).
Installing vcpkg
We will also need some way to install external libraries. While many different tools
exist the tool vcpkg
was chosen for this book. vcpkg
is an open source tool developed
by Microsoft used for downloading and managing C++ libraries with CMake. We can install,
add it to your PATH
and validate the install using the following batch/CMD commands:
> cd %userprofile%
> mkdir bin
> cd bin
> git clone https://github.com/Microsoft/vcpkg.git
> .\vcpkg\bootstrap-vcpkg.bat -disableMetrics
> setx VCPKG_ROOT %userprofile%\bin\vcpkg
> setx PATH "%PATH%;%userprofile%\bin\vcpkg"
:: You must now reload CMD for the Environment Variables to refresh by closing and reopening the CMD.
> vcpkg --version
vcpkg package management program version 2023-10-18-27de5b69dac4b6fe8259d283cd4011e6d20a84ce
See LICENSE.txt for license information.
MacOS
To install GCC and Clang on MacOS we will need Apple's developer toolchain called Xcode and a package manager for MacOS called Homebrew.
Installation
To build almost anything on MacOS we need the Xcode developer suite. This is a set of libraries, environment configurations and binaries used at the core of all Apple software products. The full installation can be found on Apple's developer page (requires a login) but this is an extremely large package requiring ~40Gb of disk space. Luckily there is a much smaller CLI package that just installs the necessary tooling for working with software from the terminal. One of these tools is the Clang compiler. To install GCC you will need the Homebrew, a package manager which will by default install the latest stable version of the GCC formula. If you need a different version you can can check the GCC formula page for available versions. To install these packages, open the 'Terminal' app and run:
# Install Xcode CLI tools
$ xcode-select --install
# Install Homebrew
$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
# Add `brew` command to your PATH
$ (echo; echo 'eval "$(${HOMEBREW_PREFIX}/bin/brew shellenv)"') >> ${shell_profile}
# Install GCC
$ brew install gcc cmake
Verifying Installation
To verify the install worked for either GCC or Clang we can run the compiler programs with the version flag and ensure the install has been successful.
# Verify GCC
$ g++-13 --version
g++-13 (Homebrew GCC 13.2.0) 13.2.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
# Verify Clang
$ clang++ --version
Apple clang version 15.0.0 (clang-1500.0.40.1)
Target: x86_64-apple-darwin22.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
Any details displayed from verifying a given newly installed tool may differ to what is displayed in this book.
- The name of GNU's compiler toolchain is 'GCC' aka GNU Compiler Collection. This is in
contrast to the CLI tool called
gcc
which stands for GNU C Compiler. - The C++ compiler from GCC is called
g++
. Make sure to use this command when compile C++ code. - You must specify the versioned
g++
command in order to use the Homebrew version of the command. You can find the default version installed by runningbrew info gcc
. We must do this because the regularg++
command redirects back to Apple's Clang implementation.
Installing CMake
We will also want a tool to help manage larger projects and allow us to build on different machines from the same source. CMake is one such build tool for C++ projects. It is used to manage different configurations for a projects. You would have already installed CMake when you installed the C++ compilers earlier with Homebrew as we added CMake to the install list. You can verify by running:
$ cmake --version
cmake version 3.25.1
CMake suite maintained and supported by Kitware (kitware.com/cmake).
Installing vcpkg
We will also need some way to install external libraries. While many different tools
exist the tool vcpkg
was chosen for this book. vcpkg
is an open source tool developed
by Microsoft used for downloading and managing C++ libraries with CMake. We can install,
add it to our PATH
and validate the install using the following commands:
cd ~
mkdir bin
cd bin
git clone https://github.com/Microsoft/vcpkg.git
./vcpkg/bootstrap-vcpkg.sh
printf '\n# >>> vcpkg >>>\nexport VCPKG_ROOT="$HOME/bin/vcpkg"\nexport PATH="$VCPKG_ROOT:$PATH"\n# <<< vcpkg <<<\n' >> ~/.bashrc
source ~/.bashrc
Verify vcpkg
$ vcpkg --version
vcpkg package management program version 2023-10-18-27de5b69dac4b6fe8259d283cd4011e6d20a84ce
See LICENSE.txt for license information.
Hello World
Now that you've installed a C++ compiler, its time to write your first C++ program. It is tradition when learning a new programming language to write a program that prints "Hello, world!" to the screen and we'll be doing the same.
"Hello, world!" was first introduced as a teaching mechanism for people learning a new programming language in Brian Kernighan's 1972 "A Tutorial Introduction to the Language B".
Creating a Project Directory
First, you'll create a new directory to store you C++ code. It is a good idea to create a 'projects' or 'dev' directory within you home or user directory in order to store any project you might develop for this book an beyond. Open a terminal and run the following commands.
Linux, MacOS or PowerShell on Windows:
$ mkdir ~/projects
$ cd ~/projects
$ mkdir hello_world
$ cd hello_world
CMD on Windows:
> mkdir "%userprofile%\projects"
> cd "%userprofile%\projects"
> mkdir hello_world
> cd hello_world
Writing and Running a C++ Program
Within this new 'hello_world' directory we will create a new file called main.cxx. This
is called a C++ source file. A C++ program is then built from one or more of these files.
We use the file extension *.cxx
to denote that this file contains C++ source code. If a
filename contains multiple words the convention is to separate the words with an
underscore eg. hello_world.cxx over helloworld.cxx. Now open the file you have just
created and copy the code from Listing 1-1 into the file.
#include <iostream>
auto main() -> int {
std::cout << "Hello, world!\n";
return 0;
}
C++ source files can have various different extensions such as *.cpp
, or *.c++
however, for this book the *.cxx
style will be used. Is also good practice to use the
same extension type across a project so no matter which one you use, just be consistent.
Save the file and return to your terminal open to the ~/projects/hello_world directory and run the following command.
On Linux or MacOS
$ g++ -std=c++20 -o hello_world main.cxx
$ ./hello_world
Hello, world!
On Windows
:: Must be done in a 'Developer Command Prompt for VS ...'
> cl /std:c++20 /EHsc /Fe: hello_world.exe main.cxx
> .\hello_world.exe
Hello, world!
If you see "Hello, world!" printed on your terminal, congratulations, you've officially written your first C++ program!
- You can swap the
g++
command with theclang++
command if you want to use the Clang compiler instead of the GCC compiler. - The
-std=c++20
(GCC/Clang) and/std:c++20
flag options specify the compiler to use the C++20 (2020) version of C++. The-o <name>
is used to specify the name and/or directory for the compiled program.
Anatomy of a C++ Program
Let's go into some more detail on the structure of our "Hello, world!" program. The first component to cover is:
auto main() -> int {
}
This declares a function called main
. The main
function is known as the program's
entry point, meaning main
is the very first function that runs in every executable C++
program. This declaration of main
takes no parameters and returns an integer (int
).
If there were parameters they would be declared within the parenthesis ()
. The body of
the function is wrapped in curly braces {}
.
The body of the function contains the following two lines:
std::cout << "Hello, world!\n";
return 0;
The second line returns a status code from main
to the (OS) indicating whether the
program run successfully or not. A status code of 0
indicates the program ran was
successfully with any other value indicating the program failed.
The first line is where the action occurs! We start by accessing the symbol cout
from
the namespace std
"(usually pronounced stood) using the namespace resolution operator
::
. cout
is a global character output stream that is linked to stdout
ie. your
terminal's output (you'll learn more about streams and IO in later chapters).
.
We can push characters through the stream using the <<
operator where the left argument
must be an output stream and the right argument is a series of characters, numbers or a
string. In this case we are pushing the string literal "Hello, world!\n"
through the
stream. We use the \n
character to specify a newline to be printed after our string
has been written to the terminal.
You'll notice that we end the line with a semicolon ;
. Semicolon's are used to indicate
the end of an expression.
The operator <<
has been overloaded for use with cout
(and other output streams), as
such it is only defined to work with C++'s primitive and standard library types. You
would need to provide you own definition for custom types.
You'll also notice at the top of the file the following line:
#include <iostream>
This is a preprocessor instruction, more specifically it is an instruction use to import
the 'iostream' library into our program. This is where the symbol for the cout
output
stream comes from. We include libraries by utilising the preprocessor directive
#include
which basically copies and pastes the contents of the file indicating within
the <>
symbols into our program, which in this case is the file 'iostream'. Assume this
file's location (and the location of any others used in the same manner) is known to your
compiler unless specified otherwise. Files imported using #include
are known as headers
.
Compiling and Running Are Separate Steps
You may notice that it took two separate steps in order to run our program. This is because C++ is a compiled language, meaning that our source code is transformed into something else. In the case of C++, the compiler will generate binary machine code for our target platform; which in this case is our own device before running. This means the generate (machine) code is specific to the target and you cannot transferred and run on a different computer if its architecture is different. This allows the compiler to optimise your code for the target platform but does require the additional step.
This is in contrast to interpreted languages; like Python, Ruby, JavaScript etc., which will perform the conversion while the program is running but this in turn requires another program; the interpreter, to run alongside yours, taking up extra resources but it usually means your programs are more portable as they can run on anywhere the interpreter can. These are some trade-offs made when designing or using a language.
For simple programs, directly using a C++ compiler (like g++
) is fine, but as your
project grows you'll want to manage all the options and make it easy to share your code.
Next, we'll introduce you to the CMake tool, which will help you write manage much larger
projects.
Hello, CMake
CMake is a third-party tool used to configure and build C++ projects. While there are other tools like CMake for configuring C++ compiler toolchains, CMake is the most ubiquitous within the C++ community. CMake allows use to define one or more targets that our project produces. Targets can be an executable, library, documentation or even testing. This allows a single project to build many different outputs for different platforms from a single source. Targets can also be consumed by other targets allowing more modular builds.
Creating a Project with CMake
To start off, go back to your projects/
directory and create a new directory called
'hello_cmake'.
$ mkdir hello_cmake
$ cd hello_cmake
Within this directory we will need to create three new files main.cxx
, CMakeLists.txt
and CMakePresets.json
. For the main.cxx
file you can copy the below code which is
identical to the one found on the previous page except printing slightly different
content.
#include <iostream>
auto main() -> int
{
std::cout << "Hello, CMake!\n";
return 0;
}
We will first look at the CMakeLists.txt
file.
CMake Configuration Files
A CMake project is defined by a set of 'CMakeLists.txt' files located in the source tree
(directories containing your source code). These describe your projects targets, source
files etc.. For a simple single file project we only need a single 'CMakeLists.txt'
alongside our main.cxx
source file. Copy the contents from Listing 1-2.
cmake_minimum_required(VERSION 3.22)
project(hello_cmake
VERSION 0.1.0
DESCRIPTION "Hello, CMake!"
LANGUAGES CXX)
add_executable(hello_cmake main.cxx)
target_compile_features(hello_cmake PRIVATE cxx_std_20)
Let's break down our CMakeLists.txt
file. First we specify the minimum required version
of CMake this project uses. This helps to ensure that any CMake features used in the
projects configuration are available to end users and collaborators.
cmake_minimum_required(VERSION 3.22)
We then define the basic information about our project such as its name, description, version and what languages it uses.
project(hello_cmake
VERSION 0.1.0
DESCRIPTION "Hello, CMake!"
LANGUAGES CXX)
In order to mark our main.cxx
as an executable we use the add_executable()
function
where we specify the executable's name ie. the name of the target created from the
executable as well as the source file used to make the executable.
add_executable(hello_cmake main.cxx)
Finally, we can add compilation features; such as setting the C++ Standard to use for
building the target, using the target_compile_features()
function. Here we add the
builtin CMake feature cxx_std_20
to our executable which ensures it is built using the
2020 C++ Standard.
target_compile_features(hello_cmake PRIVATE cxx_std_20)
See Appendix D for more information on C++ Standards.
CMake Presets
We can also specify presets for CMake that define different configurations by a unique
name. These presets can be used to configure your project to compile on multiple
different platforms as well as set various flags and options depending on how your want
the project to be built. This is better than writing large 'CMakeLists.txt' files with
complicated conditional logic that makes just writing the configuration complicated. A
minimalistic CMakePresets.json
file would look similar to Listing 1-3.
{
"version": 3,
"cmakeMinimumRequired": {
"major": 3,
"minor": 22,
"patch": 0
},
"configurePresets": [
{
"name": "default",
"binaryDir": "${sourceDir}/build"
}
]
}
A CMakePresets.json
file is starts with a key-value pair indicating the version of the
preset engine to use from CMake. We also specify the minimum CMake version required for
this project, similar to the first line Listing 1-2.
"version": 3,
"cmakeMinimumRequired": {
"major": 3,
"minor": 22,
"patch": 0
},
We then have a configuration array which stores our presets objects used for configuring our projects for different targets. All presets must have a unique name used to identify them.
"configurePresets": [
// ... preset objects go here
]
In our preset named "default" specify where we want the resulting binary to be put. In
this case we specified it to be placed in the build/
directory at the root of our
project.
{
"name": "default",
"binaryDir": "${sourceDir}/build"
}
One final thing to mention is that CMakePresets.json
files support macro expansions
which allow you to obtain common variables. The syntax for expanding a macro is to use a
dollar sign ($
) followed by the variables identifier surrounded in braces ({}
). We
can see one being used in Listing 1-3 when we specify where our binary
should be built. We can see that instead of hard coding a path or using relative path we
can leverage CMake knowing where our projects root is (which is where the root
CMakeLists.txt
file is located) and obtain the root of our source directory using the
sourceDir
variable, hence its expansion being used on line 11 eg.
"binaryDir": "${sourceDir}/build"
. Variable names are always in camel case.
More information of CMake's presets can be found on CMake's official documentation cmake-presets(7).
Building and Running a CMake Project
When building a CMake project we have to perform two steps. The first step is to configure the project. What this does is generate the build recipe(s) for your project according to your 'CMakeLists.txt' files. A recipes are the instructions used to actually compile your project with a single recipe being used to build one or more targets. CMake then builds one or more of these targets according to a recipe.
For our project we only have a single target which also happens to correspond to our single preset so we can simply run the following to build our recipe.
$ cmake --preset=default
-- The CXX compiler identification is GNU 11.4.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/user/projects/hello_cmake/build
If you do not want to use presets you can manually build the project with the following command.
$ cmake -S . -B build
We can then build the target using the following command:
$ cmake --build build
[ 50%] Building CXX object CMakeFiles/hello_cmake.dir/main.cxx.o
[100%] Linking CXX executable hello_cmake
[100%] Built target hello_cmake
This will produce a binary called main
in the build/
directory on Linux and MacOS and
the build/Debug/
directory on Windows. We can run our program like normal.
$ ./build/hello_cmake # ... or .\build\Debug\hello_cmake.exe on Windows
Hello, CMake!
The reason for Windows based builds having an additional intermediate directory Debug/
for the output is because the underlying builder(s) used on Windows can be configured to
output both debug and release builds from the same recipe which is controlled with
CMake's --config=<config>
flag during the build step. You can test creating a 'Release'
build by running the following command which should now produce and executable in the
build\Release\
directory.
> cmake --build build --config=Release
Compiling with Flags (Optional)
Often we want to have specific flags set for the compiler(s) we are using but because
each compiler has different flags available it can become difficult to have parity across
compilers. Luckily presets make this much easier. Below I have created a preset for each
platform with the correct flags set for the compiler(s) of each platform, ensuring some
of the most common errors and bugs are caught by the compiler and reported to us.
Listing 1-4 showcases these presets which i'd recommend copying over these
presets into the projects. There are also some hidden presets that are used to define
settings across presets; for example, I have set the C++ standard to 20 for all presets
by inheriting the "std-cxx"
preset in the non-hidden platform presets.
Listing 1-5 demonstrates the commands needed to configure, build and run
the executable target for each preset. From now on in the book, I will assuming the use
of presets for building C++.
- You'll have to specify the build directory using the
-B
flag like it is shown in Listing 1-5 because the presets do not define this however, this allows you to customize the build location. - These presets are for tailored for a single executable target and may not be robust to handle exporting libraries.
{
"version": 3,
"cmakeMinimumRequired": {
"major": 3,
"minor": 22,
"patch": 0
},
"configurePresets": [
{
"name": "vcpkg",
"hidden": true,
"toolchainFile": "$env{VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake"
},
{
"name": "std-cxx",
"hidden": true,
"cacheVariables": {
"CMAKE_CXX_EXTENSIONS": "OFF",
"CMAKE_CXX_STANDARD": "20",
"CMAKE_CXX_STANDARD_REQUIRED": "ON"
}
},
{
"name": "common",
"hidden": true,
"inherits": [
"std-cxx",
"vcpkg"
],
"cacheVariables": {
"CMAKE_EXPORT_COMPILE_COMMANDS": "ON"
}
},
{
"name": "linux",
"inherits": [
"common"
],
"description": "These flags are supported by both GCC and Clang",
"cacheVariables": {
"CMAKE_CXX_FLAGS": "-fstack-protector-strong -fcf-protection=full -fstack-clash-protection -Wall -Werror -Wextra -Wpedantic -Werror -Wnarrowing -Wconversion -Wsign-conversion -Wcast-qual -Wformat=2 -Wundef -Werror=float-equal -Wshadow -Wcast-align -Wunused -Wnull-dereference -Wdouble-promotion -Wimplicit-fallthrough -Wextra-semi -Woverloaded-virtual -Wnon-virtual-dtor -Wold-style-cast",
"CMAKE_EXE_LINKER_FLAGS": "-Wl,--allow-shlib-undefined,--as-needed,-z,noexecstack,-z,relro,-z,now",
"CMAKE_SHARED_LINKER_FLAGS": "-Wl,--allow-shlib-undefined,--as-needed,-z,noexecstack,-z,relro,-z,now"
}
},
{
"name": "apple-darwin",
"inherits": [
"common"
],
"cacheVariables": {
"CMAKE_CXX_FLAGS": "-fstack-protector-strong -Wall -Werror -Wextra -Wpedantic -Wnarrowing -Wconversion -Wsign-conversion -Wcast-qual -Wformat=2 -Wundef -Werror=float-equal -Wshadow -Wcast-align -Wunused -Wnull-dereference -Wdouble-promotion -Wimplicit-fallthrough -Wextra-semi -Woverloaded-virtual -Wnon-virtual-dtor -Wold-style-cast"
}
},
{
"name": "windows-x64",
"inherits": [
"common"
],
"description": "Note that all the flags after /WX are required for MSVC to conform to the language standard",
"cacheVariables": {
"CMAKE_CXX_FLAGS": "/sdl /guard:cf /utf-8 /diagnostics:caret /w14165 /w44242 /w44254 /w44263 /w34265 /w34287 /w44296 /w44365 /w44388 /w44464 /w14545 /w14546 /w14547 /w14549 /w14555 /w34619 /w34640 /w24826 /w14905 /w14906 /w14928 /w45038 /WX /permissive- /volatile:iso /Zc:inline /Zc:preprocessor /Zc:enumTypes /Zc:lambda /Zc:__cplusplus /Zc:externConstexpr /Zc:throwingNew /EHsc",
"CMAKE_EXE_LINKER_FLAGS": "/machine:x64 /guard:cf"
}
},
{
"name": "windows-x86",
"inherits": [
"common"
],
"description": "Note that all the flags after /WX are required for MSVC to conform to the language standard",
"cacheVariables": {
"CMAKE_CXX_FLAGS": "/sdl /guard:cf /utf-8 /diagnostics:caret /w14165 /w44242 /w44254 /w44263 /w34265 /w34287 /w44296 /w44365 /w44388 /w44464 /w14545 /w14546 /w14547 /w14549 /w14555 /w34619 /w34640 /w24826 /w14905 /w14906 /w14928 /w45038 /WX /permissive- /volatile:iso /Zc:inline /Zc:preprocessor /Zc:enumTypes /Zc:lambda /Zc:__cplusplus /Zc:externConstexpr /Zc:throwingNew /EHsc",
"CMAKE_EXE_LINKER_FLAGS": "/machine:x86 /guard:cf"
}
}
]
}
# Linux (debug)
$ cmake -S . -B build/linux/debug --preset=linux # configure
$ cmake --build build/linux/debug # build
$ ./build/linux/debug/<exe-name> # execute
# Linux (release)
$ cmake -S . -B build/linux/release --preset=linux -DCMAKE_BUILD_TYPE="Release" # configure
$ cmake --build build/linux/release # build
$ ./build/linux/release/<exe-name> # execute
# --------------------------------------------
# macOS (debug)
$ cmake -S . -B build/macos/debug --preset=macos # configure
$ cmake --build build/macos/debug # build
$ ./build/macos/debug/<exe-name> # execute
# macOS (release)
$ cmake -S . -B build/macos/release --preset=macos -DCMAKE_BUILD_TYPE="Release" # configure
$ cmake --build build/macos/release # build
$ ./build/macos/release/<exe-name> # execute
# --------------------------------------------
# Windows [x64] (debug)
$ cmake -S . -B build/windows-x64 --preset=windows-x64 # configure
$ cmake --build build/windows-x64 --config=Debug # build
$ ./build/windows-x64/Debug/<exe-name>.exe # execute
# Windows [x64] (release)
$ cmake -S . -B build/windows-x64 --preset=windows-x64 # configure
$ cmake --build build/windows-x64 --config=Release # build
$ ./build/windows-x64/Release/<exe-name>.exe # execute
# --------------------------------------------
# Windows [x86] (debug)
$ cmake -S . -B build/windows-x86 --preset=windows-x86 # configure
$ cmake --build build/windows-x86 --config=Debug # build
$ ./build/windows-x86/Debug/<exe-name>.exe # execute
# Windows [x86] (release)
$ cmake -S . -B build/windows-x86 --preset=windows-x86 # configure
$ cmake --build build/windows-x86 --config=Release # build
$ ./build/windows-x86/Release/<exe-name>.exe # execute
Hello, vcpkg
While CMake can be used to build your project and help to customize its configuration for different platforms and uses, it is not very good at managing packages. For this reason we will be using another tool built for this purpose called vcpkg. It is an open-source project developed at Microsoft that interacts directly with CMake.
Setting Up CMake Project with vcpkg
To get started we are going to create another new directory in our parent directory
projects/
.
$ mkdir hello_vcpkg
$ cd hello_vcpkg
We can then copy all files from the previous pages project into this new directory, these
are the main.cxx
, CMakeLists.txt
and CMakePresets.json
files. We can then
initialise a new vcpkg project using the command.
$ vcpkg new --application
This will create two new files vcpkg.json
and vcpkg-configuration.json
. The
vcpkg.json
will currently be empty but it is used to specify dependencies. It can also
declare available features for downstream users of your project if it is setup as a
library however, this is not relevant to us right now. The vcpkg-configuration.json
is
used to specify the source location of packages as well as lock the version of the source
to a particular version to make reproducible builds easier, which is important in
production software however, we can largely ignore that file.
The next thing we'll do is ensure that CMake is aware of vcpkg so the two tools can work
together. CMake supports a notion of a toolchain file which specifies which underlying
tools CMake must use which assist in building projects across different systems and helps
determine how to build the packages your project requests. We can specify the vcpkg
toolchain in our CMakePresets.json
by adding the file's path to CMake's variable cache.
This can be done by adding the "cacheVariables
object below the "binaryDir"
entry in
our CMakePresets.json
file with an entry in the new object for the variable
CMAKE_TOOLCHAIN_FILE
. This can be seen in Listing 1-6.
{
"version": 3,
"cmakeMinimumRequired": {
"major": 3,
"minor": 22,
"patch": 0
},
"configurePresets": [
{
"name": "default",
"binaryDir": "${sourceDir}/build",
"cacheVariables": {
"CMAKE_TOOLCHAIN_FILE": "$env{VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake"
}
}
]
}
This leverages the $env{}
macro which obtains environment variables, in this case the
variable $VCPKG_ROOT
variable which is where our vcpkg install lives.
Adding Packages
Let us add a package to our project. For this example we are going to use the fast
formatting and I/O library {fmt}
. To add dependencies
we simply need to add an entry into vcpkg.json
called "dependencies"
which is an
array of objects or strings representing our projects dependencies. We can do this with
the following command:
vcpkg add port fmt
We can also more add a minimum version constraint to dependencies by converting the newly added dependency from string a into an object similar to Listing 1-7.
{
"dependencies": [
{
"name": "fmt",
"version>=": "10.1.0"
}
]
}
The full reference for vcpkg.json
contains more details related to controlling your projects setup and dependencies.
Next, we need to tell CMake that {fmt}
exists and we'd like to use it. To do this we
must tell CMake to find the package using the find_package()
in the projects
CMakeLists.txt
file.
cmake_minimum_required(VERSION 3.14)
project(hello_vcpkg
VERSION 0.1.0
DESCRIPTION "Hello, vcpkg from {fmt}!"
LANGUAGES CXX)
find_package(fmt CONFIG REQUIRED)
add_executable(hello_vcpkg main.cxx)
target_compile_features(hello_vcpkg PRIVATE cxx_std_20)
Because {fmt}
will be a required package for our project we must tell CMake to fail if
it cannot be found which is why we pass the REQUIRED
keyword to the function. We also
pass in the CONFIG
keyword to indicate for CMake to use the packages provided
configuration file which is what allows vcpkg to manipulate how the packages CMake
targets are made. We then must add {fmt}
(more specifically, a target from {fmt}
) to
our projects executable target. We can do this with the target_link_libraries()
function. This function first takes the name of the target we want to add libraries to,
we then specify the names of the targets we wish to link using, ensuring to specify a
scope keyword eg. one of PRIVATE
, PUBLIC
or INTERFACE
. Linking a library simply
means adding it to another target. In Listing 1-8 we use PRIVATE
scoping for adding {fmt}
to our executable because the usage of {fmt}
is limited to
the internals of our projects resulting binary and will not be exposed.
# ... rest of CMakeLists.txt
add_executable(hello_vcpkg main.cxx)
target_compile_features(hello_vcpkg PRIVATE cxx_std_20)
target_link_libraries(hello_vcpkg PRIVATE fmt::fmt)
You may also notice we are linking fmt::fmt
not just fmt
in the call to
target_link_libraries()
. This is because the first fmt
is a namespace for the package
found through find_package()
. We then access the target named fmt
and link it to our
executable.
With all that done we can now use {fmt}
in our main.cxx
file.
Listing 1-9 is an example program that uses {fmt}
which you can copy
into the projects main.cxx
.
#include <fmt/core.h>
auto main() -> int
{
fmt::println("Hello, vcpkg from {{fmt}}");
return 0;
}
The use of two braces in the format string of fmt::println()
is so the one pair is
actually printed in th output. Normally braces have a special meaning in {fmt}
but if
we need a literal '{'
or '}'
we use two.
Building and Running with vcpkg
If we copy over the presets file from the bottom of the previous page, we can build our
smalle package using presets. This is because of the "vcpkg"
preset which links CMake
to vcpkg together such that CMake can find packages installed with vcpkg.
$ cmake -S . -B build/<platform> --preset=linux
$ cmake --preset=default
$ cmake --build build/<platform>
$ ./build/<platform>/hello_vcpkg # ... or .\build\windows-x[86|64]\Debug\hello_vcpkg.exe on Windows
Hello, vcpkg from {fmt}
"Hello, Godbolt!"
Compiling Online
The ability to quickly test and prototype software is extremely useful however, doing so in C++ is not so easy. There's a lot of steps that need to be taken to setup a project correctly which is good for building robust software but slows the speed of prototyping down to a halt. Luckily there exists a tool that makes this processes much easier. It is online C++ compiler known as Godbolt. This site allows you to compile C++ using many different compilers; even at the same time, as well as execute the resulting binary and even see the assembly generated from the compiler. It also allows you to share your session with others so they can see not only the code you wrote but the exact compiler(s), and flags you have set. It is a massively useful tool that is invaluable to the C++ community. Here is an example "Hello, world!" on Godbolt which shows the generated assembly as well as the output from the executed binary. You can also see the godbolt instance embedded below.
Project: Guessing Game
Let us jump straight into C++ by developing a project together! This will help expose you to some common concepts from C++ and how they are used in an actual program. You'll learn how create variables, control the flow of your program, take in user input, create functions and more! These concepts will be explored in more detail in future chapters while this one will focus on the fundamentals.
We'll be implementing a simple number guessing game. The program will generate a random integer between 1 and 100 (inclusive). It will then prompt the user to type in a guess. After the guess is entered the program will indicate whether the guess was too high or to low or a congratulatory message if the user got it right and exit the program.
Setting Up a New Project
To begin, create a new directory in your projects/
directory called guessing_game
and
enter it.
$ mkdir guessing_game
$ cd guessing_game
As usual, we'll need to create the files main.cxx
, CMakeLists.txt
and
CMakePresets.json
. Our main.cxx
file can just be an empty main()
function like
Listing 2-1 and for our CMakeLists.txt
file we must specify a minimum
project configuration detailed in Listing2-2. As for our
CMakePresets.json
file, we can use the either one from Chapter 1;
Listing 1-3 or Listing 1-4.
auto main() -> int {
return 0;
}
cmake_minimum_required(VERSION 3.14)
project(guessing_game
VERSION 0.1.0
DESCRIPTION "Number Guessing Game"
LANGUAGES CXX)
add_executable(guessing_game main.cxx)
target_compile_features(guessing_game PRIVATE cxx_std_20)
Processing a Guess
First we will need to we need to ask the user for input, process that input and ensure it is in a form we expected. To start we'll simply take in the users guess and return it to them. Listing2-3 shows the starting code.
#include <iostream>
#include <string>
auto main() -> int
{
std::cout << "Guessing Game!\n";
std::cout << "Please input your guess (1..100): ";
auto guess = std::string {};
std::getline(std::cin, guess);
std::cout << "You guessed: " << guess << std::endl;
return 0;
}
Let's briefly go over the new concepts introduced in Listing 2-3. We have
included a new header <string>
which
contains the definitions the type std::string
and supported functions.
#include <string>
We then prompt the user with the name of the game as well as request input from the user
using the output stream std::cout
, which we covered in Chapter 1.
std::cout << "Guessing Game!\n";
std::cout << "Please input your guess (1..100): ";
Storing Data with Variables
Next, we construct a new variable to store the users input in.
auto guess = std::string {};
Now this is where things begin to get interesting. This line is an assignment expression which is used to bind a value to a variable. Here is another!
auto boxes = 7;
Note the lack of a type after the =
. This is because we can initialize boxes
with a
integer literal and thus a type is not needed.
In C++ variables are mutable by default which means we are allowed to change it's value.
This concept will be discussed more in Chapter 3 | Variables and Mutability.
To make a variable constant ie. its value cannot change once it is set, we use the
const
keyword after/before auto
(I choose after).
auto const boxes = 7; // constant
auto crates = 4; // mutable
The //
syntax indicates a comment that continues until the end of the line. Everything
in a comment is ignored by C++. You will learn more about them in
Chapter 3 | Comments.
In this case of our variable guess
in our guessing game program, we have (default)
constructed a temporary value with the type std::string
which we then bind to the
variable named guess
using the =
operator. We have also used auto
to allow the
compiler to deduce the type that the variable guess
should have. We could have written
explicitly the type on the left-hand-side instead of auto
like the example below but
this would be more verbose as we have to express the type twice. It also means that if
we change the type on the RHS we must also change it on the LHS but with auto
the
compiler will do that for us!
std::string input = std::string {};
When constructing our std::string
we have used what is known as brace. This is a safer
variant of regular construction (which uses parenthesis ()
) as it prevents narrowing
which causes the bit representation of some types to be truncated. We also have default
constructed our std::string
which in this case means the std::string
is constructed
as an empty string not as an invalid object.
Receiving User Input
There are a few different ways for handling user input from the terminal in C++. For this
program we have used the
std::getline()
.
std::getline(std::cin, input);
This function extracts all characters from the first argument which is of type
std::basic_istream<>
. In this
case, the input stream is std::cin
. Once no
characters remain in the stream or the designated deliminator is encountered; which
defaults to '\n'
(third argument), the extracted characters are then written to the
second argument which is a reference to a string of the same underlying character type.
References allow functions to read and/or modify data passed to them and have the effects
reflected on the callers side. We'll cover references and ownership in C++ during
Chapter 4. In effect this function reads an entire line and
copies the characters into a string.
Printing with Output Streams
As we first saw in "Hello, world!" we can output text using
std::cout
global object using the operator <<
.
You may be wondering why the "unique" syntax for out has been chosen for printing? This
is because the Input/Output library is more
generic than just a printing facility. As the name suggests it is a library for
manipulating and using Input/Output (IO) streams. Streams can be thought of as a pipeline
between two endpoints eg. a program and the terminal screen where data can be pushed from
one end (the program) and extracted at the other end (the terminal screen). The C++ IO
library uses streams to model how data is transferred between various endpoints like a
program, the terminal screen, files etc. with the <<
and >>
operators being used to
perform formatted IO ie. push formatted data to and/or extract formatted data from a
stream respectively. These facilities were then used to wrap low level IO handles such as
stdin
, stdout
and stderr
; which are used to print and take user input, in global
stream objects eg. std::cin
, std::cout
and std::cerr
which meant they could be
manipulated using the same API and functionality provided by the standard C++ IO library.
The C++23 Standard includes a new header <print>
with functions like std::println()
which use the C++20
formatting library which make
printing much more intuitive and faster. This library was directly inspired by the
{fmt}
library.
If you are familiar with other languages you may be wondering why <<
is used to push to
a streams as this operator is normally used for the
left bit shifting
operations. We are able to use the <<
operator because it has been overloaded.
Essentially this means the functionality of <<
has been changed and customized for
particular types. Within the C++ standard library, <<
has been overloaded to support
taking a reference to a std::basic_ostream<>
object as the left argument; ie. the type of std::cout
, and various builtin C++ types
and library types from the standard library as the right argument eg. int
and
std::string
, which allows the <<
syntax to be used with many different types already
in C++. Overloading will be covered in more detail in
Chapter 3 | Functions.
In this program we have seen that we can chain the calls to <<
.
std::cout << "You guessed: " << input << std::endl;
This is because each call to <<
returns a reference to the same stream passed as the
left argument, allowing you to make subsequent calls to <<
one after another. This can
make it easier to build up pipelines to and from streams as we can create arbitrarily
long chains.
Finally, you may notice the std::endl
at the end of the chain. This is a
stream manipulator. Stream manipulators are used to modify the stream to support
different kinds of formatting. In this case, std::endl
simply appends a '\n'
to the
stream and flushes the underlying buffer. So why not just use '\n'
? Well, you should.
Using std::endl
repeatedly just to add newlines will dramatically degrade performance
because repeatedly flushing the internal buffer forces the OS the immediately display the
characters instead of allowing for the output to buffer ie. reach a large enough size to
warrant making a system call. std::endl
should only be used when you want to flush the
streams buffer and place a newline eg. at the end of a program, otherwise use an explicit
'\n'
.
Generating a Secret Number
Now we want some way to generate a secret number that the player will try to guess. We
also want the number to be different each time so the game is more fun but we'll keep it
between 1 and 100 to ensure it is not too difficult. To generate our secret number we'll
use a random number generator. The C++ standard library contains a header
<random>
which contains a bunch of
facilities for generating random numbers. Update your main.cxx
file according to
Listing 2-4.
#include <iostream>
#include <random>
#include <string>
auto main() -> int
{
std::cout << "Guessing Game!\n";
auto rd = std::random_device {};
auto gen = std::mt19937 { rd() };
auto distrib = std::uniform_int_distribution<unsigned> { 1u, 100u };
auto const secret_number = distrib(gen);
std::cout << "The secret number is: " << secret_number << '\n';
std::cout << "Please input your guess: ";
auto input = std::string {};
std::getline(std::cin, input);
std::cout << "You guessed: " << input << std::endl;
return 0;
}
First we include the new header <random>
so we can access the (pseudo-) random number
generation types. Next we add the lines
auto rd = std::random_device {};
auto gen = std::mt19937 { rd() };
auto distrib = std::uniform_int_distribution { 1, 100 };
The first line (default) constructs a new
std::random_device
.
This is a uniformly distributed, non-deterministic number generator. While we could
generate a random number from simply calling rd
, this is considered bad practice as
std::random_device
performance degrades with use due to its entropy pool being used
up. For this reason we simply use it to seed a proper Pseudo-Random Number Generator
(PRNG) such as std::mt19937
which is what we do on the second line. Finally we construct a
std::uniform_int_distribution<>
which is used to uniformly generate integers between the two provided bounds.
This sets up our random number generator. To obtain a random number we can call the distribution object, passing in the generator and returning a new random value.
auto const secret_number = distrib(gen);
Comparing the Guess to the Secret Number
Next we want to compare our players guess to the secret number. The updated code can be seen in Listing 2-5.
#include <compare>
#include <iostream>
#include <random>
#include <string>
auto main() -> int
{
std::cout << "Guessing Game!\n";
auto rd = std::random_device {};
auto gen = std::mt19937 { rd() };
auto distrib = std::uniform_int_distribution { 1, 100 };
auto const secret_number = distrib(gen);
std::cout << "The secret number is: " << secret_number << '\n';
std::cout << "Please input your guess: ";
auto input = std::string {};
std::getline(std::cin, input);
auto guess = std::stoi(input);
if (auto const cmp = guess <=> secret_number; std::is_eq(cmp)) {
std::cout << "You guessed correctly!\n";
} else if (std::is_lt(cmp)) {
std::cout << "Too small!\n";
} else if (std::is_gt(cmp)) {
std::cout << "Too big!\n";
}
return 0;
}
Before we are able to compare the players input to our secret number we must first convert the raw input into a number so they can be compared.
auto guess = std::stoi(input);
C++ offers a few functions for converting strings into numbers which all start with the
prefix std::sto*
meaning
'string-to' followed by a designator for the conversion type. Because we want to parse
our input as a plain int
we can use std::stoi()
.
Next we compare the guess
to our secret_number
. Here we can make use of the spaceship
operator (<=>
) which allows us to perform a '3 way comparison' which we can then query
with the utility functions
std::is_eq
, std::is_lt
, std::is_gt
etc..
In this case we create a new object cmp
and then use these 'named comparison' functions
to check the result. We use
if
and else if
branches to test the
comparisons result and run a separate piece of code if that branch succeeds.
if (auto const cmp = guess <=> secret_number; std::is_eq(cmp)) {
std::cout << "You guessed correctly!\n";
} else if (std::is_lt(cmp)) {
std::cout << "Too small!\n";
} else if (std::is_gt(cmp)) {
std::cout << "Too big!\n";
}
We have also used a initialiser statement in the first if
branch. This allows us to run
an expression at the start of the if
branches and store the result in a local variable
(in this case cmp
) which can only be accessed within the if
branches. This helps
ensure that cmp
is not modified or accessed outside the if
branches it belongs to.
Handling Parsing Errors with Exceptions
Our game is coming along quite nicely but it has one fundamental flaw. What happens if we give our game the input "abcd34" or "38574876546456476745"? We get the following two errors and our game crashes!
# input: "abcd34"
terminate called after throwing an instance of 'std::invalid_argument'
what(): stoi
[1] 27989 IOT instruction ./build/.../guessing_game
# input: "38574876546456476745"
terminate called after throwing an instance of 'std::out_of_range'
what(): stoi
[1] 1513 IOT instruction ./build/.../guessing_game
This is not ideal as it gives no way for the system to recover from the error and let the
user try again. How do we fix this? Well notice in the error message it states that an
instance of (either)
std::invalid_argument
(or)
std::out_of_range
was thrown.
What are these objects? These are known as exceptions. They are a special object used to
indicate that an exceptional event has occurred. These are pathways in our program that
we do not expect to occur but might and exceptions allow us to recover the system without
fully crashing. This is a useful mechanism for allowing systems to remain online and
perform self recovery if an error does occur.
Before we look at how to handle thrown exceptions we'll first discuss what each of these
exceptions mean in the context of std::stoi()
. std::invalid_argument
is used to
indicate that a general parsing error has occurred due to a bad input ie. prefixing the
input with letters eg. "abcd34". The exception std::out_of_range
is used to indicate
that the input value cannot fit into the conversion type. For example if
"38574876546456476745" is passed to std::stoi()
we have this exception thrown because
the max value that can be fit inside an int
is 2147483647
which is much smaller than
38574876546456476745
.
The std::sto*
function family will 'successfully' parse inputs like "34abc" as they
extract the number from the front and will discard the rest.
Catching Exceptions
So how do we handle an exception that has been thrown? We can use a try-catch
block.
When there is a chance for something to fail we place the potentially failing code in a
try
block. After a try block we put one or more catch
blocks. These are used to
define the exception handling pathway for that particular exception. For our simple
program we can define a try-catch
block like in Listing 2-6.
#include <compare>
// --snip--
#include <exception>
#include <iomanip>
#include <iostream>
#include <random>
#include <string>
// --snip--
auto main() -> int
{
std::cout << "Guessing Game!\n";
auto rd = std::random_device {};
auto gen = std::mt19937 { rd() };
auto distrib = std::uniform_int_distribution { 1, 100 };
auto const secret_number = distrib(gen);
std::cout << "The secret number is: " << secret_number << '\n';
std::cout << "Please input your guess: ";
auto input = std::string {};
std::getline(std::cin, input);
// --snip--
auto guess = int {};
try {
guess = std::stoi(input);
} catch (std::invalid_argument const&) {
std::cout << "Invalid input " << std::quoted(input) << "!\n";
std::exit(0);
} catch (std::out_of_range const&) {
std::cout << "Input " << std::quoted(input) << " is too large!" << '\n';
std::exit(0);
}
if (auto const cmp = guess <=> secret_number; std::is_eq(cmp)) {
std::cout << "You guessed correctly!\n";
} else if (std::is_lt(cmp)) {
std::cout << "Too small!\n";
} else if (std::is_gt(cmp)) {
std::cout << "Too big!\n";
}
return 0;
// --snip--
}
While try-catch
block's do model a form of control flow they are very different to
regular control flow mechanisms like if
statements. You should not be used try-catch
blocks to control the regular/expected execution pathway of a program as they are much
slower nor should you throw exceptions in order to jump out to a particular scope.
Exceptions should only be used to indicate that a recoverable error has occurred and
try-catch
blocks being used to handle recovering from this event eg. giving any
allocated resources back to the OS, as such exceptions should be used only in
exceptional (pun most definitely intended) cases and when appropriate for your domain
(as they can be undesirable in many situations). The main purpose of showing exceptions
now is to demonstrate how to handle them not throw your own.
Allowing Multiple Guesses with a Loop
Now that we correctly handle the exceptional cases of parsing our player's input we can
look at making the game more interactive. Only having one guess doesn't make our game
very fun. Lets allow the player to make multiple guesses by introducing a loop! We will
want this loop to run forever with explicit mechanisms for exiting the loop. We can use
a while
loop with its condition simply being true
. This will create our infinite
loop. But how and when do we exit the loop? We want the loop to be broken when the player
guesses the correct number. We can do this by introducing a break
statement in the
first if
branch when comparing the player's input to the secret number. break
is used
to break out of the enclosing loop block. We also need the program to run the next loop
iteration if an exception occurs, skipping the comparisons. We can do this with a
continue
statement within each of the catch
blocks to skip to the next iteration.
Finally, be sure to move the prompt output and player input logic into the loop so they
are called each iteration.
#include <compare>
#include <exception>
#include <iomanip>
#include <iostream>
#include <random>
#include <string>
// --snip--
auto main() -> int
{
std::cout << "Guessing Game!\n";
auto rd = std::random_device {};
auto gen = std::mt19937 { rd() };
auto distrib = std::uniform_int_distribution { 1, 100 };
auto const secret_number = distrib(gen);
std::cout << "The secret number is: " << secret_number << '\n';
auto input = std::string {};
auto guess = int {};
// --snip--
while (true) {
std::cout << "Please input your guess: ";
std::getline(std::cin, input);
// --snip--
try {
guess = std::stoi(input);
} catch (std::invalid_argument const&) {
std::cout << "Invalid input " << std::quoted(input) << "!\n";
continue;
} catch (std::out_of_range const&) {
std::cout << "Input " << std::quoted(input) << " is too large!" << '\n';
continue;
}
if (auto const cmp = guess <=> secret_number; std::is_eq(cmp)) {
std::cout << "You guessed correctly!\n";
break;
} else if (std::is_lt(cmp)) {
std::cout << "Too small!\n";
} else if (std::is_gt(cmp)) {
std::cout << "Too big!\n";
}
}
return 0;
}
Fantastic! With a final tweak we have finished the guessing game. Our game is still printing the secret number! We can fix this by deleting the line. The final code is available in Listing 2-8.
#include <compare>
#include <exception>
#include <iomanip>
#include <iostream>
#include <random>
#include <string>
auto main() -> int
{
std::cout << "Guessing Game!\n";
auto rd = std::random_device {};
auto gen = std::mt19937 { rd() };
auto distrib = std::uniform_int_distribution { 1, 100 };
auto const secret_number = distrib(gen);
auto input = std::string {};
auto guess = int {};
while (true) {
std::cout << "Please input your guess: ";
std::getline(std::cin, input);
try {
guess = std::stoi(input);
} catch (std::invalid_argument const&) {
std::cout << "Invalid input " << std::quoted(input) << "!\n";
continue;
} catch (std::out_of_range const&) {
std::cout << "Input " << std::quoted(input) << " is too large!" << '\n';
continue;
}
if (auto const cmp = guess <=> secret_number; std::is_eq(cmp)) {
std::cout << "You guessed correctly!\n";
break;
} else if (std::is_lt(cmp)) {
std::cout << "Too small!\n";
} else if (std::is_gt(cmp)) {
std::cout << "Too big!\n";
}
}
return 0;
}
Summary
This project offered a hands on way to learn many of C++ features: auto
, variables,
functions, if
statements, exception handling and loops! In the upcoming chapters you
will delve deeper into these concepts as well as explore many new ones. See you there!
Common Programming Concepts
Throughout this chapter we will cover some of the most common concepts that appear in many different programming languages and how they work in C++. None of these concepts are unique to C++ but they may work slightly different to how you are used to.
Keywords are words reserved for use by the language meaning they cannot be used as identifier names for variables or functions. See Appendix A for full list of keywords.
Variables and Mutability
We first saw variables in our mini guessing game project where we used them to store the guess of the user and create our PRNG etc.. Let's exlore what happens when we try to modify constant data and when we would want to allow mutations.
By default, variables are mutable, allowing you to modify them freely. While this offers
great flexibiliy and ease of programming, it is beneficial to opt-in to immutabilty using
the const
keyword which data that does not need to change, cannot change; opting to
remove the const
keyword when data needs to be mutable.
Create a new (or use an existing project) with a main.cxx
and CMakeLists.txt
etc.
like we did for our previous programs; or use an existing one, and we'll explore
mutability. Change the name of the target to main
in the CMakeLists.txt, as I'll be
using this as the target name from (near) all examples from now on in the book.
cmake_minimum_required(VERSION 3.22)
project(main
VERSION 0.1.0
DESCRIPTION "C++ Book Example"
LANGUAGES CXX)
add_executable(main main.cxx)
target_compile_features(main PRIVATE cxx_std_20)
In your main.cxx
, write the following program. When we try to compile this we should
get an error like so.
#include <iostream>
auto main() -> int {
auto const x = 42;
std::cout << x << std::endl;
x = 43;
std::cout << x << std::endl;
return 0;
}
When we try to compile this we should get an error like so:
$ cmake -S . -B build --preset=<platform>
$ cmake --build build
[ 50%] Building CXX object CMakeFiles/main.dir/main.cxx.o
/home/user/projects/common/main.cxx: In function ‘int main()’:
/home/user/projects/common/main.cxx:7:7: error: assignment of read-only variable ‘x’
7 | x = 43;
| ~~^~~~
gmake[2]: *** [CMakeFiles/main.dir/build.make:76: CMakeFiles/main.dir/main.cxx.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/main.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
It is vital that we catch errors like this are compile time as it prevents us writing bad and security vulnerable code. Constant data is also easier to reason about as we can assume that no part of the program will modify this piece of data. The benefits of this do not emerge properly until we introduce functions and have to share data across the function boundaries where we expect the function to not mutate data passed to it even though the surrounding scope might. More on this later.
Even though immutable data is easier to reason about, mutating data is where the fun
parts of computation occur. We can see that by dropping the const
we can mutate the
variable freely.
#include <iostream>
auto main() -> int {
auto x = 42;
std::cout << x << std::endl;
x = 43;
std::cout << x << std::endl;
return 0;
}
With it compiling to...
$ cmake -S . -B build --preset=<platform>
$ cmake --build build
$ ./build/main
42
43
Constant Expressions
C++ allows for us to define constants whose value is computed at compile time using the
constexpr
keyword. This allows you to define variables that are the result of some
computation but have the value ready at runtime instead of performing the computation
perform during runtime. constexpr
are naturally immutable.
To actually see this feature in action, we need to look at the assembly generated for
code using constexpr
and code without. Take below, we see two numbers, one is
is a constexpr
and is initialized to some expression; even containing a function call,
and another initialized to a simple number but immediately changed to the same expression
value.
#include <iostream>
auto constexpr sum(auto const n) {
auto acc = 0;
for (auto i = 0; i < n; ++i) {
acc += 1;
}
return acc;
}
auto main() -> int {
auto constexpr x = (42 + 7) / sum(23);
auto y = 6;
y = (42 + 7) / sum(23);
std::cout << x << std::endl;
std::cout << y << std::endl;
return 0;
}
This generates the following assembly (at least for GCC-14):
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], 2
mov DWORD PTR [rbp-8], 6
mov edi, 23
call auto sum<int>(int)
mov ecx, eax
mov eax, 49
cdq
idiv ecx
mov DWORD PTR [rbp-8], eax
mov esi, 2
The place of interest is the 5th and 6th line and then the lines 8-14. The first set are
the variables x
and y
being initialized. Line 6 makes sense because we initialized
the value with a literal 6
, but line 5 shows 2
. Compare this to the lines 8-14 which
show the process of calling the sum()
function, calculating and moving the result into
registers, a division call (idiv
) and finally pushing the result onto the variable on
stack frame. That's not even to mention the instructions needed to run sum()
(take a
look at the link below for the full assembly). The difference is quite distinguishable.
While the example above is simple (and a little contrived*), constexpr
has become
a very powerful feature of C++ and is capable of computing super complex expressions
at compile time, even expression involving objects that typically interact with runtime
only entities like the heap however, we'll learn more about this in future chapters.
*This initialization and immediate change is necessary to force the compiler to generate
the unoptimized assembly I wanted to show off. Compilers have gotten so good that
regardless of constexpr
or no constexpr
, a variable directly initialized to this
expression will cause the compiler to optimize the whole thing away into the result
of the expression and directly initialize the variable with that value.
In fact, it completely removes the definition of sum()
as it is only used in these
expressions which run at compile time, so there is no need to store the functions code
in the resulting binary if it is never used again. Setting the second value to a
temporary value disallows the compiler to make these optimizations.
It's amazing how much heavy lifting compilers are able to do for us.
It should be noted that constexpr
only indicates to the compiler that this expression
could be computable at compile time but makes no guarantee that it will. For that,
consteval
was introduced.
Type Deduction
You may be wondering why we I am using auto
to declare variables instead of writing the
type like below. C++ is a statically typed language after all... right?
int x = 5;
auto y = 6;
auto
is a keyword that allows the compiler to perform type deduction, which means we
tell the compiler to figure out the type of the variable or function return signature
from the context it is given.
Storage Duration
Data in C++ falls into different storage duration categories which dictates the lifetime of the data. So far we have seen data with automatic storage duration, this is data that is automatically freed when it goes out of scope. These are variables that do not allocate heap memory and instead live entirely on the stack and thus are freed when stack frames are popped, which occurs naturally as functions return.
Data with dynamic storage duration is data that is created at runtime and must be deallocated manually before the program finishes. This is data that is usually stored on the heap or what C++ formally calls the free store.
One we haven't looked at yet is static storage duration. This is data that is encoded
directly in the binary of a program and thus lives for the entire duration of the
program. To give data this storage duration we declare it with the static
keyword.
Global variables declared outside of a functions are implicitly static
.
Data Types
As we mentioned on the last page, C++ is a statically typed language which means the type of data must be known (or deducable) to the compiler. C++ has a large selection of types available to use, some are language primitives and others are defined in the standard library. In this page we will look at four categories of types, scalar integrals, floating point, compound and special types.
Scalar Types
Scalar integrals are types encoded as whole numbers. This not only includes integers types but C++ character and Boolean types.
Integer Types
An integer is a whole number. C++ has a few different integer types which have
diffenent bit widths. The default int
is 32-bits wide on most platforms. By default
integer types are signed ie. they can represent both positive and negative numbers. If
you need unsigned numbers we can use the unsigned
qualifier.
int const x = -5;
unsigned int const y = 5;
If you need integers of a different sizes you can either use size qualifiers with the
int
type to dictate the minimum size the integer can be. All of these can be used in
combination with the unsigned
qualifier.
Type | Full Type | Minimum Size | Signed Value Range | Unsigned Value Range |
---|---|---|---|---|
char |
char |
at least 8 | -128 to 127 | 0 to 255 |
short |
short int |
at least 16 | -32,768 to 32,767 | 0 to 65,535 |
int |
int |
at least 16 | -32,768 to 32,767 | 0 to 65,535 |
long |
long int |
at least 32 | -2,147,483,648 to 2,147, 483,647 |
0 to 4,294, 967,296 |
long long |
long long int |
at least 64 | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
0 to 18,446,744,073,709,551,615 |
You can also use fixed width integer types (FWIT). FWIT have the form std::intN_t or std::uintN_t where N is the exact number of bits wide. The standard library define FWIT (signed and unsigned) for 8, 16, 32, 64 bits widths.
The bit width of an integer dictates how many values the integer can represent. As of C++20, all integers must be represented by 2s-complement which means that for signed numbers the range of values is \(-2^{N-1}\) to \(+2^{N-1}-1\) eg. -128 to 127 for an 8-bit number and for an unsigned number the range is \(2^N-1\) eg. values 0 to 255 for an 8-bit number.
In addition to these integer types there are std::size_t
and std::ptrdiff_t
which
are the unsigned and signed types respectively that have the max bit width available on a
given architecture, eg. 64 bits on 64-bit architecture. std::size_t
is the type used
when index arrays or getting the size of objects. The odd name for std::ptrdiff_t
is
because this is the type returned after pointer arithmetic however, it is really the
largest signed integer type.
Literals
You can specify the type/width of an integer using a literal suffix from the table below
with the u
suffix being able to be used in combination with the other two.
Keyword | Description |
---|---|
u or U | unsigned |
l or L | long |
ll or LL | long long |
Additionally you can write integer literals in a different base form by changing the prefix of the literal.
auto const decimal = 42;
auto const octal = 052;
auto const hex = 0x2a;
auto const Hex = 0X2A; // capital hex digits
auto const binary 0b101010;
Integers can also be separated using a '
to make large numbers easier to read.
auto const x = 1'234'567'890;
Character Types
You'll notice that we have included the char
type in the integer list above. This is
because character types in C++ are represented using numbers, specifically char
represents ASCII code points. Character literals are specified with single quotes like
the example below.
char const x = 'a';
auto const y = 'b';
Boolean Type
C++'s Boolean type is called bool
and can either hold the value true
or false
.
Booleans are used mostly in conditional and loop statements eg. if
and while
.
bool x = false;
auto y = true;
The C language; C++'s mother language, originally did not have a native Boolean type with
Boolean expressions return 1 for true
and 0 for false
. Later in the 1999 standard of
C (C99), the _Bool
type was introduced to support Booleans.
Floating Point Types
C++ has three floating point types, all of which are based on the IEEE-754 standard.
Floating point numbers are used to represent decimal numbers ie. numbers that can store
fractional components. These types are the float
, double
and long double
; with
float
represent single precision (32-bit) numbers, double
being double precision
(64-bit) numbers and long double
being an extended or quadruple precision (128-bit)
floating point number.
With auto
, floating point values being initialized as a double
by default and float
and long double
literals being specified by f
and l
literal suffixes.
auto const f = -0.06f;
auto const d = 47.5768;
auto const l = -655456.457567l;
We can also initialize floating points using exponential form:
auto const f = -6e-2f;
auto const d = 475768e4;
auto const l = -655456457567le7l;
Arithmetic Operations
Integral and floating point types are categorized as arithmetic types which mean they support the common arithmetic operations like addition, subtraction etc.
auto main() -> int {
// addition
auto const sum = 4 + 6;
// subtraction
auto const diff = 10 - 5.5;
// multiplication
auto const mul = 5 * 3.2;
// division
auto const idiv = 10 / 3;
auto const fdif = 13.5 / 2.4;
// remainder
auto const = 23 % 4;
return 0;
}
- Division between two integrals performs integer division and truncates towards 0 while if one argument is a floating point then floating point division is performed.
- Remainder is only valid between integral types.
Compound Data Types
Compound data types store multiple pieces of data or are data that can take multiple values.
Enumerations
Enumerations or enums are a construct that allows you to define a type whose value is
restricted to a set of named variants or enumerators. These named constants have an
underlying integral type. Specifying the underlying type is optional ie. omit the
: type
in the enum declaration.
enum class colour : char {
red,
green,
blue
};
auto const c = colour::red;
Tuple
Tuples allow you to pack multiple pieces of data of different types into a single
structure. Tuples have a fixed size/number of elements that cannot grow or shrink once
declared. Tuples in C++ are not language types but are provided by the standard library
in the <tuple>
header and is called std::tuple
. We create a tuple using brace
initialization (top) or using the helper function std::make_tuple()
.
auto const t = std::tuple { 5u, 5.34f, -345, "abc", false };
auto const u = std::make_tuple(5u, 5.f, -345, "abc", false);
Tuples can be accessed using std::get<I>(t)
with I
being the index of the value we
want to access and t
is the tuple object.
auto const e = std::get<2>(t); // e := -345
You can also destructure tuples into its constituent values like so.
auto const [v, w, x, y, z] = t;
There is a specialization of tuples called std::pair
which holds just two values. The
values of a pair can be extracted using the same methods as tuples but they also have
public members std::pair::first
and std::pair::second
which allows you to access the
data.
auto const p = std::pair {5, 'a'};
auto const [x, y] = p;
auto const z = p.second;
Special Types
C++ has a handful of special types that you won't use as directly as types but are fundamental to the language.
The first is the void
type is an incomplete type that is used to indicate that a
function does not return a value.
auto foo(auto const i) -> void {
i + 5;
}
The other type is std::nullptr_t
which is the type of nullptr
the value of a pointer
pointing to nothing.
Array Types
C++ array type is a fixed sized container where elements are all of the same type.
The array type is called std::array
and is found in the <array>
header. Array
elements can be accessed using the subscript operator []
or the array::at()
method
with indices starting at 0. The subscript element access does not perform bounds checking
while array::at()
does, meaning the later will throw and exception if an out of bounds
index is used while the former will crash the program... sometimes.
auto const a = std::array { 1, 2, 3, 4, 5 };
auto const e1 = a[0]; // valid
auto const e2 = a.at(5); // exception std::out_of_range
Functions
Functions are fundamental to programming as they allow us to write reusable pieces of
code. We have already been using a function in the examples we have shown so far, that
is the main()
function which is called by our OS to start the program. We have also
seen a function in constexpr
example.
Functions are defined by introducing a type (or auto
) followed by the functions name,
a(n optional) comma-seperated list of parameters surrounded in parenthesis followed by
the body of the function in (curly-)braces. We call a function through its name and
suffixing parenthesis to it.
#include <iostream>
// --snip--
auto another_one() {
std::cout << "Another one!\n";
}
auto main() -> int {
std::cout << "Main function!\n";
another_one();
return 0;
// --snip--
}
A function must be declared before it can be used as the compiler has to know the function symbol (name + parameter and return types) exists however, it does not have to defined. Note that the return type must be explicitly stated so the return type can be deduced.
#include <iostream>
// --snip--
// declaration
auto another_one() -> void;
auto main() -> int {
std::cout << "Main function!\n";
another_one();
return 0;
// --snip--
}
// definition
auto another_one() -> void {
std::cout << "Another one!\n";
}
This mechanism is a result of how C and thus C++ code was and still is compiled and linked together. It allows you to state a symbol existed in a public header (declare) but define it later in a source file which was usually built into a binary library with the linker then connected the calls to the function to the location in the library.
Parameters
Parameters are a way to pass information into functions. The type of each parameter must be specified, using the same syntax we saw to declare a variable (without an initializer).
#include <iostream>
// --snip--
auto another_one(int const x, int const y) {
std::cout << "x: " << x << ", y: " << y << "\n";
}
auto main() -> int {
std::cout << "Main function!\n";
another_one(7, 6);
return 0;
// --snip--
}
As we saw in the constexpr
example from the previous page, function parameters may also
be declared with auto
but this can sometimes make hard to know what the type of the
parameter is supposed to be.
Return Values
Functions can also return values using the return
keyword. The type of the return value
is indicated either before the functions name (C-style) or using a trailing return type,
like we've been using for main()
. When a function doesn't a value, it's return type is
void
.
#include <iostream>
#include <sstream>
#include <string>
// --snip--
auto another_one(int const x, int const y) -> std::string {
auto ss = std::stringstring{};
ss << "x: " << x << ", y: " << y << "\n";
return ss.str();
}
auto main() -> int {
std::cout << "Main function!\n";
std::cout << another_one(7, 6);
return 0;
// --snip--
}
Overloading
In C++ you can overload functions of the same name to have different implementations as long as the type signature of the function is different. This is because the type signature is part of the functions symbol and thus functions with the same name but different parameters (and possibly return type) is an entirely different function.
#include <iomanip>
#include <iostream>
#include <sstream>
#include <string>
// --snip--
auto another_one(int const x, int const y) -> std::string {
auto ss = std::stringstream {};
ss << "x: " << x << ", y: " << y << "\n";
return ss.str();
// --snip--
}
auto another_one(float const x, float const y) -> std::string {
auto ss = std::stringstream {};
ss << std::setprecision(4)
<< "x: "
<< x
<< ", y: "
<< y
<< "\n";
return ss.str();
}
auto main() -> int {
std::cout << "Main function!\n";
std::cout << another_one(7, 6);
std::cout << another_one(7.456575654f, 6.0f);
return 0;
// --snip--
This concept also extends to C++ operators, which can also be overloaded to have custom
functionality between custom types. Operators are overloaded using the operator
keyword
as the function name, suffixed with the operator we wish to overload. Operator overload
functions can only take two parameters except unary operators, which can only take one.
#include <iostream>
#include <ostream>
#include <utility>
// --snip--
auto operator<<(std::ostream& os, std::pair<int, int> p) -> std::ostream& {
auto const [x, y] = p;
os << "x: " << x << ", y: " << y << "\n";
return os;
}
auto main() -> int {
auto const p = std::pair {7, 6};
std::cout << p << "\n";
return 0;
// --snip--
}
There are a few operators that cannot be overloaded such as scope lookup (::
), and
member access operators (.
, ->
, .*
and ->*
).
Comments
Comments are a way to document code for other people, and yourself. In C++ there are two
types of comments, single line and multi-line. We've seen single line comments in many of
the previous examples but to reiterate, a single line comment is started with //
and
any text written after it until a newline is ignored by the compiler.
// Comment on its own line
auto const x = 5; // Comment
Multi-line comments are specified using /* */
quoting ie. the comment extends from
/*
comment opener and continues until */
. This allows comments to extend multiple
lines or be nested amongst code (if you really want).
/*
multi-line comment
another line
*/
auto const /* int */ x = 5;
Control Flow
Control flow is how we get our programs to do interesting things, it allows us to write programs that do different things depending on conditions (branch) or easily repeat code (loops). C++ also has various relational and logical operators used to construct conditional expressions used by the control flow statements. You can read about them in Appendix B.
Branches
if statements
An if
statement is the simplest control flow structure, it allows us to execute a piece
of code as long as a condition is true
. if
statements are declared using the if
keyword followed by the conditional expression in parenthesis. The code to execute is
contained in braces like function definitions.
#include <iostream>
// --snip--
auto main() -> int {
auto const x = 6;
if (x % 2 == 0) {
std::cout << "Even\n";
}
return 0;
// --snip--
}
We can add an alternative branch using the else
keyword after the closing the brace of
the if
the block. This branch will run if the condition in the if
statement is
false
.
#include <iostream>
// --snip--
auto main() -> int {
auto const x = 5;
if (x % 2 == 0) {
std::cout << "Even\n";
} else {
std::cout << "Odd\n";
}
return 0;
// --snip--
}
We can create a multiple branches based on various conditions using an else if
statement. These declared after the initial if
statement.
#include <iostream>
// --snip--
auto main() -> int {
auto const x = 5;
if (x % 2 == 0) {
std::cout << "Even\n";
} else if (x % == 5) {
std::cout << "5 multiple\n";
} else {
std::cout << "Odd\n";
}
return 0;
// --snip--
}
switch statements
switch
statements are a way to mix control flow with enums. switch
statements are
given a enum object which are then matched against different cases ie. enum variants.
There is a default
case that is used if no case is match, the equivalent of else
from if
statements.
The cases of a switch
statements automatically fallthrough to the next case if you do
not use a break
statement to escape from the switch
.
#include <iostream>
// --snip--
enum class colour : char {
red,
green,
blue
};
auto main() -> int {
auto const c = colour::red;
switch (c) {
case colour::red:
std::cout << "red\n";
break;
case colour::green:
std::cout << "green\n";
break;
case colour::blue:
std::cout << "blue\n";
break;
default:
std::cout << "unknown\n";
break;
}
return 0;
// --snip--
}
Because enums are fundamentally based on an underlying integral type, switch
statements
thus can work on any integral type like char
or int
however, you have to be sure to
cover all the cases as there is no formally notion of pattern matching over integral
ranges.
Loops
while loop
while
loops are the fundamental looping construct in C++. A while
loops will repeat
as long as the condition remains true
.
#include <iostream>
// --snip--
auto main() -> int {
auto i = 0uLL;
auto acc = 0uLL;
while (i < 10) {
acc += i;
i += 1;
}
std::cout << "Sum: " << acc << "\n";
return 0;
// --snip--
}
There is another while
loop called a do-while
loop. This has the same semantics as
a while
loop but the loop condition is checked at the end of the loop instead of at the
start. This has the effect of running the loop at least once.
#include <iostream>
// --snip--
auto main() -> int {
auto i = 0uLL;
auto acc = 0uLL;
do {
acc += i;
i += 1;
} while (i < 1);
std::cout << "Sum: " << acc << "\n";
return 0;
// --snip--
}
for loop
for
loops further abstract the concepts of loops by providing dedicated syntax for
initializing the loop counter and incrementing the loop unlike a while
loop which only
only has syntax for checking the loop condition. We saw a for
loop in our constexpr
example.
#include <iostream>
// --snip--
auto main() -> int {
auto acc = 0uLL;
for (auto i = 0; i < 10; i++) {
acc += i;
}
std::cout << "Sum: " << acc << "\n";
return 0;
// --snip--
}
range-for loop
In C++11, we got another for
loop called a range-for
loop. This loop is able to
automatically traverse C++ standard container types like array
. This is beneficial
as it prevents us from incorrectly accessing/traversing the container ie. indexing out of
the array/containers bounds.
#include <iostream>
#include <array>
// --snip--
auto main() -> int {
auto const a = std::array {1, 2, 3, 4, 5};
auto acc = 0uLL;
for (auto const x : a) {
acc += x;
}
std::cout << "Sum: " << acc << "\n";
return 0;
// --snip--
}
Ownership
Ownership of data and resources is vital to consider when writing complex and sophisticated programs in C++ (or other systems level programming language) due to needing to manage resources like memory manually. Having a clear picture of who owns what data and who has access to data ensures we write safer programs.
What is Ownership?
Ownership is the notion that some data is managed or owned by a particular variable and thus is responsible for ensuring that it's data lives long enough for all parts of the program that reference the data can correctly access the data.
We first had a look at lifetimes in Common Concepts - Variables and Mutability when discussing storage duration of data but we are now going to discuss how this comes into effect in our programs.
You'll hear a lot about the stack and the heap when discussing C++ but what are they?
These are two regions of memory that your program can access during its execution. The stack is a fixed sized region that is utilised automatically by your program. When variables are created, the data is pushed onto the top of the stack and the stack pointer is incremented by the size in bytes of this newly pushed variable. When that variable is no longer referenceable ie. it goes out of scope, the value is popped off the stack thus deleted the data. Data with automatic storage duration live on the stack and it is where all variables we have shown in the previous examples have been allocated to.
Function calls also interact with the stack in an interesting way. When a function is called the stack creates a new stack frame which encapsulates all the data created during the function call, as well as information about parameters and how to get back to the function's call site ie. parameter data and return address storage. This is done so that when a function does return, the entire stack frame can be popped off, deallocating all data created during the functions execution.
All in all, the stack is super fast and automatic allocates and deallocates memory for us thus allowing the lifetime of variables to be computed by the compiler, not us! So why don't we always use the stack? We can't because the stack is a fixed size and cannot grow beyond its original capacity which usually isn't very large because our OS wants to allow lots of programs to be able to run at once.
This is where the heap comes in. The heap is slow but dynamic memory that our program requests at runtime. This allows us to create variable sized memory regions that we can grow and shrink as need be however, this comes at the cost of having to manually return this memory back to the OS otherwise it is leaked! This means we have to track the lifetime of the data we create and ensure it is freed correctly. Data of this kind is categories as having dynamic storage duration.
Scope
Scopes define what set of symbols and objects are valid to reference in our program. We've encounter quite a few different uses of scope in our travels this far. The obvious one being functions. Functions create an entirely new scope that isn't just semantic (ie. only enforced by the compiler for correctness sake) but have an effect on the execution of a program. When a function is called it allocates a new stack frame meaning the lifetime of all data creating in that function is bound to that function's lifetime.
We also can see scope with conditional statements like for
and range-for
loops as the
initializer and iterator for each statement type respectively is only bound to the scope
of the statement body. In fact, you can introduce an unnamed scope using a brace block.
{
auto const x = 5;
// do stuck with x
}
// x out of scope
So how do we share data? In C++, variables have copy semantics and what this means is
that the data of an object is copied when we bind a new variable to an existing
variable. We can see this in the play below with y
being assigned the value of x
not
x
itself and thus the address of each object is unique.
#include <iostream>
// --snip--
auto main() -> {
auto const x = 5;
auto const y = x;
std::cout << &x << "\n";
std::cout << &y << "\n";
return 0;
}
// --snip--
Notice the addresses of x
and y
are only 4 bytes apart, this is because they are
right next to each other on the stack as we discussed above.
The std::string
Type
So what happens when data on the heap goes out of scope? To demonstrate what happens we
need to introduce the std::string
type. string
is more complex than the type
introduced in Common Concepts - Data Types
as it allocates its data on the heap and can change its size during runtime, as opposed
to string literals which are encoded directly into the compiled binary. We even saw
string
in our guessing game!
String literials are declared using a pair of double quotes (""
) to surround the text
and is of the type of a pointer to the first character (const char*
).
So how can we ensure that the memory allocated on the heap is automatically freed when
the variable goes out of scope? Some languages use a Garbage Collector (GC) to clean up
memory that hasn't been used recently. In C++ we do not have a GC so it is our
responsiblility to identify when memory is no longer needed or is it? C++ uses a concept
known as Resource Acquisition Is Initialization or RAII. In essence it is the idiom
of binding the lifetime of a resource; like memory, to the variable or object that own
it and thus allowing for the resource to be freed when the owning variable goes out of
scope. This is how string
; and any other standard library containers, works.
#include <iostream>
#include <string>
// --snip--
auto main() -> {
{
auto const s = std::string {"hello"};
// s is in scope
}
// s out of scope and data freed
return 0;
}
// --snip--
References and Moves
Reference Semantics
So how do dynamic objects like string
interact with C++ copy semantics? Well, they obey
the same rules, the data is copied into a new heap location, creating two distinct
objects.
#include <iostream>
#include <string>
// --snip--
auto foo(std::string const s) {
std::cout << "Address of s: " << static_cast<const void*>(s.data()) << "\n";
}
auto main() -> int {
auto const s = std::string {"hello"};
std::cout << "Address of s: " << static_cast<const void*>(s.data()) << "\n";
foo(s);
return 0;
// --snip--
}
This is fine for primitive values that are small in size eg. int
, bool
etc.
which are small but a string
can get really big and copying it's data every time; when
say pass it to a function, takes \(O(n)\) time. What if we could refer to the same data
without copying it? This is where references come into effect. As their name suggests
reference allow us to refer to another object and treat ourselves as said object.
References are declared by suffxing an ampersand (&
) to a type declaration on a
variable or parameter.
#include <iostream>
#include <string>
// --snip--
auto foo(std::string const& s) {
std::cout << "Address of s: " << static_cast<const void*>(s.data()) << "\n";
}
auto main() -> int {
auto const s1 = std::string {"hello"};
auto const& s2 = std::string {"hello"};
std::cout << "Address of s: " << static_cast<const void*>(s.data()) << "\n";
std::cout << "Address of s: " << static_cast<const void*>(s.data()) << "\n";
foo(s2);
return 0;
// --snip--
}
Binding a referencing to another reference doesn't create a reference to a reference. This is because references pass information through themselves thus the new reference points the original object.
References have a few special semantics, for one references; once bound, cannot be rebound and thus will refer to the same object for the references lifetime. References can also not refer to nothing, they must be bound at construction. This makes references super effective at sharing data safely however, you do have to be careful as C++ does not guarantee a reference does not outlive the object it refers to and thus you can have a dangling reference which refers to a non-existent object and is invalid to use.
This is particularly important to consider when returning references from functions as we as programmers must ensure the object being referred to is not cleaned up when the function returns.
#include <iostream>
#include <sstream>
#include <string>
// --snip--
auto foo(std::string const& s) -> std::string const& {
auto ss = std::stringstream {};
ss << "Address of s: " << static_cast<const void*>(s.data()) << "\n";
return ss.str(); // error: returning reference to temporary
}
auto main() -> int {
auto const s = std::string {"hello"};
std::cout << "Address of s: " << static_cast<const void*>(s.data()) << "\n";
std::cout << foo(s);
return 0;
// --snip--
}
cmake -S . -B build --preset=<platform>
cmake --build build
[ 50%] Building CXX object CMakeFiles/main.dir/main.cxx.o
/home/user/projects/ownership/main.cxx: In function ‘const std::string& foo(const std::string&)’:
/home/user/projects/ownership/main.cxx:9:18: error: returning reference to temporary [-Werror=return-local-addr]
9 | return ss.str(); // error: returning reference to temporary
| ~~~~~~^~
cc1plus: all warnings being treated as errors
gmake[2]: *** [CMakeFiles/main.dir/build.make:76: CMakeFiles/main.dir/main.cxx.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/main.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
If you need to return something out of a function and it was allocated in the lifetime of the function and won't exist beyond the function, the return type should not be a reference but a plain value.
Move Semantics
C++ has another method for control data ownership called move semantics which allows
you to transfer ownership of data to another object. This will leave the previously
owning object in a default initialized state or its empty state. Moves; contrary to the
name, moves don't move data but rather transfer ownership of data. To make a object
movable we need to turn it into what is called an x-value expression ie. a temporary
value, such that the compiler can correctly resolve the move. This is done with the
std::move()
function found in the <utility>
header.
#include <iostream>
#include <string>
#include <utility>
// --snip--
auto constexpr str_addr(std::string const& s) -> const void* {
return static_cast<const void*>(s.data());
}
auto main() -> int {
auto s1 = std::string {"hello this is a really long string"};
std::cout << sizeof(s1) << "\n";
std::cout << "String: " << s1 << " | addr: " << str_addr(s1) << "\n";
auto const s2 = std::move(s1);
std::cout << "String: " << s1 << " | addr: " << str_addr(s1) << "\n";
std::cout << "String: " << s2 << " | addr: " << str_addr(s2) << "\n";
return 0;
// --snip--
}
We have to make s1
non-const
to see the behaviour I specified above because if s1
were const
deleted the stored data would violate the invariant that s1
is const
as we would have mutated it thus const
data will invoke a copy not a move.
This restriction is due to moves not being destructive in C++ which would mean s1
would become an invalid object and generate a compiler warning if we accessed it after
moving from it.
The span
and string_view
types
string_view
Often we want to reference only part of a string
, in the past we would use
string::substr()
however this would return a newly allocated string
so in C++17 we
got std::string_view
which is a reference to a series of characters however, it does
not own the characters. string_view
has almost all the same operations as string
which makes it super versatile as a string
substitute when needing to reference part of
a string
.
#include <iostream>
#include <string>
#include <string_view>
// --snip--
auto main() -> int {
auto s = std::string { "hello" };
auto sv = std::string_view { s.data() + 1, 3 };
std::cout << s << "\n";
std::cout << sv << "\n";
return 0;
// --snip--
}
The string::data()
method is used to get the address of the first element in a string
thus we can use it to get the starting address of our substring by offsetting it by the
correct number of characters as seen above.
We can also use string_view
to handle string literals, these are the strings we create
using double quotes (""
). This makes string literals; which previously was just an
address to the character data, much easier to use and much closer to strings
, with the
the constraint that you cannot modify this text.
#include <iostream>
#include <string>
#include <string_view>
using namespace std::literals;
auto main() -> int {
auto sv1 = std::string_view { "hello" };
auto sv2 = "bye"sv;
std::cout << sv1 << "\n";
std::cout << sv2 << "\n";
return 0;
// --snip--
}
We can create strings
and string_views
from string literals using
literal operators. Literal operators are suffixes you attach to a literals; like the
u
suffix to make an integer literal unsigned
, that can be used to construct a custom
type from the literals. In this case, we can make a string
or string_view
using the
s
or sv
literal operators respectively. These are found in the namespace
std::literals
which we expose globally in the line above main()
.
Spans
We can general this ntion of a view using the std::span
type. Because spans
are more
general than a string_view
there are far fewer methods available however, they still
cover all you need when working with a generalised view (or span) of a contiguous
data structure.
spans
are used for similar reasons to string_view
, to easily accesses subslices of a
contiguous data structure (ie. a subarray) or to adapt C-arrays into a safer type.
#include <iostream>
#include <array>
#include <span>
auto main() -> int {
// --snip--
auto a1 = std::array { 1, 2, 3, 4, 5 };
auto s1 = std::span { a1.data() + 1, 3 };
int a2[] = { 1, 2, 3, 4, 5 }; // C-array
auto s2 = std::span { a2 };
return 0;
}
// --snip--
You don't need to worry about why C-arrays are unsafe for the purposes of this book. In a nutshell C-arrays (and string literals for that matter) are very primitive structures that do not provide any guards from misuse.
This has the benefit of allowing clever uses of the structures for the sake of performance and optimization which can be a good thing for system languages especially for the time period C came onto the scene however, when learning a system language guards help ensure correct practices are followed and engrained early in your journey so they do not become footguns in the future. This is why this book does not cover content from C as C++ has given many safer alternatives for decades.
Structures
A structure or struct is a way to aggregate or group related data together while giving each piece of data a distinct name, unlike tuples. We'll explore; in this chapter, how to define and instantiate structs, access data and call special functions called methods on instances of structs.
Variables of a struct care called member variables while functions are called methods. Together, these form the members of the struct.
Creating Structures
To declare a struct we use the struct
keyword followed by the name of the new type.
Members are defined inside curly braces using the same variable and function declaration
syntax we have seen previously; although variables do not need an initializer and thus
auto
is less powerful in member variable declarations. The entire struct is capped by a
semicolon.
#include <string>
struct Person {
bool alive;
std::size_t age;
std::string name;
std::string email;
};
auto main() -> int {
return 0;
}
We can then create an instance of the struct using an aggregate initializer. This is the process of giving concrete value to the member variables using a brace-initializer list. The order in which we initialize member variables is the same as the order member variables are declared in.
#include <string>
struct Person {
bool alive;
std::size_t age;
std::string name;
std::string email;
};
auto main() -> int {
auto const p = Person {
true,
23,
"John Doe",
"johnd@example.com"
};
return 0;
}
To access member variables we use the member access operator (.
). If your object is
not constant you can also assign new values to members through the dot operator.
#include <string>
struct Person {
bool alive;
std::size_t age;
std::string name;
std::string email;
};
auto main() -> int {
auto p = Person {
true,
23,
"John Doe",
"johnd@example.com"
};
p.email = "jdoe@sample.com";
return 0;
}
Functions can return structs just like builtin types. Here we have a function that
creates a Person
.
#include <string>
#include <string_view>
struct Person {
bool alive;
std::size_t age;
std::string name;
std::string email;
};
auto make_person(std::string_view const name, std::string_view const email) -> Person {
return Person {
true,
0,
std::string{ name },
std::string{ email }
};
}
auto main() -> int {
auto const p = make_person(
"John Doe",
"johnd@example.com"
);
return 0;
}
For simple structs like this, the compiler will generate a few constructors for us such as a default constructor and a copy constructor. These allow these simple types to be copied or constructed in a default state without having to specify this process ourselves.
#include <string>
#include <string_view>
struct Person {
bool alive;
std::size_t age;
std::string name;
std::string email;
};
auto make_person(std::string_view const name, std::string_view const email) -> Person {
return Person {
true,
0,
std::string{ name },
std::string{ email }
};
}
auto main() -> int {
auto const p1 = make_person(
"John Doe",
"johnd@example.com"
);
// Default construct
auto p2 = Person {};
// Copy
auto p3 = p1;
return 0;
}
We will explore constructors Chapter 8 - Custom Types and how we can use them to control the initialization of our own types. We will also explore how to disable certain constructors to disallow certain behaviours from our types.
Using Structures
Let us explore how structs can be used in everyday programs. We are going to create a simple program to calculate operations on a 3D vector type.
#include <cmath>
#include <iostream>
auto magnitude(auto const x, auto const y, auto const z) -> double {
return std::sqrt(x * x + y * y + z * z);
}
auto main() -> int {
auto const x = 2.;
auto const y = 3.;
auto const z = 5.;
std::cout << "The magnitude of the vector is "
<< magnitude(x, y, z)
<< "units.\n";
return 0;
}
Refactoring with Tuples
We can make this code more concise by packing the data into a tuple. This allows the
type signature of magnitude()
to be much simpler; taking a single parameter, and
ensures all our data is collected together. However, using a tuple leaves room for
ambiguity in which piece of data has which meaning as none of the elements have names.
#include <cmath>
#include <iostream>
#include <tuple>
using vec3 = std::tuple<double, double, double>;
auto magnitude(vec3 const vec) -> double {
auto const& [x, y, z] = v;
return std::sqrt(x * x + y * y + z * z);
}
auto main() -> int {
auto const v = vec3 { 2., 3., 5. };
std::cout << "The magnitude of the vector is "
<< magnitude(v)
<< "units.\n";
return 0;
}
- The line starting with the
using
keyword is used introduce a type alias. This allows us to define a shorter name for a type we are using frequently. This is particularly useful for tuples such that we can distinguish two tuples of the same underlying types but with different purposes. - We could also have used
std::make_tuple()
to create our tuple object inmain()
however, I used the brace-initialized from with the type alias to make it clearer what typev
is supposed to be.
Refactoring with structs
We can add more meaning by create a vec3
struct with named x, y and z data members.
Now our magnitude()
function is able to access the member variables by name.
#include <cmath>
#include <iostream>
struct vec3 {
double x;
double y;
double z;
};
auto magnitude(vec3 const vec) -> double {
return std::sqrt(vec.x * vec.x
+ vec.y * vec.y
+ vec.z * vec.z);
}
auto main() -> int {
auto const v = vec3 { 2., 3., 5. };
std::cout << "The magnitude of the vector is "
<< magnitude(v)
<< "units.\n";
return 0;
}
Methods
As discussed before, methods are functions that are called on instances of a struct. This allows the method to access the member variables of the struct and just like regular functions we can pass parameters and return values from methods.
Defining Methods
Let's change our example program from before to use methods instead of a free function. We define methods within the structs curly braces just like regular functions and call the function using the dot syntax on an instance of the struct.
#include <cmath>
#include <iostream>
struct vec3 {
double x;
double y;
double z;
auto magnitude() const -> double {
return std::sqrt(x * x + y * y + z * z);
}
};
auto main() -> int {
auto const v = vec3 { 2., 3., 5. };
std::cout << "The magnitude of the vector is "
<< v.magnitude()
<< "\n";
return 0;
}
The const
after the parameter declaration and before the trailing return arrow does not
mean the return type is constant, but rather indicates that this method does not modify
the member variables of this vec3
instance and thus can be used on on const
instances.
this
keyword
Implicitly, all methods are passed an argument called this
which is a pointer to the
instance of the struct the method was called on. this
can be omitted in most cases
with variables automatically being looked up in the struct instance however, if the name
lookup is ambiguous ie. there is a parameter of the same name, then you will need to
access the member variable through this
. Because this
is a pointer you cannot use the
dot operator but must use the ->
operator to deference the pointer.
#include <cmath>
#include <iostream>
#include <string>
struct vec3 {
double x;
double y;
double z;
auto magnitude() const -> double {
return std::sqrt(x * x + y * y + z * z);
}
auto normalized() const -> vec3 {
auto const sz = this->magnitude();
return vec3 { x / sz, y / sz, z / sz };
}
// Helper method for stringifying vec3
auto to_string() const -> std::string {
auto ss = std::stringstream {};
ss << "{ "
<< x
<< ", "
<< y
<< ", "
<< z
<< " }";
return ss.str();
};
auto main() -> int {
auto const v = vec3 { 2., 3., 5. };
std::cout << "The magnitude of the vector is "
<< v.magnitude()
<< "\n";
auto const n = v.normalized();
std::cout << "Vector v: "
<< v.to_string()
<< " is: "
<< n.to_string()
<< "\n";
return 0;
}
We will discuss pointers properly and in detail in Chapter 13 - Memory but for now, think of pointers as like references but closer to a hardware concept.
Taking Parameters
As stated before, we can declare parameters for methods such that they can take arguments with parameters a declared the same as with free functions.
#include <cmath>
#include <iostream>
#include <sstream>
#include <string>
struct vec3 {
double x;
double y;
double z;
auto magnitude() const -> double {
return std::sqrt(x * x + y * y + z * z);
}
auto normalized() const -> vec3 {
auto const sz = this->magnitude();
return vec3 { x / sz, y / sz, z / sz };
}
auto dot(vec3 const& u) const -> double {
return x * u.x + y * u.y + z * u.z;
}
// Helper method for stringifying vec3
auto to_string() const -> std::string {
auto ss = std::stringstream {};
ss << "{ "
<< x
<< ", "
<< y
<< ", "
<< z
<< " }";
return ss.str();
}
};
auto main() -> int {
auto const v = vec3 { 2., 3., 5. };
std::cout << "The magnitude of the vector is "
<< v.magnitude()
<< "\n";
auto const n = v.normalized();
std::cout << "Vector v: "
<< v.to_string()
<< " is: "
<< n.to_string()
<< "\n";
auto const u = vec3 { 2., -3., 5. };
std::cout << "Dot product of v: "
<< v.to_string()
<< " and u: "
<< u.to_string()
<< " is: "
<< v.dot(u)
<< " units \n";
return 0;
}
Operator Overloading
Just like we can define overloaded operators as free functions we can define overloaded operators within a struct however, the left hand argument is always the the struct instance the operator is defined on.
#include <cmath>
#include <iostream>
#include <sstream>
#include <string>
struct vec3 {
double x;
double y;
double z;
auto magnitude() const -> double {
return std::sqrt(x * x + y * y + z * z);
}
auto normalized() const -> vec3 {
auto const sz = this->magnitude();
return vec3 { x / sz, y / sz, z / sz };
}
auto dot(vec3 const& u) const -> double {
return x * u.x + y * u.y + z * u.z;
}
auto operator*(vec3 const& u) const -> double {
return this->dot(u);
}
// Helper method for stringifying vec3
auto to_string() const -> std::string {
auto ss = std::stringstream {};
ss << "{ "
<< x
<< ", "
<< y
<< ", "
<< z
<< " }";
return ss.str();
}
};
auto main() -> int {
auto const v = vec3 { 2., 3., 5. };
std::cout << "The magnitude of the vector is "
<< v.magnitude()
<< "\n";
auto const n = v.normalized();
std::cout << "Vector v: "
<< v.to_string()
<< " is: "
<< n.to_string()
<< "\n";
auto const u = vec3 { 2., -3., 5. };
std::cout << "Dot product of v: "
<< v.to_string()
<< " and u: "
<< u.to_string()
<< " is: "
<< v * u
<< " units \n";
return 0;
}
If we want to reorder the parameters of an operator on our struct but keep the definition
all together we can use the friend
keyword to create a free function in a structs
definition. This also allows the friend function to access the members of the struct
instance. The friend
keyword becomes more relevant when discussing
Access Modifiers in Chapter 8.
#include <cmath>
#include <iostream>
#include <ostream>
struct vec3 {
double x;
double y;
double z;
auto magnitude() const -> double {
return std::sqrt(x * x + y * y + z * z);
}
auto normalized() const -> vec3 {
auto const sz = this->magnitude();
return vec3 { x / sz, y / sz, z / sz };
}
auto dot(vec3 const& u) const -> double {
return x * u.x + y * u.y + z * u.z;
}
auto operator*(vec3 const& u) const -> double {
return this->dot(u);
}
friend auto operator<<(std::ostream& os, vec3 const& v) -> std::ostream& {
os << "{ "
<< x
<< ", "
<< y
<< ", "
<< z
<< " }";
return os;
}
};
auto main() -> int {
auto const v = vec3 { 2., 3., 5. };
std::cout << "The magnitude of the vector is "
<< v.magnitude()
<< "\n";
auto const n = v.normalized();
std::cout << "Vector v: "
<< v
<< " is: "
<< n
<< "\n";
auto const u = vec3::unit_x();
std::cout << "Dot product of v: "
<< v
<< " and u: "
<< u
<< " is: "
<< v * u
<< " units \n";
return 0;
}
Static Functions
We can also declare static
methods on a struct which do not operate on an instance but
are simply bound to the struct itself. We declare static methods with the static
keyword
#include <cmath>
#include <iostream>
#include <ostream>
struct vec3 {
double x;
double y;
double z;
static auto unit_x() -> vec3 {
return vec3 { 1., 0., 0. };
}
auto magnitude() const -> double {
return std::sqrt(x * x + y * y + z * z);
}
auto normalized() const -> vec3 {
auto const sz = this->magnitude();
return vec3 { x / sz, y / sz, z / sz };
}
auto dot(vec3 const& u) const -> double {
return x * u.x + y * u.y + z * u.z;
}
auto operator*(vec3 const& u) const -> double {
return this->dot(u);
}
// Helper method for stringifying vec3
friend auto operator<<(std::ostream& os, vec3 const& v) -> std::ostream& {
os << "{ "
<< x
<< ", "
<< y
<< ", "
<< z
<< " }";
return os;
}
};
auto main() -> int {
auto const v = vec3 { 2., 3., 5. };
std::cout << "The magnitude of the vector is "
<< v.magnitude()
<< "\n";
auto const n = v.normalized();
std::cout << "Vector v: "
<< v
<< " is: "
<< n
<< "\n";
auto const u = vec3 { 2., -3., 5. };
std::cout << "Dot product of v: "
<< v
<< " and u: "
<< u
<< " is: "
<< v * u
<< " units \n";
return 0;
}
Summary
While this chapter has only a handful of pages we covered a lot of new features and syntax. From defining and creating structs, attaching methods to structures and even static methods!
Appendix
Useful info about C++ that doesn't fit into the model of the book.
A - Keywords
This is the list of keywords reserved by C++. This means these words cannot be used as an identifier for variables, functions, class/struct member names etc.. Some are reserved with no current or deprecated usecase.
Currently in Use
Keyword | Description |
---|---|
alignas (C++11) | |
and | |
and_eq | |
asm | |
auto | |
bitand | |
bitor | |
break | |
case | |
catch | |
class | |
compl | |
concept (C++20) | |
const | |
consteval (C++20) | |
constexpr (C++11) | |
constinit (C++20) | |
continue | |
co_await (C++20) | |
co_return (C++20) | |
co_yield (C++20) | |
decltype (C++11) | |
default | |
do | |
double | |
else | |
enum | |
explicit | |
export | |
extern | |
false | |
float | |
for | |
friend | |
goto | |
if | |
inline | |
mutable | |
namespace | |
noexcept (C++11) | |
not | |
not_eq | |
nullptr (C++11) | |
operator | |
or | |
or_eq | |
private | |
protected | |
public | |
register | |
requires (C++20) | |
return | |
signed | |
sizeof | |
static | |
static_assert (C++11) | |
struct | |
switch | |
template | |
this | |
thread_local (C++11) | |
throw | |
true | |
try | |
typedef | |
typename | |
union | |
unsigned | |
using | |
virtual | |
void | |
volatile | |
while | |
xor | |
xor_eq |
Reserved In Specific Contexts
These keywords are reserved in specific circumstances like in a class declaration etc..
Keyword | Description |
---|---|
final (C++11) | Specifies virtual member function cannot be overridden in child class. |
override (C++11) | Specifies virtual member function definition overrides parent definition. |
import (C++20) | Module import declaration. |
module (C++20) | Module and module fragment declaration. |
Reserved for Future Use
These keywords are reserved for experimental features being tested in a Technical Specification.
Keyword | Technical Specification | Description |
---|---|---|
atomic_cancel | Transactional Memory (TM) TS | Starts atomic block that will restore data modified during atomic block for some exception types, otherwise it will call std::abort . |
atomic_commit | Transactional Memory (TM) TS | Starts atomic block that commits data changes regardless of exceptions being thrown. |
atomic_noexcept | Transactional Memory (TM) TS | Starts atomic block that will call std::abort if exception is thrown within the block. |
reflexpr | Reflection TS | Provides meta info about an object by returning a meta-object. |
synchronized | Transactional Memory (TM) TS | Starts a synchronized block |
transaction_safe | Transactional Memory (TM) TS | Indicates that a function is transaction-safe. |
transaction_safe_dynamic | Transactional Memory (TM) TS | Indicates that a virtual function is transaction-safe. |
B - Operators
This page is a high level overview of C++ operators and other symbols and what they do.
- ✅ - Fully overloadable
- ☑️ - Overloadable with constraints
- ⚠️ - Overloadable but not recommended
- ❌ - Not overloadable
Basic Operators
Operator | Example | Description | Overloadable |
---|---|---|---|
+ | +expr | Arithmetic posigation | ✅ |
+ | expr + expr | Arithmetic addition | ✅ |
++ | ++expr | Prefix increment | ✅ |
++ | expr++ | Postfix increment | ✅ |
+= | var += expr | Arithmetic addition and assignment | ✅ |
- | -expr | Arithmetic negation | ✅ |
- | expr - expr | Arithmetic subtraction | ✅ |
-- | --expr | Prefix decrement | ✅ |
-- | expr-- | Postfix decrement | ✅ |
-= | var -= expr | Arithmetic subtraction and assignment | ✅ |
* | *expr | Pointer dereference | ☑️ |
* | expr * expr | Arithmetic multiplication | ✅ |
*= | var *= expr | Arithmetic multiplication and assignment | ✅ |
/ | expr / expr | Arithmetic division | ✅ |
/= | var /= expr | Arithmetic division and assignment | ✅ |
% | expr % expr | Arithmetic remainder | ✅ |
%= | var %= expr | Arithmetic remainder and assignment | ✅ |
~ | ~expr | Bitwise Complement | ✅ |
& | &expr | Address of | ✅ |
& | type ident& , type ident const& | Reference type | ❌ |
& | expr & expr | Bitwise AND | ✅ |
&= | var &= expr | Bitwise AND and assignment | ✅ |
&& | expr && expr | Logical AND | ☑️ |
| | expr | expr | Bitwise OR | ✅ |
|= | var |= expr | Bitwise OR and assignment | ✅ |
|| | expr || expr | Logical OR | ☑️ |
^ | expr ^ expr | Bitwise XOR | ✅ |
^= | var ^= expr | Bitwise XOR and assignment | ✅ |
<< | expr << expr | Bitwise left shift | ✅ |
<<= | var <<= expr | Bitwise left shift and assignment | ✅ |
>> | expr >> expr | Bitwise right shift | ✅ |
>>= | var >>= expr | Bitwise right shift and assignment | ✅ |
! | !expr | Logical NOT | ✅ |
== | expr == expr | Equality comparison | ✅ |
!= | expr != expr | Inequality comparison | ✅ |
< | expr < expr | Less than | ✅ |
<= | expr <= expr | Less than or equal | ✅ |
> | expr > expr | Greater than | ✅ |
>= | expr >= expr | Greater than or equal | ✅ |
<=> | expr <=> expr | Three way comparison | ✅ |
[] | expr[expr, expr, ..] | Subscript / array indexing (multi-argument since C++23) | ✅ |
() | expr(expr, expr, ..) | Function object invocation | ✅ |
, | expr, expr | Comma sequencing | ⚠️ |
= | var = expr , ident = expr | Assignment / Binding | ☑️ |
?: | expr ? expr : expr | Ternary expression | ❌ |
:: | ident::ident, ident::var | Namespace lookup | ❌ |
... | typename types... , type T... , T... args | Parameter type and value packs | ❌ |
. | expr.ident | Member access | ❌ |
.* | expr.*ident | Member access to pointer members | ❌ |
-> | expr->ident | Member access through a pointer | ☑️ |
->* | expr->*ident | Member access through a pointer to pointer members | ☑️ |
"" | literal_suffix-ident | User defined literal | ☑️ |
Memory Operators
Operator | Example | Description |
---|---|---|
new | new type (init-list) | Allocate a heap memory object constructed with parameters in init-list |
new [] | new type[size] {init-list} | Allocate a heap memory block initialized with elements in init-list |
delete | delete expr | Delete heap memory object |
delete [] | delete [] expr | Deletes heap memory block |
Type Casting Operators
Operator | Example | Description |
---|---|---|
static_cast | static_cast<T>(expr) | Casts expr to type T |
dynamic_cast | dynamic_cast<T>(expr) | Casts pointers and references to classes up, down and sideways through inheritance hierarchy |
reinterpret_cast | reinterpret_cast<T>(expr) | Casts expr to type T by reinterpreting underlying bits of expr |
const_cast | const_cast<T>(expr) | Can cast to or away const when type of expr and T are similar types |
C-cast | (type)expr | Legacy type cast from C, uses a combination of above casts |
Other Operators
Operator | Example | Description |
---|---|---|
sizeof | sizeof(expr) , sizeof(type) | Obtains the size in bytes of a type or expression |
sizeof... | sizeof...(pack-expr) , sizeof(pack-type) | Obtains the number of elements of a parameter pack |
typeid | typeid(expr) , typeid(type) | Obtains compiler representation of a type |
noexcept | noexcept(expr) | Checks if an expression will throw an exception |
alignof | alignof(typeid) | Obtains the alignment required by a type |