In this tutorial, you will:
- learn about the basics on compiling C++
- learn how to write a simple makefile
A short aside on compilers
In this class, we’re hoping to use (at least mainly) the clang C++ compiler. Clang is a compiler frontend based on LLVM, a project based here at UIUC, and is generally considered more modern (and informative), while being a mostly drop-in replacement for gcc. It provides the default C/C++ compiler on systems designed by Apple, and is becoming increasingly more popular for both industrial and common use. In previous semesters, we taught this course using exclusively the gcc C++ compiler. This tutorial will be executed chiefly using clang, and you are encouraged to follow suit. However, the alternative gcc command will be provided as well, for historical reasons and your interest.
Take note of the difference between a compiler and the language itself—a language is a standard, and a compiler interprets according to an implementation of that standard. (Fun fact: Neither the gcc C++ compiler nor the clang C++ compiler are actually C++ standard compliant.) In practice in this class, the differences should not overly concern you. However, if you run two of the clang/gcc paired commands below, such as the one which invokes their respective preprocessors, you may find that they do in fact have different internal behaviour.
Introduction to compilation
From your CS 225 git directory, run the following on EWS:
git fetch release git merge release/maketutorial -m "Merging initial maketutorial files" cd maketutorial/hello
If you’re on your own machine, you may need to run:
git fetch release git merge --allow-unrelated-histories release/maketutorial -m "Merging initial maketutorial files" cd maketutorial/hello
(Make sure you’ve followed the directions on the Course Setup page to check out your GitHub repository first.)
Open up the file
hello_world.cpp in your favourite text editor, and let’s
walk through what it’s doing.
- The first line includes the library
iostream, the standard input/output (i/o) streams library. It’s not important that you understand it intimately, but you’ll use it a lot in the near future. More relevantly, it’s useful for the upcoming educational example on running the macro preprocessor.
- Next, on line 3, we’re defining a function called
mainthat returns an
main’s case, that’s a return code that usually indicates if the run was successful; we didn’t write a
returnstatement, but in this case, the
returnis implicit), and takes no parameters (the empty parens).
- On line 4, there’s a helpful and informative one-line comment.
- Line 5 is the line that actually does the work.
std::coutis a function from the library
iostreamthat allows us to print something to the standard out stream.
<<is the insertion operator (you’ll learn more about operators later; all you need to know now is that this is print statement syntax), and the string after that is what we’re printing to standard out.
All in all, a very bare-bones Hello, world implementation.
Let’s try compiling it manually:
clang++ hello_world.cpp -o hello
or (note that the syntax is the same):
g++ hello_world.cpp -o hello
-o flag tells the compiler to give the executable an alternative name. Otherwise, the default name is
./ simply tells your shell to search the current directory for the
executable, rather than its normal executable paths. If all goes well, you
Hello, world! printed as output. But now let’s try to get a little
more in-depth. You can get rid of the executable you made by typing:
should verify its disappearance. Run the following command:
clang++ -save-temps hello_world.cpp -o hello
g++ -save-temps hello_world.cpp -o hello
-save-temps tells the compiler to retain the temporary files it
makes when we compile our program… so we can look at them! Listing the
contents of your current directory should yield four new files: naturally the
hello, but also
hello_world.o, the temporary files we asked the compiler to save, and our
guides into the slightly more technical aspects of basic compilation.
Running the macro preprocessor: What is
Run the following line:
clang++ -E hello_world.cpp -o preprocessed.ii
g++ -E hello_world.cpp -o preprocessed.ii
If all goes well, your terminal will spit out a large amount of somewhat unintelligible code, but at the bottom, there’s the code for our Hello, world program (with the comment stripped out). So what did the preprocessor do?
All it really did for this program was replace our “include” directive
#include <iostream>) with the actual text of the library we included (and, of
course, strip the comment out).
What does that actually mean? Well, if you were capable of compiling this
program at all, somewhere on the machine (be it virtual, remote, or physically
present) that compiled it, there exists a file called
contains the C++ code that implements the i/o streams library. If you were
using clang, it will be located in the directory where the library libc++
(libcxx) is installed. If you were using gcc, it’s in the directory where
libstdc++ (libstdcxx) is installed. Don’t worry about the specific libraries,
it doesn’t really matter, but if you were so inclined, you would be able to
find the code on your own machine. There is no magic involved here.
Back to the preprocessed code. In this case, the only included library was
iostream, but it would do exactly the same thing for any other included
library. If you had a million include directives, it would go through those
millions of lines, find each file you referenced, and tack it to your program,
so that when you referenced a function or class defined in one of those
standard library files, it would make sense to the compiler—like
in this case, which is a function defined in
iostream, that you wouldn’t have
been able to use without including the code. Of course the preprocessor has
plenty of other jobs as well, but we won’t cover them now.
Question: Why did we enclose the library name,
iostream, in angle brackets?
It’s not just so our code looks cooler—we could have said
"iostream" too (feel free to try it out), so what’s the difference? The
difference (in clang and gcc) is that using angle brackets specifies that the
preprocessor should look in the standard compiler include paths, and quotes
tell it to search the current directory first, and via the standard paths only
if that fails. Note that the true standard definition is a little more
complicated than this: technically, both behave in an “implementation-defined
manner” (any implementation could treat that differently if it so wished) but
that’s not very important for us.
Now you can run:
Look familiar? That’s the output file the preprocessor dumped, and it is identical to the output you saw when you ran the preprocessor yourself. This is the file that the compiler really compiles—not your plain, unpreprocessed source file.
If you want to be sure, try running:
diff hello_world.ii preprocessed.ii
diff returns no output if the files it’s comparing are identical. Make sure
preprocessed.ii were produced by the same
The actual compilation step: What is
Now let’s take a look at the next temporary file. Print the contents of
For those of you who have seen assembly code before, the output should be
recognisable. If you haven’t, assembly is the low-level intermediate between
normal, higher-level programming languages like C++, and the machine code that
your computer actually executes. In this case, the compiler (this is the step
of compilation that’s actually called compilation) has translated the
preprocessed source code from C++ to assembly, and dumped the output as
hello_world.s. Let’s ask our compiler to directly compile the code that we
preprocessed into assembly code:
clang++ -S preprocessed.ii -o compiled.s
g++ -S preprocessed.ii -o compiled.s
diff to verify that the files are the same (again, remember to make sure
compiled.s were produced by the same compiler):
diff hello_world.s compiled.s
If you used gcc, there shouldn’t be any differences. With clang, the only line that should be different is a line stating what preprocessed file the assembly was generated from.
Question: Why don’t we just write everything in assembly language? Well, for one, it’s kind of annoying to write all the time, and higher level ideas are harder to keep abstract without our human-friendly programming languages. Perhaps more importantly, assembly isn’t portable in the slightest. Assembly languages are specific to a specific architecture, so what assembles and runs on my machine may not run without alteration on yours. That’s pretty annoying, and compilers work pretty well, so most people normally leave the assembly to them.
Assembly: What is
The next step is assembling the code—that just means translating the assembly
hello_world.s into machine-readable code. That’s known as object
code, and the standard suffix for object code is
likely to see quite a few
.o files as you continue in this course. That doesn’t
mean you have to read them, though. If you:
you’ll fast realise it would be a somewhat unrealistic expectation anyway.
If you want to ask your compiler to assemble your assembly code, you can do this:
clang++ -c compiled.s -o assembled.o
g++ -c compiled.s -o assembled.o
Linking: Generating the final executable.
Linking is the final step, and arguably the most important and relevant to you. It’s the part you’ll interact with most, and besides perhaps flat out failure to compile at all, it’s the part of compiling you’ll be most confused by, particularly at the beginning of this class, when you’re responsible for all of your own compilation. Linking problems are some of the most notorious issues people have early on in this class… so pay attention to it, and perhaps you will be spared the “undefined reference” trauma.
Hint for the future “Undefined reference” errors are pretty much always linking errors, and you will probably have them. Remember this.
All a linker does is take all the object files tossed out by the assembling
step, and join them together into a single executable—in this case, the file
hello which you ran earlier. We only have one object file in our Hello, world
program, so this linking process is very uninteresting, but very soon (like,
later in this tutorial), you’ll be dealing with multiple object files.
Run the following, to have our compiler link our object file and output our
clang++ assembled.o -o hello_manual
g++ assembled.o -o hello_manual
Feel free to verify that it does exactly the same thing as our original
Congratulations, you’ve just compiled your own miniature program!
Dealing with multiple object files
Let’s visit the example directory
cd ../animals/ ls
The files you’ll see listed are
main.cpp. Feel free
to check out the source code.
dog.hpp is a C++ header file, what we’d call
the definition of the
Dog class, and
dog.cpp is a source file, the
implementation for said class. You’ll become more familiar with the details of
that relationship as the class moves on, but right now, just know that
together, they make the
main.cpp might look more familiar to
you. It’s a lot like
hello_world.cpp from the last exercise, in that it has
some includes and it has an executable
main function. In that
function, it calls a constructor for the class
Dog, and asks the object it
creates to do a number of things. But including the
Dog header file doesn’t
actually make the source code available. First, compile the main object file:
clang++ -c main.cpp -o main.o
g++ -c main.cpp -o main.o
Then, try compiling
clang++ main.o -o dog_program
g++ main.o -o dog_program
That’s what we did before for our Hello, world program, so what happened this
time? You got a bunch of “undefined reference” errors, and if you remember what
we said a few paragraphs up, “undefined reference” errors are pretty much
always linking errors. The compiler’s telling us that it doesn’t know what the
Dog::bark() (or any
Dog function) does, because it doesn’t have
that information in
main.cpp. The solution is to compile a separate object
file for the
Dog class. In general, you’ll have one object file per
source file, compiled together with its header file (
.hpp) and other
necessary dependencies. So let’s compile an object file for the
clang++ -c dog.cpp
g++ -c dog.cpp
You’ll see that it added a new file called
dog.o, the object file for the
Dog class (if you include the header in the compilation, you’ll also see a
.hpp.gch file. The
.gch file is a precompiled header; all that
happens with that is in the future, for fulfilling an
directive, the precompiled header is preferentially used). So now if we wanted
to compile these together, we would do this:
clang++ dog.o main.o -o dog_program
g++ dog.o main.o -o dog_program
And that should complete just fine. Try running it like so:
But what happens if we change something? If we just change something in
main.cpp, like the
Dog’s name, we just have to run that final linking
command again, and that’s easy. But if we change something in the
itself, like adding a new function, or changing an implementation, we have to
Dog object file, and then link it back to the main object file.
That may not seem like a big deal now, but it gets annoying extremely fast when
you have more than a single tiny class.
Introducing the program
Those of you with some experience in compilation are probably aware of a common
Unix utility called
make. It’s a program extremely widely used on Unix based
systems (Microsoft also has a Visual Studio spinoff called
to build executable program files from source files. (Don’t let the “expected
use” case fool you, though—
make is not a program limited by the narrow
realm of compilation, as you’ll see before this tutorial is over.)
The best instruction is by example, so let’s build a basic
Makefile for our
dog_program. Open a file called
Makefile (make sure it’s titlecase—
will recognise the lowercase
makefile as well, but our autograder won’t, so
it’s good to get into the habit now) with your preferred text editor (mine is
emacs, yours may not be, so replace “emacs” with your editor of choice if you
Note that you won’t see the new file in your directory until you save it.
Makefile rules are written in the format:
target : tgt_dependency1 tgt_dependency2 ... command
So if our target is
dog.o, what are the dependencies (the files needed to make
the target)? They’re
dog.hpp, of course. And the command is the
same as the one we used to compile the object file to begin with. So our rule
dog.o, the dog object file, will look like this:
dog.o : dog.cpp dog.hpp clang++ -c dog.cpp
Copy that into your new
Makefile, and save it (for the makefile examples, I
won’t explicitly give you the gcc equivalents, but if you want to use gcc
instead, just replace all references to clang++ with g++). Now let’s write a
main.o : main.cpp clang++ -c main.cpp
Tabbing in makefiles
Remember: the tab is very important—if you don’t tab the second line of a
rule, you’ll get the error “
*** missing separator. Stop.” Don’t forget your
You can remove everything in the directory besides
Makefile for the demonstration to have any real effect, and
rm dog.o dog_program make
ls now, you’ll see that it’s built the target
dog.o (and left the
precompiled header as well). But what is
An aside about the order in which make interprets makefiles
make will search the current directory for a file called
makefile (again, for your sanity and grades, please only use
titlecase, with a capitalised
M). If it finds one, it will execute the first
rule in the file, and if one of the dependencies of the first target does not
yet exist, it will search for a rule that creates it. So for example, if I have
a makefile like so:
animal_assembly : moose goose cat command moose : antlers hooves fur command goose : beak wings webbed_feet interest_in_bread command cat : whiskers evil_personality command
make, when called with no arguments, will attempt to build the target
animal_assembly. Assuming the dependencies
already available in the directory, it will completely ignore the rules for
them, and build
animal_assembly from what’s present. If
goose is not, it will note that
moose is present, see that
goose is not present, look for a rule to build
goose, find the rule, build
goose, and then note that
cat is present and build
cat are present, it will have to build all of them
using the rules available.
But what if you put the target for
moose : antlers hooves fur command animal_assembly : moose goose cat command goose : beak wings webbed_feet interest_in_bread command cat : whiskers evil_personality command
Well, then if
make is called with no arguments, it will make the target
and stop. If you wanted it to make
animal_assembly, you would then have to call
it like so:
So a good rule of thumb is to put the final and most important command (for our purposes, the one that finally links the object files together into an executable) at the top.
Now back to our
dog example. For our
dog program, what the above means is
that we should put the rule for the whole program at the top. How should we
write it? Well, perhaps as you’d expect at this point:
dog_program : dog.o main.o clang++ dog.o main.o -o dog_program
Put that at the top of your makefile, save it, and run make again.
Now you should see the executable
dog_program, which should behave as it has
in all previous post-compilation incarnations.
Now let’s do one final thing—in general, you should do this when writing your
Makefiles, but it’s especially useful for instructive purposes: we’ll
clean : rm dog_program *.o
Add that to the bottom of your
Makefile (as long as it’s not the top, it
doesn’t really matter, but in long
Makefiles, you want to separate the
clean targets from real compilation-relevant targets for clarity), save it,
make again, passing
clean as an argument to invoke the
make clean ls
What happened? We’ve deleted all of the executables and compilation byproducts that we created, to clean up the directory. But the most notable thing about this rule compared to the others we’ve seen is that it a.) lacks dependencies and b.) doesn’t perform anything compilation-related in its command. Let’s talk about those two things a bit.
The dependency list
The dependency list you write for a target exists so that
make knows what
other targets to ensure you have before you run the command, but if the targets
are guaranteed to be present and
make isn’t responsible for updating them,
make technically doesn’t need to check for anything. (It does not parse the
actual command you give it, so it will not know what files to look for based on
that.) Try deleting the dependency list of the target
dog.o, and then
make clean make dog.o
dog.hpp are present in the directory, and
make doesn’t have
to rebuild them individually when they change (as it does for
have never have errors when compiling that line. But if you deleted the
dependency list for the target
dog_program and ran:
make clean make
make will output an error that the
recipe for target 'dog_program' failed,
dog.o was not in the dependency list, and
make therefore did not
check to make sure it existed. As such, it didn’t bother to build it. As for
including dependencies that make will never have to build (such as
.cpp files), well, it’s simply good practice to document the dependencies
of each target thoroughly. It’s cleaner for other people to read, and it’s a
good way for you to confirm that you’re doing what you wanted to do,
particularly late at night when the lines start to blur together. And now onto
make will run anything you ask it to, because it’s not as smart as you think it is
This is what we were referring to earlier, when we said
make was not limited
to compilation-related commands. Let’s move over to a different directory, for
make-related messing about.
cd ../file_meddling/ ls
As you can see, the
Makefile is currently the only thing in this directory.
It’s a very small and simple one, so open it up with your favourite text
editor, and try guessing what it will do. It’s not compilation—it’s something
altogether much sillier. When you have your prediction, execute
And now there’s a new file in the directory. The command
will yield the somewhat accurate phrase “Hello, there is nothing important
here”—I say somewhat because while the file and indeed the phrase itself are
completely unimportant, the concept is, in fact, important.
make is not a
magical program that intuits the mysterious delicacies of compilation by
parsing incomprehensible syntax and making anything more of it than what you
yourself put there.
make is simply executing the command you gave it, and it
does so blindly, and without any particular personal interest in the results.
Feel free to execute the following now:
make move_file ls
Now, when make executes the rule for the target
move_file, it simply renames
silly_file to something even more ungainly. And finally:
make delete_file ls
removes the file altogether. Usually a rule like this will be named
and it’s very acceptable to stick to that convention for the rest of your life.
However to illustrate that there is nothing magical about the target name
clean (or indeed, any target name at all), in this
Makefile, we are using the
clean target to populate our directory with junk. Try it:
make clean ls
Note that there are now five empty junk files (the directory is not cleaner), and feel free to remove them:
(For the future, it is recommended that this educational example not be taken too deeply to heart. Conventions exist for a reason, and that reason is usually to make everybody’s lives easier. It is always worth knowing, though, that conventions are ultimately just that—conventions.)
Another important concept is understanding the control flow. In what order
would the commands have to have gone in order to create a new file and fill it
with text? Cheerfully,
make will tell you what command it’s executing as it
executes them, but don’t take that for granted. Walk through the
yourself. In fact, let’s do it together.
The first rule you hit is the rule for the target
all. all is a phony target,
commonly used both in the real world and in CS225, placed at the top of a
Makefile, which, in its typical use case, will list all relevant targets
which produce executables as dependencies. This ensures that
compile all of the executables for which there are rules listed. In this case,
we’ve just put it at the top because we can. It, of course, is not currently
responsible for any executables.
When you read the rule for all, you see the dependency listed is
fill_file_with_nonsense doesn’t actually
exist in the directory, so we skip down to the rule for
fill_file_with_nonsense. The dependency listed is
create_file, which also
isn’t a real file, so we skip to the rule for
create_file, which tells us it
has no dependencies, and to
touch is a standard Unix
program that can create, as we have done here, an empty file.
Once that’s done, we can finish up the rule to “build”
which pipes the string “Hello, there is nothing important here” into the newly
Then we can finish up “building” the target all, for which the command is to
print the string “I have mostly created a lot of junk today!” to standard out.
And so it does. Take note that, of course, it “builds” none of the targets that
are not present in its direct control flow, so the unmentioned targets have to
be explicitly passes as arguments to
make in order for it to build them.
Just to be really clear, let’s add another rule to our
Makefile. Open the
Makefile in your text editor of choice, and write the rule
open_file : gedit another_silly_file
(If you do not have gedit installed, use another text editor.) Now run:
and the gedit text editor will open
another_silly_file. Feel free to make a
little change and run make
open_file again. It will open the same file. And
because of our cleverly repetitive naming scheme, we can even delete it with
So hopefully now the basics are painfully clear. Let’s move on.
Now let’s gloss over a basic component of makefile syntax that we’ve so far neglected to mention. Makefile syntax allows for a certain kind of variable called a macro. Macros are useful in a standard makefile essentially for the same reason that variables are useful in a normal program—they allow you to quickly define parts of your program which will appear repeatedly, and if you later to decide to change that part of the program, well, it’s a single change, rather than the countlessly many that are possible in large makefiles. In this class, you will never actually need macros to write an effective and mostly unrepetitive makefile, but it’s not a bad habit to get into, so let’s see an example.
You may notice that our Hello, world example from ages ago has returned, and
now we have a makefile for it. Open up the
Makefile. There’s some rather
strange syntax in here, so let’s try to break it down.
First, we’ve defined a macro called
CXX. Unfortunately, this is a special
macro, so we’re going to ignore it briefly and jump to
FLAGS is a
macro we defined to refer to the flags we’re passing our compiler; in this
case, the flag is
-O, an optimisation option that turns on a series of other
flags which it’s not important for you to know right now (see the clang/gcc
documentation for that information).
FLAGS of course isn’t restricted in
value to valid flags—we could have said
FLAGS = some moose have large
make would have been perfectly happy with that, until the call
to clang++ failed later (you can try it out;
make will actually try to
g++ some moose have large antlers hello_world.cpp -o hello).
Now let’s talk about
CXX. Not all macro names in the
Makefile language are
completely without meaning—there is a certain set of names which do have a
default meaning. In this case, we’ve defined
CXX = clang++. The
default value is usually
g++ on Linux systems, so if we never defined the
CXX, when we used it in the command to compile the executable, it would
have probably used
g++ instead. Try running
make right now, and you should
see the following output:
make clang++ -O hello_world.cpp -o hello
But if you delete the line that says
CXX = clang++, what happens?
make g++ -O hello_world.cpp -o hello
Feel free to replace the line now.
When you call a macro, enclose it like so:
$(MACRO). That’s simply makefile
language syntax. (You may have noticed that my example macro’s name was all
uppercase—as in fact, all of my macros thus far have been. This is not
syntactically required, but it is conventional.)
So that explains most of what’s going on in this file, but the strange symbols
$@ remain, perhaps, mysteries. As you might guess, those are also
macros—they’re special predefined macros in the makefile language, with the
respective meanings “names of the dependencies (newer than the target)” and
“name of the target”, so in this case,
$? refers to
(provided that you
make clean before you
$@ refers to
incidentally (purposefully) the name of the executable created as well. Using
shorthand like this is a good motivation to name targets after the file the
rule creates (this is, of course, also conventional, and increases the
readability of your
Makefiles drastically). Special predefined macros aren’t
important for you to know—there are others we haven’t yet mentioned—but as
you go about life in CS225 and the real world, you are bound to come across
Compiler and linker flags in CS225
For this class we are going to have a very standard set of flags to pass during
compilation and linking. We are going to define these as macros in each
assignment’s Makefile. Here is an example of what those look like (taken from
# This defines our compiler and linker, as we've seen before. CXX = clang++ LD = clang++ # These are the options we pass to the compiler. # -std=c++1y means we want to use the C++14 standard (called 1y in this version of Clang). # -stdlib=libc++ specifies that we want to use the standard library implementation called libc++ # -c specifies making an object file, as you saw before # -g specifies that we want to include "debugging symbols" which allows us to use a debugging program. # -O0 specifies to do no optimizations on our code. # -Wall, -Wextra, and -pedantic tells the compiler to look out for common problems with our code. -Werror makes it so that these warnings stop compilation. CXXFLAGS = -std=c++1y -stdlib=libc++ -c -g -O0 -Wall -Wextra -Werror -pedantic # These are the options we pass to the linker. # The first two are the same as the compiler flags. # -l<something> tells the linker to go look in the system for pre-installed object files to link with. # Here we want to link with the object files from libpng (since we use it in our code) and libc++. Remember libc++ is the standard library implementation. LDFLAGS = -std=c++1y -stdlib=libc++ -lc++abi
A final diversion: The makefile language is Turing complete?
Limited the uses may be for such information, but particularly thanks to its support for lambda abstractions and combinators, the makefile language is actually a complete functional programming language. Will you ever need to write a Fibonacci number generator in the makefile language? Probably not, but you certainly can.
cd ../functional_fun/ make
This will, of course, get quite slow as gets large (the naive solution takes exponential time), so I suggest you stop the process with a well timed Ctrl-C as it begins to lag.
That concludes the tutorial on compilation and
Makefiles. If you have any
questions, please feel free to look up the concepts yourself, ask about them
on Piazza, or ask your TAs or classmates for help.