This is the third post in my SCons
series. The topic of this post is setting up a multi-module C++ project using SCons
, with a separate build directory.
In previous posts in the series I introduced the SCons
open source build tool, and described a simple C++ example project that uses SCons
.
In this post I use the exact same C++ project from the basic example. I rewrite the build scripts, using SCons
, to achieve the following properties:
- Divide the project into distinct modules. A module maps to a directory in the project tree. Each module contains a
SConscript
file, describing the targets included in that module. - Separate the build output directory from the module source code directories. I like my project tree clean and tidy, without object files scattered between source files.
- Allow build targets in modules to refer to other targets in other modules easily. This is required, for example, when a program in one module uses functions from a static library in another module.
The final result is available on my GitHub scons-series repository. In the rest of this post I explain the details of what I came up with.
As a reminder, the (seemingly silly) C++ project is a simple address book program. Refer to the previous post if you’re interested in more details.
The main SConstruct
The main SConstruct
file is in the project root directory. This is where SCons
starts processing when it is executed in the project.
You can see the complete file in the GitHub repository. I will paste selected parts in arbitrary order for deductive purposes. The pasted code might be modified, so please don’t copy & paste and use it as is – prefer using the GitHub version!
The heart of the SConstruct
file is the module-loop:
# Go over modules to build, and include their SConscript files for module in modules(): sconscript_path = os.path.join(module, 'SConscript') # Execute the SConscript file, with variant_dir set to the module dir under the project build dir. targets = env.SConscript(sconscript_path, variant_dir=os.path.join(build_dir, module), exports={'env': env}) # Add the targets built by this module to the shared cross-module targets # dictionary, to allow the next modules to refer to these targets easily. for target_name in targets: # Target key built from module name and target name # It is expected to be unique target_key = '%s::%s' % (module, target_name) assert target_key not in env['targets'] env['targets'][target_key] = targets[target_name]
The loop iterates over a list of modules (generated by the call to modules()
), assuming every element yielded is a module directory. The module SConscript
file is processed in the highlighted lines. The variant_dir
argument instructs SCons
to place build artifacts for that module in the specified directory (as explained in the SCons
user guide). The module SConscript
is expected to return a dictionary of targets it contains (see the content of the SConscript
s below). That dictionary is then added to the shared cross-module targets dictionary, storing all targets from all modules declared so far.
The modules()
function can be anything that returns or yields the names of the modules to be built. def modules(): return ['AddressBook', 'Writer']
would suffice.
I used a generator instead of a list, mostly because I’m a smart-ass:
def modules(): yield 'AddressBook' yield 'Writer'
Module SConscripts
The module-level SConscript
files are simple enough:
Import('*') module_targets = dict() module_targets['addressbook'] = env.Library('addressbook', ['addressbook.cc']) Return('module_targets')
Import('*') module_targets = dict() module_targets['writer'] = env.Program( 'writer', ['writer.cc'] + env.get_targets('addressbook')) Return('module_targets')
Both SConscript
files follow a similar pattern. They create a dictionary, declare build targets and save them in the dictionary, and return the dictionary.
The AddressBook
module SConscript
simply builds a addressbook
library. Nothing special about it.
The Writer
module SConscript
builds the writer
program, based on writer.cc
source file, and a weird env.get_targets('addressbook')
thingie. This is the interesting part!
Before going into details about env.get_targets
, which is an extension I added, lets first understand what we’re trying to do.
Linking with Libraries
The SCons
user guide explain how to build a program that links with libraries. When declaring the Program
target, the LIBS
and LIBPATH
construction variables should be specified. These flags, in turn, are passed to the linker (using -l<libname>
and -L<libpath>
in case of gcc / clang).
So in the address book example, it would look like this: Program('writer', ['writer.cc'], LIBS=['addressbook'], LIBPATH=['#build/AddressBook'])
.
I don’t like this… The module-level SConscript
should not need to know about the “build directory” explicitly. If it is changed in the main SConstruct
, the build will break unless the developer remembers to change also relevant SConscript
files.
One possible solution may look like this: Program('writer', ['writer.cc'], LIBS=['addressbook'], LIBPATH=['../AddressBook'])
. This is better – no explicit reference to the build dir. Another option would be to do something like this: LIBPATH=['$BUILDDIR/AddressBook']
, assuming that SConstruct
file did something like this: env['BUILDDIR'] = '#build'
. This is also a valid solution.
But I still don’t like this… 🙂
In all 3 variations, the Program
declaration still referred to the address book module twice. I named the library I want (in LIBS
), and the search path for that library (in LIBPATH
). Granted, these are two different things – library name, and the directory that has it. But it feels like it can be cleaner, more elegant.
Striving for elegance and SConscript simplicity, I wrote the get_targets
extension to make it easier to refer to targets from other modules.
The get_targets SCons Extension
First, note that the module-level SConscripts can use env.get_targets
thanks to the highlighted line in the main SConstruct
:
env = Environment() # Allow including from project build base dir env.Append(CPPPATH=['#%s' % (build_dir)]) # Prepare shared targets dictionary env['targets'] = dict() # Allow modules to use `env.get_targets('libname1', 'libname2', ...)` as # a shortcut for adding targets from other modules to sources lists. env.get_targets = lambda *args, **kwargs: get_targets(env, *args, **kwargs)
This way, when a module-level SConscript calls env.get_targets(a1, a2, kw1=v1, kw2=v1)
, it will pass the call to the get_targets
function with (env, a1, a2, kw1=v1, kw2=v2)
. Not completely different from a regular Python instance method.
Now, what do I want this function to do? Lets characterize the function, before going into its implementation.
The end goal is to simplify the way SConscripts refer to targets from other modules. In most cases, this applies to Program
targets that need to use Library
targets from other modules. Without limiting generality, I’ll use the example to characterize the desired behavior.
The writer
program needs to be linked with the addressbook
library, in the AddressBook
module. We can refer to a target named lib
in a module named mod
in a globally unique way as mod::lib
. This is globally unique because ::
is not a valid sequence in any POSIX or Windows path. So, assume that for every library that is built in any module in the project, the list of targets returned from the Library
builder is stored in a global dictionary under the unique identifier mod::lib
. If this dictionary is globally accessible, then any program that needs to link with this library can simply extend its list of sources with the targets from the dictionary at mod::lib
.
We already saw that this dictionary exists in env['targets']
, with the main SConstruct
updating it after including every module-level SConscript
. So all the get_targets(...)
function needs to do, is lookup the dictionary entry for every target identifier passed to it, and return an aggregate list of all targets it saw.
But wait. In most common scenarios, you’re not going to have different modules reusing the same target names, right? In the current example, for instance, the name addressbook
is specific enough to refer to the AddressBook::addressbook
library uniquely!
So I want get_targets
to support short-form queries as well. But I don’t want to force the project to avoid reusing target names in different modules. So a naive approach that creates two entries in the targets dictionary for every target (one for mod::lib
and one for lib
) will not suffice, because lib
might be overwritten.
My solution – do a “smart” lookup in get_targets
:
- If a query contains
::
, lookup a fullmod::lib
match in the targets dictionary. - If it’s just a name (no
::
), lookup anymod::<name>
match (and maybe print a warning if more than one matched). - Added bonus – if the query contains
*
– do a wildcard lookup (allowing, for example, getting all targets from a module withmodule::*
).
The get_targets
with this behavior is implemented in site_scons/site_init.py
. This file is automatically read by SCons
before SConstruct/SConscript files.
You can see the complete function on GitHub. Lets take a closer look at selected parts.
The query_to_regex
helper function decides for every query what kind of query it is. It returns a RegEx matcher object that can be applied against the targets dictionary entries. It also returns a boolean flag, used to decide whether multiple matches are a problem or not.
def query_to_regex(query): """Return RegEx for specified query `query`.""" # Escape query string query = re.escape(query) if r'\*' in query: # '\' because of RE escaping # It's a wildcard query return re.compile('^%s$' % (query.replace('\\*', '.*'))), False if r'\:\:' in query: # '\' because of RE escaping # It's a fully-qualified "Module::Target" query return re.compile('^%s$' % (query)), True # else - it's a target-name-only query return re.compile(r'^[^\:]*\:{2}%s$' % (query)), True
The main loop simply goes over the queries in the args
list. Every query in converted into a RegEx matcher, and checked against all target names in the dictionary. Matching entries are added to the matched list.
for query in args: qre, warn = query_to_regex(query) for target_name in target_names: if qre.match(target_name): matching_target_names.append(target_name)
The main loop also counts matches (per-query), and prints warnings if no matches were found, or unexpected multiple matches were found.
Finally, an aggregate list of targets in constructed and returned. This is done using the one-liner return reduce(lambda acculist, tname: acculist + env['targets'][tname], matching_target_names, [])
. If it confuses you (like it confused me to write it), here’s an equivalent spread-out version:
acculist = [] for tname in matching_target_names: acculist.extend(env['targets'][tname]) return acculist
The Python docs on the reduce
function can help.
Side notes:
- The matched list is a
list
and not aset
, because order may be important, andset
is not ordered. - Duplicate matches are avoided by removing the matched entries from the list of available target names in every iteration. It also reduces the number of iterations for every query.
- It is possible that after a couple of queries, all available targets are matched. If this happens, it doesn’t make sense to keep iterating over the queries list. I chose to complete the iteration instead of breaking out of the main loop, in order to print all the warnings for all the remaining queries.
It Works?
It does!
itamar@legolas sconseries (episodes/02-modules) $ scons scons: Reading SConscript files ... scons: |- Reading module AddressBook ... scons: |- Reading module Writer ... scons: done reading SConscript files. scons: Building targets ... g++ -o build/AddressBook/addressbook.o -c -Ibuild build/AddressBook/addressbook.cc ar rc build/AddressBook/libaddressbook.a build/AddressBook/addressbook.o ranlib build/AddressBook/libaddressbook.a g++ -o build/Writer/writer.o -c -Ibuild build/Writer/writer.cc g++ -o build/Writer/writer build/Writer/writer.o build/AddressBook/libaddressbook.a scons: done building targets.
What’s the Catch..?
This all seems nice and fun, doesn’t it? Well, nothing comes for free, does it?
You might have already noticed the caveats and potential problems. Let me list what I’m aware of: (and let me know if you spot more!)
- The module-level
SConscript
files are cluttered with the localmodule_targets
dictionary that the mainSConstruct
expects. - The
modules()
in the mainSConstruct
must be given in the correct order!
The first item is no deal breaker (maybe). It does upset my OCD. So in another post in the series I take the task of SConscript
simplification to an extreme.
That second item is more significant. Let me show you. If I switch the order of Writer
and AddressBook
, and try to build, here’s what happens:
itamar@legolas sconseries (episodes/02-modules) $ scons scons: Reading SConscript files ... scons: |- Reading module Writer ... scons: warning: get_targets query "addressbook" had no matches scons: |- Reading module AddressBook ... scons: done reading SConscript files. scons: Building targets ... g++ -o build/AddressBook/addressbook.o -c -Ibuild build/AddressBook/addressbook.cc ar rc build/AddressBook/libaddressbook.a build/AddressBook/addressbook.o ranlib build/AddressBook/libaddressbook.a g++ -o build/Writer/writer.o -c -Ibuild build/Writer/writer.cc g++ -o build/Writer/writer build/Writer/writer.o Undefined symbols for architecture x86_64: "PhoneNumber::set_number(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)", referenced from: PromptForAddress(Person*) in writer.o "PhoneNumber::set_type(PhoneNumber::PhoneType)", referenced from: PromptForAddress(Person*) in writer.o "Person::set_id(int)", referenced from: PromptForAddress(Person*) in writer.o "Person::set_name(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)", referenced from: PromptForAddress(Person*) in writer.o "Person::set_email(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)", referenced from: PromptForAddress(Person*) in writer.o "Person::name() const", referenced from: _main in writer.o ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation) scons: *** [build/Writer/writer] Error 1 scons: building terminated because of errors.
Line #4 (highlighted) shows that get_targets
failed to find matches for the addressbook
query. Line #12 (also highlighted) shows that the writer
program isn’t linked with the addressbook
library. The result is that the build breaks.
What happened..?
That should be apparent by now. Since Writer/SConscript
got processed first, when it reached the call to get_targets
, the global targets dictionary still didn’t contain the AddressBook::addressbook
library.
Solutions?
- Specify the models in order of dependency. A module may use targets only from modules that appeared before.
- An important implication – you can’t have a target in module
A
using targets from moduleB
, along with a target from moduleB
using targets from moduleA
. This creates a cyclic dependency graph. You cannot listA
andB
in an order that satisfies #1. You will have to refactor – maybe create acommon
module that contains targets fromA
andB
that are used by both.
Better solutions?
Technically, if SConstruct
has the full modules list in advance, it can pre-process it. It can determine what targets belong to what modules, and what modules depend on what targets. Using this information, it can construct a module dependency graph, and try to resolve it. The result would be an ordered modules list, that satisfies the requirements (assuming there are no cyclic dependencies).
This indeed would be a better solution! It’s also interesting and non-trivial enough to deserve its own post in this series. 🙂
Summary
That was my first SCons
extension, supporting the use-case of multi-module project with a separate build directory.
I showed how to use the SConscript
function with the variant_dir
argument to delegate targets declarations to module-level SConscript
files, and have the build artifacts created under a separate build directory.
An easy way to declare that a target in one module uses targets from other modules was characterized and implemented.
The final result is available on my GitHub scons-series repository. Feel free to use / fork / modify. If you do, I’d appreciate it if you share back improvements.
See the scons
tag for more in my SCons
series. Specific posts of interest may include:
- Extreme
SConscript
simplification. - Supporting Arbitrary Modules Order In SCons – relaxing the requirement for specifying modules in order of dependence.
- Multi-flavored project build.
Leave a Reply