Controlling #include in C++

DRAFT – not quite finished, I’ll remove the DRAFT mark when I’m finished.

The C++ #include mechanism is how we access functionality that is split up into multiple pieces – whether you call these libraries, packages, modules, classes, or some other unit of structure. It’s a weak mechanism, but it’s what we have. We typically call the files that are included “headers” and the files that do the including “source”, but this is arbitrary; any file can include any other file.

C++ compilers have the idea of include search paths. Typically, each compiler is part of a development environment, and that environment has some baked-in search paths (not just for includes, but also for libraries and even executables). This is mostly marked as “implementation defined” in the C++ standard, but fortunately the major compilers all do it in similar fashions.

Recommendations

Here are my conclusions up front.

  1. Use <path> in combination with based paths only, and used based paths sparingly
  2. Use “path” in combination with relative paths only
  3. Only use based paths when crossing package bounds (a package is a piece of code that you would publish separately from anything else)

C++ Standard

The C++11 standard talks about #include in section 16.2, Source file inclusion. It takes several paragraphs to say this:

  • #include <path> searches a sequence of implementation-defined places for path.
  • #include “path” searches a sequence of implementation-defined places for path, and if that fails, then acts as if #include <path> were supplied.
  • #include MACRO evaluates MACRO and if it expands to one of the two previous forms, then said #include operation takes place, otherwise it is undefined behavior.

What this means is that there is an implementation-defined set of paths for <> headers, and there may or may not be another implementation-defined set of paths for “” headers. It is entirely possible for <> and “” to have identical behavior. Certainly the implication in the standard is that there is a base set used for <> and “”, and then a set only used by “”.

Sometimes you’ll see people say <> are system headers, and “” are user headers. This loosely follows from a comment in 16.2.7 that says ‘in general, programmers should use the <> form for headers provided with the implementation, and the “” form for sources outside the control of the implementation’. In this case, ‘the implementation’ is referring to the implementation of the C++ compiler. I think this is too strong a recommendation, and we’ll get to that in a bit.

Examples of the three styles of including files:

#include <stdio.h>

#include "main.h"

#define INCFILE "version_1.h"
#include INCFILE

POSIX Standard

The Open Group Base Specifications Issue 7 document touches on c99 behavior (this is technically identical to IEEE Std 1003.1, 2013 edition), and this is relevant to us.

The search path for #include “path” files is: first in the directory of the file with the #include line, then in directories specified by the user, and then in places defined by the system.

The search path for #include <path> files is: in directories specified by the user, and then in places defined by the system.

In other words, the two methods are the same, except that #include <path> does not look in the parent file’s directory, only in search paths.

POSIX is where the technique of “search in your parent’s folder” was introduced, and it also separated the idea of user-defined paths versus system-defined paths. This is because POSIX was the definition for a system, not a language, but C is such a part of POSIX that they integrate its behavior with the system behavior.

This is important, because most of the real-world C++ compilers we deal with have also decided to be as POSIX-compliant as possible, as well as following the C++ standard.

Philosophy

There are three basic ways for source code to reference included files.

Absolute paths

#include "/projects/git/starcraft/src/lib/base/include.h"

Of course, you wouldn’t do this in reality, because it’s not portable even between computers running the the same operating system. I have it here only for completeness’ sake, and because people are tempted to this from time to time, or because a system that auto-generates source might create absolute paths.

Never ever do this!

Based paths

#include <base/include.h>

where there is a header search path pointing to the root of the hierarchy where base/include.h can be found. In the example above, this would expect /projects/git/starcraft/src/lib to be in the header search paths. This pattern should look familiar to you, because of system or vendor libraries that come with your development environment; e.g. #include <functional> for C++, or #include <Cocoa/Cocoa.h> or #include <Windows.h> for Win32 developers. These all work because somewhere in your compiler environment are prebaked paths pointing to directories that contain files named functional or Windows.h.

The based paths approach leads you to put everything in one massive hierarchy, so you can minimize the number of include search paths that need to be added. This means each project turns into its own little world, and it becomes very hard to share pieces between projects, without an import process.

Most real-world projects follow the “based paths” model. But it has scaling issues, and relative paths are one way to bypass some of that scaling issue.

Relative paths

Assume we have this hierarchy

src/
  client/
    test/
      main.cpp
  lib/
    base/
      allocator.h
      include.h
      types.h
      internal/
        alloc.h

Then if main.cpp wanted to access #include.h, it would do

#include "../../lib/base/include.h"

and if include.h wanted to access types.h, it would do

#include "types.h"

and if alloc.h wanted to access allocator.h, it would do

#include "../allocator.h"

The advantage is that all you need is a path to the first file, and then all other files are relative to each other. As long as the hierarchy is preserved, you need very few header search paths.

Header search paths turn out to be one of those small but annoying roadblocks in terms of making source code portable. If you give someone a library and a hierarchy of headers, then if you use based paths, they must add one or more search paths to their projects. If you use relative paths, then they could do whatever they want in terms of placing your headers relative to their project – it’s under their control.

It’s less readable, for sure, and less writable. Based paths are, in the short run, easier to work with. However, it gets hard in very large programs to keep sub-paths unique. If you follow the pattern of

#include <base/header1>
#include <base/header2>

then you’re hoping that no other header search path except yours has a base root directory.

Implementation

There are now three major players in the C++ compiler infrastructure world: GNU with GCC, Microsoft with Visual C++, and Apple with Xcode (which uses both GCC and Clang under the hood). And since Clang is trying to be GCC-compliant so that it can be a drop-in replacement, there’s really only two systems to worry about.

GCC/Clang

Default include search paths

GCC bakes some paths into the binary, and this baking is done at install time, either when you build GCC yourself, or when your particular operating system vendor/packager installed GCC for you.

You can see these paths with a little trick

echo "//" | gcc -xc++ -E -v -

This will force the language to C++, will do verbose output, and then will compile the program supplied by stdin, which I thoughtfully supply as a comment. As a side effect, I’ll get a bunch of GCC’s baked-in config written to stdout. On a Mac OS X 10.8 machine with Xcode 4.6.3, which has GCC 4.2.1 (Apple’s LLVM 2336.11.00 spin)and Clang 4.2 (clang-425.0.28), I get this (all but include search paths elided)

...
ignoring nonexistent directory "/usr/llvm-gcc-4.2/bin/../lib/gcc/i686-apple-darwin11/4.2.1/../../../../i686-apple-darwin11/include"
ignoring nonexistent directory "/usr/include/c++/4.2.1/i686-apple-darwin11/x86_64"
ignoring nonexistent directory "/usr/local/include"
ignoring nonexistent directory "/Applications/Xcode.app/Contents/Developer/usr/llvm-gcc-4.2/lib/gcc/i686-apple-darwin11/4.2.1/../../../../i686-apple-darwin11/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/llvm-gcc-4.2/bin/../lib/gcc/i686-apple-darwin11/4.2.1/include
 /usr/include/c++/4.2.1
 /usr/include/c++/4.2.1/backward
 /Applications/Xcode.app/Contents/Developer/usr/llvm-gcc-4.2/lib/gcc/i686-apple-darwin11/4.2.1/include
 /usr/include
 /System/Library/Frameworks (framework directory)
 /Library/Frameworks (framework directory)
End of search list.

So, if you compile C++ code with GCC, you’ll be looking for header files in these directories by default. There’s some magic to do with the Frameworks folder, Apple’s version of GCC knows how to turn <Cocoa/Cocoa.h> into <Cocoa.framework/Versions/Current/Headers/Cocoa.h>, or perhaps just <Cocoa.framework/Headers/Cocoa.h> – I actually don’t know how the resolution is done, I should figure that out someday.

You can do the same for clang, and you get a simpler set of directories.

ignoring nonexistent directory "/usr/include/c++/4.2.1/i686-apple-darwin10/x86_64"
ignoring nonexistent directory "/usr/include/c++/4.0.0"
ignoring nonexistent directory "/usr/include/c++/4.0.0/i686-apple-darwin8/"
ignoring nonexistent directory "/usr/include/c++/4.0.0/backward"
ignoring nonexistent directory "/usr/local/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/include/c++/4.2.1
 /usr/include/c++/4.2.1/backward
 /usr/bin/../lib/clang/4.2/include
 /usr/include
 /System/Library/Frameworks (framework directory)
 /Library/Frameworks (framework directory)

In both cases, you’ll see that the compiler is configured for a larger set of directories. For example, clang is told about /usr/local/include, and I happen to not have such a folder. But it’s a common convention to put non-vendor things in /usr/local/include, hence clang is taught this as a default.

User-specified include search paths

Use -I to add to <path> search paths

You can add to the include search paths with the -I directive on the gcc command line.

echo "//" | gcc -xc++ -Isrc/lib/ -Isrc/api/ -E -v -

which gives us

#include "..." search starts here:
#include <...> search starts here:
 src/lib/
 src/api/
 /usr/llvm-gcc-4.2/bin/../lib/gcc/i686-apple-darwin11/4.2.1/include
 /usr/include/c++/4.2.1
 /usr/include/c++/4.2.1/backward
 ...

Our search paths come before the baked-in search paths, which is important – otherwise, we couldn’t override anything in the default paths. GCC checks to make sure your paths exist, and if they don’t, you’ll see an error “Not a directory”.

Use -iquote to add to “path” search paths

GCC searches in the same directory as the file containing the #include “path” statement. This is not called out in the standard, but most compilers do this.

Note that by default, GCC puts your header files in the <> search paths. You can actually cause some include search paths to go into the “” section by prefacing those includes with -iquote. The -I headers go into <>, and the -iquote headers go into “”

echo "//" | gcc -xc++ -Isrc/lib/ -iquote src/api/ -E -v -

we would get this

#include "..." search starts here:
 src/api/
#include <...> search starts here:
 src/lib/
 /usr/llvm-gcc-4.2/bin/../lib/gcc/i686-apple-darwin11/4.2.1/include
 /usr/include/c++/4.2.1
 /usr/include/c++/4.2.1/backward
 ...

GCC has an older mechanism -I-, but it’s deprecated, and clang doesn’t support it, so we won’t talk about it. GCC also has a large number of other options for messing with header search paths, of which one more is worth talking about.

Use -nostdinc to avoid using built-in search paths

If you put -nostdinc on your GCCcommand-line, then GCC will only search header paths you supply; it will not search any of its own. This can be useful when you want to completely control all aspects of compiling and linking. This can make builds a bit more repeatable, because all the headers and libraries can be supplied as part of the project, instead of depending on them being installed on the machine.

If you have this

echo "//" | gcc -xc++ -nostdinc -Isrc/lib/ -iquote src/api/ -E -v -

you will get this

#include "..." search starts here:
 src/api/
#include <...> search starts here:
 src/lib/

and only your specified search paths will be used when looking for #include files.

Frameworks (NeXT/Mac OS X)

NeXT introduced frameworks as a way to package libraries and headers together, and Mac OS X inherited this. This throws a wrinkle into C/C++ compilation, because there is no longer a direct path to the header files.

GCC and Clang on Mac OS X have a separate option to specify the root of the frameworks directory; -F<directory> adds the specified directory to the search path for frameworks, but then additional processing is done to determine the actual path inside the framework. For example:

gcc -F/System/Library/Frameworks
...
#include <CoreFoundation/CoreFoundation.h>

will actually look in

/System/Library/Frameworks/CoreFoundation.framework/Versions/A/Headers/CoreFoundation.h

In the case of CoreFoundation, we have this

Frameworks/
  CoreFoundation.framework/
    CoreFoundation -> Versions/Current/CoreFoundation
    Headers -> Versions/Current/Headers
    Versions/
      Current -> A
      A/
        CoreFoundation
        Headers/
        Resources/

which seems all very complicated, but allows for multiple versions of libraries to exist together in one Frameworks folder. Unix/Linux introduced versioning at the shared library level, but this isn’t good enough, there’s important bits in the header files as well.

So an include like <${FRAMEWORK}/${HEADER}.h> will look in ${FRAMEWORKROOT}/${$FRAMEWORK}.framework/Headers/{$HEADER}.h, and the Headers directory entry will point to the current version to use.

Visual C++

Default include search paths

Up through Visual Studio 2008, Visual C++’s default header paths were stored in the (not sure, somewhere in filesystem, can’t find anything in registry), and separately for each version of Visual Studio. They can be edited from within the Visual Studio editor, but these are global settings.

With Visual Studio 2010 and up, the default header paths are actually stored in property sheets. Again, they are global and affect all compiles. These are properly part of MSBuild, and are found at

%USERPROFILE%/appdata/local/microsoft/msbuild/v4.0

with one settings file for each combination of language and platform that is supported. For example, an older version of my Microsoft.Cpp.Win32.user.props looked like this

<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" ToolsVersion="4.0"
         xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
    <ExecutablePath>C:\dev\Win7SDK71\Bin;
                    C:\dev\mt\6.1.7716.0;
                    $(VSInstallDir)\SDK\v2.0\bin;
                    $(ExecutablePath)
    </ExecutablePath>
    <IncludePath>C:\dev\Win7SDK71\Include;
                 C:\oracle\product\10.2.0\client_1\oci\include;
                 c:\dev\DXSDK_2009_March\Include;
                 $(IncludePath)
    </IncludePath>
    <ReferencePath>$(ReferencePath)</ReferencePath>
    <LibraryPath>C:\dev\Win7SDK71\Lib;
                 C:\oracle\product\10.2.0\client_1\oci\lib\msvc;
                 c:\dev\DXSDK_2009_March\Lib\x86;
                 $(LibraryPath)
    </LibraryPath>
    <SourcePath>$(SourcePath)</SourcePath>
    <ExcludePath>$(ExcludePath)</ExcludePath>
  </PropertyGroup>
  <ItemDefinitionGroup>
    <ClCompile>
      <AdditionalIncludeDirectories>
        E:\projects\svn\core-repository\trunk\Contrib\Contrib\Boost;
        %(AdditionalIncludeDirectories)
      </AdditionalIncludeDirectories>
    </ClCompile>
  </ItemDefinitionGroup>
</Project>

Well, almost – I did a little creative editing for linebreaks to make it readable, linebreaks that would actually be illegal in the actual source file, I’m sure.

Unlike with GCC, there really aren’t baked search paths, but these are global, so you generally don’t want to put anything other than the C++ and Windows header paths in here. You can even simply remove everything. My actual¬†Microsoft.Cpp.Win32.user.props looks like this:

<?xml version="1.0" encoding="utf-8"?> 
<Project DefaultTargets="Build" ToolsVersion="4.0" 
       xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

</Project>

because I compile many different projects, and I don’t want any defaults.

User-specified include search paths

Use /I to add to <path> search paths

There are multiple ways to add extra include paths to a Visual C++ build. If you are exercising the compiler directly, then there is a /I switch, and it works exactly as the GCC -I switch.

However, most people work with Visual C++ through project files. In the IDE, you would add extra entries to the “Additional Includes” entry in the General sub-section of the C/C++ secion of Configuration Properties, which maps to an¬†AdditionalIncludeDirectories entry to the CInclude section in the project file, which is in XML format (since I think VS2008).

    <ClCompile>
      <PreprocessorDefinitions>
        WIN32;_DEBUG;_CONSOLE;%(PreprocessorDefinitions)
      </PreprocessorDefinitions>
      <AdditionalIncludeDirectories>
        C:\package_cache\boost\1.44-bnet-0\noarch\include
      </AdditionalIncludeDirectories>
    </ClCompile>

There is no way to have Visual C++ show you the include paths, but since they aren’t as hidden as with GCC, there’s also not as much need.

You can also add search paths for specific files in the IDE, which translates to an AdditionalIncludeDirectories entry on the specific file, instead of at the ClCompile task level.

If there are no /I directives, and there is an INCLUDE environment variable, then the environment variable is used to specify the search paths. This is overriden by /X.

Use /X to avoid using built-in search paths

If you don’t want to edit the built-in search paths out of existence for all projects, you can still suppress them for specific projects or specific files. From the IDE, this is the “Ignore Standard Include Paths” in the Preprocessor sub-section of the C/C++ section in Configuration Properties. At the compiler level, this is the /X switch. The comment in the help claims that this means “ignore contents of INCLUDE and PATH environment variables”, but I’m not sure that this is correct, unless the IDE stuffs the global paths into the INCLUDE environment variable before issuing the CL.EXE command (through msbuild). Todo – try this out.

Hard-wired “path” search paths behavior

Visual Studio has some non-standard behavior. It’s hard-wired; you can’t alter it.

First, Visual Studio searches in the same directory as the file containing the #include statement. This is not in the standard, but GCC and Clang also do this, so it might as well be considered standard.

Then it searches up the #include chain – if there was a file that included the file that has the #include statement, then Visual Studio looks in the directory containing that file. It repeats this behavior until it finds a matching filename or it reaches the top file in the compile chain. Unfortunately, I don’t know that there is a way to disable this behavior, and this behavior is not compatible with other compilers, nor is it particularly good behavior.

Xcode

TODO – not much different from GCC/Clang listed above, but bits about the IDE.

Conclusions

TODO

Reference

http://gcc.gnu.org/onlinedocs/cpp/Search-Path.html

http://msdn.microsoft.com/en-us/library/vstudio/ee855621(v=vs.120).aspx

http://blogs.msdn.com/b/vcblog/archive/2010/03/02/visual-studio-2010-c-project-upgrade-guide.aspx

http://stackoverflow.com/questions/14969365/creating-a-vs-property-sheet-for-a-c-library

http://stackoverflow.com/questions/21593/what-is-the-difference-between-include-filename-and-include-filename

https://www2.opengroup.org/ogsys/catalog/C138

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>