IncludeTool

A reference guide for IncludeTool, which is used to convert existing C++ projects into an Include-What-You-Use (IWYU) style.

Choose your operating system:

Windows

macOS

Linux

Warning: IncludeTool has been used to convert a number of internal Epic projects, including Unreal Engine 4 (UE4). While we provide this tool as a courtesy, the level of support we can provide is limited.

The IncludeTool utility can be used to convert existing C++ projects into an Include-What-You-Use (IWYU) style. It does so by attempting to form self-contained translation units out of each header file. While it can save a lot of time, it is based on a heuristic, and is highly unlikely to produce perfect output, rather, it may produce a starting point from which you can make manual edits to complete the transition. Even then, it operates by brute force (which is fairly slow, on the order of taking several days to complete for large projects), is painful to iterate on, may require project-specific modifications, and can be difficult to debug.

Operating the Tool

IncludeTool operates in several phases, which is described below:

  • UnrealBuildTool is invoked with a target to generate a list of build steps. The project's editor target is recommended, since it typically includes more code paths than any other target type. We also recommend compiling for Linux using the cross-compile toolchain from Windows, so as to use the Clang toolchain. Using Clang is important, because Visual C++ doesn't do two phase name lookup, which fails to find dependencies from template classes on dependent classes until those templates are instantiated, making highly unreadable output.

  • The source files are partially preprocessed using an internal preprocessor. This does not transform the tokenized source file itself, but determines which blocks of code are active and inactive for the current target, and splits each file into a series of fragments. Each fragment defines a range of lines in the source file between #include directives, meaning that a single-file version of any source file can be assembled at any point by recursively following #include directives and concatenating the list of encountered fragments.

  • This step is important, because it is the list of fragments that IncludeTool is going to optimize. There are several restrictions that IncludeTool validates during this phase and issues warnings about. Some warnings appear pedantic and stylistic, but they are required to ensure that the output is valid. The following are particularly notable:

    • There should be no circular includes between header files.

    • Included source file must be preprocessed in the same way by every translation unit (for example, any macro must be defined the same way in every context that is included).

  • Each fragment is written to the working folder. Each source file is output in its original layout to preserve line numbers, with lines belonging to other fragments being commented out.

  • Source files are tokenized and searched for patterns that look like symbols that can be forward declared. (for example, "class Foo { ...")

  • Each source file is brute-force compiled to determine which fragments it depends on to compile successfully. This is the most labor intensive part of the transformation, but the results from this analysis are stored in the working directory, and should be reusable if source files do not change. The tool also supports a "sharded" mode for using multiple PCs to compute the dependency data (see below).

  • The search is structured as follows:

    • An input translation unit is expanded to be represented by a sequence of fragments (1...n).

    • A set of the required fragments (r) is initialized to only those fragments that were in the input file (rather than being included by the input file).

    • A binary search is performed to find the longest sequence of fragments required for the source file to still successfully compile (1...m, with r).

    • The last fragment in the sequence (m), is added to the required set (r). If 'm' has already been optimized, its dependencies are also added to r.

    • The binary search is repeated for the sequence of fragments (1...m - 1).

  • Each output file is written with a minimal set of includes for it to compile in isolation. The heuristic attempts to directly include the header for any dependency containing a symbol that is explicitly referenced by name, and only include other dependencies if they are not included recursively.

Sample Usage

To perform fragment analysis and output diagnostic information:

-Mode=Scan -Target=FooEditor -Platform=Linux -Configuration=Development -WorkingDir=D:\Working

To optimize a codebase:

-Mode=Optimize -Target=FooEditor -Platform=Linux -Configuration=Development -WorkingDir=D:\Working -OutputDir=D:\Output -SourceFiles=-/Engine/... -OptimizeFiles=-/Engine/... -OutputFiles=-/Engine/...

Several filter arguments may be passed to limit the source file(s) iteration ( -SourceFiles , -OptimizeFiles , -OutputFiles ). Each can take a semicolon list of P4-style wildcards to include ( /Engine/... ) or exclude ( -/Engine/Foo.cpp ) files, or may specify a response file with rules, writing per line ( D:\Filter.txt ). In general, you probably don't want to re-optimize engine code, so it makes sense to exclude all of those from analysis.

Since the program can take so long to run, it does have a facility for running in "sharded" mode. If you have several PCs all synced up to the same source tree, you can use the -Shard=N and -NumShards=M to configure a machine to only consider a portion of the input set. The working directories should be in the same location on each machine, and the resulting working directories can be copied together and used for a final run on a single machine to generate output files.

Note: See the comments in Program.cs for other modes (look for the ToolMode enum) and command-line options (the CommandLineOptions class).

Help shape the future of Unreal Engine documentation! Tell us how we're doing so we can serve you better.
Take our survey
Dismiss