In-depth: Incremental linking and the search for the Holy Grail
[In this reprinted #altdevblogaday in-depth piece, Bungie's engineer architect Andy Firth]
So you've done your full rebuild, waited out your now much shorter build time while talking with colleagues over coffee… but wait…you forgot to make that 1 edit to test your function… there done…you see your file building… its done and BAM "Linking…"
time to go get more coffee… this baby is going to take a while.
Typically on small programs linking is measured in mere seconds (often less) and is largely unnoticeable. However on larger projects with millions of lines of code, heavy template usage and optimizations such as inline functions and weak references (declspec selectany) your symbol count goes off the charts.
Combo this with the probable use of libraries to compartmentalize your code and the likelihood that you're compiling more than one target at a time (dll's, tools, runtime). The linker is likely doing a significant amount of the work for you AFTER compiling your single file.
As an example, for me within our current code at Bungie, a single file change in the code I usually work on can cost as much as 10mins in linking (many targets). If I opt to build a single target (say the runtime), then this is reduced to ~2 mins. This is without any linker optimization such as Link Time Code Generation which is known to heavily increase Link time.
In my experience, I'd say most of my changes (and most of the changes my colleagues make) before doing a compile involve a single file. More often than not, it's simpler than that: a single line. So the iteration is something like this:
- make change
- hit compile
- 1-2 seconds of compile
- 1-2 mins to link single runtime target.
- load game
Now loading the game is always going to be a problem*, but but the build itself is completely dominated by linking. If only that step could be faster.
Incremental linking sets up an ilk
file. The first time you link with /INCREMENTAL enabled it will generate a database of the symbols within the Target executable and the location that supplied them within object files of said executable.
This file is used in subsequent links to cross reference a changed object file (the single file you changed) with final executable allowing the linker to "patch" the symbols that are different with the new version. The patch is achieved using a code thunk so the resultant code can be slower. The recommendation is to do a full rebuild before doing any performance testing.
Incremental linking is on for link.exe by default. New projects generated with MSVC will explicitly disable it for release however. I would assume that most if not all of your simple programs are using it in debug. Sadly the most common "working" case is the one that needs it least; small programs have short link times.
In my experience the more complex your project, the less likely incremental linking is working for you. I have spoken with many engineers over the years about incremental linking and almost all believe it to be a flawed feature of MSVC. Sometimes it works, sometimes it doesn't. When it doesn't work, it actually increases link times and therefore, for many engineers, it isn't worth wrangling with.
I experienced the same, sometimes it works and its fantastic. Sometimes it simply doesn't… most of the time it tells you a reason.
If your project uses any linker options that alter the symbols the linker uses or explicitly lists the order in which those symbols are used, then internally the linker will disable incremental linking (the MSDN docs explicitly mention this).
Using /sourcemap (undocumented method of redirecting pdb source lookup) will automatically disable incremental linking. I don't yet understand the reasons for this; however, I know it disables it. It's undocumented, but several teams I know are using it, so figured I would mention it here.
Barring the above, which are relatively easy to avoid explicitly within your linker options. I still had trouble getting incremental linking working. MSDN explains that a full link occurs when:
- The incremental status (.ilk) file is missing (LINK creates a new .ilk file in preparation for subsequent incremental linking).
- There is no write permission for the .ilk file (LINK ignores the .ilk file and links non-incrementally.)
- The .exe or .dll output file is missing.
- The timestamp of the .ilk, .exe, or .dll is changed.
- A LINK option is changed. Most LINK options, when changed between builds, cause a full link.
- An object (.obj) file is added or omitted.
- An object that was compiled with the /Yu /Z7 option is changed.
However, none of these were occurring (that I could tell).
After a few experiments and a lot of compiling a question came to mind that i hadn't considered before.
What Is A (.lib) Library File
The MSDN definition
is somewhat verbose but it boils down to: a library is a container for a set of objects. Link resolves external references by searching first in libraries specified then default libs.
My own understanding of libraries was woefully lacking. I believed it to be a container for objects only, when you linked against a library it linked in all the object files. The reality is very different, however. If you link with the command line option /VERBOSE
, you can see for yourself what actually occurs. The rough algorithm is:
- Compile Target Source files (main.cpp etc)
- When linking:
- For each symbol not found in Target Source Files
- Search objects in supplied libs in the order supplied
- When symbol found stop searching
- end For
This essentially means that when linking an executable, Link.exe cherry picks the symbols it needs from a library and discards the rest. The upshot of this is effectively object file level dead-stripping as a first level feature of using a lib file. Now let's get back to incremental linking.
What Does That Mean For Incremental Linking With Lib Files
If you're simply linking against stable Libs, incremental linking works fine. However if the code you're editing is within a Lib itself, it doesn't work. There seems to be no output from link.exe saying its not going to incrementally link it simply doesn't do it.
You can see this for yourself by using the /time+ (another undocumented option) option on the linker command line which describes the various passes the linker does. Experiment yourself with changing a file within a lib and a file within the target exe.
I believe this all stems from a simple fact: Libs are supposed to be stable from your builds perspective.
I have always used them for compartmentalization; a simple method to contain functionality within a single file (and therefore reference) rather than having each project have to contain a multitude of files it doesn't really need to know about.
I believe this is a very common usage pattern. Common enough that every team I've worked with used them in this way. I believe this is the main reason many engineers believe incremental linking is simply a broken feature.
But I Want My Cake
Given the information above, the solution "seems" simple. Simple enough that VS now provides an option for it based on work by Joshua Jensen who discusses it at length here
Within a standard VS2010 project browse to:
/Properties/Configuration Properties/Linker/General/Use Library Dependency Inputs
This option switches a "library" included from within your solution, to use the object files directly rather than the library file. This switches the library to my own previous understanding of lib files; all objects will be linked into the target explicitly as if they were Target source files. The lib file itself isn't used at all.
One issue, however, I mentioned earlier using a lib provides a feature I hadn't previously understood: "effectively object file level deadstripping".
If your lib file contains symbols that have a dependency themselves, for instance D3d (very common), then using the above option will require you to explicitly list D3D as a dependency in your target project setup regardless of whether the Target required the functionality itself. This could mean relatively few changes for you or it could mean you have to split up your code to remove dependencies that only exist for specific target executables but not for all.
Deadstripping (OPT:REF) has to be off so this switch will in turn enlarge your target executables.
Once Incremental Linking is active and working, you've removed the use of Lib files directly for the code you're editing and stopped doing dead-stripping for all local builds you'll see a massive improvement in overall iteration times. For one of our tools projects, the link time reduced from ~5mins to ~10seconds, and now we're aware of the pitfalls and reasons for linker performance (using /VERBOSE /time+) we're reducing that even further.
For our main runtime, the link reduced from 2mins to 2-3 seconds, and a single file change including deployment was reduced from 145seconds to ~10 seconds. This seems inconsequential when considered standalone. However, for engineers iterating on functionality within said runtime, this is huge and equates to an estimated 30-45 mins per day per engineer.
*Only The Penitent Man Shall Cross
Once you have resolved all the issues that can stop incremental linking from working, you are a short step from the Holy Grail. Something I personally have never had working before but will be actively fighting for in the near future. Consider the original process reformatted based on our above work:
- make change
- hit compile
- 1-2 seconds of compile
- 1-2 seconds to link runtime target. <= incremental linking is great
- load game
In this new iteration process, the initialization and loading of the data the target uses is highly likely the main time sink. For a game, this usually means loading a level, instancing 1000s of game entities and starting all manner of systems. I've seen this be as fast as 30 seconds or as slow as 5-10mins depending on Build Target (Debug vs slightly optimized vs Release) and the data being tested.
is a time saving feature that allows the code to compile while in break mode, when the programmer steps or continues the "new" code is compiled using incremental linking in-situ. It is non-trivial to setup and has all manner of caveats however depending on your specific circumstances the win can be huge. Were Edit-and-continue working the above example becomes:
- hit compile and run
- 1-2 seconds of compile
- 1-2 seconds to link runtime target.
- load game (30 seconds to 5mins)
- run test within game
- break into program
- make small change
- continue running newly compiled program
For small changes, this can save a programmer the entire cost of loading the data and initializing the systems being tested. For my current situation were I to get Edit and continue working I'd save around 3 minutes per iteration.
For me, being able to iterate on a test situation as fast as possible is paramount. Hopefully the above helps you reach your development goals faster and with a lot less coffee time.
Notes And References
Managed C++ cannot link incrementally, if you're combing managed and native consider splitting the native into a dll to achieve incremental linking. Obviously this is a lot more work and would likely deserve a cost benefit analysis beforehand.
MSDN Incremental Linking Help: http://msdn.microsoft.com/en-us/library/4khtbfyf(v=VS.100).aspx
Edit and Continue: http://msdn.microsoft.com/en-us/library/esaeyddf.aspx
Forum post re: incremental Linking: http://bytes.com/topic/net/answers/281196-incremental-linking-multiple-projects
Link Time Code Generation: http://msdn.microsoft.com/en-us/magazine/cc301698.aspx
More From Andy Firth's Code Optimization Series
Part 1 Part 2 Part 3 Part 4
[This piece was reprinted from #AltDevBlogADay, a shared blog initiative started by @mike_acton devoted to giving game developers of all disciplines a place to motivate each other to write regularly about their personal game development passions.]