Skip to content
This repository has been archived by the owner on Mar 30, 2021. It is now read-only.
Whisperity edited this page Aug 14, 2017 · 76 revisions

Clang Cross Translational Unit (CTU) Static Analysis

The goal of this project is to improve the Clang Static Analyzer to be able to detect bugs that span multiple translation units (TUs). CTU analysis has been presented at EuroLLVM '17 (see the submitted Extended abstract for a more in-depth overview.)

Usage

To use CTU static analysis, you need to build a version of Clang which supports this feature. (See in Compilation.) Invoking the analyzer requires some special arguments (for an in-depth explanation, see Approach), we suggest using CodeChecker to invoke the analyzer. (See Cross Translation Unit analysis with CodeChecker.)

Compilation

You can build a version of Clang by checking out our repository. The commits below tell you which LLVM and clang-tools-extra Git commit to use. To build clang, use the same procedure as usual, but with the commits described below.

Branches

The ctu-os branch collects commits and changes that are currently undergoing review by the community.

ctu-master and ctu-clang5 contain extra functionality that are continuously aimed to make CTU more viable, especially for C++ projects. -master follows the master version of Clang, while -clang5 is branched from the (currently release-candidated) Clang 5.0 version. We suggest using ctu-clang5 to build your Clang binaries from.

Which LLVM commit to use?

Branch ctu-clang5 -> LLVM commit b20d324de517c95e5cb01e88f78855b3d0e10d51

Branch ctu-master -> LLVM commit 00708415fb45c18f9871def78647dd555c253e0b

Branch ctu-os -> LLVM commit 7dab9bfe3016988a518ea5868cbf0457d335a356

If you want to use clang-tools-extra (e.g. clang-tidy):

Branch ctu-clang5 -> CTE commit 619d067acc7165aed1bb8ff86f9579ec666777fa

Branch ctu-master -> CTE commit ea1b4cd563843284e8d20f132d63b6e85deadf70

Branch ctu-os -> CTE commit cdfb024e2f69e1466479278579623167799bca5f


Approach

Today, Clang SA can perform (context-sensitive) inter-procedural analysis by "inlining" the called function into the callers context. This means that function parameters (including all constraints) are passed to the called function and the return value of the function is passed back to the caller. This works well for function calls within a translation unit, but when the symbolic execution reaches a function that is implemented in another TU, the analyzer engine handles it as "unknown".

In this project we are working on a method which enables CTU analysis by inlining external function definitions using Clang's existing ASTImporter functionality.

The EuroLLVM '17 Extended abstract contains a more in-depth description in white paper style.

Two-pass analysis

To perform the analysis we need to run Clang on the whole source code two times.

1st pass

We generate a binary AST dump (using Clang's -cc1 -emit-pch feature) of each TU into a temporary directory called preanalyze-dir. We collect the Unified Symbol Resolution (USR) of all externally linkable functions into a text file (externalFnMap.txt).

2nd pass

We run the Clang Static Analysis for all translation units, and if during inlining an externally defined function is reached, we look up the definition of that function in the corresponding AST file (based on the info in externalFnMap.txt) and import the function definition into the caller's context using the ASTImpoter library.

Results

We have run comparative analysis on several open source projects, such as openssl, FFMpeg, Git, Xerces, tmux, etc. We found several additional bugs compared to the normal (non cross-translation-unit capable) analysis.

See the results on cc.elte.hu/, with memory usage and result comparison.

Credits

This work is based on earlier work of Aleksei Sidorin, Artem Dergachev, et al. See http://lists.llvm.org/pipermail/cfe-dev/2015-October/045730.html.