-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Initial manual commit of documentation
Signed-off-by: John Pennycook <[email protected]>
- Loading branch information
0 parents
commit 7deaef0
Showing
47 changed files
with
7,630 additions
and
0 deletions.
There are no files selected for viewing
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
Performing Analysis | ||
=================== | ||
|
||
The main interface of CBI is the ``codebasin`` script, which can be invoked to | ||
analyze a code base and produce various reports. Although CBI ships with other | ||
interfaces specialized for certain use-cases, ``codebasin`` supports an | ||
end-to-end workflow that should be preferred for general usage. | ||
|
||
The simplest way to invoke ``codebasin`` is as shown below:: | ||
|
||
$ codebasin analysis.toml | ||
|
||
...but what is ``analysis.toml``? We need to use this file to tell CBI which | ||
files are part of the code base, and where it should look to find the | ||
compilation databases defining our platforms. | ||
|
||
.. note:: | ||
|
||
The TOML file can have any name, but we'll use "analysis.toml" throughout | ||
this tutorial. | ||
|
||
|
||
Defining Platforms | ||
################## | ||
|
||
Each platform definition is a TOML `table`_, of the form shown below: | ||
|
||
.. _`table`: https://toml.io/en/v1.0.0#table | ||
|
||
.. code-block:: toml | ||
[platform.name] | ||
commands = "/path/to/compile_commands.json" | ||
The table's name is the name of the platform, and we can use any meaningful | ||
string. The ``commands`` key tells CBI where to find the compilation database | ||
for this platform. | ||
|
||
In our example, we have two platforms that we're calling "cpu" and "gpu", | ||
and our build directories are called ``build-cpu`` and ``build-gpu``, so | ||
our platform definitions should look like this: | ||
|
||
.. code-block:: toml | ||
[platform.cpu] | ||
commands = "build-cpu/compile_commands.json" | ||
[platform.gpu] | ||
commands = "build-gpu/compile_commands.json" | ||
.. warning:: | ||
Platform names are case sensitive! The names "cpu" and "CPU" would refer to | ||
two different platforms. | ||
|
||
|
||
Running ``codebasin`` | ||
##################### | ||
|
||
Running ``codebasin`` with this analysis file gives the following output: | ||
|
||
.. code-block:: text | ||
:emphasize-lines: 4,5,6,7,9 | ||
----------------------- | ||
Platform Set LOC % LOC | ||
----------------------- | ||
{} 2 6.06 | ||
{cpu} 7 21.21 | ||
{gpu} 7 21.21 | ||
{cpu, gpu} 17 51.52 | ||
----------------------- | ||
Code Divergence: 0.45 | ||
Unused Code (%): 6.06 | ||
Total SLOC: 33 | ||
Distance Matrix | ||
-------------- | ||
cpu gpu | ||
-------------- | ||
cpu 0.00 0.45 | ||
gpu 0.45 0.00 | ||
The results show that there are 2 lines of code that are unused by any | ||
platform, 7 lines of code used only by the CPU compilation, 7 lines of code | ||
used only by the GPU compilation, and 17 lines of code shared by both | ||
platforms. Plugging these numbers into the equation for code divergence gives | ||
0.45. | ||
|
||
|
||
Filtering Platforms | ||
################### | ||
|
||
When working with an application that supports lots of platforms, we may want | ||
to limit the analysis to a subset of the platforms defined in the analysis | ||
file. | ||
|
||
Rather than require a separate analysis file for each possible subset, we can | ||
use the :code:`--platform` flag (or :code:`-p` flag) to specify the subset of | ||
interest on the command line: | ||
|
||
.. code:: sh | ||
$ codebasin -p [PLATFORM 1] -p [PLATFORM 2] analysis.toml | ||
For example, we can limit the analysis of our sample code base to the cpu | ||
platform as follows: | ||
|
||
.. code:: sh | ||
$ codebasin -p cpu analysis.toml |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
Command Line Interface | ||
====================== | ||
|
||
.. code-block:: text | ||
codebasin [-h] [--version] [-v] [-q] [-R <report>] [-x <pattern>] [-p <platform>] [<analysis-file>] | ||
**positional arguments:** | ||
|
||
``analysis-file`` | ||
TOML file describing the analysis to be performed, | ||
including the codebase and platform descriptions. | ||
|
||
**options:** | ||
|
||
``-h, --help`` | ||
Show help message and exit. | ||
|
||
``--version`` | ||
Display version information and exit. | ||
|
||
``-v, --verbose`` | ||
Increase verbosity level. | ||
|
||
``-q, --quiet`` | ||
Decrease verbosity level. | ||
|
||
``-R <report>`` | ||
Generate a report of the specified type. | ||
|
||
- ``summary``: output only code divergence information. | ||
- ``clustering``: output only distance matrix and dendrogram. | ||
- ``all``: generate both summary and clustering reports. | ||
|
||
``-x <pattern>, --exclude <pattern>`` | ||
Exclude files matching this pattern from the code base. | ||
May be specified multiple times. | ||
|
||
``-p <platform>, --platform <platform>`` | ||
Include the specified platform in the analysis. | ||
May be specified multiple times. | ||
If not specified, all platforms will be included. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,130 @@ | ||
Compilation Databases | ||
===================== | ||
|
||
Before it can analyze a code base, CBI needs to know how each source file is | ||
compiled. Just like a compiler, CBI requires a full list of include paths, | ||
macro definitions and other options in order to identify which code is used | ||
by each platform. Rather than require all of this information to be specified | ||
manually, CBI reads it from a `compilation database`_. | ||
|
||
|
||
Generating a Compilation Database | ||
################################# | ||
|
||
Since our sample code base is already set up with a ``CMakeLists.txt`` file, we | ||
can ask CMake to generate the compilation database for us with the | ||
:code:`CMAKE_EXPORT_COMPILE_COMMANDS` option: | ||
|
||
.. code-block:: cmake | ||
:emphasize-lines: 4 | ||
cmake_minimum_required(VERSION 3.5) | ||
project(tutorial) | ||
set(CMAKE_EXPORT_COMPILE_COMMANDS ON) | ||
set(SOURCES main.cpp third-party/library.cpp) | ||
option(GPU_OFFLOAD "Enable GPU offload." OFF) | ||
if (GPU_OFFLOAD) | ||
add_definitions("-D GPU_OFFLOAD=1") | ||
list(APPEND SOURCES gpu/foo.cpp) | ||
else() | ||
list(APPEND SOURCES cpu/foo.cpp) | ||
endif() | ||
add_executable(tutorial ${SOURCES}) | ||
.. important:: | ||
For projects that don't use CMake, we can use `Bear`_ to intercept the | ||
commands generated by other build systems (such as GNU makefiles). Other | ||
build systems and tools that produce compilation databases should also be | ||
compatible. | ||
|
||
.. _`compilation database`: https://clang.llvm.org/docs/JSONCompilationDatabase.html | ||
.. _`Bear`: https://github.com/rizsotto/Bear | ||
|
||
|
||
CPU Compilation Commands | ||
------------------------ | ||
|
||
Let's start by running CMake without the :code:`GPU_OFFLOAD` option enabled, to | ||
obtain a compilation database for the CPU: | ||
|
||
.. code :: sh | ||
$ mkdir build-cpu | ||
$ cmake ../ | ||
$ ls | ||
CMakeCache.txt CMakeFiles Makefile cmake_install.cmake compile_commands.json | ||
This :code:`compile_commands.json` file includes all the commands required to | ||
build the code, corresponding to the commands that would be executed if we were | ||
to actually run :code:`make`. | ||
|
||
.. attention:: | ||
CMake generates compilation databases when the ``cmake`` command is | ||
executed, allowing us to generate compilation databases without also | ||
building the application. Other tools (like Bear) may require a build. | ||
|
||
In this case, it contains: | ||
|
||
.. code :: json | ||
[ | ||
{ | ||
"directory": "/home/username/src/build-cpu", | ||
"command": "/usr/bin/c++ -o CMakeFiles/tutorial.dir/main.cpp.o -c /home/username/src/main.cpp", | ||
"file": "/home/username/src/main.cpp" | ||
}, | ||
{ | ||
"directory": "/home/username/src/build-cpu", | ||
"command": "/usr/bin/c++ -o CMakeFiles/tutorial.dir/third-party/library.cpp.o -c /home/username/src/third-party/library.cpp", | ||
"file": "/home/username/src/third-party/library.cpp" | ||
}, | ||
{ | ||
"directory": "/home/username/src/build-cpu", | ||
"command": "/usr/bin/c++ -o CMakeFiles/tutorial.dir/cpu/foo.cpp.o -c /home/username/src/cpu/foo.cpp", | ||
"file": "/home/username/src/cpu/foo.cpp" | ||
} | ||
] | ||
GPU Compilation Commands | ||
------------------------ | ||
|
||
Repeating the exercise with :code:`GPU_OFFLOAD` enabled gives us a different | ||
compilation database for the GPU. | ||
|
||
.. warning:: | ||
The ``GPU_OFFLOAD`` option is specific to this ``CMakeLists.txt`` file, and | ||
isn't something provided by CMake. Understanding how to build an application | ||
for a specific target platform is beyond the scope of this tutorial. | ||
|
||
As expected, we can see that the compilation database refers to ``gpu.cpp`` | ||
instead of ``cpu.cpp``, and that the ``GPU_OFFLOAD`` macro is defined as part | ||
of each compilation command: | ||
|
||
.. code :: json | ||
[ | ||
{ | ||
"directory": "/home/username/src/build-gpu", | ||
"command": "/usr/bin/c++ -D GPU_OFFLOAD=1 -o CMakeFiles/tutorial.dir/main.cpp.o -c /home/username/src/main.cpp", | ||
"file": "/home/username/src/main.cpp" | ||
}, | ||
{ | ||
"directory": "/home/username/src/build-gpu", | ||
"command": "/usr/bin/c++ -D GPU_OFFLOAD=1 -o CMakeFiles/tutorial.dir/third-party/library.cpp.o -c /home/username/src/third-party/library.cpp", | ||
"file": "/home/username/src/third-party/library.cpp" | ||
}, | ||
{ | ||
"directory": "/home/username/src/build-gpu", | ||
"command": "/usr/bin/c++ -D GPU_OFFLOAD=1 -o CMakeFiles/tutorial.dir/gpu/foo.cpp.o -c /home/username/src/gpu/foo.cpp", | ||
"file": "/home/username/src/gpu/foo.cpp" | ||
} | ||
] | ||
These differences are the result of code divergence. We'll explore how to use | ||
``codebasin`` to measure the *amount* of code divergence in a later tutorial. |
Oops, something went wrong.