Skip to content

Commit

Permalink
Merge pull request #85 from Pennycook/docs
Browse files Browse the repository at this point in the history
Rewrite the documentation
  • Loading branch information
Pennycook authored Mar 25, 2024
2 parents 5d0a6a0 + ae283fe commit 2d2c308
Show file tree
Hide file tree
Showing 48 changed files with 1,267 additions and 563 deletions.
131 changes: 131 additions & 0 deletions CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# Contributor Covenant Code of Conduct

## Our Pledge

We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.

We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.

## Our Standards

Examples of behavior that contributes to a positive environment for our
community include:

* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the overall
community

Examples of unacceptable behavior include:

* The use of sexualized language or imagery, and sexual attention or advances of
any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email address,
without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting

## Enforcement Responsibilities

Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.

Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.

## Scope

This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.

## Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
CommunityCodeOfConduct AT intel DOT com.
All complaints will be reviewed and investigated promptly and fairly.

All community leaders are obligated to respect the privacy and security of the
reporter of any incident.

## Enforcement Guidelines

Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:

### 1. Correction

**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.

**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.

### 2. Warning

**Community Impact**: A violation through a single incident or series of
actions.

**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or permanent
ban.

### 3. Temporary Ban

**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.

**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.

### 4. Permanent Ban

**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.

**Consequence**: A permanent ban from any sort of public interaction within the
community.

## Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.1, available at
[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].

Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].

For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
[https://www.contributor-covenant.org/translations][translations].

[homepage]: https://www.contributor-covenant.org
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
132 changes: 71 additions & 61 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,72 +1,38 @@
# Code Base Investigator
Code Base Investigator (CBI) is a tool designed to help developers reason about the use of _specialization_ (i.e. code written specifically to provide support for or improve performance on some set of platforms) in a code base. Specialization is often necessary, but how a developer chooses to express it may impact code portability and future maintenance costs.

The [definition of platform](https://doi.org/10.1016/j.future.2017.08.007) used by CBI is deliberately very flexible and completely user-defined; a platform can represent any execution environment for which code may be specialized. A platform could be a compiler, an operating system, a micro-architecture or some combination of these options.
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5018974.svg)](https://doi.org/10.5281/zenodo.5018974)
[![OpenSSF Best Practices](https://www.bestpractices.dev/projects/8679/badge)](https://www.bestpractices.dev/projects/8679)

## Code Divergence
CBI measures the amount of specialization in a code base using [code divergence](http://doi.org/10.1109/P3HPC.2018.00006), which is defined as the arithmetic mean pair-wise distance between the code-paths used by each platform.
Code Base Investigator (CBI) is an analysis tool that provides insight into the
portability and maintainability of an application's source code.

At the two extremes, a code divergence of 0 means that all of the platforms use exactly the same code, while a code divergence of 1 means that there is no code shared between any of the platforms. The code divergence of real codes will fall somewhere in between.
- Measure [code divergence](http://doi.org/10.1109/P3HPC.2018.00006) to
understand how much code is specialized for different compilers, operating
systems, hardware micro-architectures and more.

## How it Works
![Abstract Syntax Tree](./docs/example-ast.png)
- Visualize the distance between the code paths used to support different
compilation targets.

CBI tracks specialization in two forms: source files that are not compiled for all platforms; and regions of source files that are guarded by C preprocessor directives (e.g. `#ifdef`). A typical run of CBI consists of a three step process:
1) Extract source files and compilation commands from a configuration file or compilation database.
2) Build an AST representing which source lines of code (LOC) are associated with each specialization.
3) Record which specializations are used by each platform.
- Identify stale, legacy, code paths that are unused by any compilation target.

## Usage
- Export metrics and code path information required for P3 analysis using [other
tools](https://intel.github.io/p3-analysis-library/).

The `codebasin` script analyzes a code base described in a YAML configuration file and produces one or more output reports. Example configuration files can be found in the [examples](./examples) directory, and see the [configuration file documentation](docs/configuration.md) for a detailed description of the configuration file format.

To see a complete list of `codebasin` options, run `codebasin -h`.
## Table of Contents

> [!IMPORTANT]
> In previous releases of Code Base Investigator, the main script was called `codebasin.py`. The old naming was a bug that needed to be fixed, and we made the difficult decision to rename the script ahead of the next major release.
- [Dependencies](#dependencies)
- [Installation](#installation)
- [Getting Started](#getting-started)
- [Contribute](#contribute)
- [License](#license)
- [Security](#security)
- [Code of Conduct](#code-of-conduct)
- [Citations](#citations)

### Summary Report
The summary report (`-R summary`) gives a high-level summary of a code base, as shown below:
```
---------------------------------------------
Platform Set LOC % LOC
---------------------------------------------
{} 2 4.88
{GPU 1} 1 2.44
{GPU 2} 1 2.44
{CPU 2} 1 2.44
{CPU 1} 1 2.44
{FPGA} 14 34.15
{GPU 2, GPU 1} 6 14.63
{CPU 1, CPU 2} 6 14.63
{FPGA, CPU 1, GPU 2, GPU 1, CPU 2} 9 21.95
---------------------------------------------
Code Divergence: 0.55
Unused Code (%): 4.88
Total SLOC: 41
```
Each row in the table shows the amount of code that is unique to a given set of platforms. Listed below the table are the computed code divergence, the amount of code in the code base that was not compiled for any platform, and the total size of the code base.

### Clustering Report
The clustering report (`-R clustering`) consists of a pair-wise distance matrix, showing the ratio of platform-specific code to code used by both platforms. These distances are the same as those used to compute code divergence.
```
Distance Matrix
-----------------------------------
FPGA CPU 1 GPU 2 GPU 1 CPU 2
-----------------------------------
FPGA 0.00 0.70 0.70 0.70 0.70
CPU 1 0.70 0.00 0.61 0.61 0.12
GPU 2 0.70 0.61 0.00 0.12 0.61
GPU 1 0.70 0.61 0.12 0.00 0.61
CPU 2 0.70 0.12 0.61 0.61 0.00
-----------------------------------
```

The distances can also be used to produce a dendrogram, showing the result of hierarchical clustering by platform similarity:

![Dendrogram](./docs/example-dendrogram.png)

## Dependencies

- jsonschema
- Matplotlib
- NumPy
Expand All @@ -75,15 +41,59 @@ The distances can also be used to produce a dendrogram, showing the result of hi
- PyYAML
- SciPy

CBI and its dependencies can be installed using `setup.py`:

## Installation

The latest release of CBI is version 1.2.0. To download and install this
release, run the following:

```
python3 setup.py install
git clone --branch 1.2.0 https://github.com/intel/code-base-investigator.git
cd code-base-investigator
pip install .
```

The master branch of CBI is the development branch, and should not be used in production. Tagged releases are available [here](https://github.com/intel/code-base-investigator/releases).
We strongly recommend installing CBI within a [virtual
environment](https://docs.python.org/3/library/venv.html).

## Getting Started

After installation, run `codebasin -h` to see a complete list of options.

A full tutorial can be found in the [online
documentation](https://intel.github.io/code-base-investigator/).


## Contribute

Contributions to CBI are welcome in the form of issues and pull requests.

See [CONTRIBUTING](CONTRIBUTING.md) for more information.


## License

[BSD 3-Clause](./LICENSE)

## Contributing
See the [contribution guidelines](./CONTRIBUTING.md) for details.

## Security

See [SECURITY](SECURITY.md) for more information.

The main branch of CBI is the development branch, and should not be used in
production. Tagged releases are available
[here](https://github.com/intel/code-base-investigator/releases).


## Code of Conduct

Intel has adopted the Contributor Covenant as the Code of Conduct for all of
its open source projects. See [CODE OF CONDUCT](CODE_OF_CONDUCT.md) for more
information.


## Citations

If your use of CBI results in a research publication, please consider citing
the software and/or the papers that inspired its functionality (as
appropriate). See [CITATION](CITATION.cff) for more information.
28 changes: 18 additions & 10 deletions bin/codebasin
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,13 @@ def main():
# Read command-line arguments
parser = argparse.ArgumentParser(
description="Code Base Investigator " + str(version),
add_help=False,
)
parser.add_argument(
"-h",
"--help",
action="help",
help="Display help message and exit.",
)
parser.add_argument(
"--version",
Expand All @@ -54,58 +61,59 @@ def main():
"-r",
"--rootdir",
dest="rootdir",
metavar="DIR",
metavar="<dir>",
default=None,
help="Set working root directory (default .)",
help="Set working root directory. "
+ "Defaults to current working directory.",
)
deprecated_args.add_argument(
"-c",
"--config",
dest="config_file",
metavar="<config-file>",
action="store",
help="Configuration YAML file. " + "Defaults to config.yaml",
help="Configuration YAML file. " + "Defaults to config.yaml.",
)
parser.add_argument(
"-v",
"--verbose",
dest="verbose",
action="count",
default=0,
help="increase verbosity level",
help="Increase verbosity level.",
)
parser.add_argument(
"-q",
"--quiet",
dest="quiet",
action="count",
default=0,
help="decrease verbosity level",
help="Decrease verbosity level.",
)
parser.add_argument(
"-R",
"--report",
dest="reports",
metavar="REPORT",
metavar="<report>",
default=["all"],
choices=["all", "summary", "clustering"],
nargs="+",
help="desired output reports (default: all)",
help="Generate a report of the specified type.",
)
deprecated_args.add_argument(
"-d",
"--dump",
dest="dump",
metavar="<file.json>",
action="store",
help="dump out annotated platform/parsing tree to <file.json>",
help="Dump out annotated platform/parsing tree to <file.json>.",
)
deprecated_args.add_argument(
"--batchmode",
dest="batchmode",
action="store_true",
default=False,
help="Set batch mode (additional output for bulk operation.)",
help="Enable additional output for bulk operation.",
)
parser.add_argument(
"-x",
Expand Down Expand Up @@ -133,7 +141,7 @@ def main():
"analysis_file",
metavar="<analysis-file>",
nargs="?",
help="TOML file describing the analysis to be performed,"
help="TOML file describing the analysis to be performed, "
+ "including the codebase and platform descriptions.",
)

Expand Down
Loading

0 comments on commit 2d2c308

Please sign in to comment.