Skip to content

Examine the cardinality of a relation between two datasets

License

Notifications You must be signed in to change notification settings

Naturhistoriska/cardinality.py

Repository files navigation

cardinality.py

Build-Status Coverage-Status License

cardinality.py is a small tool written in Python for examining the cardinality of a relation between two datasets. All the code is contained within a single file that can be imported using Python's import mechanism or used as a command-line tool.

The code has been tested with Python 3.7.

Source repository: https://github.com/naturhistoriska/cardinality.py


Prerequisites

  • Python 3
  • The Python library pandas

An easy way to get Python working on your computer is to install the free Anaconda distribution.

Installation

The project is hosted at <https://github.com/naturhistoriska/cardinality.py> and can be downloaded using git:

$ git clone https://github.com/naturhistoriska/cardinality.py

Usage

$ ./cardinality.py --help
usage: cardinality.py [-h] [-V] [-v] [-p column [column ...]]
                      [-f column [column ...]]
                      pk-file fk-file

Command-line utility for examining the cardinality of the relation between two
TSV-files.

positional arguments:
  pk-file               TSV-file with primary keys
  fk-file               TSV-file with foreign keys

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit
  -v, --verbose         show verbose output
  -p column [column ...]
                        primary key columns
  -f column [column ...]
                        foreign key columns

Example usage

Examine the relation between two example datasets included in this repository.

$ ./cardinality.py test_files/pk-data.tsv test_files/fk-data.tsv -p pk -f fk
0,1 to 0,3

Running the tests

Testing is carried out with pytest:

$ pytest -v test_cardinality.py

Test coverage can be calculated with Coverage.py using the following commands:

$ coverage run -m pytest
$ coverage report -m cardinality.py

The code follow style conventions in PEP8, which can be checked with pycodestyle:

$ pycodestyle cardinality.py test_cardinality.py

License

cardinality.py is distributed under the MIT license.

Author and maintainer

Markus Englund, [email protected]