MHS generation algorithms

C++ implementations of several algorithms for minimal hitting set generation. These implementations can be run from a standalone binary and are suitable for inclusion in other projects as a library (see HOWTO.library.md).

Benchmarks of these and many other algorithms for MHS generation are available from our benchmark repository and paper.

This implementation is available in an AlgoRun container, which can be downloaded to your computer with docker pull compsysmed/agdmhs or run online

The minimal hitting set generation problem

Consider a family H of sets E₁, E₂, … Eₙ. A hitting set of H is a set S with the property that S intersects every one of the Es. A minimal hitting set is a hitting set which cannot be made smaller without losing this property. The minimal hitting set generation problem is to compute all the minimal hitting sets of the given family H.

For example, suppose that H is [[1, 2, 5], [2, 3, 4], [1, 3]]. Then [1, 2, 3] is a hitting set, because every set in H has at least one of these elements. However, it is not a minimal hitting set, because we could remove either 2 or 3! [1, 2] and [1, 3] are both minimal hitting sets of H.

Minimal hitting sets are important in a wide array of applications, ranging from very abstract mathematics (Boolean algebra and hypergraph theory) to high-tech applications (computational biology and data mining). Hitting sets have even been used to prove that there is no 16-clue sudoku! As a result, many algorithms have been developed to generate minimal hitting sets. We provide implementations of five of these algorithms, including two with state-of-the-art performance.

The algorithms

We provide implementations of five algorithms for MHS generation. We describe each below. Some of the algorithms have the option to generate only MHSes up to a certain size; we refer to this as "cutoff" enumeration. Some of the algorithms also support multiple processors.

pMMCS

(Supports cutoff and multithreading) Highly performant algorithm published in Efficient algorithms for dualizing large-scale hypergraphs in 2014 by Murakami and Uno. We provide a parallelized implementation.

pRS

(Supports cutoff and multithreading) Highly performant algorithm published in Efficient algorithms for dualizing large-scale hypergraphs in 2014 by Murakami and Uno. We provide a parallelized implementation.

Berge

(Supports cutoff) A sequential algorithm based on an algebraic decomposition of the input hypergraph. First published in 1984 by Berge in Hypergraphs: combinatorics of finite sets. (We find that this algorithm is not competitive with pMMCS and pRS on practical inputs. This algorithm is provided for testing purposes only.)

FK-A

The A algorithm from On the complexity of dualization of monotone disjunctive normal forms, Fredman, M. and Khachiyan, L, as improved in An efficient implementation of a quasi-polynomial algorithm for generating hypergraph transversals and its application in joint generation, E. Boros et al. Has the best known asymptotic bounds on runtime of any serial algorithm with a public implementation. (However, we find that it is not very efficient on practical inputs. This algorithm is provided for testing purposes only.)

BM

(Supports multithreading) A "global parallel" algorithm based on a full cover decomposition of the input hypergraph. Published in A fast and simple parallel algorithm for the monotone duality problem by E. Boros and K. Makino. Has the best known asymptotic bounds on runtime of any parallel algorithm. (However, we find that it is not very efficient on practical inputs. This algorithm is provided for testing purposes only.)

Compiling the code

This project depends on several libraries from the Boost project. To install the required dependencies on a Debian system, run

sudo apt-get install libboost-program-options-dev libboost-log-develop

It also uses the moodycamel::ConcurrentQueue library, which is included here under the terms of the author's "Simplified BSD license".

To download the code to your own computer, you should clone it using Git. To do this, run the following in your terminal:

git clone https://github.com/VeraLiconaResearchGroup/Minimal-Hitting-Set-Algorithms.git

This will download the code and extract it into a directory called Minimal-Hitting-Set-Algorithms. You can then compile the program by running make in that directory. (If you have multiple cores or processors, you can build in parallel by running make -j instead.)

To update the software, run git pull from the Minimal-Hitting-Set-Algorithms directory, then run make again.

Running the code

Once you have compiled the software, you're ready to generate minimal hitting sets. First, you need an input file representing your hypergraph. The format is simple and matches that used by the Hypergraph Dualization Repository. For each edge of your hypergraph, your input file should contain a single line which gives the vertices as positive integers separated by spaces. For example, the hypergraph [[1, 2, 5], [2, 3, 4], [1, 3]] is represented by the following input file:

1 2 5
2 3 4
1 3

We provide this example in [example-input.dat][example-input.dat]. To generate the MHSes of this hypergraph and store them in out.dat, run the following:

./agdmhs example-input.dat out.dat -a pmmcs

You can replace pmmcs with prs, berge, fka, or bm to run a different algorithm.

Cutoff enumeration

If you are only interested in hitting sets up to a certain size k, you can obtain them more efficiently. Add the option -c *k* to the command above to run in this mode. (This applies to pmmcs, prs, and berge only.)

Multithreaded enumeration

If your computer has multiple cores or processors, you can run in multithreaded mode to speed things up. To use t threads, add the option -t *t* to the command above. (This applies to pmmcs, prs, and bm only.)

License

This project uses the moodycamel::ConcurrentQueue implementation by Cameron Desrochers. It is included here (as include/concurrentqueue.h) under the terms of the author's license.

This project uses the Catch unit testing framework. It is included here (as include/catch.hpp) under the terms of the author's license.

All other code is provided by the author under the terms of the GPLv3 license.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
include		include
src		src
test		test
.gitignore		.gitignore
HOWTO.library.md		HOWTO.library.md
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
example-input.dat		example-input.dat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MHS generation algorithms

The minimal hitting set generation problem

The algorithms

pMMCS

pRS

Berge

FK-A

BM

Compiling the code

Running the code

Cutoff enumeration

Multithreaded enumeration

License

About

Releases 1

Packages

Contributors 2

Languages

License

VeraLiconaResearchGroup/Minimal-Hitting-Set-Algorithms

Folders and files

Latest commit

History

Repository files navigation

MHS generation algorithms

The minimal hitting set generation problem

The algorithms

pMMCS

pRS

Berge

FK-A

BM

Compiling the code

Running the code

Cutoff enumeration

Multithreaded enumeration

License

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages