Implements the Flajolet-Martin algorithm for approximately counting unique items in a data stream,
• Flajolet and Martin, Probabilistic Counting Algorithms for Data Base Applications,
Journal of Computer and System Sciences 31, 182-209 (1985).
Boost
http://www.boost.org
FarmHash
https://github.com/google/farmhash
main.cpp
Test main that approximately counts the number of unique words in a specified text file.
constitution-words-only.txt
Text of the United States Constitution, with punctuation removed. This contains 925 unique words.
declaration-words-only.txt
Text of the Declaration of Independence, with punctuation removed. This contains 585 unique words.