Sampling without replacement

The problem of sampling without replacement: given integers n, k such that 0 ≤ k ≤ n, return k distinct nonnegative integers < n, choosing at random such that every possible result is equally likely. One way to get a precise understanding of the problem is to look at the simplest algorithm for solving it, via these Python or C++ implementations.

Here, I propose an algorithm I call "Cardchoose" for this problem which as far as I know is novel, and which in C++ outperforms all other solutions to either problem for most (n, k) inputs. The only data structure used is the output vector, so no extra memory is needed; it runs in O(k log k) time. I've implemented this and other algorithms in C++ and Python to compare their performance.

In my C++ tests, cardchoose outperforms rejectionsample, floydf2, and hsel for all values of n and k. quadraticreject is best where k < 100 or so, and select and iterativechoose perform well when k/n is large. cardchoose always performs within an acceptable factor of other algorithms.

def sorted_choose(n, k):
    "k distinct integers 0 <= x < n, sorted"
    t = n - k + 1
    d = [None] * k
    for i in range(k):
        r = random.randrange(t + i)
        if r < t:
            d[i] = r
        else:
            d[i] = d[r - t]
    d.sort()
    for i in range(k):
        d[i] += i
    return d

This is not an officially supported Google product.

Name		Name	Last commit message	Last commit date
Latest commit History 193 Commits
cpp		cpp
latex		latex
python		python
results/latest		results/latest
rust		rust
third_party		third_party
.gitattributes		.gitattributes
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
algorithms.md		algorithms.md
cardchoose.md		cardchoose.md
results.md		results.md
running.md		running.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sampling without replacement

About

Releases

Packages

Languages

License

ciphergoth/sansreplace

Folders and files

Latest commit

History

Repository files navigation

Sampling without replacement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages