Skip to content

Commit

Permalink
Core: Fix the distribution of Options.Range.triangular()
Browse files Browse the repository at this point in the history
The original implementation was converting a continuous distribution of
[a, b] to integers by rounding, but this results in the smallest and
largest integers having half as many float values that round to them
compared to other integers in the range.

For example, given the continuous range [0, 3]:
0.0 <= x < 0.5 rounds to 0 -> width of 0.5
0.5 <= x < 1.5 rounds to 1 -> width of 1.0
1.5 <= x < 2.5 rounds to 2 -> width of 1.0
2.5 <= x <= 3.0 rounds to 3 -> width of 0.5 (kind of plus an
infinitesimal bit extra)

To convert to 4 integers uniformly given a uniform continuous
distribution, the width of the continuous distribution would have to be
4, e.g [-0.5, 3.5] or [0, 4].

This patch fixes the distribution of Options.Range.triangular() by
increasing the width of the continuous distribution by 1. This requires
adjusting the mode (`tri`) of the distribution to the new width of the
distribution and accounting for the near zero chance of
`random.triangular(a, b+1, adjusted_tri)` returning exactly `b+1`.
  • Loading branch information
Mysteryem committed Nov 29, 2024
1 parent ce210cd commit dd55eb5
Showing 1 changed file with 45 additions and 1 deletion.
46 changes: 45 additions & 1 deletion Options.py
Original file line number Diff line number Diff line change
Expand Up @@ -740,7 +740,51 @@ def __str__(self) -> str:

@staticmethod
def triangular(lower: int, end: int, tri: typing.Optional[int] = None) -> int:
return int(round(random.triangular(lower, end, tri), 0))
if lower == end:
return lower

if lower > end:
# Swap the two so that `lower` is always smaller. This simplifies later code.
lower, end = end, lower

if tri is not None and (tri < lower or tri > end):
# random.triangular allows this for performance reasons, but it is not well-defined/documented behaviour, so
# we'll reject this scenario for simplicity.
raise Exception(f"Triangular distribution mode {tri} is outside the allowed range {lower}-{end}")

# To produce integers from [a, b] from a continuous distribution, it is easier to start with a continuous
# distribution that is [a, b+1). For example, for lower=0 and end=2, the continuous distribution of [0, 3) can
# be split into 3 groups: 0 <= x < 1, 1 <= x < 2 and 2 <= x < 3.
new_end = end + 1
if tri is not None:
# `tri` needs to be remapped from the original [lower, end) range to the new [lower, new_end) range.
# Normalize to the range [0, 1).
# '[lower, end)' - lower = '[0, end - lower)'
# '[0, end - lower)' / (end - lower) = '[0, 1)'
tri_normalized = (tri - lower) / (end - lower)
# Scale up to fit the new range and then offset back by lower.
# '[0, 1)' * (new_end - lower) = '[0, new_end - lower)'
# '[0, new_end - lower)' + lower = '[lower, new_end)'
tri_rescaled = tri_normalized * (new_end - lower) + lower
else:
tri_rescaled = None

# To produce integers from these floats, truncate towards `lower` by using `math.floor`.
# Truncating with `int(my_float)` truncates towards `0`, so would not work correctly with `lower < 0`.
# Given the previous example for a continuous distribution of [0, 3):
# 0 <= x < 1 -> 0
# 1 <= x < 2 -> 1
# 2 <= x < 3 -> 2
r_int = math.floor(random.triangular(lower, new_end, tri_rescaled))

# Unlike, `random.random()` which is a-inclusive to b-exclusive, [a, b), `random.triangular()` is a-inclusive to
# b-inclusive, [a, b], so there is a chance of getting exactly b.
# With 1 million calls of `random.triangular(0.9999999999, 1, 1)` might return `1` a single time. `lower` and
# `end` are integers so are always at least 1 apart, so the chance of getting exactly `end` should be very
# small.
# Because `tri` has been limited to `lower <= tri <= end` and `lower` and `end` have been swapped if `lower`
# was greater than `end`, `r_int` only needs to be checked for being larger than `end`.
return min(end, r_int)


class NamedRange(Range):
Expand Down

0 comments on commit dd55eb5

Please sign in to comment.