Core: Fix the distribution of Options.Range.triangular() #4283

Mysteryem · 2024-11-29T02:46:16Z

What is this fixing or adding?

The original implementation was converting a continuous distribution of [a, b] to integers by rounding, but this results in the smallest and largest integers having half as many float values that round to them compared to other integers in the range.

For example, given the continuous range [0, 3]:
0.0 <= x < 0.5 rounds to 0 -> width of 0.5
0.5 <= x < 1.5 rounds to 1 -> width of 1.0
1.5 <= x < 2.5 rounds to 2 -> width of 1.0
2.5 <= x <= 3.0 rounds to 3 -> width of 0.5 (kind of plus an infinitesimal bit extra)

To convert to 4 integers uniformly, given a uniform continuous distribution, the width of the continuous distribution would have to be 4, e.g [-0.5, 3.5] or [0, 4].

This patch fixes the distribution of Options.Range.triangular() by increasing the width of the continuous distribution by 1. For simplicity, this changes the mode (tri) of the distribution to a float in [0.0, 1.0] where 0.0 corresponds to lower and 1.0 corresponds to end.

There is a near zero chance of random.triangular(a, b) returning exactly b when a != b, but this is accounted for by ensuring the returned value is never more than end.

How was this tested?

from collections import Counter
from Options import Range
# Change these values to your liking.
low = 0
high = 5
# For random-range-low, `tri=low`.
# For random-range-middle, `tri=None`,
# For random-range-high, `tri=high`.
tri = low
counts = Counter(Range.triangular(low, high, tri) for _ in range(1_000_000))
min_count = min(counts.values())
# Scale everything down such that the most infrequent value has a count of 1.0.
scaled_sorted_counts = {k: v/min_count for k, v in sorted(counts.items(), reverse=True, key=lambda t: t[1])}
print(*scaled_sorted_counts.items(), sep="\n")

Before

(low=0, high=5, tri=low).
1 and 2 end up more common than 0.

(1, 32.88061538461538)
(2, 24.562358974358975)
(0, 19.511794871794873)
(3, 16.38871794871795)
(4, 8.220615384615385)
(5, 1.0)

After

(low=0, high=5, tri=low).
0 is correctly the most common value and if you were to plot the points, you would get a straight line.

I used math.floor over the range [0,6] in the end so the diagram doesn't match with the code, but it should still illustrate the point.

(0, 11.004496888153398)
(1, 9.00525236536317)
(2, 6.9815087959132285)
(3, 4.989711119905026)
(4, 2.994136057847969)
(5, 1.0)

The original implementation was converting a continuous distribution of [a, b] to integers by rounding, but this results in the smallest and largest integers having half as many float values that round to them compared to other integers in the range. For example, given the continuous range [0, 3]: 0.0 <= x < 0.5 rounds to 0 -> width of 0.5 0.5 <= x < 1.5 rounds to 1 -> width of 1.0 1.5 <= x < 2.5 rounds to 2 -> width of 1.0 2.5 <= x <= 3.0 rounds to 3 -> width of 0.5 (kind of plus an infinitesimal bit extra) To convert to 4 integers uniformly given a uniform continuous distribution, the width of the continuous distribution would have to be 4, e.g [-0.5, 3.5] or [0, 4]. This patch fixes the distribution of Options.Range.triangular() by increasing the width of the continuous distribution by 1. This requires adjusting the mode (`tri`) of the distribution to the new width of the distribution and accounting for the near zero chance of `random.triangular(a, b+1, adjusted_tri)` returning exactly `b+1`.

Bicoloursnake

Ooh, it's not every day I see a math heavy PR. I did some number crunching and can confirm that the previous version was biased against the endpoints, and that this PR would remove that bias.

Mysteryem · 2024-11-29T04:06:58Z

Options.py

@@ -740,7 +740,51 @@ def __str__(self) -> str:

    @staticmethod
    def triangular(lower: int, end: int, tri: typing.Optional[int] = None) -> int:
-        return int(round(random.triangular(lower, end, tri), 0))
+        if lower == end:
+            return lower


This early return should go below the exception so that tri < lower or tri > end always raises an Exception, even when lower == end. I'll move it tomorrow.

NewSoupVi · 2024-11-29T08:34:00Z

So, this PR is correct mathematically from what I can tell

Now I like me some verbose code for some important functions, but this feels a bit overkill to me for what it is.

I was expecting to open the diff and see like, a + 1 and a - 0.5, or a + 1 and replacing round with floor. And then maybe a 1-2 line comment. :D

Are all of these additions necessary? Is end < lower an actual case we have?

beauxq · 2024-11-29T12:12:56Z

I think it should be questioned whether this change is good or not.
We have to remember that what users want, and what makes users happy, is more important than what is "correct".
If players have been using a different distribution for a long time, and they like that distribution, and we change that distribution to something they don't enjoy as much, then we're making it worse. It doesn't matter what's more "correct".

I've used random-low and random-high before a lot, and I wasn't working under the assumption that it's perfect triangular distribution.
I think it might make better gameplay experiences to lessen the extreme endpoints a bit.

Exempt-Medic · 2024-11-29T13:39:37Z

So, this PR is correct mathematically from what I can tell

Now I like me some verbose code for some important functions, but this feels a bit overkill to me for what it is.

I was expecting to open the diff and see like, a + 1 and a - 0.5, or a + 1 and replacing round with floor. And then maybe a 1-2 line comment. :D

Are all of these additions necessary? Is end < lower an actual case we have?

Currently random-range-low-10-1 skews towards 10 which... I feel like we should just error and not "fix" it, since it's unclear what the user actually wanted. But I'm also very "error-happy"

Exempt-Medic · 2024-11-29T13:41:25Z

I haven't really stepped through anything, but do you happen to know how well this considers negative numbers ?

Mysteryem · 2024-11-29T13:45:10Z

So, this PR is correct mathematically from what I can tell

Now I like me some verbose code for some important functions, but this feels a bit overkill to me for what it is.

I was expecting to open the diff and see like, a + 1 and a - 0.5, or a + 1 and replacing round with floor. And then maybe a 1-2 line comment. :D

Are all of these additions necessary? Is end < lower an actual case we have?

I prefer to add more comments than I think are necessary for maths stuff to help those that are less mathematically inclined (or me when half asleep). Partially why I provided test code output and some graphs as a visual aid.

Core does not have end < lower, but none of these functions are marked as private, so removing that functionality results in a breaking API change (more so than removing tri > end or tri < lower which is already a breaking change).

Personally, I would do more breaking changes and rewrite Range.triangular and its usages so that it is expecting tri to always be in the range [0,1] and then the function could be squashed down to

    @staticmethod
    def triangular(lower: int, end: int, tri: float = 0.5) -> int:
        """
        Integer triangular distribution for `lower` inclusive to `end` inclusive.
        
        Expects `lower <= end` and `0 <= tri <= 1`. The result of other inputs is undefined.
        """
        # Use the continuous range [lower, end + 1) to produce an integer result in [lower, end].
        # random.triangular is actually [a, b] and not [a, b), so there is a very small chance of getting exactly b, so
        # ensure the result is never more than `end`.
        return min(end, math.floor(random.triangular(0, 1, tri) * (end - lower + 1) + lower))

Adding a

        if end < lower:
            end, lower = lower, end

at the start would additionally add support for end < lower.

I think it should be questioned whether this change is good or not. We have to remember that what users want, and what makes users happy, is more important than what is "correct". If players have been using a different distribution for a long time, and they like that distribution, and we change that distribution to something they don't enjoy as much, then we're making it worse. It doesn't matter what's more "correct".

I've used random-low and random-high before a lot, and I wasn't working under the assumption that it's perfect triangular distribution. I think it might make better gameplay experiences to lessen the extreme endpoints a bit.

The current distribution is pretty close to triangular for larger ranges, but more inaccurate for smaller ranges. If a random-low distribution that is more heavily weighted towards lower numbers, but doesn't weight the lowest number the highest could be preferred, then that could be discussed, but I think it is outside the scope of this PR.

Mysteryem · 2024-11-29T13:51:58Z

Currently random-range-low-10-1 skews towards 10 which... I feel like we should just error and not "fix" it, since it's unclear what the user actually wanted. But I'm also very "error-happy"

Range.custom_range() sorts the two values before passing them to Range.triangular, so I think it would actually skew towards 1.

Archipelago/Options.py

Line 714 in 6f2464d

random_range.sort()

I haven't really stepped through anything, but do you happen to know how well this considers negative numbers ?

Yes, either lower or end being negative, or both is accounted for. This is why the code uses math.floor() to always truncate towards lower rather than int() which truncates towards 0.

Exempt-Medic · 2024-11-29T13:53:55Z

Range.custom_range() sorts the two values before passing them to Range.triangular, so I think it would actually skew towards 1.

Archipelago/Options.py

Line 714 in 6f2464d

random_range.sort()

Then how can lower > end ?

Mysteryem · 2024-11-29T14:04:19Z

Range.custom_range() sorts the two values before passing them to Range.triangular, so I think it would actually skew towards 1.

Archipelago/Options.py

Line 714 in 6f2464d

random_range.sort()

Then how can lower > end ?

A world can define their own options or use other code that calls Range.triangular() directly.

Exempt-Medic · 2024-11-29T14:30:44Z

Options.py

+        if tri is not None and (tri < lower or tri > end):
+            # random.triangular allows this for performance reasons, but it is not well-defined/documented behaviour, so
+            # we'll reject this scenario for simplicity.
+            raise Exception(f"Triangular distribution mode {tri} is outside the allowed range {lower}-{end}")


Should this be an OptionError?

OptionError doesn't seem to be used much in this file. Range.__init__, Range.weighted_range and Range.custom_range can all raise Exception, so that's why I went with Exception here.

I think that's just a lack of updating to use the new error

Exempt-Medic · 2024-11-29T14:56:47Z

Then how can lower > end ?

A world can define their own options or use other code that calls Range.triangular() directly.

My personal two cents: this should not be a concern or something to try to handle. This would cut down a lot on these changes too. I get if you don't want to though

Now expects `lower <= end` and `tri` as a float: `0.0 <= tri <= 1.0`. No warnings or exceptions will be raised if these expectations are not met.

It's expected that we can get `end`, it's the b in [a, b] that we don't want.

Exempt-Medic · 2024-12-01T18:55:56Z

Options.py

+    def triangular(lower: int, end: int, tri: float = 0.5) -> int:
+        """
+        Integer triangular distribution for `lower` inclusive to `end` inclusive.
+
+        Expects `lower <= end` and `0.0 <= tri <= 1.0`. The result of other inputs is undefined.
+        """
+        # Use the continuous range [lower, end + 1) to produce an integer result in [lower, end].
+        # random.triangular is actually [a, b] and not [a, b), so there is a very small chance of getting exactly b even
+        # when a != b, so ensure the result is never more than `end`.
+        return min(end, math.floor(random.triangular(0.0, 1.0, tri) * (end - lower + 1) + lower))


"lower" was getting confusing for me, you could use "start" instead, maybe? I think "mode" is also more descriptive than "tri" for this distribution.

Suggested change

def triangular(lower: int, end: int, tri: float = 0.5) -> int:

"""

Integer triangular distribution for `lower` inclusive to `end` inclusive.

Expects `lower <= end` and `0.0 <= tri <= 1.0`. The result of other inputs is undefined.

"""

# Use the continuous range [lower, end + 1) to produce an integer result in [lower, end].

# random.triangular is actually [a, b] and not [a, b), so there is a very small chance of getting exactly b even

# when a != b, so ensure the result is never more than `end`.

return min(end, math.floor(random.triangular(0.0, 1.0, tri) * (end - lower + 1) + lower))

def triangular(start: int, end: int, mode: float = 0.5) -> int:

"""

Integer triangular distribution from `start` inclusive to `end` inclusive.

Expects `start <= end` and `0.0 <= mode <= 1.0`. The result of other inputs is undefined.

"""

# Use the continuous range [start, end + 1) to produce an integer result in [start, end].

# random.triangular is actually [a, b] and not [a, b), so there is a very small chance of getting exactly b even

# when a != b, so ensure the result is never more than `end`.

return min(end, math.floor(random.triangular(0.0, 1.0, mode) * (end - start + 1) + start))

Would be on board with this personally, however since tri is a kwarg, we'd have to make sure it's not passed as e.g. tri=0.8 anywhere

Why? That seems like an intended thing to do. The default is 0,5, and then other values, like 0.8, a world could use itself when calling this

Exempt-Medic

Did some test trials and saw the desired results/distribution come out. Also analyzed the PDF directly.

github-actions bot added affects: core Issues/PRs that touch core and may need additional validation. waiting-on: peer-review Issue/PR has not been reviewed by enough people yet. labels Nov 29, 2024

Bicoloursnake approved these changes Nov 29, 2024

View reviewed changes

Mysteryem commented Nov 29, 2024

View reviewed changes

Exempt-Medic added the is: bug/fix Issues that are reporting bugs or pull requests that are fixing bugs. label Nov 29, 2024

Exempt-Medic reviewed Nov 29, 2024

View reviewed changes

Mysteryem added 2 commits December 1, 2024 14:50

Simplify Range.triangular

efefa2e

Now expects `lower <= end` and `tri` as a float: `0.0 <= tri <= 1.0`. No warnings or exceptions will be raised if these expectations are not met.

Fix using the wrong variables in the comments

94a91a7

It's expected that we can get `end`, it's the b in [a, b] that we don't want.

Exempt-Medic reviewed Dec 1, 2024

View reviewed changes

Exempt-Medic approved these changes Dec 1, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core: Fix the distribution of Options.Range.triangular() #4283

Core: Fix the distribution of Options.Range.triangular() #4283

Mysteryem commented Nov 29, 2024 •

edited

Loading

Bicoloursnake left a comment

Mysteryem Nov 29, 2024

NewSoupVi commented Nov 29, 2024 •

edited

Loading

beauxq commented Nov 29, 2024

Exempt-Medic commented Nov 29, 2024 •

edited

Loading

Exempt-Medic commented Nov 29, 2024 •

edited

Loading

Mysteryem commented Nov 29, 2024

Mysteryem commented Nov 29, 2024

Exempt-Medic commented Nov 29, 2024

Mysteryem commented Nov 29, 2024

Exempt-Medic Nov 29, 2024

Mysteryem Nov 29, 2024

Exempt-Medic Nov 29, 2024

Exempt-Medic commented Nov 29, 2024

Exempt-Medic Dec 1, 2024

NewSoupVi Dec 2, 2024 •

edited

Loading

Exempt-Medic Dec 2, 2024 •

edited

Loading

Exempt-Medic left a comment

Core: Fix the distribution of Options.Range.triangular() #4283

Are you sure you want to change the base?

Core: Fix the distribution of Options.Range.triangular() #4283

Conversation

Mysteryem commented Nov 29, 2024 • edited Loading

What is this fixing or adding?

How was this tested?

Before

After

Bicoloursnake left a comment

Choose a reason for hiding this comment

Mysteryem Nov 29, 2024

Choose a reason for hiding this comment

NewSoupVi commented Nov 29, 2024 • edited Loading

beauxq commented Nov 29, 2024

Exempt-Medic commented Nov 29, 2024 • edited Loading

Exempt-Medic commented Nov 29, 2024 • edited Loading

Mysteryem commented Nov 29, 2024

Mysteryem commented Nov 29, 2024

Exempt-Medic commented Nov 29, 2024

Mysteryem commented Nov 29, 2024

Exempt-Medic Nov 29, 2024

Choose a reason for hiding this comment

Mysteryem Nov 29, 2024

Choose a reason for hiding this comment

Exempt-Medic Nov 29, 2024

Choose a reason for hiding this comment

Exempt-Medic commented Nov 29, 2024

Exempt-Medic Dec 1, 2024

Choose a reason for hiding this comment

NewSoupVi Dec 2, 2024 • edited Loading

Choose a reason for hiding this comment

Exempt-Medic Dec 2, 2024 • edited Loading

Choose a reason for hiding this comment

Exempt-Medic left a comment

Choose a reason for hiding this comment

Mysteryem commented Nov 29, 2024 •

edited

Loading

NewSoupVi commented Nov 29, 2024 •

edited

Loading

Exempt-Medic commented Nov 29, 2024 •

edited

Loading

Exempt-Medic commented Nov 29, 2024 •

edited

Loading

NewSoupVi Dec 2, 2024 •

edited

Loading

Exempt-Medic Dec 2, 2024 •

edited

Loading