You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, in the gymnasium/spaces/discrete module, the sample() method of the Discrete class only allows you to choose randomly between valid integers. It would be nice to extend it so that it can also accept a probability mask, where it gives the probability that a specific number can be returned. It could be used as an alternative to masking, because if a move is invalid, we can assign it a probability of zero.
Motivation
This would be helpful for me, since I'm implementing an Ant Colony Optimization algorithm, and each action needs to be assigned a probability of being chosen.
Pitch
Add a probability_mask parameter to the sample function, so that the user can choose to either provide a regular mask, probability mask, or neither. A verification should be added to make sure that they only provided one of the mask options, not both. We also need to verify that the probabilities add up to 1.
Alternatives
Rather than adding a new argument to the sample function, we could also create a new function called probability_sample, that takes the arguments n, mask. Both arguments could be required, since if you're not going to add a mask, you might as well use the original sample method.
Additional context
If the maintainers of Gymnasium think that this is a good idea, I would love to implement it myself and make a pull request.
Checklist
I have checked that there is no similar issue in the repo
The text was updated successfully, but these errors were encountered:
Proposal
Currently, in the gymnasium/spaces/discrete module, the sample() method of the Discrete class only allows you to choose randomly between valid integers. It would be nice to extend it so that it can also accept a probability mask, where it gives the probability that a specific number can be returned. It could be used as an alternative to masking, because if a move is invalid, we can assign it a probability of zero.
Motivation
This would be helpful for me, since I'm implementing an Ant Colony Optimization algorithm, and each action needs to be assigned a probability of being chosen.
Pitch
Add a probability_mask parameter to the sample function, so that the user can choose to either provide a regular mask, probability mask, or neither. A verification should be added to make sure that they only provided one of the mask options, not both. We also need to verify that the probabilities add up to 1.
Alternatives
Rather than adding a new argument to the sample function, we could also create a new function called probability_sample, that takes the arguments n, mask. Both arguments could be required, since if you're not going to add a mask, you might as well use the original sample method.
Additional context
If the maintainers of Gymnasium think that this is a good idea, I would love to implement it myself and make a pull request.
Checklist
The text was updated successfully, but these errors were encountered: