You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey! A super useful feature would be allowing users to set the values that are used for normalisation of tasks. For example then one could easily do PPO-Normalisation or Human-Performance-Normalisation etc. Additionally, one could simply set the true max and min values of a task. This is really a much needed feature since the way normalisation is done here can actually cause serious undesired effects. Through the addition of new algorithms and more data thereby changing the normalisation, you can actually change the outcome severely of previous experiments. For example, if you ran PPO and SAC and compared performance, you might see that one is statistically significant in its improvement, however as soon as you add a new algorithm this changes the normalisation and can completely change this result. Addiyionally, if you are making two separate plots of different subsets of algorithms, these scores can be normalised completely differently leading to misleading results when looking across the graphs.
Desired Outcome
I imagine the best way would simply be to allow the user to input a dictionary mapping env suite and task to min and max score.
The text was updated successfully, but these errors were encountered:
Description
Hey! A super useful feature would be allowing users to set the values that are used for normalisation of tasks. For example then one could easily do PPO-Normalisation or Human-Performance-Normalisation etc. Additionally, one could simply set the true max and min values of a task. This is really a much needed feature since the way normalisation is done here can actually cause serious undesired effects. Through the addition of new algorithms and more data thereby changing the normalisation, you can actually change the outcome severely of previous experiments. For example, if you ran PPO and SAC and compared performance, you might see that one is statistically significant in its improvement, however as soon as you add a new algorithm this changes the normalisation and can completely change this result. Addiyionally, if you are making two separate plots of different subsets of algorithms, these scores can be normalised completely differently leading to misleading results when looking across the graphs.
Desired Outcome
I imagine the best way would simply be to allow the user to input a dictionary mapping env suite and task to min and max score.
The text was updated successfully, but these errors were encountered: