You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In classification currently there is no warning if the dataset is unbalanced. A warning should be thrown and if possible the user should be given the option to balance the dataset by simple subsampling the largest group to reduce it to the smallest group. A heuristic approach for unbalanced could be 25% more data in one group than the other or something along those lines.
This option to balance the dataset will be useful for the new modules added that look at the error prediction vs classification accuracy.
If not another option instead of balancing the data is to use metrics that take this into account or learning algorithms that can deal with unbalanced datasets.
The text was updated successfully, but these errors were encountered:
In classification currently there is no warning if the dataset is unbalanced. A warning should be thrown and if possible the user should be given the option to balance the dataset by simple subsampling the largest group to reduce it to the smallest group. A heuristic approach for unbalanced could be 25% more data in one group than the other or something along those lines.
This option to balance the dataset will be useful for the new modules added that look at the error prediction vs classification accuracy.
If not another option instead of balancing the data is to use metrics that take this into account or learning algorithms that can deal with unbalanced datasets.
The text was updated successfully, but these errors were encountered: