You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The discussion on whether we want to distinguish between ordinal and nominal categorical features in ehrapy was raised while calculating feature correlations as part of the new bias detection method (PR #690).
As of now, the feature correlations would be the only application in ehrapy where we would need the differentiation between nominal and ordinal features. Because, as soon as data are encoded, detecting this difference automatically is nearly impossible, we would add quite some effort for the user, as they would have to manually declare what features are ordinal and what nominal just to compute the feature correlations using the optimal method (Spearman CC vs. Cramer's V, for instance).
Additionally, just computing Spearman/Pearson CC for all features won't show any correlations that aren't there, but just some correlations between categorical features won't be revealed. However, those should then be detected by the feature importances calculation.
Hence, we decided to stick to Pearson/Spearman CC for all features as of now. If in the future the differentiation between ordinal and nominal categorical features becomes important at other places in ehrapy, it would be easy to adapt the bias detection method accordingly.
The text was updated successfully, but these errors were encountered:
Description of feature
The discussion on whether we want to distinguish between ordinal and nominal categorical features in ehrapy was raised while calculating feature correlations as part of the new bias detection method (PR #690).
As of now, the feature correlations would be the only application in ehrapy where we would need the differentiation between nominal and ordinal features. Because, as soon as data are encoded, detecting this difference automatically is nearly impossible, we would add quite some effort for the user, as they would have to manually declare what features are ordinal and what nominal just to compute the feature correlations using the optimal method (Spearman CC vs. Cramer's V, for instance).
Additionally, just computing Spearman/Pearson CC for all features won't show any correlations that aren't there, but just some correlations between categorical features won't be revealed. However, those should then be detected by the feature importances calculation.
Hence, we decided to stick to Pearson/Spearman CC for all features as of now. If in the future the differentiation between ordinal and nominal categorical features becomes important at other places in ehrapy, it would be easy to adapt the bias detection method accordingly.
The text was updated successfully, but these errors were encountered: