probability classification report - does not show the entire size of data #240

Guidosalimbeni · 2022-05-29T11:46:34Z

Hello,
have been investigating your great library. However, after updating the tool and running a classification report I am noticing that the count of rows in the report is inconsistent with the data used for the calculation.
I am not in the position of sharing the error but I hope is a quick thing to check on your side?
I tried all the possible changes and debugging. I know for sure that the reference data has 2500 rows but the report only shows 480 records. Really not sure what else to check and any helps would be really appreciated.

emeli-dral · 2022-05-30T17:37:01Z

Hi @Guidosalimbeni ,
thank for sharing, we will try to figure it out.

I have quick questions:

Do I got it right, that this bug appeared in the latest version, and in the older one everything worked correctly? Or you built the dashboard in the latest version only? This will allow us to understand a little faster what the problem might come from.
For some reports we filter out rows with nan values, it might be the reason of the problem here. Could you please check the amount of rows with at least one nan value: df.isna().any(axis=1).sum() ? May it be, that there are 2020 rows with nan values?

danieljmv01 · 2022-06-03T13:44:41Z

Hi @Guidosalimbeni,

In case it helps, these might be related: #241 and #242

Do the data include columns that have nans or np.inf values (even if they are not the target/prediction columns)?
Does the count change if used with a dataset with only the target and/or prediction columns?

Guidosalimbeni · 2022-06-03T14:30:45Z

Thanks, @emeli-dral and @danieljmv01

I have noticed it only in the new version but it might be that I did not notice the error in the previous version. Apologies I am not a great help here.
I feel the point on Null values might likely be the issue. Let me test on Monday and I will let you know.

emeli-dral added the bug Something isn't working label May 30, 2022

Tapot assigned emeli-dral Jul 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

probability classification report - does not show the entire size of data #240

probability classification report - does not show the entire size of data #240

Guidosalimbeni commented May 29, 2022

emeli-dral commented May 30, 2022

danieljmv01 commented Jun 3, 2022

Guidosalimbeni commented Jun 3, 2022

probability classification report - does not show the entire size of data #240

probability classification report - does not show the entire size of data #240

Comments

Guidosalimbeni commented May 29, 2022

emeli-dral commented May 30, 2022

danieljmv01 commented Jun 3, 2022

Guidosalimbeni commented Jun 3, 2022