Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: '<' not supported between instances of 'str' and 'int' when setting pval=True #160

Open
erdie721 opened this issue Feb 29, 2024 · 5 comments

Comments

@erdie721
Copy link

I'm getting the following error whenever I set pval=True on a data set. Using Jupyter Notebook via Anaconda running python 3.9


TypeError Traceback (most recent call last)
/tmp/ipykernel_2037601/677072381.py in
----> 1 results = TableOne(data = df, columns = columns, groupby = groupby, nonnormal = nonnormal, categorical = categorical,pval=True)

~/anaconda3/lib/python3.9/site-packages/tableone/tableone.py in init(self, data, columns, categorical, groupby, nonnormal, min_max, pval, pval_adjust, htest_name, pval_test_name, htest, isnull, missing, ddof, labels, rename, sort, limit, order, remarks, label_suffix, decimals, smd, overall, row_percent, display_all, dip_test, normal_test, tukey_test)
386 # forgive me jraffa
387 if self._pval:
--> 388 self._htest_table = self._create_htest_table(data)
389
390 # correct for multiple testing

~/anaconda3/lib/python3.9/site-packages/tableone/tableone.py in _create_htest_table(self, data)
1113 # if categorical, create contingency table
1114 elif is_categorical:
-> 1115 catlevels = sorted(data[v].astype('category').cat.categories)
1116 cross_tab = pd.crosstab(data[self._groupby].
1117 rename('groupby_var'), data[v])

TypeError: '<' not supported between instances of 'str' and 'int'

@ExtremeCoolDude
Copy link

Can anyone help understanding this issue please ? I'm having the same issue.

The values are stratified, and are numeric, I don't understand where this issue is coming from ?

image

@tompollard
Copy link
Owner

Sorry for the delay, will work on bug fixes this week!

@tompollard
Copy link
Owner

I'm not quite sure what's going here, but I think it's a data type issue (one of your columns appears to contain a mix of strings and numbers). Are you able to share a dataset that can be used to reproduce the error?

@erdie721
Copy link
Author

Yeah, I think some of the columns had an entry with a < or > on some of the numbers. Is there any way to just exempt those columns from having a p-value calculated? Or having the error be slightly more descriptive of the issue?

@tompollard
Copy link
Owner

Yeah, I think some of the columns had an entry with a < or > on some of the numbers. Is there any way to just exempt those columns from having a p-value calculated? Or having the error be slightly more descriptive of the issue?

Yes definitely, I'll have a think about how best to handle this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants