Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Ensure Cache Key pk is Converted to INT to Prevent Dataframe Series Null Issues #161

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

arthur-verta
Copy link

@arthur-verta arthur-verta commented Dec 23, 2024

-> Ensure that the cache key pk (if used) is always converted to an INT format.

This addresses a bug that occurs when a queryset is loaded into a dataframe. Specifically, if the queryset includes a foreign key with nullable fields and a mix of instances with null and non-null related fields, pandas assigns the dtype of the primary key (pk) column as object. Consequently, pk values are automatically converted to floats because a pandas integer Series cannot contain None.

To avoid this, we must explicitly reconvert the pk column to INT before using it as a cache key.

Without this step, as of now, the dataframe ends up with None for every row in such cases.

[Using pandas 2.2.2]

Make sure that the cache key pk used, if any available, is converted to INT format.
Add a try / except, in case the orginial pk is not an integer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant