Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GenericProperty (and therefore Expando) changes str values to bytes (sometimes) #99

Open
pnico opened this issue Aug 18, 2023 · 3 comments · May be fixed by #100
Open

GenericProperty (and therefore Expando) changes str values to bytes (sometimes) #99

pnico opened this issue Aug 18, 2023 · 3 comments · May be fixed by #100

Comments

@pnico
Copy link

pnico commented Aug 18, 2023

Expected Behavior

If I set the value of a GenericProperty to a str, then read it back from datastore, the value should still always be a str. The value I read back should equal the value I put in.

Actual Behavior

If I set a GenericProperty to the string 'oops', then read it back from datastore, the value is now b'oops', so it is not equal to the value I put in the property.
If I set a GenericProperty to the string 'oøps', then read it back from datastore, the value is still 'oøps'. Apparently we are deciding to do different behavior depending on the contents of the string.

The same thing happens in appengine Python 2 incidentally, but it was not a big problem in practice because in Python 2, 'oops' == u'oops' was True. In Python 3 it is a problem :)

Steps to Reproduce the Problem

class Oops(ndb.Model):
    generic_prop = ndb.GenericProperty()
    string_prop = ndb.StringProperty()

oops = Oops(generic_prop="oøps", string_prop="oøps")
assert "oøps" == oops.generic_prop == oops.string_prop
oops.put()
assert "oøps" == oops.generic_prop == oops.string_prop
oops = oops.key.get()
assert "oøps" == oops.generic_prop == oops.string_prop
oops = oops.key.get(use_cache=False)
assert oops.generic_prop == oops.string_prop  # <- Succeeds: both properties are still "oøps"

oops = Oops(generic_prop="oops", string_prop="oops")
assert "oops" == oops.generic_prop == oops.string_prop
oops.put()
assert "oops" == oops.generic_prop == oops.string_prop
oops = oops.key.get()
assert "oops" == oops.generic_prop == oops.string_prop
oops = oops.key.get(use_cache=False)
assert oops.generic_prop == oops.string_prop  # <- Fails, because now generic_prop is b'oops'

Specifications

  • Version: appengine-python-standard 1.1.3, Python 3.11.4
  • Platform: MacOS
@pnico
Copy link
Author

pnico commented Aug 18, 2023

I think changing one line here, from sval.decode('ascii') to sval = str(sval.decode('ascii')), should fix the issue without causing problems for anyone who isn't doing something really strange? edit: this would break indexed GenericProperties, see linked PR for a fix for this at least with unindexed ones.

try:
sval.decode('ascii')
except UnicodeDecodeError:
try:
sval = six.text_type(sval.decode('utf-8'))
except UnicodeDecodeError:
pass

Edit: I guess this would break if you are putting bytes in and expecting bytes out, but for that case, GenericProperty will still fail for any bytes object that isn't an ascii-encoded string anyway, for example if you are putting binary data into it (but this would also fail in Python 2). Since binary data was never (edit: "properly") supported, it shouldn't be a compatibility issue to not support bytes here in Python 3 either and just assume the user wanted a str encoded as ascii or utf-8, which was assumed before also. (edit: the linked PR only assumes you wanted a str if you passed in a str, and doesn't change/fix any behavior if you explicitly passed bytes)

GenericProperty is also used internally for Expando though, so I don't know if a simple change like this would mess anything up with that. It's surprising to me that this issue hasn't been reported here or in the Python NDB library where it's also like this, maybe people don't use these two features very much.

@pnico
Copy link
Author

pnico commented Aug 21, 2023

Same issue exists with Expando:

class OopsExpando(ndb.Expando):
    orig_prop = ndb.StringProperty()

value_to_put = 'oops'
oops = OopsExpando(orig_prop=value_to_put, added_prop=value_to_put)
oops.put()
assert oops.orig_prop == oops.added_prop  # <- succeeds
oops = oops.key.get(use_cache=False)
assert oops.orig_prop == oops.added_prop  # <- fails, but works with "oøps"

@pnico pnico changed the title GenericProperty changes str values to bytes (sometimes) GenericProperty (and therefore Expando) changes str values to bytes (sometimes) Aug 21, 2023
@pnico pnico linked a pull request Aug 21, 2023 that will close this issue
1 task
@pnico
Copy link
Author

pnico commented Aug 21, 2023

Just a note: I noticed the existing tests for this only test with bytes objects containing encoded ascii strings. I doubt this is the most common use of string properties in Expando, and I almost wonder if the tests were simply modified in Python 3 this way to make them pass; please let me know if there's a better reason I'm not thinking of. Also, it will break if you happen to pass bytes objects containing encoded utf-8 strings - I didn't attempt to fix this, since it was most likely also broken in Python 2, and also probably not a common use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant