-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
General jsonlite compatibility issues #3
Comments
I think the above option should be the default actually. Stability of return types is important for programmatic use. Granted the type of vector cannot be guaranteed, but it seems it would be easier for code to handle this rather than having to deal with different ‘shapes’ (vector vs list). |
@shikokuchuo Just to clarify: you think that (by default) the return type should always be an atomic vector? I'm a bit conflicted on how this might go, as I like the idea of defaulting to easy-to-handle atomic vectors, but I'm "offended" by the idea of just discarding type information as soon as types are mixed. |
Yes, making it always return an atomic vector seems to be better behaviour. If you think about it from the perspective of wanting to do something with the returned data, having a list that retains the data types does not really give you any advantage. If you unlist (or coerce) then that also loses the type information. That means handling code would need to have 2 branches - to deal with the case it is an atomic vector or using lapply or similar when dealing with a list. In reality I think a lot of json is retrieved from web APIs, in which case you probably know the expected return type, and can easily test in case the returned vector is not the expected type (perhaps the API returns a 'missing value' occasionally or something). This handling may be included within vectorised code. Otherwise we would have to write code that handles the fact that we may get a list from time to time (or flatten by default, which gets us back to the same place). However, I don't have a real use case where this (may) happen (thankfully!), so this is purely theoretical and I defer to those who are actually dealing with such data. |
Thanks, I appreciate your thoughts. Mapping JSON-to-R is tricky and there are so many tradeoffs! |
Totally! That's why I'm so glad someone like you has actually attempted this :) |
(can be moved to a new issue if that helps) Would it be useful to add a jsonlite <- jsonlite::toJSON(as.list(euro), auto_unbox = TRUE)
jsonlite
#> {"ATS":13.7603,"BEF":40.3399,"DEM":1.9558,"ESP":166.386,"FIM":5.9457,"FRF":6.5596,"IEP":0.7876,"ITL":1936.27,"LUF":40.3399,"NLG":2.2037,"PTE":200.482}
str(jsonlite)
#> 'json' chr "{\"ATS\":13.7603,\"BEF\":40.3399,\"DEM\":1.9558,\"ESP\":166.386,\"FIM\":5.9457,\"FRF\":6.5596,\"IEP\":0.7876,\""| __truncated__
yyjsonr <- yyjsonr::to_json_str(as.list(euro), opts = yyjsonr::to_opts(auto_unbox = TRUE))
yyjsonr
#> [1] "{\"ATS\":13.7603,\"BEF\":40.3399,\"DEM\":1.95583,\"ESP\":166.386,\"FIM\":5.94573,\"FRF\":6.55957,\"IEP\":0.787564,\"ITL\":1936.27,\"LUF\":40.3399,\"NLG\":2.20371,\"PTE\":200.482}"
str(yyjsonr)
#> chr "{\"ATS\":13.7603,\"BEF\":40.3399,\"DEM\":1.95583,\"ESP\":166.386,\"FIM\":5.94573,\"FRF\":6.55957,\"IEP\":0.7875"| __truncated__
jsonlite:::print.json(yyjsonr)
#> {"ATS":13.7603,"BEF":40.3399,"DEM":1.95583,"ESP":166.386,"FIM":5.94573,"FRF":6.55957,"IEP":0.787564,"ITL":1936.27,"LUF":40.3399,"NLG":2.20371,"PTE":200.482}
class(yyjsonr) <- "json"
yyjsonr # since jsonlite is loaded, the print method is available, but this isn't necessarily the case
#> {"ATS":13.7603,"BEF":40.3399,"DEM":1.95583,"ESP":166.386,"FIM":5.94573,"FRF":6.55957,"IEP":0.787564,"ITL":1936.27,"LUF":40.3399,"NLG":2.20371,"PTE":200.482}
str(yyjsonr)
#> 'json' chr "{\"ATS\":13.7603,\"BEF\":40.3399,\"DEM\":1.95583,\"ESP\":166.386,\"FIM\":5.94573,\"FRF\":6.55957,\"IEP\":0.7875"| __truncated__ Created on 2023-09-06 with reprex v2.0.2 This would allow for |
If this is necessary for some use cases, it could perhaps live in an option (non-default). If a key raison d'être of yysonr is performance, adding a class is probably the single worst thing you can do to kill it. This is as every generic function, including the standard subsetting operators, then needs to perform S3 dispatch. It is of course possible to use .subset and .subset2 but a lot of those in code is not fun and kills readability. Just my perspective of looking to push the boundaries of what's possible, not just oh well it's already 2x faster than 'jsonlite'. |
I'm not sure I follow - I'm suggesting adding a class to the resulting JSON string. If you're calling a generic on that then it will go through dispatch regardless, but will hit the default method if it was without a formal class. If the suggestion was to make parsing a generic then yes, that would impact performance, but I'm not suggesting that at all. |
Right - when I typed my response I was thinking the other way round (my main use cases). Assuming it's less likely you'd want to manipulate the resulting json string then I guess it's less of a concern.
Just to clarify what I meant, if the object is not classed it's OBJECT bit is not set on the SEXP and S3 dispatch is not performed. Example with the generic function cstr <- str <- yyjsonr::to_json_str(list(a = 1, b = 2))
str
#> [1] "{\"a\":[1.0],\"b\":[2.0]}"
class(cstr) <- "json"
cstr
#> [1] "{\"a\":[1.0],\"b\":[2.0]}"
#> attr(,"class")
#> [1] "json"
# generic `[[`:
microbenchmark::microbenchmark(str[[1L]], cstr[[1L]])
#> Unit: nanoseconds
#> expr min lq mean median uq max neval
#> str[[1L]] 77 89.5 137.26 97.0 104.0 3557 100
#> cstr[[1L]] 302 314.0 410.14 327.5 347.5 5027 100
# non-generic .subset2:
microbenchmark::microbenchmark(.subset2(str, 1L), .subset2(cstr, 1L))
#> Unit: nanoseconds
#> expr min lq mean median uq max neval
#> .subset2(str, 1L) 76 80 127.40 84.5 89 3386 100
#> .subset2(cstr, 1L) 85 90 105.08 97.0 99 812 100 Created on 2023-09-06 with reprex v2.0.2 |
Really cool work, looks like there is lots of potential for wide benefits if this could be leveraged in some commonly used packages! It seems like if compatibility were designed such that Some examples that come to mind: Is this something you have in mind, or are you thinking of this as more of a bulk data ingest focused library that might not share the same design objectives of I did some poking around with |
@msummersgill I did think about aiming for strict jsonlite drop-in compatibility but this was not a priority for the initial release. |
+1 against drop in replacement. If there is something to improve this is the place to do it. If jsonlite made some imperfect design decisions they are already stuck with them for good. New package like this is exactly where corrections can and should be made. I would like to point out potential of |
{yyjsonr}
has a "soft" goal to try and be compatible with{jsonlite}
in the most common use cases.Full compatibility with
{jsonlite}
is a non-goal.This is a place to voice any compatibility issue you have, or express support/disagreement for things on the list below. This will help prioritise programming effort.
digits
argumentpromote_num_to_string
- Option to promote numerics to strings if an array has a mixture of numbers and strings. set toTRUE
forjsonlite
compatibility.FALSE
. Set toTRUE
forjsonlite
compatibilityPOSIXt = c('string', 'epoch', 'ISO8601')
option likejsonlite::toJSON()
The text was updated successfully, but these errors were encountered: