-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
from_ndjson #58
Comments
on branch |
@dcooley I just installed this per your comment in #29 but something is erroring out quickly. Pretty minimal reprex: # remotes::install_github("SymbolixAU/jsonify", "issue58")
jsonify::from_ndjson('{"abc":123}') fails with error: I think this one's pretty straightforward. There are some switches in argument names between Edit: just forked to make this tweak and ran into more issues. Specifically, calling
To reproduce from my fork, jsonify::from_ndjson(ndjson = '{"abc":123}')
jsonify::from_ndjson(paste0(c('{"abc":123}', '{"def":456}', collapse = "\n"))) with results:
|
I've also messed up my branches so I'm now going through and fixing a load of branch conflicts. However, your paste0(c('{"abc":123}', '{"def":456}', collapse = "\n"))
[1] "{\"abc\":123}" "{\"def\":456}" "\n" Is this what you intended? It looks like you may have the closing bracket in the wrong place? paste0(c('{"abc":123}', '{"def":456}'), collapse = "\n")
[1] "{\"abc\":123}\n{\"def\":456}" |
Yikes, not my finest moment on Github. You're correct, closing paren in the wrong spot, so this statement now works with the fixed argname json to ndjson described above: On larger test files that I've written with to_ndjson in the past few days, I'm still seeing errors in from_ndjson, but I can't reproduce these easily on testcases at the moment. The crux of the issue is that this works fine: lapply(readLines("somefile.ndjson"),
function(x) from_ndjson(ndjson = x)) but this fails: from_ndjson("somefile.ndjson") with I can't see why each individual line succeeds but parsing the entire file fails. When I try to reproduce the error using a simple case, test_list <- from_ndjson(paste0(c('{"abc":123}', '{"def":456}'), collapse = "\n"))
writeLines(to_ndjson(test_list), "testlist.ndjson")
from_ndjson("testlist.ndjson") this also works fine. So I need to figure out the structural difference between these files to reproduce. Will write back when I do. (This time without dumb errors!) |
Can you check how the ndjson is separated, is it by |
edit: clarified this and not a problem with 58, working as intended Ha, I can finally reproduce. This was nasty. Still don't have a good idea why it's happening. When I wrote the files throwing The two files -- 1 written with I discovered this on a production file, which is a deeply nested dataframe with list-cols. I don't see why the deeply-nested part would cause this, but until I can get a more minimal reprex together I just included an anonymized variant of the data. (Github won't let me upload .rds directly, so I guess unzip this first.) remotes::install_github("SymbolixAU/jsonify", "issue29", force = TRUE)
testcase <- readRDS("testcase.rds")
writeLines(jsonify::to_ndjson(testcase), "issue29.ndjson")
remotes::install_github("SymbolixAU/jsonify", "issue58_fix", force = TRUE)
testcase <- readRDS("testcase.rds")
writeLines(jsonify::to_ndjson(testcase), "issue58.ndjson")
openssl::md5(file("issue29.ndjson"))
openssl::md5(file("issue58.ndjson"))
length(readLines("issue29.ndjson"))
length(readLines("issue58.ndjson"))
library(jsonify)
foo <- from_ndjson("issue58.ndjson")
bar <- from_ndjson("issue29.ndjson")
bar <- from_ndjson(paste0(readLines("issue58.ndjson", nrow(foo)), collapse = "\n"))
all.equal(foo, bar) Seriously thought I was hallucinating halfway through this. |
lst <- list(
x = 1:5
, y = list(
a = letters[1:5]
, b = data.frame(i = 10:15, j = 20:25)
)
)
from_ndjson( "{\"x\":[1,2,3,4,5]}\n{\"y\":{\"a\":[\"a\",\"b\",\"c\",\"d\",\"e\"],\"b\":[{\"i\":10,\"j\":20},{\"i\":11,\"j\":21},{\"i\":12,\"j\":22},{\"i\":13,\"j\":23},{\"i\":14,\"j\":24},{\"i\":15,\"j\":25}]}}" )
|
^^ Actually, this is probably correct, because we are splitting the list up during the In the future, if anyone specifically asks or this is an issue, we may have to come up with rules for handling {"x":[1,2,3,4,5]}
{"y":{"a":["a","b","c","d","e"],"b":[{"i":10,"j":20},{"i":12,"j":22}]}} where we would need to remove the trailing |
Going to CRAN because I need to update the package to set |
&Document
rapidjson::Dom
rather than streaming the string with<< '[' << ndjson << ']'
, thenstd::replace( json.begin(), json.end(), '\n', ',')
update reverse imports (mapdeck, geojsonsf, spatialwidget) to useR_xlen_t
values before release to CRAN.The text was updated successfully, but these errors were encountered: