Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jtd-infer crashes with a stack overflow on large amounts of concatenated input #11

Open
bjorndm opened this issue Jun 11, 2021 · 1 comment

Comments

@bjorndm
Copy link

bjorndm commented Jun 11, 2021

I can use jtd-infer with a few concatenated files, but when I concatetenate a 1000+ JSON files, the program crashed with a stack overflow. Similar inferral tools for JSON Schema, like Genson, work fine with these same 1000+ files.

$ ls -Aq testdata/metadata/  | wc -l 
2
$ cat testdata/metadata/*.json | jtd-infer 
# This works OK
$ ls -Aq ../test/testdata/metadata/  | wc -l 
1129
$ cat ../test/testdata/metadata/*.json | jtd-infer 
thread 'main' has overflowed its stack
fatal runtime error: stack overflow
Afgebroken (geheugendump gemaakt)

I would appreciate if this could be fixed somehow. Perhaps allow several input files on the command line in stead of one?

@zekefast
Copy link

zekefast commented Apr 7, 2023

@bjorndm It seems that schema inferred by jtd-infer is too large to fit default stack size on your system. jtd-infer could be rewritten a bit to occupy less space on stack and use more heap instead (which is like slow it down a bit) and assume default stack size which on linux systems usually 8MiB.

But you can also increase stack size to workaround the issue.

ulimit -all

shows configuration of you system. You interested in stack size line. Or you can grep it: ulimit -all | grep "stack size". The value usually in KiB (*1024).

To set new value use

ulimit -s <VALUE_IN_KiB>

You can try to double (ulimit -s 16384) or quadruple(ulimit -s 32768) stack size and run jtd-infer once again.

@ucarion It could be worth to add somewhere to documentation or README that kind of workaround as people with large JSON files and schemas hit that issue quite frequently and not everybody may know about ability to increase default stack size on their systems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants