Skip to content

kakkoko/jwalk

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

jwalk

jwalk is a streaming JSON parser for Unix:

  • streaming, in that individual JSON tokens are parsed as soon as they are read from input;
  • for Unix, in that its line-based output is designed to be used and manipulated by the standard Unix toolset.

jwalk is written in standard awk, sed, and sh, and does not require a C compiler. It is intended to run from source on any contemporary POSIX system.

It can parse large documents slowly, but steadily, in memory space proportional to the key depth of the document.

Reading Records From JSON

The jwalk command is a filter which transforms a stream of JSON tokens from standard input into a stream of tab-delimited, line-separated records on standard output.

A token is an indivisible, non-whitespace span of JSON, such as a number, string, boolean, bracket, or brace.

Every line of jwalk output is a record, arranged as follows, with each field separated by a tab character:

  • zero or more fields, collectively the path, containing the string keys used to access the value, followed by
  • one field specifying the value's type, followed by
  • one field representing the value itself.

The type is one of number, string, boolean, null, array, or object. String values are encoded as UTF-8, and are unescaped with the exception of \n, \t, and \\.

Examples

$ echo 123.45 | jwalk
number ▷ 123.45

$ echo true | jwalk
boolean ▷ true

$ echo '"acab"' | jwalk
string ▷ acab

$ echo null | jwalk
null ▷

$ echo '[123,"acab"]' | jwalk
array ▷
0 ▷ number ▷ 123
1 ▷ string ▷ acab

$ echo '{"version":"1.0.0"}' | jwalk
object ▷
version ▷ string ▷ 1.0.0

In general, records of type array and object provide structural information. Use the -l (or --leaf-only) flag to skip these records.

$ echo '[123,"acab"]' | jwalk -l
0 ▷ number ▷ 123
1 ▷ string ▷ acab

$ echo '{"version":"1.0.0"}' | jwalk -l
version ▷ string ▷ 1.0.0

Processing Records As Text

For simple array documents, pipe jwalk's output to cut -f 3 to see the array's values:

$ echo '[1,2,3]' | jwalk -l | cut -f 3
1
2
3

Or wc -l to count the number of elements in the array:

$ echo '[1,2,3]' | jwalk -l | wc -l
3

For simple object documents, pipe jwalk's output to cut -f 1 to see the object's keys:

$ echo '{"first":"Sam","last":"Stephenson"}' | jwalk -l | cut -f 1
first
last

Or cut -f 1,3 to see the key-value pairs:

$ echo '{"first":"Sam","last":"Stephenson"}' | jwalk -l | cut -f 1,3
first ▷ Sam
last ▷ Stephenson

The jwalk command also accepts a filename from the command line.

Use grep to filter records of interest by path:

$ curl -sLO https://unpkg.com/turbolinks@beta/package.json
$ jwalk -l package.json | grep -E 'scripts\t' | cut -f 2
clean
build
watch
start
test

Examining Records With awk

When a situation calls for more control over record output than grep and cut can provide, consider writing a jwalk examiner. An examiner is an awk script pre-configured with variables for accessing record data.

Variable name Description
keys an array of zero or more strings, representing the key path, indexed forward starting at 1 and backward at -1
path the key path as a string, with each key separated by a tab (or FS)
key the rightmost or last key of the key path; equivalent to keys[-1]
type the type of the JSON value
leaf false when the type is array or object; true otherwise
value (aliased as _) the string representation of the JSON value

Pass one or more -e <script> options on the command line to specify examiners inline:

$ jwalk -l -e '$1 == "scripts" {print key}' package.json
clean
build
watch
start
test

Store more complex examiners in files and load them with the -f <scriptfile> command-line option.

Installing and Embedding jwalk

To install jwalk, run sh lib/jwalk.sh --install with the path to the directory where jwalk should be installed. For example:

$ sh lib/jwalk.sh --install /usr/local

Once you have a jwalk command in your path, you can run jwalk --install to embed jwalk into another project:

$ mkdir -p vendor/jwalk
$ jwalk --install vendor/jwalk
$ vendor/jwalk/bin/jwalk -l ...

© Sam Stephenson

About

A streaming JSON parser for Unix

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Awk 59.7%
  • Shell 40.3%