kaitai-io · davidhicks · Apr 11, 2017 · Apr 11, 2017 · Apr 11, 2017 · Apr 11, 2017
diff --git a/user_guide.adoc b/user_guide.adoc
@@ -380,6 +380,21 @@ enums:
     17: udp
 ----
 
+Alternatively, hexadecimal notation can also be used to define an enumeration:
+
+[source,yaml]
+----
+seq:
+  - id: key
+    type: u4
+    enum: keys
+enums:
+  keys:
+    0x77696474: width #widt
+    0x68656967: height #heig
+    0x64657074: depth #dept
+----
+
 There are two things that should be done to declare a enum:
 
 1.  We add `enums` key on the type level (i.e. on the same level as
@@ -472,7 +487,25 @@ structure:
 
 [source,yaml]
 ----
-TODO
+seq:
+  - id: header
+    type: file_header
+  - id: metadata
+    type: metadata_section
+types:
+  file_header:
+    seq:
+      - id: version
+        type: u2
+  metadata_section:
+    seq:
+      - id: author
+        type: strz
+        encoding: UTF-8
+      - id: publisher
+        type: strz
+        encoding: UTF-8
+        if: _parent.header.version >= 2
 ----
 
 ==== `_root`
@@ -799,6 +832,39 @@ other value which was not listed explicitly.
         _: rec_type_unknown
 ----
 
+If an enumeration has already been defined, you can use references to
+items in the enumeration instead of specifying integers a second time:
+
+[source,yaml]
+----
+seq:
+  - id: key
+    type: u4
+    enum: keys
+  - id: data
+    type:
+      switch-on: key
+      cases:
+        keys::width: data_field_width
+        keys::height: data_field_height
+        keys::depth: data_field_depth
+types:
+  data_field_width:
+    seq:
+      #...
+  data_field_height:
+    seq:
+     #...
+  data_field_depth:
+    seq:
+     #...
+enums:
+  keys:
+    0x77696474: width #widt
+    0x68656967: height #heig
+    0x64657074: depth #dept
+----
+
 === Instances: data beyond the sequence
 
 So far we've done all the data specifications in `seq` - thus they'll
@@ -1024,7 +1090,117 @@ bytes sparsely.
 
 === Streams and substreams
 
-TODO
+====Introduction and simple example====
+
+A stream is a flow of data from an input file into a parser which is
+generated by a KS script. The parser can request one or more bits of
+data from the stream at a time, but cannot request the same data twice
+and cannot request data be provided out of sequential order. A stream
+knows the maximum amount of data available to be requested by the
+parser and the actual amount of data which has already been
+requested by the parser.
+
+When a file is first opened for parsing by a parser generated by KS,
+a root stream is created. This root stream can be accessed via
+`_root._io` at any time and in any place. In this scenario, `_root`
+returns the top level object defined in a script, and `_io` is a
+method which can be called on an object to return the associated
+stream. The root stream will know the maximum amount of data available
+to be requested by the parser as the file size of the input file which
+is being parsed. Initially, the root stream will know that 0 bits of
+data have been requested by the parser.
+
+Below is an example script which is used to generate a parser which
+is then used to parse an input file. Assume that this example input
+file simply consists of a 32-bit unsigned integer value of 1000
+followed by 1000 bytes of payload data. This example input file thus
+has a total file size of 1004 bytes.
+
+[source,yaml]
+----
+meta:
+  - id: example_file
+seq:
+  - id: header
+    type: file_header
+  - id: body
+    type: file_body
+    size: header.body_size
+types:
+  file_header:
+    seq:
+      - id: body_size
+        type: u4
+  file_body:
+    seq:
+      - id: payload
+        size-eos: true
+----
+
+The parser generated by the script will first request 4 bytes of data
+from the root stream to copy into the object `header.body_size`. After
+the stream has returned the 4 bytes of data to the parser, the stream
+will know that it has returned 4 out of the 1004 bytes of data available
+to the parser. The parser is now only able to request 1000 bytes of
+additional data from the stream.
+
+The definition of the `body` object in the example script specifies the
+size of the `body` object to be the already-parsed value of
+`header.body_size`. Defining an object size results in something
+interesting happening with the KS-generated parser--a new substream is
+created to specifically parse the `body` object.
+
+Similar to how the root stream operates, the new substream initially
+knows the maximum amount of data available to be requested, and the
+actual amount of data already returned. In this example, the substream
+upon creation has a maximum of 1000 bytes of data which can be
+requested by the parser. The substream will know the actual amount of
+data which has been provided is 0 bytes.
+
+The parser will then continuously request data from the new substream
+to copy into the object `file_body.payload`. As the substream receives
+requests for more data, the substream will pass all requests to the
+root stream. Unlike the root stream, substreams are only able to
+request data from either the root stream or other substreams.
+Substreams do not read from an input file directly.
+
+Because `size-eos: true` is specified for the `file_body.payload`
+object, the parser will continue requesting data from the substream
+until the actual amount of data provided by the substream is 1000
+bytes (the maximum amount of data which the substream is available
+to provide). Upon all 1000 bytes of data being copied from the input
+file, via the root stream and then via the substream to the
+`file_body.payload` object, the internal state of the two streams
+would be:
+* root stream--maximum bytes of data available remains 1004, actual
+               amount of data already requested is 1004 bytes
+* substream--maximum bytes of data available remains 1000, actual
+             amount of data already requested is 1000 bytes
+
+Alternatively, if `header.body_size` happens to be a value larger than
+the input file size, the root stream would be unable to fulfill this
+request, and the KS-generated parser would abruptly raise an exception
+for trying to read non-existent data beyond the end of the input file.
+
+The `_io` method can be used to access the stream associated with an
+object. An object can be obtained by identifier, or alternatively by
+methods `_root` and `_parent`. Once a stream has been obtained with
+the `_io` method, a number of different methods can be used to obtain
+the internal state of the stream:
+* `size` to return the maximum amount of data which is available to be
+  requested from the stream
+* `pos` to return the actual amount of data which has already been
+  requested from the stream
+* `eof` to return a boolean value of `false` when `pos != size` and
+  `true` when `pos == size` (has the maximum amount of data available
+  via the stream already been requested?)
+
+Substreams can be nested many layers deep by defining the `size` of
+each object in the nested tree.
+
+Related expressions which are useful when working with streams include:
+* `repeat: eos`
+* `size-eos: true`
 
 === Processing: dealing with compressed, obfuscated and encrypted data
 
@@ -1903,7 +2079,38 @@ beginner Kaitai Struct users.
 
 === Specifying size creates a substream
 
-TODO
+In the following example script, an erronous attempt is made to parse
+an input file with a file size of 2000 bytes:
+
+[source,yaml]
+----
+seq:
+  - id: body
+    type: some_body_type
+    size: 1000
+types:
+  some_body_type:
+    seq:
+      - id: payload
+        size: 999
+      - id: overflow
+        size: 2
+----
+
+The parser can successfully copy the required 999 bytes into
+`body.payload` as the `body` substream has 1000 bytes available to
+be requested, and the root stream has 2000 bytes available.
+
+Where an exception occurs is upon attempting to copy data from the
+`body` substream into the `overflow` object. After data has been
+copied from the `body` substream into the `payload` object, the
+`body` substream will only have 1 byte of data still available for
+the parser to request. As 2 bytes of data are attempted to be
+requested, the `body` substream is exhausted of available data and
+thus an exception occurs. The fact that the root stream still has
+1001 bytes available to be requested from the input file does not
+matter, as the `body` substream never has the opportunity to request
+any more than the first 1000 bytes of the input file.
 
 === Applying `process` without a size