Skip to content

Commit

Permalink
Rename within docs/
Browse files Browse the repository at this point in the history
  • Loading branch information
kddnewton committed Sep 27, 2023
1 parent 3bfefc4 commit ea9a3fa
Show file tree
Hide file tree
Showing 12 changed files with 207 additions and 207 deletions.
42 changes: 21 additions & 21 deletions docs/build_system.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
# Build System

There are many ways to build YARP, which means the build system is a bit more complicated than usual.
There are many ways to build prism, which means the build system is a bit more complicated than usual.

## Requirements

* It must work to build YARP for all 6 uses-cases below.
* It must be possible to build YARP without needing ruby/rake/etc.
Because once YARP is the single parser in TruffleRuby, JRuby or CRuby there won't be another Ruby parser around to parse such Ruby code.
* It must work to build prism for all 6 uses-cases below.
* It must be possible to build prism without needing ruby/rake/etc.
Because once prism is the single parser in TruffleRuby, JRuby or CRuby there won't be another Ruby parser around to parse such Ruby code.
Most/every Ruby implementations want to avoid depending on another Ruby during the build process as that is very brittle.
* It is desirable to compile YARP with the same or very similar compiler flags for all use-cases (e.g. optimization level, warning flags, etc).
Otherwise, there is the risk YARP does not work correctly with those different compiler flags.
* It is desirable to compile prism with the same or very similar compiler flags for all use-cases (e.g. optimization level, warning flags, etc).
Otherwise, there is the risk prism does not work correctly with those different compiler flags.

The main solution for the second point seems a Makefile, otherwise many of the usages would have to duplicate the logic to build YARP.
The main solution for the second point seems a Makefile, otherwise many of the usages would have to duplicate the logic to build prism.

## General Design

Expand All @@ -24,15 +24,15 @@ This way there is minimal duplication, and each layer builds on the previous one

The static library exports no symbols, to avoid any conflict.
The shared library exports some symbols, and this is fine since there should only be one librubyparser shared library
loaded per process (i.e., at most one version of the yarp *gem* loaded in a process, only the gem uses the shared library).
loaded per process (i.e., at most one version of the prism *gem* loaded in a process, only the gem uses the shared library).

## The various ways to build YARP
## The various ways to build prism

### Building from ruby/yarp repository with `bundle exec rake`
### Building from ruby/prism repository with `bundle exec rake`

`rake` calls `make` and then uses `Rake::ExtensionTask` to compile the C extension (see above).

### Building the yarp gem by `gem install/bundle install`
### Building the prism gem by `gem install/bundle install`

The gem contains the pre-generated templates.
When installing the gem, `extconf.rb` is used and that:
Expand All @@ -44,31 +44,31 @@ there is Ruby code using FFI which uses `librubyparser.{so,dylib,dll}`
to implement the same methods as the C extension, but using serialization instead of many native calls/accesses
(JRuby does not support C extensions, serialization is faster on TruffleRuby than the C extension).

### Building the yarp gem from git, e.g. `gem 'yarp', github: 'ruby/yarp'`
### Building the prism gem from git, e.g. `gem "prism", github: "ruby/prism"`

The same as above, except the `extconf.rb` additionally runs first:
* `templates/template.rb` to generate the templates

Because of course those files are not part of the git repository.

### Building YARP as part of CRuby
### Building prism as part of CRuby

[This script](https://github.com/ruby/ruby/blob/32e828bb4a6c65a392b2300f3bdf93008c7b6f25/tool/sync_default_gems.rb#L399-L426) imports YARP sources in CRuby.
[This script](https://github.com/ruby/ruby/blob/32e828bb4a6c65a392b2300f3bdf93008c7b6f25/tool/sync_default_gems.rb#L399-L426) imports prism sources in CRuby.

The script generates the templates when importing.

YARP's `Makefile` is not used at all in CRuby. Instead, CRuby's `Makefile` is used.
prism's `Makefile` is not used at all in CRuby. Instead, CRuby's `Makefile` is used.

### Building YARP as part of TruffleRuby
### Building prism as part of TruffleRuby

[This script](https://github.com/oracle/truffleruby/blob/master/tool/import-yarp.sh) imports YARP sources in TruffleRuby.
[This script](https://github.com/oracle/truffleruby/blob/master/tool/import-prism.sh) imports prism sources in TruffleRuby.
The script generates the templates when importing.

Then when `mx build` builds TruffleRuby and the `yarp` mx project inside, it runs `make`.
Then when `mx build` builds TruffleRuby and the `prism` mx project inside, it runs `make`.

Then the `yarp bindings` mx project is built, which contains the [bindings](https://github.com/oracle/truffleruby/blob/master/src/main/c/yarp_bindings/src/yarp_bindings.c)
and links to `librubyparser.a` (to avoid exporting symbols, so no conflict when installing the yarp gem).
Then the `prism bindings` mx project is built, which contains the [bindings](https://github.com/oracle/truffleruby/blob/master/src/main/c/prism_bindings/src/prism_bindings.c)
and links to `librubyparser.a` (to avoid exporting symbols, so no conflict when installing the prism gem).

### Building YARP as part of JRuby
### Building prism as part of JRuby

TODO, probably similar to TruffleRuby.
8 changes: 4 additions & 4 deletions docs/building.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# Building

The following describes how to build YARP from source.
The following describes how to build prism from source.
This comes directly from the [Makefile](../Makefile).

## Common

All of the source files match `src/**/*.c` and all of the headers match `include/**/*.h`.

The following flags should be used to compile YARP:
The following flags should be used to compile prism:

* `-std=c99` - Use the C99 standard
* `-Wall -Wconversion -Wextra -Wpedantic -Wundef` - Enable the warnings we care about
Expand All @@ -16,7 +16,7 @@ The following flags should be used to compile YARP:

## Shared

If you want to build YARP as a shared library and link against it, you should compile with:
If you want to build prism as a shared library and link against it, you should compile with:

* `-fPIC -shared` - Compile as a shared library
* `-DYP_EXPORT_SYMBOLS` - Export the symbols (by default nothing is exported)
* `-DPM_EXPORT_SYMBOLS` - Export the symbols (by default nothing is exported)
50 changes: 25 additions & 25 deletions docs/configuration.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
# Configuration

A lot of code in YARP's repository is templated from a single configuration file, [config.yml](../config.yml). This file is used to generate the following files:

* `ext/yarp/api_node.c` - for defining how to build Ruby objects for the nodes out of C structs
* `include/yarp/ast.h` - for defining the C structs that represent the nodes
* `java/org/yarp/AbstractNodeVisitor.java` - for defining the visitor interface for the nodes in Java
* `java/org/yarp/Loader.java` - for defining how to deserialize the nodes in Java
* `java/org/yarp/Nodes.java` - for defining the nodes in Java
* `lib/yarp/compiler.rb` - for defining the compiler for the nodes in Ruby
* `lib/yarp/dispatcher.rb` - for defining the dispatch visitors for the nodes in Ruby
* `lib/yarp/dsl.rb` - for defining the DSL for the nodes in Ruby
* `lib/yarp/mutation_compiler.rb` - for defining the mutation compiler for the nodes in Ruby
* `lib/yarp/node.rb` - for defining the nodes in Ruby
* `lib/yarp/serialize.rb` - for defining how to deserialize the nodes in Ruby
* `lib/yarp/visitor.rb` - for defining the visitor interface for the nodes in Ruby
A lot of code in prism's repository is templated from a single configuration file, [config.yml](../config.yml). This file is used to generate the following files:

* `ext/prism/api_node.c` - for defining how to build Ruby objects for the nodes out of C structs
* `include/prism/ast.h` - for defining the C structs that represent the nodes
* `java/org/prism/AbstractNodeVisitor.java` - for defining the visitor interface for the nodes in Java
* `java/org/prism/Loader.java` - for defining how to deserialize the nodes in Java
* `java/org/prism/Nodes.java` - for defining the nodes in Java
* `lib/prism/compiler.rb` - for defining the compiler for the nodes in Ruby
* `lib/prism/dispatcher.rb` - for defining the dispatch visitors for the nodes in Ruby
* `lib/prism/dsl.rb` - for defining the DSL for the nodes in Ruby
* `lib/prism/mutation_compiler.rb` - for defining the mutation compiler for the nodes in Ruby
* `lib/prism/node.rb` - for defining the nodes in Ruby
* `lib/prism/serialize.rb` - for defining how to deserialize the nodes in Ruby
* `lib/prism/visitor.rb` - for defining the visitor interface for the nodes in Ruby
* `src/node.c` - for defining how to free the nodes in C and calculate the size in memory in C
* `src/prettyprint.c` - for defining how to prettyprint the nodes in C
* `src/serialize.c` - for defining how to serialize the nodes in C
Expand All @@ -29,15 +29,15 @@ This is a list of tokens to be used by the lexer. It is shared here so that it c

Each token is expected to have a `name` key and a `comment` key (both as strings). Optionally they can have a `value` key (an integer) which is used to represent the value in the enum.

In C these tokens will be templated out with the prefix `YP_TOKEN_`. For example, if you have a `name` key with the value `PERCENT`, you can access this in C through `YP_TOKEN_PERCENT`.
In C these tokens will be templated out with the prefix `PM_TOKEN_`. For example, if you have a `name` key with the value `PERCENT`, you can access this in C through `PM_TOKEN_PERCENT`.

## `flags`

Sometimes we need to communicate more information in the tree than can be represented by the types of the nodes themselves. For example, we need to represent the flags passed to a regular expression or the type of call that a call node is performing. In these circumstances, it's helpful to reference a bitset of flags. This field is a list of flags that can be used in the nodes.

Each flag is expected to have a `name` key (a string) and a `values` key (an array). Each value in the `values` key should be an object that contains both a `name` key (a string) that represents the name of the flag and a `comment` key (a string) that represents the comment for the flag.

In C these flags will get templated out with a `YP_` prefix, then a snake-case version of the flag name, then the flag itself. For example, if you have a flag with the name `RegularExpressionFlags` and a value with the name `IGNORE_CASE`, you can access this in C through `YP_REGULAR_EXPRESSION_FLAGS_IGNORE_CASE`.
In C these flags will get templated out with a `PM_` prefix, then a snake-case version of the flag name, then the flag itself. For example, if you have a flag with the name `RegularExpressionFlags` and a value with the name `IGNORE_CASE`, you can access this in C through `PM_REGULAR_EXPRESSION_FLAGS_IGNORE_CASE`.

## `nodes`

Expand All @@ -47,14 +47,14 @@ Optionally, every node can define a `child_nodes` key that is an array. This arr

The available values for `type` are:

* `node` - A child node that is a node itself. This is a `yp_node_t *` in C.
* `node?` - A child node that is optionally present. This is also a `yp_node_t *` in C, but can be `NULL`.
* `node[]` - A child node that is an array of nodes. This is a `yp_node_list_t` in C.
* `string` - A child node that is a string. For example, this is used as the name of the method in a call node, since it cannot directly reference the source string (as in `@-` or `foo=`). This is a `yp_string_t` in C.
* `constant` - A variable-length integer that represents an index in the constant pool. This is a `yp_constant_id_t` in C.
* `constant[]` - A child node that is an array of constants. This is a `yp_constant_id_list_t` in C.
* `location` - A child node that is a location. This is a `yp_location_t` in C.
* `location?` - A child node that is a location that is optionally present. This is a `yp_location_t` in C, but if the value is not present then the `start` and `end` fields will be `NULL`.
* `node` - A child node that is a node itself. This is a `pm_node_t *` in C.
* `node?` - A child node that is optionally present. This is also a `pm_node_t *` in C, but can be `NULL`.
* `node[]` - A child node that is an array of nodes. This is a `pm_node_list_t` in C.
* `string` - A child node that is a string. For example, this is used as the name of the method in a call node, since it cannot directly reference the source string (as in `@-` or `foo=`). This is a `pm_string_t` in C.
* `constant` - A variable-length integer that represents an index in the constant pool. This is a `pm_constant_id_t` in C.
* `constant[]` - A child node that is an array of constants. This is a `pm_constant_id_list_t` in C.
* `location` - A child node that is a location. This is a `pm_location_t` in C.
* `location?` - A child node that is a location that is optionally present. This is a `pm_location_t` in C, but if the value is not present then the `start` and `end` fields will be `NULL`.
* `uint32` - A child node that is a 32-bit unsigned integer. This is a `uint32_t` in C.

If the type is `node` or `node?` then the value also accepts an optional `kind` key (a string). This key is expected to match to the name of another node type within `config.yml`. This changes a couple of places where code is templated out to use the more specific struct name instead of the generic `yp_node_t`. For example, with `kind: StatementsNode` the `yp_node_t *` in C becomes a `yp_statements_node_t *`.
If the type is `node` or `node?` then the value also accepts an optional `kind` key (a string). This key is expected to match to the name of another node type within `config.yml`. This changes a couple of places where code is templated out to use the more specific struct name instead of the generic `pm_node_t`. For example, with `kind: StatementsNode` the `pm_node_t *` in C becomes a `pm_statements_node_t *`.
4 changes: 2 additions & 2 deletions docs/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ The design of the parser is based around these main goals.

The first piece to understand about the parser is the design of its syntax tree. This is documented in `config.yml`. Every token and node is defined in that file, along with comments about where they are found in what kinds of syntax. This file is used to template out a lot of different files, all found in the `templates` directory. The `templates/template.rb` script performs the templating and outputs all files matching the directory structure found in the templates directory.

The templated files contain all of the code required to allocate and initialize nodes, pretty print nodes, and serialize nodes. This means for the most part, you will only need to then hook up the parser to call the templated functions to create the nodes in the correct position. That means editing the parser itself, which is housed in `yarp.c`.
The templated files contain all of the code required to allocate and initialize nodes, pretty print nodes, and serialize nodes. This means for the most part, you will only need to then hook up the parser to call the templated functions to create the nodes in the correct position. That means editing the parser itself, which is housed in `prism.c`.

## Pratt parsing

Expand All @@ -24,7 +24,7 @@ In order to provide the best possible error tolerance, the parser is hand-writte
* https://matklad.github.io/2020/04/13/simple-but-powerful-pratt-parsing.html
* https://chidiwilliams.com/post/on-recursive-descent-and-pratt-parsing/

You can find most of the functions that correspond to constructs in the Pratt parsing algorithm in `yarp.c`. As a couple of examples:
You can find most of the functions that correspond to constructs in the Pratt parsing algorithm in `prism.c`. As a couple of examples:

* `parse` corresponds to the `parse_expression` function
* `nud` (null denotation) corresponds to the `parse_expression_prefix` function
Expand Down
Loading

0 comments on commit ea9a3fa

Please sign in to comment.