-
Notifications
You must be signed in to change notification settings - Fork 2
Home
A parser's job is to extract one or more pieces of related information from a case. For example, the city in which a case was decided.
This page explains how you can create a new parser, which gets compiled into a package that is then used in our pipeline to feed our website and API.
It is easy to build a new parser.
You'll need to set up your local development environment with nodejs and clone this repository.
A parser is a pure function which takes defined inputs and returns the data extracted.
Parsers must be typescript and must have an associated unit test. See the repository for examples.
Parsers can take in any of the following fields as arguments:
caseText: string -- the full plain text of a case (with line breaks)
caseNames: string[]-- an array of known case names. Will usually be only one.
caseDate: string -- the date of a case
caseCitations: string[] -- an array of known case citations (e.g., [2012] NZHC 1234). Will usually be only one.
caseFileKey string -- unique id for a case
The following parser takes caseText as an input and returns the first five characters of the case.
export default (caseText: string): string => {
const firstFiveCharacters = caseText.substring(0,5);
return firstFiveCharacters;
};
Sample files that contain example caseText data are available in src/testData/parseRepresentation/caseTexts
Sometimes you will need to pass in more than already-known case information to a parser.
For example, our judge parser needs to receive a list of possible judge titles which are static ('Chief Justice', 'Associate Judge', etc).
To do so, include that data as a JSON file in /src/dataDefinitions.
NB: For a parser to be integrated into the pipeline, we may need to modify our database schema. You should provide details of whether your parser will require one or more additional columns (or tables and relationships, foreign keys etc). We use a postgresql database.