This parser converts UK judgments from .docx format to XML. It is written in C# and requires .NET 5.0.
To invoke the parser programatically, clients should use the classes in the UK.Gov.NationalArchives.Judgments.Api namespace.
- Create a Request object, with the following properties:
- Content (required), a byte array, the content of the judgment, in .docx format
- Filename (optional), a string, the name of .docx file containing the judgment
- Attachments (optional), an array of Attachment objects, having the following properties:
- Content (required), a byte array, the content of the attachment, in .docx format
- Type (required), an enum, with the following possibe values: Order
- Filename (optional), a string, the name of .docx file containing the attachment
- Meta (optional), a Meta object, with the following properties:
- Court (optional), a string, the identifier of the court
- Cite (optional), a string, the natural citation of the case
- Date (optional), a date, the date of the judgment
- Name (optional), a string, the case name
- Uri (optional), a string, a URI for the judgment
- Attachments (optional), an array of ExternalAttachment objects, having the following properties:
- Name (required), a string, the name of the attachment for display
- Link (optional), a string, a URL for the attachment
- Hint (optional), an enum, with the following possibe values: UKSC, UKCA, UKHC, UKUT, Judgment, PressSummary. If present, the parser will attempt to parse a judgment only of the specified type.
- Pass it to the Parse method in the Parser class,
- Receive a Response object, which will have the following properties:
A REST API, mimicking the above, is available at https://parse.judgments.tna.jurisdatum.com. Its specification can be found at /api.yaml.
The parser can also be invoked from the command line, as follows:
dotnet run --input path/to/file.docx
So, for example, the following command will parse the included test document and direct the output to the console:
dotnet run --input test/judgments/test1.docx
To direct the XML output to a file, use the --output
option, like so:
dotnet run --input test/judgments/test1.docx --output something.xml
To save the XML and all of the embedded images to a .zip file, use the --output-zip
option, like so:
dotnet run --input test/judgments/test1.docx --output-zip something.zip
If the --log
option is used, the parser will log its progress to the specified file. For example:
dotnet run --input test/judgments/test1.docx --output something.xml --log log.txt
And if the --test
option is used, the parser will perform a few tests and display the results either in the console or, if logging is enabled, to the log file.