Skip to content

Commit

Permalink
Merge pull request #1365 from opensafely/update-codelists-docs
Browse files Browse the repository at this point in the history
Update codelists documentation
  • Loading branch information
rebkwok authored Nov 3, 2023
2 parents f4776c2 + 950c18a commit 0d8d199
Show file tree
Hide file tree
Showing 8 changed files with 98 additions and 56 deletions.
4 changes: 2 additions & 2 deletions docs/_redirects
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@
/en/stable/onboarding_analysts/ / 301
/en/stable/open_data_manifesto /open-data-manifesto/ 301
/en/stable/open_data_manifesto/ /open-data-manifesto/ 301
/en/stable/snomed /codelist-snomed/ 301
/en/stable/snomed/ /codelist-snomed/ 301
/en/stable/snomed /codelist-intro/ 301
/en/stable/snomed/ /codelist-intro/ 301
/en/stable/study_definition /study-def/ 301
/en/stable/study_definition/ /study-def/ 301
/en/stable/* /:splat 301
Expand Down
56 changes: 28 additions & 28 deletions docs/codelist-creation.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,24 +6,23 @@ There are two ways to build a codelist using OpenCodelists:

The current coding systems available in the OpenCodelists builder are listed below.

| Coding system | CSV column name |
| ---- | ---- |
| [Pseudo BNF](https://www.bennett.ox.ac.uk/blog/2017/04/prescribing-data-bnf-codes/) | `BNFCode` |
| CTV3 (Read v3) | `CTV3Code` |
| CTV3 (Read v3) with TPP extensions | `CTV3Code` |
| [Dictionary of Medicines and Devices (dm+d)](https://www.bennett.ox.ac.uk/blog/2019/08/what-is-the-dm-d-the-nhs-dictionary-of-medicines-and-devices/) | `DMDCode` |
| International Classification of Diseases 10 (ICD-10) | `ICD10Code` |
| Read v2 | `Read2Code` |
| SNOMED CT | `SNOMEDCode` |
| Coding system |
| ---- |
| [Pseudo BNF](https://www.bennett.ox.ac.uk/blog/2017/04/prescribing-data-bnf-codes/) |
| CTV3 (Read v3) |
| CTV3 (Read v3) with TPP extensions |
| [Dictionary of Medicines and Devices (dm+d)](https://www.bennett.ox.ac.uk/blog/2019/08/what-is-the-dm-d-the-nhs-dictionary-of-medicines-and-devices/) |
| International Classification of Diseases 10 (ICD-10) |
| SNOMED CT |


Each codelist must use exactly one of these systems.

OPCS-4 codes are not currently supported by the OpenCodelists builder as we do not currently have the full list of available OPCS-4 codes. However, it is possible to [manually upload an existing OPCS-4 codelist](https://www.opencodelists.org/docs/#creating-a-codelist-from-a-csv-file).
OPCS-4 and dm+d codes are not currently supported by the OpenCodelists builder. However, it is possible to [manually upload an existing OPCS-4 or dm+d codelist](https://www.opencodelists.org/docs/#creating-a-codelist-from-a-csv-file).

## Workflow

The general workflow for creating codelists is as follows:
The general workflow for creating codelists from scratch with the builder is as follows:

1. Search [OpenCodelists](https://www.opencodelists.org) for codelists that meet or nearly meet your requirements and make sure that one doesn't already exist.
1. If you need to build a new codelist [sign up for an account on OpenCodelists](https://www.opencodelists.org/accounts/register/).
Expand All @@ -33,13 +32,15 @@ previous research papers.
1. When logged into [OpenCodelists](https://www.opencodelists.org/accounts/login/) click "my codelists" and then "create new codelist". There is a short video at the [bottom of this page](#medvid) on how to use the builder to develop a medication codelist.
1. Add/remove terms to your codelists to end up with a list.
1. Save the list as a draft.
1. Clicking "Save changes" makes the codelist available on <https://codelists.opensafely.org> as a draft. Share this link to the GitHub issue.
1. Clicking "Save changes" makes the codelist available on <https://www.opencodelists.org> as a draft. Share this link to the GitHub issue.
1. Discuss as a group in the issue your decisions, and the reason for including or excluding different codes. Finalise a list
as a group (i.e. at least 2). Detailed reasons are helpful in this issue for referencing in the future.
1. Once agreed, obtain sign-off.
1. Summarise your discussion and methodology briefly for the metadata, and reference the issue on the website for more details. This will initially be a draft. When ready, publish it.
1. Once agreed, click "Save for review" and obtain sign-off.
1. Summarise your discussion and methodology briefly, and add any references (including the
GitHub issue) and sign offs by [editing the codelist metadata](#editing-existing-codelists).
When ready, click "Publish version" to publish it.
1. Close the issue on the [codelist-development repo](https://github.com/opensafely/codelist-development).
1. Import the codelist for use in your study definition.
1. [Import the codelist](#import-the-codelist-for-use-in-your-study-definition) for use in your study definition.

## Create a new issue on the [codelist-development repo](https://github.com/opensafely/codelist-development)

Expand Down Expand Up @@ -69,15 +70,16 @@ Once a draft codelist has been agreed, we recommend it should be signed-off by a
"disease expert" (clinical sign-off).


## Add to [OpenCodelists](https://www.opencodelists.org)
## Creating a codelist from a CSV file

* Go to the OpenCodelists [new codelist page](https://www.opencodelists.org/codelist/opensafely/).
You will need an editor account. Ask one of the tech team for one if you do not have one.
* Fill in the fields. Include lots of detail (specific guidance to follow).
* **CSV data**: [Export your Spreadsheet to a CSV](#exporting-a-csv-from-a-spreadsheet) and choose that file.
* **References**: this should include a link to the issue on the [codelist-development repo](https://github.com/opensafely/codelist-development), and any other relevant materials.
* **Sign Off**: This should match the people signing off on the issue. You need at least 2 people and can have many more.
* Click Submit and check the new codelist has appeared on the main site.
* If your codelist is in an Excel spreadsheet, first [export your Spreadsheet to a CSV](#exporting-a-csv-from-a-spreadsheet).
* For codelists in PseudoBNF, CTv3, ICD-10 or SNOMED-CT coding systems, the CSV must have
no headers, and the first column must contain the codes. Follow the instructions in the OpenCodelists documentation to [create a codelist by uploading the CSV file](https://www.opencodelists.org/docs/#creating-a-codelist-from-a-csv-file).
* For OPCS-4 or dm+d codelists, use the alternative CSV upload process described in the
[OpenCodelists documentation](https://www.opencodelists.org/docs/#creating-a-codelist-from-a-csv-file). In this case,
the CSV file must include a column with the heading `code` (or `dmd_id` for dm+d uploads).
References and sign-offs can be entered during the upload process, or afterwards, by [editing
the codelist metadata](#editing-existing-codelists).


## Exporting a CSV from a Spreadsheet
Expand Down Expand Up @@ -109,23 +111,21 @@ How contributions to codelists are acknowledged -- to be agreed.
* Go to an existing Codelist page.
* Click Edit metadata.
* Edit the relevant fields
* Note: Changing the CSV data requires you Update the current Version or Create a new Version, both can be done from the Codelist page.
* Add, remove, or edit the References and SignOffs as needed.
* Add, remove, or edit the Description, Methodology, References and SignOffs as needed.
* Click Submit

## Publishing a Codelist Version

* Go to an existing Codelist page.
* This will show you the latest version for a Codelist.
* If it's a draft version there will be a Publish version button on the left below Create new version.
* If not, it's already published, good job!
* If it's a Under Review version there will be a Publish Version button on the left below Create new version.
* If not, it's already published, and the Publish Version button will be disabled.


## Adding a Codelist Version

* Go to an existing Codelist page.
* Click Create new version.
* If you want to update the existing version, click Update version instead.

## <a name="medvid"></a>Build a simple medication codelist

Expand Down
19 changes: 11 additions & 8 deletions docs/codelist-project.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,26 @@

Your example research template doesn't include any codelists but the folder structure and text files that are needed to include codelists already exist.
Take a look at the `codelists/codelists.txt` file in the repo, this file is currently empty but you can add one (or more) lines to this file that specify the codelists that you need for your research project.
The naming convention of the line that you need to add to the `codelists/codelists.txt` file follows this structure: a `<codelist-id> `is followed by `/` and a `<tag>`.
Note that the tag is usually a date (YYYY-MM-DD) but can also be a version number (e.g., v1.2).
The naming convention of the line that you need to add to the `codelists/codelists.txt` file follows this structure: a `<codelist-id> `is followed by `/` and a `<version-id>`.
Note that the version ID is a sequence of 8 characters. Some codelists may also have a version tag in the form of a date (YYYY-MM-DD) or a version number (e.g., v1.2) that can be
used in place of the version ID.

```bash
<codelist-id>/<tag>
<codelist-id>/<version-id>
```

If you want to add a codelist from [OpenCodelists](https://www.opencodelists.org) to your project you can find this information on the page for each of the codelists, see orange boxes in the screenshot below.

![Finding the codelist id and tag on OpenCodelists.](images/adding-codelist-id-tag.png)
![Finding the codelist ID and version ID on OpenCodelists.](images/adding-codelist-id-version.png)

You need to add each line into a new line of the `codelists.txt` file.
The next time you run the command `opensafely codelists update` in your terminal, the codelists you specified earlier will be added to the the `codelists/` subfolder in your project automatically so you don't need to add these files manually to your project.

For example, a `codelists.txt` file of a project may consist of four different lines:

```bash
opensafely/aplastic-anaemia/2020-04-24
opensafely/asplenia/2020-06-02
opensafely/aplastic-anaemia/58ac196d
opensafely/asplenia/3ce9e642
opensafely/current-asthma/2020-05-06
primis-covid19-vacc-uptake/bmi_stage/v1.2
```
Expand All @@ -34,11 +35,13 @@ opensafely-current-asthma.csv
primis-covid19-vacc-uptake-bmi_stage.csv
```

A codelist may be owned by an individual user, rather than an organisation. In this case, the
entry in `codelists.txt` follows this structure: `user/<username>/<codelist-id>/<version-id>`.

If necessary, during initial development you can even import codelists this way before they are published (provided they have been put "under review", not in "draft" state), but ensure they are finalised and updated in your study before running in the real data.
To use codelists that are not yet published you need to add a new line to the `codelists.txt` file using this structure `user/<your_username>/<your-codelist-id>/<tag>`).

## Adding/updating a codelist CSV file
Once you have listed the codelists you need from OpenCodelists in the `codelists.txt` file, you can download the specified files into the `codelist/` folder using the `opensafely` program by running
Once you have listed the codelists you need from OpenCodelists in the `codelists.txt` file, you can download the specified files into the `codelists/` folder using the `opensafely` program by running

```bash
opensafely codelists update
Expand Down
17 changes: 0 additions & 17 deletions docs/codelist-snomed.md

This file was deleted.

56 changes: 56 additions & 0 deletions docs/codelist-updating.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
Once your codelists are [imported into your study](codelist-project.md), they are ready to be
used for running jobs on the [jobs site](jobs-site.md).

You may encounter a warning message when you try to run jobs, that looks something like this:

![Out of date codelists warning on Jobs site.](images/codelists-jobs-warning.png)


To fix this, you will need to follow the steps to [add a codelist into your study](codelist-project.md#adding-updating-a-codelist-CSV-file) again.

Note that this warning is only relevant if the jobs you are running require access to the
backend database. Analysis jobs that use data that has already been extracted in a previous
run do not need to update codelists in order to run successfully.


!!! Info

Due to changes introduced to address [dm+d codes](#dmd-a-special-case), dm+d
codelists now download with standardised column headings (`code` and `term`) in the
CSV files. For backwards compatibility, they also include a column with the
original code column heading (typically `dmd_id`).


## What are "out-of-date" codelists?

Codelists may sometimes go "out-of-date". All coding systems change (with the exception of CTv3, which is no longer updated), and new releases are published which may add new codes or retire codes.

A codelist version on OpenCodelists is associated with a specific release of a coding system,
and once under review or published, it cannot change. This means that, for the most part, any
codelist that has been specified in `codelists.txt` with a `version-id` and downloaded into
a study repo will not need to be updated again.

!!! warning

This does not mean that the codelist is up-to-date with the most recent release of a coding
system. It only means that the version downloaded in the study has not changed on
OpenCodelists.

You may need to create new versions of codelists in order to update them to a more recent
coding system release. To do this, go to an existing Codelist page and click on Create new
version.

### dm+d: a special case

Codelists created with the Dictionary of Medicines and Devices (dm+d) coding system are special cases. The dm+d coding coding system is updated and released on a weekly basis. Codes for Virtual Medicinal Products (VMPs) can change, and are retrospectively updated in patients’
clinical records. This means that after a new release of dm+d, a VMP with a changed code will no longer match patients that it did previously.

In order to address this, OpenCodelists maintains a mapping of changed VMP codes. When you run
`opensafely codelists update` to download codelist CSV files into your study repo, dm+d
codelist CSV files will include the codes explicitly specified in the codelist *and* any
previous or subsequent changes to those codes.

If a new release of dm+d introduces new VMP mappings that affect codes in your codelists, you
,ay be prompted (by the opensafely command line tool, automated tests in GitHUb, or the jobs site) to re-run
`opensafely codelists update`, commit the changes and push them to GitHub before you can run
jobs.
Binary file added docs/images/adding-codelist-id-version.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/codelists-jobs-warning.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ nav:
- Introduction to codelists: codelist-intro.md
- Building a codelist: codelist-creation.md
- Adding codelists to a project: codelist-project.md
- SNOMED CT codelists: codelist-snomed.md
- Keeping codelists up to date: codelist-updating.md
- Actions:
- Overview: actions-intro.md
- The project pipeline: actions-pipelines.md
Expand Down

0 comments on commit 0d8d199

Please sign in to comment.