Skip to content

Commit

Permalink
Merge pull request #1633 from opensafely/eli/getting-started-guide-im…
Browse files Browse the repository at this point in the history
…provements

Improvements to the Getting Started guide
  • Loading branch information
iaindillingham authored Oct 1, 2024
2 parents ed1e68d + b191be7 commit b4d8687
Show file tree
Hide file tree
Showing 11 changed files with 46 additions and 31 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ This code reads the CSV of patient data, and saves a histogram of ages to a new
- **Line 16** tells the system that this action depends on the outputs of the
`generate_dataset` being present.
- **Lines 17-19** describe the files that the action creates. Line 18 says that the
items indented below it are *moderately* sensitive, that is they may be released
items indented below it are *moderately* sensitive, which means they may be released
to the public after a careful review (and possible redaction). Line 19 says that
there's one output file, which will be found at `output/report.png`.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Your repository is automatically configured with tests to verify the project is runnable,
each time you push.
Your repository is automatically configured with tests to verify the project is runnable.
These tests run each time you push.

Now that you have published the changes from your codespace to your GitHub repository,
we can see if these tests pass.
Expand Down
10 changes: 8 additions & 2 deletions docs/getting-started/tutorial/create-a-github-codespace/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,12 @@ To open your repository with GitHub Codespaces:

![A screenshot showing the "Create codespace on main" button.](../../../images/getting-started-codespaces-button.png)

You may see the following screen requesting additional permissions for your Codepsace:

![A screenshot showing "This codepsace is requesting additional permissions, with a green 'Authorize and continue button at the bottom right".](../../../images/getting-started-codespaces-repository-additional-permissions.png)

If so, click "Authorize and continue".

You should then see a "Setting up your codespace" screen:

![A screenshot showing "Setting up your codespace".](../../../images/getting-started-codespaces-setting-up.png)
Expand All @@ -25,10 +31,10 @@ Explorer.](../../../images/getting-started-codespaces-start.png)
The terminal at the bottom-right of the GitHub codespace runs
commands on a computer (virtual machine) provided by GitHub.

The large, upper-right area holds the **main editor** and where you will
The large, upper-right area holds the **main editor**, which is where you will
view and edit files that you are working on. The left **"side bar"**
holds the Explorer when you first start the codespace. There are
other useful menus in this area that can be switched with the icons
other useful menus in this area that can be accessed with the icons
to the far left side. Finally, the button at the top-left with three
horizontal lines (``) is the **menu button**, which allows you to
access many more options.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Here, you'll copy our OpenSAFELY research template to your own GitHub
account, for developing your own study:
account and use it to develop your own study:

1. Click on the link below to create new repository based on our template.
You may need to log in to GitHub if you are not already logged in:
Expand All @@ -13,10 +13,10 @@ account, for developing your own study:
1. Click **Create repository from template**
1. The new GitHub repository will take a moment to initialise, as it is running
some setup in background. Wait about 1 minute, then reload the page, and you
should see the README displayed now reflects the name you gave to the new
should see that the README displayed now reflects the name you gave to the new
repository.

If you see `${GITHUB_REPOSITORY_NAME}` in your README, the repo is not yet initialised, wait a few seconds longer and reload.
If you see `${GITHUB_REPOSITORY_NAME}` in your README, the repo is not yet initialised; wait a few seconds longer and reload.

---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,13 +1,15 @@
Now you're ready to run your first study.

In the terminal, run:
In the terminal, type the following:

```shell-session
$ opensafely exec ehrql:v1 generate-dataset analysis/dataset_definition.py
opensafely exec ehrql:v1 generate-dataset analysis/dataset_definition.py
```

pressing ++enter++ once you've typed the command.

This command makes use of files that already exist in the repository to generate a dummy dataset.

The first time you run this command, it may take a few seconds to download the
required software. Eventually, you should see output that contains lines like the following:

Expand Down
4 changes: 2 additions & 2 deletions docs/getting-started/tutorial/introduction/index.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
OpenSAFELY is designed to allow you to do your analytic work on your own
computer, without ever having to access the real, sensitive, patient-level data.

Because the study uses dummy patient data,
Because the tutorial study uses dummy patient data,
anyone can complete the tutorial.

We ask all potential collaborators to complete this tutorial,
We ask all potential collaborators to complete this tutorial
before applying to run their project against real data.

## Learning outcome
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
So far,
the changes you have made only exist in the codespace which you are working in.
the changes you have made only exist in the codespace in which you are working.

In this section, you will first add the study changes that you've made
to a new *commit* in your repository — a commit represents a stored
Expand Down
33 changes: 21 additions & 12 deletions docs/getting-started/tutorial/run-the-project-pipeline/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,23 @@ we will look at the OpenSAFELY project pipeline.

So far,
we have run the single dataset definition step, or *scripted action*,
at the command line with the command:
using the command line with the command:

```sh
$ opensafely exec ehrql:v1 generate-dataset analysis/dataset_definition.py`
opensafely exec ehrql:v1 generate-dataset analysis/dataset_definition.py`
```

A complete OpenSAFELY study may include multiple actions.
For example, the first action might extract a dataset,
and a subsequent action might generate a table or chart from that data.

The `project.yaml` file in the study repository
defines the actions for an OpenSAFELY project pipeline
defines the actions for an OpenSAFELY project pipeline.

## The `project.yaml` file

In the Visual Studio Code file Explorer,
open the `project.yaml` file by clicking on it.
open the `project.yaml` file by clicking on it. This file will be near the end of the list of files and folders.

You should see a tab with the following content:

Expand All @@ -38,24 +38,29 @@ actions:
dataset: output/dataset.csv.gz
```

There is a single actions defined called `generate_dataset`
There is a single action defined, called `generate_dataset`,
in this project pipeline.

The highlighted line is the command that the action runs
The highlighted line is the command that the action runs,
and is very similar to the command we previously ran.

The difference is that `generate_dataset` defines an output
stored in the `output` folder.

## Running the action in the pipeline

1. In the Visual Studio Code file Explorer,
<ol>
<li>
In the Visual Studio Code file Explorer,
confirm that the `output` folder is empty.
2. In the Visual Studio Code Terminal,
</li>

<li>
In the Visual Studio Code Terminal,
type:

```sh
$ opensafely run generate_dataset
opensafely run generate_dataset
```

and press ++enter++ to run the pipeline action.
Expand All @@ -81,9 +86,13 @@ stored in the `output` folder.
`output/dataset.csv.gz`, and that it should be considered highly sensitive
data. What you see here is exactly the same process that would happen on a real, secure
server.
3. When the command completes,
recheck the `output` folder
</li>
<li>
When the command completes, recheck the `output` folder
and see that it contains a `dataset.csv.gz` file.
</li>
</ol>
### Viewing the dataset output
Expand All @@ -103,7 +112,7 @@ The difference between them is that:
* `opensafely exec` runs actions *outside* of the project pipeline
and is useful for quick feedback during interactive development
* `opensafely run` runs actions *inside* the project pipeline,
* `opensafely run` runs actions *inside* the project pipeline -
that is, just as they would be in the secure OpenSAFELY environment
containing real patient data
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
You've successfully generated a dataset from the code in your study, but at the moment it only adds one data column.
You've successfully generated a dataset from the code in your study, but at the moment it only contains one data column.

Now we'll add some code to create an extra column.

Expand Down Expand Up @@ -32,15 +32,13 @@ Lines 8-12 mean "*I'm interested in all patients who were registered at a practi
on the index date*"; line 14 "*Give me a column of data corresponding
to the sex of each patient*"; and line 15 "*Give me a column of data corresponding
to the age of each patient on the given date*".
1. If you run:
1. If you type the following into your terminal:

```shell-session
$ opensafely exec ehrql:v1 generate-dataset analysis/dataset_definition.py
opensafely exec ehrql:v1 generate-dataset analysis/dataset_definition.py
```

you will see a new randomly generated dataset.

However, this time it contains the additional `age` column.
and press ++enter++, you will see a new randomly generated dataset which now contains the additional `age` column.

---

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit b4d8687

Please sign in to comment.