Skip to content

Commit

Permalink
scripts: misc updates for clarity
Browse files Browse the repository at this point in the history
  • Loading branch information
rkdarst committed Nov 8, 2023
1 parent c72700c commit b1929fa
Show file tree
Hide file tree
Showing 5 changed files with 44 additions and 28 deletions.
46 changes: 31 additions & 15 deletions content/scripts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Jupyter notebooks can be parameterized for instance using `papermill <https://pa

Within JupyterLab, you can export any Jupyter notebook to a Python script:

.. figure:: https://jupyterlab.readthedocs.io/en/stable/_images/exporting_menu.png
.. figure:: https://jupyterlab.readthedocs.io/en/stable/_images/exporting-menu.png

Select File (top menu bar) → Export Notebook as → **Export notebook to Executable Script**.

Expand All @@ -69,9 +69,13 @@ Exercises 1

1. Download the :download:`weather_observations.ipynb <../resources/code/scripts/weather_observations.ipynb>` and the weather_data file and upload them to your Jupyterlab. The script plots the temperature data for Tapiola in Espoo. The data is originally from `rp5.kz <https://rp5.kz>`_ and was slightly adjusted for this lecture.

**Note:** If you haven't downloaded the file directly to your Jupyterlab folder, it will be located in your **Downloads** folder or the folder you selected. In Jupyterlab click on the 'upload file' button, navigate to the folder containing the file and select it to load it into your Jupyterlab folder.
**Hint:** Copy the URL above (right-click) and in JupyterLab, use
File → Open from URL → Paste the URL. It will both download it to
the directory JupyterLab is in and open it for you.

2. Open a terminal in Jupyter (File → New → Terminal).
2. Open a terminal in Jupyter: File → New Launcher, then click
"Terminal" there. (if you do it this way, it will be in the right
directory. File → New → Terminal might not be.)

3. Convert the Jupyter script to a Python script by calling::

Expand All @@ -81,6 +85,8 @@ Exercises 1

$ python weather_observations.py



Command line arguments with :data:`sys.argv`
--------------------------------------------

Expand All @@ -100,29 +106,31 @@ and any further argument (separated by space) is appended to this list, like suc
$ # sys.argv[2] is 'B'
Lets see how it works: We modify the **weather_observations.py** script such that we allow start
and end times as well as the output file to be passed in as arguments to the function:
and end times as well as the output file to be passed in as arguments
to the function. Open it (find the ``.py`` file from the JupyterLab
file browser) and make these edits:

.. code-block:: python
:emphasize-lines: 1,5-6,8,16
:emphasize-lines: 1,5-6,8,14-15
import sys
import pandas as pd
# set start and end time
start_date = pd.to_datetime(sys.argv[1],dayfirst=True)
end_date = pd.to_datetime(sys.argv[2],dayfirst=True)
output_file_name = sys.argv[3]
# define the start and end time for the plot
start_date = pd.to_datetime(sys.argv[1], dayfirst=True)
end_date = pd.to_datetime(sys.argv[2], dayfirst=True)
...
# select the data
weather = weather[weather['Local time'].between(start_date,end_date)]
...
# save the figure
output_file_name = sys.argv[3]
fig.savefig(output_file_name)
We can try it out:
We can try it out (see the file ``spring_in_tapiola.png`` made in the
file browser):

.. code-block:: console
Expand Down Expand Up @@ -185,6 +193,7 @@ would show the following message:

.. code-block:: console
$ python birthday.py --help
usage: birthday.py [-h] [-d DATE] N
positional arguments:
Expand All @@ -201,7 +210,7 @@ Exercises 2
.. challenge:: Scripts-2

1. Take the Python script (``weather_observations.py``) we have written in the preceding exercise and use
:py:mod:`argparse` to specify the input and output files and allow the start and end dates to be set.
:py:mod:`argparse` to specify the input (URL) and output files and allow the start and end dates to be set.

* Hint: try not to do it all at once, but add one or two arguments, test, then add more, and so on.
* Hint: The input and output filenames make sense as positional arguments, since they must always be given. Input is usually first, then output.
Expand Down Expand Up @@ -236,6 +245,7 @@ Exercises 2
- We can now process different input files without changing the script.
- We can select multiple time ranges without modifying the script.
- We can easily save these commands to know what we did.
- This way we can also loop over file patterns (using shell loops or similar) or use
the script in a workflow management system and process many files in parallel.
- By changing from :data:`sys.argv` to :mod:`argparse` we made the script more robust against
Expand Down Expand Up @@ -287,9 +297,9 @@ Exercises 3 (optional)
.. challenge:: Scripts-3

1. Download the :download:`optionsparser.py <https://raw.githubusercontent.com/AaltoSciComp/python-for-scicomp/master/resources/code/scripts/optionsparser.py>`
function and load it into your working folder in Jupyterlab.
function and load it into your working folder in Jupyterlab (Hint: in JupyterLab, File → Open from URL).
Modify the previous script to use a config file parser to read all arguments. The config file is passed in as a single argument on the command line
(using e.g. argparse or sys.argv) still needs to be read from the command line.
(using e.g. :mod:`argparse` or :data:`sys.argv`) still needs to be read from the command line.


2. Run your script with different config files.
Expand All @@ -303,6 +313,12 @@ Exercises 3 (optional)
:language: python
:emphasize-lines: 5,9-12,15-27,30,33,36-37,58

What did this config file parser get us? Now, we have separated the
code from the configuration. We could save all the configuration in
version control - separately and have one script that runs them. If
done right, our work could be much more reproducible and
understandable.


.. admonition:: Further reading

Expand Down
6 changes: 3 additions & 3 deletions resources/code/scripts/weather_observations.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,12 @@
"weather = pd.read_csv(url,comment='#')\n",
"\n",
"# define the start and end time for the plot \n",
"start_date=pd.to_datetime('01/06/2021',dayfirst=True)\n",
"end_date=pd.to_datetime('01/10/2021',dayfirst=True)\n",
"start_date=pd.to_datetime('01/06/2021', dayfirst=True)\n",
"end_date=pd.to_datetime('01/10/2021', dayfirst=True)\n",
"\n",
"# The date format in the file is in a day-first format, which matplotlib does nto understand.\n",
"# so we need to convert it.\n",
"weather['Local time'] = pd.to_datetime(weather['Local time'],dayfirst=True)\n",
"weather['Local time'] = pd.to_datetime(weather['Local time'], dayfirst=True)\n",
"# select the data\n",
"weather = weather[weather['Local time'].between(start_date,end_date)]\n"
]
Expand Down
6 changes: 3 additions & 3 deletions resources/code/scripts/weather_observations.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@
weather = pd.read_csv(url,comment='#')

# define the start and end time for the plot
start_date=pd.to_datetime('01/06/2021',dayfirst=True)
end_date=pd.to_datetime('01/10/2021',dayfirst=True)
start_date=pd.to_datetime('01/06/2021', dayfirst=True)
end_date=pd.to_datetime('01/10/2021', dayfirst=True)
#Preprocess the data
weather['Local time'] = pd.to_datetime(weather['Local time'],dayfirst=True)
weather['Local time'] = pd.to_datetime(weather['Local time'], dayfirst=True)
# select the data
weather = weather[weather['Local time'].between(start_date,end_date)]

Expand Down
8 changes: 4 additions & 4 deletions resources/code/scripts/weather_observations_argparse.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,19 @@
parser.add_argument("input", type=str, help="Input data file")
parser.add_argument("output", type=str, help="Output plot file")
parser.add_argument("-s", "--start", default="01/01/2019", type=str, help="Start date in DD/MM/YYYY format")
parser.add_argument("-e", "--end", default="16/10/2021", type=str, help="End date in DD/MM/YYYY format")
parser.add_argument("-e", "--end", default="16/10/2021", type=str, help="End date in DD/MM/YYYY format")

args = parser.parse_args()

# load the data
weather = pd.read_csv(args.input,comment='#')

# define the start and end time for the plot
start_date=pd.to_datetime(args.start,dayfirst=True)
end_date=pd.to_datetime(args.end,dayfirst=True)
start_date=pd.to_datetime(args.start, dayfirst=True)
end_date=pd.to_datetime(args.end, dayfirst=True)

# preprocess the data
weather['Local time'] = pd.to_datetime(weather['Local time'],dayfirst=True)
weather['Local time'] = pd.to_datetime(weather['Local time'], dayfirst=True)
# select the data
weather = weather[weather['Local time'].between(start_date,end_date)]

Expand Down
6 changes: 3 additions & 3 deletions resources/code/scripts/weather_observations_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,11 @@
weather = pd.read_csv(parameters.input,comment='#')

# obtain start and end date
start_date=pd.to_datetime(parameters.start,dayfirst=True)
end_date=pd.to_datetime(parameters.end,dayfirst=True)
start_date=pd.to_datetime(parameters.start, dayfirst=True)
end_date=pd.to_datetime(parameters.end, dayfirst=True)

# Data preprocessing
weather['Local time'] = pd.to_datetime(weather['Local time'],dayfirst=True)
weather['Local time'] = pd.to_datetime(weather['Local time'], dayfirst=True)
# select the data
weather = weather[weather['Local time'].between(start_date,end_date)]

Expand Down

0 comments on commit b1929fa

Please sign in to comment.