-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* improve assignment table functions (#38) * update assign logging and force dtypes before merging * new parallel assignment by propagation functions * map propagate and resolve propagate functions * cache prop tables, add docstrings and todos * copy rows in prop for speed and lower memory req * correct false checks for empty rows and naming in assign by clusters * correct bugs in df filters, map assign ungauged * revise gis generating functions for new column names, logging * incrememnt version number
- Loading branch information
1 parent
af0245d
commit e5c988c
Showing
22 changed files
with
453 additions
and
438 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# `saber.assign` | ||
|
||
::: saber.assign |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# `saber.cluster` | ||
|
||
::: saber.cluster |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# `saber.gis` | ||
|
||
::: saber.gis |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,7 @@ | ||
# API Documentation | ||
# `saber-hbc` API | ||
|
||
* [`saber.assign`](assign.md) | ||
* [`saber.cluster`](cluster.md) | ||
* [`saber.gis`](gis.md) | ||
* [`saber.prep`](prep.md) | ||
* [`saber.validate`](validate.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# `saber.prep` | ||
|
||
::: saber.prep |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# `saber.validate` | ||
|
||
::: saber.validate |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
# Required Hydrological Datasets | ||
|
||
1. Hindcast/Retrospective discharge for every stream segment (reporting point) in the model. This is a time series of | ||
discharge, e.g. hydrograph, for each stream segment. The data should be saved in parquet format and named | ||
`hindcast_series_table.parquet`. The DataFrame should have: | ||
1. An index named `datetime` of type `datetime`. Contains the datetime stamp for the simulated values (rows) | ||
2. 1 column per stream, column name is the stream's model ID and is type string, containing the discharge for each | ||
time step. | ||
2. Observed discharge data for each gauge. 1 file per gauge named `{gauge_id}.csv`. The DataFrame should have: | ||
1. `datetime`: The datetime stamp for the measurements | ||
2. A column whose name is the unique `gauge_id` containing the discharge for each time step. | ||
|
||
The `hindcast_series_table.parquet` should look like this: | ||
|
||
| datetime | model_id_1 | model_id_2 | model_id_3 | ... | | ||
|------------|------------|------------|------------|-----| | ||
| 1985-01-01 | 50 | 50 | 50 | ... | | ||
| 1985-01-02 | 60 | 60 | 60 | ... | | ||
| 1985-01-03 | 70 | 70 | 70 | ... | | ||
| ... | ... | ... | ... | ... | | ||
|
||
Each gauge's csv file should look like this: | ||
|
||
| datetime | discharge | | ||
|------------|-----------| | ||
| 1985-01-01 | 50 | | ||
| 1985-01-02 | 60 | | ||
| 1985-01-03 | 70 | | ||
| ... | ... | | ||
|
||
## Things to check | ||
|
||
Be sure that both datasets: | ||
|
||
- Are in the same units (e.g. m3/s) | ||
- Are in the same time zone (e.g. UTC) | ||
- Are in the same time step (e.g. daily average) | ||
- Do not contain any non-numeric values (e.g. ICE, none, etc.) | ||
- Do not contain rows with missing values (e.g. NaN or blank cells) | ||
- Have been cleaned of any incorrect values (e.g. no negative values) | ||
- Do not contain any duplicate rows |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# Required GIS Datasets | ||
|
||
1. Drainage lines (usually delineated center lines) with at least the following attributes (columns) | ||
for each feature: | ||
- `model_id`: A unique identifier/ID, any alphanumeric utf-8 string will suffice | ||
- `downstream_model_id`: The ID of the next downstream reach | ||
- `strahler_order`: The strahler stream order of each reach | ||
- `model_drain_area`: Cumulative upstream drainage area | ||
- `x`: The x coordinate of the centroid of each feature (precalculated for faster results later) | ||
- `y`: The y coordinate of the centroid of each feature (precalculated for faster results later) | ||
|
||
2. Points representing the location of each of the river gauging station available with at least the | ||
following attributes (columns) for each feature: | ||
- `gauge_id`: A unique identifier/ID, any alphanumeric utf-8 string will suffice. | ||
- `model_id`: The ID of the stream segment which corresponds to that gauge. | ||
|
||
The `drain_table.parquet` should look like this: | ||
|
||
| downstream_model_id | model_id | model_area | strahler_order | x | y | | ||
|---------------------|-----------------|--------------|----------------|-----|-----| | ||
| unique_stream_# | unique_stream_# | area in km^2 | stream_order | ## | ## | | ||
| unique_stream_# | unique_stream_# | area in km^2 | stream_order | ## | ## | | ||
| unique_stream_# | unique_stream_# | area in km^2 | stream_order | ## | ## | | ||
| ... | ... | ... | ... | ... | ... | | ||
|
||
The `gauge_table.parquet` should look like this: | ||
|
||
| model_id | gauge_id | gauge_area | | ||
|-------------------|------------------|--------------| | ||
| unique_stream_num | unique_gauge_num | area in km^2 | | ||
| unique_stream_num | unique_gauge_num | area in km^2 | | ||
| unique_stream_num | unique_gauge_num | area in km^2 | | ||
| ... | ... | ... | | ||
|
||
|
||
## Things to check | ||
|
||
Be sure that both datasets: | ||
|
||
- Are in the same projected coordinate system | ||
- Only contain gauges and reaches within the area of interest. Clip/delete anything else. | ||
|
||
Other things to consider: | ||
|
||
- You may find it helpful to also have the catchments, adjoint catchments, and a watershed boundary polygon for | ||
visualization purposes. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,3 @@ | ||
mkdocs==1.3 | ||
mkdocs-material==8.4 | ||
mkdocs-material==8.4 | ||
mkdocstrings-python==0.7.1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,80 +1,30 @@ | ||
# Prepare Spatial Data (scripts not provided) | ||
This step instructs you to collect 3 gis files and use them to generate 2 tables. All 5 files (3 gis files and 2 | ||
tables) should go in the `gis_inputs` directory | ||
# Processing Input Data | ||
|
||
1. Clip model drainage lines and catchments shapefile to extents of the region of interest. | ||
For speed/efficiency, merge their attribute tables and save as a csv. | ||
- read drainage line shapefile and with GeoPandas | ||
- delete all columns ***except***: NextDownID, COMID, Tot_Drain_, order_ | ||
- rename the columns: | ||
- NextDownID -> downstream_model_id | ||
- COMID -> model_id | ||
- Tot_Drain -> drainage_area | ||
- order_ -> stream_order | ||
- compute the x and y coordinates of the centroid of each feature (needs the geometry column) | ||
- delete geometry column | ||
- save as `drain_table.csv` in the `gis_inputs` directory | ||
Before following these steps, you should have prepared the required datasets and organized them in a working directory. | ||
Refer to the [Required Datasets](../data/index.md) page for more information. | ||
|
||
Tip to compute the x and y coordinates using geopandas | ||
***Prereqs:*** | ||
|
||
1. Create a working directory and subdirectories | ||
2. Prepare the `drain_table` and `gauge_table` files. | ||
3. Prepare the `hindcast_series_table` file. | ||
|
||
Your table should look like this: | ||
## Prepare Flow Duration Curve Data | ||
|
||
| downstream_model_id | model_id | model_drain_area | stream_order | x | y | | ||
|---------------------|-----------------|------------------|--------------|-----|-----| | ||
| unique_stream_# | unique_stream_# | area in km^2 | stream_order | ## | ## | | ||
| unique_stream_# | unique_stream_# | area in km^2 | stream_order | ## | ## | | ||
| unique_stream_# | unique_stream_# | area in km^2 | stream_order | ## | ## | | ||
| ... | ... | ... | ... | ... | ... | | ||
|
||
1. Prepare a csv of the attribute table of the gauge locations shapefile. | ||
- You need the columns: | ||
- model_id | ||
- gauge_id | ||
- drainage_area (if known) | ||
|
||
Your table should look like this (column order is irrelevant): | ||
|
||
| model_id | gauge_drain_area | gauge_id | | ||
|-------------------|------------------|------------------| | ||
| unique_stream_num | area in km^2 | unique_gauge_num | | ||
| unique_stream_num | area in km^2 | unique_gauge_num | | ||
| unique_stream_num | area in km^2 | unique_gauge_num | | ||
| ... | ... | ... | | ||
|
||
# Prepare Discharge Data | ||
|
||
This step instructs you to gather simulated data and observed data. The raw simulated data (netCDF) and raw observed | ||
data (csvs) should be included in the `data_inputs` folder. You may keep them in another location and provide the path | ||
as an argument in the functions that need it. These datasets are used to generate several additional csv files which | ||
are stored in the `data_processed` directory and are used in later steps. The netCDF file may have any name and the | ||
directory of observed data csvs should be called `obs_csvs`. | ||
|
||
Use the dat | ||
|
||
1. Create a single large csv of the historical simulation data with a datetime column and 1 column per stream segment labeled by the stream's ID number. | ||
|
||
| datetime | model_id_1 | model_id_2 | model_id_3 | | ||
|------------|------------|------------|------------| | ||
| 1979-01-01 | 50 | 50 | 50 | | ||
| 1979-01-02 | 60 | 60 | 60 | | ||
| 1979-01-03 | 70 | 70 | 70 | | ||
| ... | ... | ... | ... | | ||
|
||
2. Process the large simulated discharge csv to create a 2nd csv with the flow duration curve on each segment (script provided). | ||
Process the `hindcast_series_table` to create a 2nd table with the flow duration curve on each segment. | ||
|
||
| p_exceed | model_id_1 | model_id_2 | model_id_3 | | ||
|----------|------------|------------|------------| | ||
| 100 | 0 | 0 | 0 | | ||
| 99 | 10 | 10 | 10 | | ||
| 98 | 20 | 20 | 20 | | ||
| 97.5 | 10 | 10 | 10 | | ||
| 95 | 20 | 20 | 20 | | ||
| ... | ... | ... | ... | | ||
|
||
3. Process the large historical discharge csv to create a 3rd csv with the monthly averages on each segment (script provided). | ||
Then process the FDC data to create a 3rd table with scaled/transformed FDC data for each segment. | ||
|
||
| month | model_id_1 | model_id_2 | model_id_3 | | ||
|-------|------------|------------|------------| | ||
| 1 | 60 | 60 | 60 | | ||
| 2 | 30 | 30 | 30 | | ||
| 3 | 70 | 70 | 70 | | ||
| ... | ... | ... | ... | | ||
| model_id | Q100 | Q97.5 | Q95 | | ||
|----------|------|-------|-----| | ||
| 1 | 60 | 50 | 40 | | ||
| 2 | 60 | 50 | 40 | | ||
| 3 | 60 | 50 | 40 | | ||
| ... | ... | ... | ... | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,5 +14,5 @@ | |
] | ||
|
||
__author__ = 'Riley C. Hales' | ||
__version__ = '0.5.0' | ||
__version__ = '0.6.0' | ||
__license__ = 'BSD 3 Clause Clear' |
Oops, something went wrong.