Skip to content

Commit

Permalink
Bug fix in valid GIS names used to create water network models (#452)
Browse files Browse the repository at this point in the history
* bug fix, missing commas in _base_attributes

* Updates to remove the use of node_type and link_type in WaterNetworkGIS GeoDataFrames

* updated/added tests

* removed user tests, geodataframes no longer store node_type

* Added "name" back into the list of valid gis columns, updated tests

* updated documentation

* minor updates

---------

Co-authored-by: kbonney <[email protected]>
  • Loading branch information
kaklise and kbonney authored Nov 12, 2024
1 parent 9a2a18f commit 26b433a
Show file tree
Hide file tree
Showing 7 changed files with 141 additions and 72 deletions.
42 changes: 21 additions & 21 deletions documentation/gis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -119,13 +119,13 @@ For example, the junctions GeoDataFrame contains the following information:
:skipif: gpd is None

>>> print(wn_gis.junctions.head())
node_type elevation initial_quality geometry
name
10 Junction 216.408 5.000e-04 POINT (20.00000 70.00000)
11 Junction 216.408 5.000e-04 POINT (30.00000 70.00000)
12 Junction 213.360 5.000e-04 POINT (50.00000 70.00000)
13 Junction 211.836 5.000e-04 POINT (70.00000 70.00000)
21 Junction 213.360 5.000e-04 POINT (30.00000 40.00000)
elevation initial_quality geometry
name
10 216.408 5.000e-04 POINT (20.00000 70.00000)
11 216.408 5.000e-04 POINT (30.00000 70.00000)
12 213.360 5.000e-04 POINT (50.00000 70.00000)
13 211.836 5.000e-04 POINT (70.00000 70.00000)
21 213.360 5.000e-04 POINT (30.00000 40.00000)

Each GeoDataFrame contains attributes and geometry:

Expand Down Expand Up @@ -341,23 +341,23 @@ and then translates the GeoDataFrames coordinates to EPSG:3857.

>>> wn_gis = wntr.network.to_gis(wn, crs='EPSG:4326')
>>> print(wn_gis.junctions.head())
node_type elevation initial_quality geometry
name
10 Junction 216.408 5.000e-04 POINT (20.00000 70.00000)
11 Junction 216.408 5.000e-04 POINT (30.00000 70.00000)
12 Junction 213.360 5.000e-04 POINT (50.00000 70.00000)
13 Junction 211.836 5.000e-04 POINT (70.00000 70.00000)
21 Junction 213.360 5.000e-04 POINT (30.00000 40.00000)
elevation initial_quality geometry
name
10 216.408 5.000e-04 POINT (20.00000 70.00000)
11 216.408 5.000e-04 POINT (30.00000 70.00000)
12 213.360 5.000e-04 POINT (50.00000 70.00000)
13 211.836 5.000e-04 POINT (70.00000 70.00000)
21 213.360 5.000e-04 POINT (30.00000 40.00000)

>>> wn_gis.to_crs('EPSG:3857')
>>> print(wn_gis.junctions.head())
node_type elevation initial_quality geometry
name
10 Junction 216.408 5.000e-04 POINT (2226389.816 11068715.659)
11 Junction 216.408 5.000e-04 POINT (3339584.724 11068715.659)
12 Junction 213.360 5.000e-04 POINT (5565974.540 11068715.659)
13 Junction 211.836 5.000e-04 POINT (7792364.356 11068715.659)
21 Junction 213.360 5.000e-04 POINT (3339584.724 4865942.280)
elevation initial_quality geometry
name
10 216.408 5.000e-04 POINT (2226389.816 11068715.659)
11 216.408 5.000e-04 POINT (3339584.724 11068715.659)
12 213.360 5.000e-04 POINT (5565974.540 11068715.659)
13 211.836 5.000e-04 POINT (7792364.356 11068715.659)
21 213.360 5.000e-04 POINT (3339584.724 4865942.280)

Snap point geometries to the nearest point or line
----------------------------------------------------
Expand Down
39 changes: 21 additions & 18 deletions documentation/model_io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -206,27 +206,29 @@ GeoJSON files
GeoJSON files are commonly used to store geographic data structures.
More information on GeoJSON files can be found at https://geojson.org.

To use GeoJSON files in WNTR, a set of valid base column names are required.
Valid base GeoJSON column names can be obtained using the
:class:`~wntr.network.io.valid_gis_names` function.
The following example returns valid base GeoJSON column names for junctions.
When reading GeoJSON files into WNTR, only a set of valid column names can be used.
Valid GeoJSON column names can be obtained using the
:class:`~wntr.network.io.valid_gis_names` function. By default, the function
returns all column names, both required and optional.
The following example returns valid GeoJSON column names for junctions.

.. doctest::
:skipif: gpd is None

>>> geojson_column_names = wntr.network.io.valid_gis_names()
>>> print(geojson_column_names['junctions'])
['name', 'elevation', 'coordinates', 'emitter_coefficient', 'initial_quality', 'minimum_pressure', 'required_pressure', 'pressure_exponent', 'tag']
['name', 'elevation', 'geometry', 'emitter_coefficient', 'initial_quality', 'minimum_pressure', 'required_pressure', 'pressure_exponent', 'tag']

A minimal list of valid column names can also be obtained by setting ``complete_list`` to False.
Column names that are optional (i.e., ``initial_quality``) and not included in the GeoJSON file are defined using default values.
A minimal list of required column names can also be obtained by setting ``complete_list`` to False.
Column names that are optional (i.e., ``initial_quality``) and not included in the GeoJSON file are
defined using default values.

.. doctest::
:skipif: gpd is None

>>> geojson_column_names = wntr.network.io.valid_gis_names(complete_list=False)
>>> print(geojson_column_names['junctions'])
['name', 'elevation', 'coordinates']
['name', 'elevation', 'geometry']

Note that GeoJSON files can contain additional custom column names that are assigned to WaterNetworkModel objects.

Expand All @@ -253,7 +255,7 @@ Note that patterns, curves, sources, controls, and options are not stored in the

The :class:`~wntr.network.io.read_geojson` function creates a WaterNetworkModel from a
dictionary of GeoJSON files.
Valid base column names and additional custom attributes are added to the model.
Valid column names and additional custom attributes are added to the model.
The function can also be used to append information from GeoJSON files into an existing WaterNetworkModel.

.. doctest::
Expand Down Expand Up @@ -300,20 +302,21 @@ To use Esri Shapefiles in WNTR, several formatting requirements are enforced:
assumed that the first 10 characters of each attribute are unique.

* To create WaterNetworkModel from Shapefiles, a set of valid field names are required.
Valid base Shapefiles field names can be obtained using the
:class:`~wntr.network.io.valid_gis_names` function.
For Shapefiles, the `truncate` input parameter should be set to 10 (characters).
The following example returns valid base Shapefile field names for junctions.
Note that attributes like ``base_demand`` are truncated to ``base_deman``.
Valid Shapefiles field names can be obtained using the
:class:`~wntr.network.io.valid_gis_names` function. By default, the function
returns all column names, both required and optional.
For Shapefiles, the `truncate_names` input parameter should be set to 10 (characters).
The following example returns valid Shapefile field names for junctions.
Note that attributes like ``minimum_pressure`` are truncated to ``minimum_pr``.

.. doctest::
:skipif: gpd is None

>>> shapefile_field_names = wntr.network.io.valid_gis_names(truncate_names=10)
>>> print(shapefile_field_names['junctions'])
['name', 'elevation', 'coordinate', 'emitter_co', 'initial_qu', 'minimum_pr', 'required_p', 'pressure_e', 'tag']
['name', 'elevation', 'geometry', 'emitter_co', 'initial_qu', 'minimum_pr', 'required_p', 'pressure_e', 'tag']

A minimal list of valid field names can also be obtained by setting ``complete_list`` to False.
A minimal list of required field names can also be obtained by setting ``complete_list`` to False.
Field names that are optional (i.e., ``initial_quality``) and not included in the Shapefile are defined using default values.

.. doctest::
Expand All @@ -322,7 +325,7 @@ To use Esri Shapefiles in WNTR, several formatting requirements are enforced:
>>> shapefile_field_names = wntr.network.io.valid_gis_names(complete_list=False,
... truncate_names=10)
>>> print(shapefile_field_names['junctions'])
['name', 'elevation', 'coordinate']
['name', 'elevation', 'geometry']

* Shapefiles can contain additional custom field names that are assigned to WaterNetworkModel objects.

Expand All @@ -349,7 +352,7 @@ Note that patterns, curves, sources, controls, and options are not stored in the

The :class:`~wntr.network.io.read_shapefile` function creates a WaterNetworkModel from a dictionary of
Shapefile directories.
Valid base field names and additional custom field names are added to the model.
Valid field names and additional custom field names are added to the model.
The function can also be used to append information from Shapefiles into an existing WaterNetworkModel.

.. doctest::
Expand Down
67 changes: 53 additions & 14 deletions wntr/gis/network.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,14 +99,21 @@ def _create_gis(self, wn, crs: str = None, pumps_as_points: bool = False,
Represent valves as points (True) or lines (False), by default False
"""

def _extract_geodataframe(df, crs=None, links_as_points=False):
# Drop any column with all NaN
def _extract_geodataframe(df, crs=None, valid_base_names=None,
links_as_points=False):
if valid_base_names is None:
valid_base_names = []

# Drop any column with all NaN, this removes excess attributes
# Valid base attributes that have all None values are added back
# at the end of this routine
df = df.loc[:, ~df.isna().all()]

# Define geom and drop node_type/link_type
if df.shape[0] > 0:
# Define geom
if 'node_type' in df.columns:
geom = [Point((x,y)) for x,y in df['coordinates']]
del df['node_type']
elif 'link_type' in df.columns:
geom = []
for link_name in df['name']:
Expand All @@ -120,24 +127,33 @@ def _extract_geodataframe(df, crs=None, links_as_points=False):
ls.append(v)
ls.append(link.end_node.coordinates)
geom.append(LineString(ls))
del df['link_type']

# Drop column if not a str, float, int, or bool
# Drop column if not a str, float, int, or bool (or np.bool_)
# This drops columns like coordinates, vertices
# This could be extended to keep additional data type (list,
# tuple, network elements like Patterns, Curves)
drop_cols = []
for col in df.columns:
if not isinstance(df.iloc[0][col], (str, float, int, bool)):
# Added np.bool_ to the following check
# Returned by df.to_dict('records') for some network models
if not isinstance(df.iloc[0][col], (str, float, int, bool, np.bool_)):
drop_cols.append(col)
df = df.drop(columns=drop_cols)

# Add back in valid base attributes that had all None values
cols = list(set(valid_base_names) - set(df.columns))
if len(cols) > 0:
df[cols] = None

# Set index
if len(df) > 0:
df.set_index('name', inplace=True)

df = gpd.GeoDataFrame(df, crs=crs, geometry=geom)
else:
df = gpd.GeoDataFrame()

return df

# Convert the WaterNetworkModel to a dictionary
Expand All @@ -146,29 +162,31 @@ def _extract_geodataframe(df, crs=None, links_as_points=False):
df_nodes = pd.DataFrame(wn_dict['nodes'])
df_links = pd.DataFrame(wn_dict['links'])

valid_base_names = self._valid_names(complete_list=False, truncate_names=None)

# Junctions
df = df_nodes[df_nodes['node_type'] == 'Junction']
self.junctions = _extract_geodataframe(df, crs)
self.junctions = _extract_geodataframe(df, crs, valid_base_names['junctions'])

# Tanks
df = df_nodes[df_nodes['node_type'] == 'Tank']
self.tanks = _extract_geodataframe(df, crs)
self.tanks = _extract_geodataframe(df, crs, valid_base_names['tanks'])

# Reservoirs
df = df_nodes[df_nodes['node_type'] == 'Reservoir']
self.reservoirs = _extract_geodataframe(df, crs)
self.reservoirs = _extract_geodataframe(df, crs, valid_base_names['reservoirs'])

# Pipes
df = df_links[df_links['link_type'] == 'Pipe']
self.pipes = _extract_geodataframe(df, crs, False)
self.pipes = _extract_geodataframe(df, crs, valid_base_names['pipes'], False)

# Pumps
df = df_links[df_links['link_type'] == 'Pump']
self.pumps = _extract_geodataframe(df, crs, pumps_as_points)
self.pumps = _extract_geodataframe(df, crs, valid_base_names['pumps'], pumps_as_points)

# Valves
df = df_links[df_links['link_type'] == 'Valve']
self.valves = _extract_geodataframe(df, crs, valves_as_points)
self.valves = _extract_geodataframe(df, crs, valid_base_names['valves'], valves_as_points)

def _create_wn(self, append=None):
"""
Expand All @@ -187,22 +205,32 @@ def _create_wn(self, append=None):
wn_dict['nodes'] = []
wn_dict['links'] = []

for element in [self.junctions, self.tanks, self.reservoirs]:
# Modifications to create a WaterNetworkModel from a dict
# Reset index
# Create coordinates/vertices from geometry
# Add node_type/link_type
for node_type, element in [('Junction', self.junctions),
('Tank', self.tanks),
('Reservoir', self.reservoirs)]:
if element.shape[0] > 0:
assert (element['geometry'].geom_type).isin(['Point']).all()
df = element.reset_index(names="name")
df.rename(columns={'geometry':'coordinates'}, inplace=True)
df['coordinates'] = [[x,y] for x,y in zip(df['coordinates'].x,
df['coordinates'].y)]
df['node_type'] = node_type
wn_dict['nodes'].extend(df.to_dict('records'))

for element in [self.pipes, self.pumps, self.valves]:
for link_type, element in [('Pipe', self.pipes),
('Pump', self.pumps),
('Valve', self.valves)]:
if element.shape[0] > 0:
assert 'start_node_name' in element.columns
assert 'end_node_name' in element.columns
df = element.reset_index(names="name")
df['vertices'] = df.apply(lambda row: list(row.geometry.coords)[1:-1], axis=1)
df.drop(columns=['geometry'], inplace=True)
df['link_type'] = link_type
wn_dict['links'].extend(df.to_dict('records'))

# Create WaterNetworkModel from dictionary
Expand Down Expand Up @@ -470,6 +498,17 @@ def _valid_names(self, complete_list=True, truncate_names=None):
if truncate_names is not None and truncate_names > 0:
for element, attributes in valid_names.items():
valid_names[element] = [attribute[:truncate_names] for attribute in attributes]

for key, vals in valid_names.items():
# Remove coordinates and vertices (not used to create GeoDataFrame geometry)
if 'coordinates' in valid_names[key]:
valid_names[key].remove('coordinates')
if 'vertices' in valid_names[key]:
valid_names[key].remove('vertices')

# Add geometry
if 'geometry' not in valid_names[key]:
valid_names[key].append('geometry')

return valid_names

Expand Down
4 changes: 2 additions & 2 deletions wntr/network/elements.py
Original file line number Diff line number Diff line change
Expand Up @@ -394,7 +394,7 @@ class Tank(Node):
"min_level",
"max_level",
"diameter",
"min_vol"
"min_vol",
"vol_curve_name",
"overflow",
"coordinates"]
Expand Down Expand Up @@ -1041,7 +1041,7 @@ class Pump(Link):
"end_node_name",
"pump_type",
"pump_curve_name",
"power"
"power",
"base_speed",
"speed_pattern_name",
"initial_status"]
Expand Down
5 changes: 3 additions & 2 deletions wntr/network/io.py
Original file line number Diff line number Diff line change
Expand Up @@ -644,12 +644,13 @@ def valid_gis_names(complete_list=True, truncate_names=None):
Valid column/field names for GeoJSON or Shapefiles
Note that Shapefile field names are truncated to 10 characters
(set truncate=10)
(set truncate_names=10)
Parameters
----------
complete_list : bool
Include a complete list of column/field names (beyond basic attributes)
When true, returns both optional and required column/field names.
When false, only returns required column/field names.
truncate_names : None or int
Truncate column/field names to specified number of characters,
set truncate=10 for Shapefiles. None indicates no truncation.
Expand Down
Loading

0 comments on commit 26b433a

Please sign in to comment.