Topological metrics #74

barneydobson · 2024-03-06T13:52:04Z

Description

Place to implement topological metrics

Fixes #50

Summary of changes:

Removed some unused dependencies
Forked netcomp to make it compatible with networkx 3. This package implements the various graph similarity metrics from the paper suggested by @cheginit .
Implemented and tested these metrics in metric_utilities.
Added debug_topology to study whether these metrics are behaving sensibly.

Summary of debug_topology:

Download a small street network graph (using prepare_data.py/download_street), a slightly larger street network which contains the small graph, and a slightly larger one again which contains that, finally another larger street network graph which does not overlap the first three.
Test all graphs for all topology metrics and plot the distances between them.
What I would expect is that the small graph is closest to the medium, and then large, and then separated one.
Here are the heatmaps of the various metrics:

Observations:
In all cases - the medium graph and smallest graph are closest to each other - good!
Under nc_laplacian_dist, nc_laplacian_norm_dist, nc_adjacency_dist, and kstest_betweenness the large and separate graphs are very close - not good! Additionally these four seem to be somewhat behaving similarly.
nc_deltacon0 and nc_resistance_distance seem to be behaving similarly.
nc_vertex_edge_distance seems to have very large values in general, but I would say qualitatively seems closer to the laplacian-type results.

Thus, provided we are happy that these are implemented properly - I think this covers quite well different measures of topological distance. In my tests these metrics are quick to calculate, but if they become prohibitive, we have evidence here to select maybe only 2 or 3 to cover the broad categories of behaviour.

-Update and test KStest

-Update and test KStest -Add and test flow/flooding for a 'dominant' outlet

update to merge with `metric_format`

revert kstest_betweenness

try and make the diff tidier

Revert kstest test

typo

refactor

move utiltiies up

keep ks betweenness the same

proper subgraph values->to_numpy

avoid intersects

cheginit · 2024-03-08T17:04:34Z

Very nice comparison!

Perhaps, you can compute and compare these metrics for the GRIP and OSM datasets, for the same bounding box.

cheginit · 2024-03-09T00:03:39Z

For one of my projects, I wrote this function for computing BC in parallel, it can speed up the computations by a factor of 3 or 4 depending on the complexity of the network:

import joblib
import networkx as nx
import cytoolz.curried as tlz
from collections import defaultdict


def edge_betweenness_centrality(G: nx.Graph, normalized: bool = True, weight: str = "weight", njobs: int = -1):
    """Parallel betweenness centrality function"""
    njobs = joblib.cpu_count(True) if njobs == -1 else njobs
    node_chunks = tlz.partition_all(G.order() // njobs, G.nodes())
    bt_func = tlz.partial(nx.edge_betweenness_centrality_subset, G=G, normalized=normalized, weight=weight)
    bt_sc = joblib.Parallel(n_jobs=njobs)(
        joblib.delayed(bt_func)(sources=nodes, targets=G.nodes()) for nodes in node_chunks
    )

    # Merge the betweenness centrality results
    bt_c = defaultdict(float)
    for bt in bt_sc:
        for n, v in bt.items():
            bt_c[n] += v
    return bt_c

Also, there's this library called graph-tool that is freakishly fast, but the caveat is that it's not available on PyPi and can only be installed from conda-forge.

In one of my tests, the BC computation with networkx took 20 min, with the parallel version took 4 min, with netowrkit took about 18 sec, and with graph-tool it finished in less than a second!

barneydobson · 2024-03-11T13:34:16Z

For one of my projects, I wrote this function for computing BC in parallel, it can speed up the computations by a factor of 3 or 4 depending on the complexity of the network:
import joblib
import networkx as nx
import cytoolz.curried as tlz
from collections import defaultdict


def edge_betweenness_centrality(G: nx.Graph, normalized: bool = True, weight: str = "weight", njobs: int = -1):
    """Parallel betweenness centrality function"""
    njobs = joblib.cpu_count(True) if njobs == -1 else njobs
    node_chunks = tlz.partition_all(G.order() // njobs, G.nodes())
    bt_func = tlz.partial(nx.edge_betweenness_centrality_subset, G=G, normalized=normalized, weight=weight)
    bt_sc = joblib.Parallel(n_jobs=njobs)(
        joblib.delayed(bt_func)(sources=nodes, targets=G.nodes()) for nodes in node_chunks
    )

    # Merge the betweenness centrality results
    bt_c = defaultdict(float)
    for bt in bt_sc:
        for n, v in bt.items():
            bt_c[n] += v
    return bt_c
Also, there's this library called graph-tool that is freakishly fast, but the caveat is that it's not available on PyPi and can only be installed from conda-forge.

In one of my tests, the BC computation with networkx took 20 min, with the parallel version took 4 min, with netowrkit took about 18 sec, and with graph-tool it finished in less than a second!

OK I will make an issue, #80, for graph-tool. I subbed out your function above for the networkx.betweenness_centrality function, and the results in the tests are slightly different (your function = 0.38995, networkx = 0.2862) - is it a problem?

barneydobson · 2024-03-11T13:42:55Z

@cheginit Oh I think I have used betweenness_centrality instead of edge_betweenness_centrality - I didn't realise there was a difference, I guess that explains why the values are different

cheginit · 2024-03-11T13:43:21Z

Note that this is for edge BC, for node BC, you need to change the function, so you should compare it with nx.edge_betweenness_centrality. I have a test suite for this, so there shouldn't be an issue.

barneydobson · 2024-03-11T13:43:46Z

Yep yep!

-Use taher's new function for edge betweenness -introduce new metric for edge betweenness (in contrast to node betweenness) -update requirements

formatting

…egeLondon/SWMManywhere into topological_metrics

formatting

more formatting...

Dobson and others added 22 commits February 29, 2024 11:41

Implement outlet matching

44f304f

-Update and test KStest

Implement outlet matching

1c829e9

-Update and test KStest -Add and test flow/flooding for a 'dominant' outlet

Merge branch 'main' into outlet_match_metric

74bbfb9

Merge branch 'main' into outlet_match_metric

bf69ed7

Merge branch 'main' into outlet_match_metric

2bf7d17

Update metric_utilities.py

d8bcd0d

update to merge with `metric_format`

Merge branch 'main' into outlet_match_metric

792f3ae

Update metric_utilities.py

1cbd9b8

revert kstest_betweenness

Update metric_utilities.py

14b2ce4

try and make the diff tidier

Update test_metric_utilities.py

e16dac3

Revert kstest test

Update test_metric_utilities.py

6dd0e65

typo

Update metric_utilities.py

d1eda73

refactor

Update metric_utilities.py

d9ba711

move utiltiies up

Update metric_utilities.py

03fb75b

keep ks betweenness the same

Update test_metric_utilities.py

2a5f9a0

Update metric_utilities.py

a99950c

Update metric_utilities.py

8465bb9

proper subgraph values->to_numpy

Update metric_utilities.py

3699dd6

avoid intersects

Ensure align_calc_nse dates are consistent format

f12a790

Add netcomp metrics

927c74b

Merge branch 'outlet_match_metric' into topological_metrics

e22020a

test new metrics

1c7b955

barneydobson added the sa_paper Sensitivity analysis paper label Mar 7, 2024

barneydobson self-assigned this Mar 7, 2024

barneydobson mentioned this pull request Mar 11, 2024

Discuss whether to use graph_tool #80

Open

barneydobson added 2 commits March 11, 2024 14:13

BC changes

64d7b03

-Use taher's new function for edge betweenness -introduce new metric for edge betweenness (in contrast to node betweenness) -update requirements

Merge branch 'main' into topological_metrics

893189b

barneydobson mentioned this pull request Mar 11, 2024

Enable different underlying datasets #81

Closed

barneydobson added 4 commits March 11, 2024 14:19

Update metric_utilities.py

eb5a601

formatting

Merge branch 'topological_metrics' of https://github.com/ImperialColl…

ee4e500

…egeLondon/SWMManywhere into topological_metrics

Update pyproject.toml

bdcbd8b

formatting

Update metric_utilities.py

4cf2675

more formatting...

barneydobson merged commit 2fe0f62 into main Mar 11, 2024
10 checks passed

barneydobson deleted the topological_metrics branch March 11, 2024 14:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Topological metrics #74

Topological metrics #74

barneydobson commented Mar 6, 2024 •

edited

Loading

cheginit commented Mar 8, 2024

cheginit commented Mar 9, 2024 •

edited

Loading

barneydobson commented Mar 11, 2024 •

edited

Loading

barneydobson commented Mar 11, 2024

cheginit commented Mar 11, 2024

barneydobson commented Mar 11, 2024

Topological metrics #74

Topological metrics #74

Conversation

barneydobson commented Mar 6, 2024 • edited Loading

Description

cheginit commented Mar 8, 2024

cheginit commented Mar 9, 2024 • edited Loading

barneydobson commented Mar 11, 2024 • edited Loading

barneydobson commented Mar 11, 2024

cheginit commented Mar 11, 2024

barneydobson commented Mar 11, 2024

barneydobson commented Mar 6, 2024 •

edited

Loading

cheginit commented Mar 9, 2024 •

edited

Loading

barneydobson commented Mar 11, 2024 •

edited

Loading