title | emoji | colorFrom | colorTo | sdk | sdk_version | app_file | pinned |
---|---|---|---|---|---|---|---|
Easygraph Bench |
📉 |
yellow |
indigo |
gradio |
3.23.0 |
app.py |
false |
This repository includes code for benchmarking the performance of graph libraries including easygraph.
- easygraph-bench
Benchmarking code that compares the performance of the 2 graph libraries easygraph (with and without C++ binding) and networkx.
Benchmarking code that compares the performance of 6 graph libraries
easygraph
, networkx
, igraph
, graphtool
, networkit
, snap-stanford
.
on these methods loading
, 2-hops
, shortest path
, k-core
, page rank
, strongly connected components
.
timeit.Timer.autorange is used to run the specified methods on the graph objects.
If the method returns a Generator, the result will be exhausted.
See get_Timer_args() for more details.
See config.py for more details.
- clustering_methods:
["average_clustering", "clustering"]
(eg.average_clustering
vsnx.average_clustering
, ...) - shortest_path_methods:
[('Dijkstra', 'single_source_dijkstra_path')]
(eg.Dijkstra
vsnx.single_source_dijkstra_path
)
- connected_components_methods:
["is_biconnected", "biconnected_components"]
- mst_methods:
['minimum_spanning_tree']
C++ binding not supported for this method yet. - other_methods:
['density', 'constraint']
.
Click to expand
Source: graph-benchmark-code.yaml
The data (except the easygraph related part) is this yaml file is extracted from timlrx
's graph-benchmark repository with semgrep
.
See the timlrx directory for more details.
easygraph:
loading: '''read_edgelist(filename, delimiter="\t", nodetype=int, create_using=eg.DiGraph()).cpp()'''
loading_undirected: '''read_edgelist(filename, delimiter="\t", nodetype=int, create_using=eg.Graph()).cpp()'''
# page rank: "'pagerank(g)'"
shortest path: f'Dijkstra(g, {nodeid})'
strongly connected components: "'[i for i in strongly_connected_components(g)]'"
graphtool:
2-hops: '"shortest_distance(g, g.vertex(0), max_dist=2).a"'
k-core: "'kcore_decomposition(g).a'"
loading:
'''''''load_graph_from_csv(filename, directed=True, csv_options={''delimiter'':
''\t'', ''quotechar'': ''"''})'''''''
loading_undirected:
'''''''load_graph_from_csv(filename, directed=False, csv_options={''delimiter'':
''\t'', ''quotechar'': ''"''})'''''''
page rank: "'pagerank(g, damping=0.85, epsilon=1e-3, max_iter=10000000).a'"
shortest path: '"shortest_distance(g, g.vertex(0)).a"'
strongly connected components:
"'cc, _ = label_components(g, vprop=None, directed=True,
attractors=False); cc.a'"
igraph:
k-core: '"g.coreness(mode=''all'')"'
loading: '"Graph.Read(filename, format=''edges'')"'
loading_undirected: '"Graph.Read(filename, format=''edges'', directed=False)"'
page rank: '"g.pagerank(damping=0.85)"'
shortest path: '"g.shortest_paths([g.vs[0]])"'
strongly connected components: '"[i for i in g.components(mode=STRONG)]"'
networkit:
k-core: '"nk.centrality.CoreDecomposition(g).run().scores()"'
loading:
'"nk.graphio.EdgeListReader(separator=''\t'', firstNode=0, continuous=True,
directed =True).read(filename)"'
loading_undirected: '"nk.graphio.EdgeListReader(separator=''\t'', firstNode=0, continuous=True).read(filename)"'
page rank: '"nk.centrality.PageRank(g, damp=0.85, tol=1e-3).run().scores()"'
shortest path: '"nk.distance.BFS(g, 0, storePaths=False).run().getDistances(False)"'
strongly connected components: '"nk.components.StronglyConnectedComponents(g).run().getPartition().getVector()"'
networkx:
2-hops: f'single_source_shortest_path_length(g, {nodeid}, cutoff=2)'
k-core: "'core.core_number(g)'"
loading: '''read_edgelist(filename, delimiter="\t", nodetype=int, create_using=nx.DiGraph())'''
loading_undirected: '''read_edgelist(filename, delimiter="\t", nodetype=int, create_using=nx.Graph())'''
page rank: "'pagerank(g, alpha=0.85, tol=1e-3, max_iter=10000000)'"
shortest path: f'shortest_path_length(g, {nodeid})'
strongly connected components: "'[i for i in strongly_connected_components(g)]'"
snap:
2-hops: '"snap.GetNodesAtHop(g, 0, 2, NodeVec, True)"'
k-core: '"snap.GetKCoreNodes(g, CoreIDSzV)"'
loading: '"snap.LoadEdgeListStr(snap.PNGraph, filename, 0, 1)"'
page rank: '"snap.GetPageRank(g, PRankH, 0.85, 1e-3, 10000000)"'
shortest path: '"snap.GetShortPath(g, 0, NIdToDistH, True)"'
strongly connected components: '"snap.GetSccs(g, Components)"'
python >= 3.10
is required.
First, you need to download datasets manually or with a script like this one.
To run these scripts, you need to clone the repo and install the dependencies listed in requirements.txt.
To install easygraph
:
As of 8/6/2022, wheel for python-easygraph
is not available on PyPI, and you need to build it yourself and install the module by running the following code.
git clone https://github.com/easy-graph/Easy-Graph && cd Easy-Graph && git checkout pybind11
pip install pybind11
python3 setup.py install
To install other 5 graph libraries on conda, run
conda install -c conda-forge python-igraph graph-tool networkit snap-stanford -y
python3 -m pip install networkx
You can run benchmarking on a single dataset with the ./bench_*.py
scripts,
or run benchmarking on a set of datasets with the ./entrypoint_*.py
scripts,
or run all of them ./bench.sh
.
You can run benchmarking on a single dataset for a single library with the ./profile_*.py
scripts, or use the convenience script ./profile_entrypoint.sh
to profile a bunch of datasets for the 6 libraries (you may need to adjust the dataset locations for this script to work).
# Download datasets
cp scripts/download_data.sh . && bash download_data.sh
# Get LCC datasets from the downloaded datasets
./get_lcc_edgelist.py
# Generate the scripts, only bench what you want to bench
gen-scripts-20230328-directed-only:
./gen_profile_scripts_with_suffix_wrapper.py '20230328-pagerank-scc-directed-only' --tools 'igraph' 'easygraph' --methods 'page rank' 'strongly connected components' --directed-datasets-only
# For Objective 1
# the ./bench_*.py scripts are for benchmarking easygraph and networkx on a single or all datasets
$ ./bench_cheminfo.py --help
ENZYMES_g1: nodes: 37 edges: 84
usage: bench_cheminfo.py [-h] [-G {clustering,shortest-path,connected-components,mst} [{clustering,shortest-path,connected-components,mst} ...]] [-C]
EasyGraph & NetworkX side-by-side benchmarking
optional arguments:
-h, --help show this help message and exit
-G {clustering,shortest-path,connected-components,mst} [{clustering,shortest-path,connected-components,mst} ...], --method-group {clustering,shortest-path,connected-components,mst} [{clustering,shortest-path,connected-components,mst} ...]
-C, --skip-cpp-easygraph, --skip-ceg
Skip benchmarking cpp_easygraph methods (default: False)
# for Objective 2
# the ./profile_*.py scripts are for profiling the one graph library on dataset of your choice.
# examples:
# ./profile_igraph.py my_dataset/my_network.edgelist -n 1000
# run the benchmarked methods of igraph on my_network.edgelist dataset for 1000 times, the dataset will be read as a directed graph.
# ./profile_networkx_undirected.py bio.edgelist
# read bio.edgelist as an undirected graph.
$ ./profile_easygraph.py D
usage: profile_easygraph.py [-h] [-n INT] PATH
Benchmark easygraph
positional arguments:
PATH path to the dataset file in tab-separated edgelist format
options:
-h, --help show this help message and exit
-n INT, --iteration INT
iteration count when benchmarking, auto-determined if unspecified (default: None)
Fork this repo, go to the Actions tab and click Run Workflow.
timeit
results are saved in csv files, and seaborn
is used to render and save the figures in the images/
directory.
Image generation is slow, use -D
or --skip-draw
when running ./bench_*.py
to skip image generation.
See dataset_loaders.py and dataset for details.
The er_*
Erdos-Renyi random graphs are generated with eg.erdos_renyi_P()
, available here.
Dataset Name | nodes | edges | is_directed | average_degree | density | type |
---|---|---|---|---|---|---|
cheminformatics | 37 | 168 | True | 9.08108108108108 | 0.12612612612612611 | easygraph.classes.directed_graph.DiGraph |
cheminformatics_lcc | 37 | 84 | False | 4.54054054054054 | 0.12612612612612611 | networkx.classes.graph.Graph |
eco | 1258 | 7619 | False | 12.112877583465819 | 0.009636338570776308 | networkx.classes.graph.Graph |
eco_lcc | 1258 | 7619 | False | 12.112877583465819 | 0.009636338570776308 | networkx.classes.graph.Graph |
bio | 1458 | 1948 | False | 2.672153635116598 | 0.0018340107310340413 | easygraph.classes.graph.Graph |
bio_lcc | 1458 | 1948 | False | 2.672153635116598 | 0.0018340107310340413 | networkx.classes.graph.Graph |
road_sampled | 2075 | 1132 | False | 1.0910843373493977 | 0.0005260773082687548 | networkx.classes.graph.Graph |
4039 | 88234 | True | 43.69101262688784 | 0.0054099817517196435 | networkx.classes.digraph.DiGraph | |
facebook_lcc | 4039 | 88234 | False | 43.69101262688784 | 0.010819963503439287 | networkx.classes.graph.Graph |
coauthorship_sampled | 4340 | 6398 | False | 2.9483870967741934 | 0.0006795084343798557 | networkx.classes.graph.Graph |
uspowergrid | 4941 | 6594 | False | 2.66909532483303 | 0.0005403026973346214 | networkx.classes.graph.Graph |
uspowergrid_lcc | 4941 | 6594 | False | 2.66909532483303 | 0.0005403026973346214 | networkx.classes.graph.Graph |
pgp_sampled | 6465 | 18906 | True | 5.848723897911833 | 0.00045240747972709105 | networkx.classes.digraph.DiGraph |
wikivote_lcc | 7066 | 100736 | False | 28.512878573450326 | 0.004035793145569756 | networkx.classes.graph.Graph |
wikivote | 7115 | 100762 | True | 29.146591707659873 | 0.0020485375110809584 | networkx.classes.digraph.DiGraph |
hepth_lcc | 8638 | 24827 | False | 5.748321370687659 | 0.0006655460658431931 | networkx.classes.graph.Graph |
pgp_undirected_sampled | 8781 | 51939 | False | 11.829859924837718 | 0.0013473644561318586 | networkx.classes.graph.Graph |
enron_sampled | 9301 | 79905 | False | 17.182023438339964 | 0.001847529401972039 | networkx.classes.graph.Graph |
hepth | 9877 | 25998 | False | 5.264351523742027 | 0.000533044909248889 | networkx.classes.graph.Graph |
condmat_lcc | 21363 | 91342 | False | 8.551420680616019 | 0.0004003099279382089 | networkx.classes.graph.Graph |
condmat | 23133 | 93497 | False | 8.083430596982666 | 0.0003494479766981958 | networkx.classes.graph.Graph |
enron_lcc | 33696 | 180811 | False | 10.731896961063628 | 0.0003185011711252004 | networkx.classes.graph.Graph |
enron | 36692 | 183831 | False | 10.020222391802028 | 0.00027309755503535 | networkx.classes.graph.Graph |
pgp | 39796 | 301498 | True | 15.15217609810031 | 0.00019037788790175037 | networkx.classes.digraph.DiGraph |
pgp_undirected | 39796 | 197150 | False | 9.908030957885215 | 0.00024897677994434515 | networkx.classes.graph.Graph |
pgp_lcc | 39796 | 197150 | False | 9.908030957885215 | 0.00024897677994434515 | networkx.classes.graph.Graph |
pgp_undirected_lcc | 39796 | 197150 | False | 9.908030957885215 | 0.00024897677994434515 | networkx.classes.graph.Graph |
road_lcc | 126146 | 161950 | False | 2.567659695907916 | 2.035482734874879e-05 | networkx.classes.graph.Graph |
road | 129164 | 165435 | False | 2.5616270787525934 | 1.9832514564949666e-05 | easygraph.classes.graph.Graph |
amazon | 262111 | 1234877 | True | 9.42254998836371 | 1.7974419114806206e-05 | networkx.classes.digraph.DiGraph |
amazon_lcc | 262111 | 899792 | False | 6.86573245685988 | 2.6194088195261075e-05 | networkx.classes.graph.Graph |
amazon_sampled | 262111 | 1234877 | True | 9.42254998836371 | 1.7974419114806206e-05 | networkx.classes.digraph.DiGraph |
coauthorship | 402392 | 1234019 | False | 6.1334171653512 | 1.5242431280399412e-05 | networkx.classes.graph.Graph |
google_lcc | 855802 | 4291352 | False | 10.028843120254452 | 1.1718662539836308e-05 | networkx.classes.graph.Graph |
875713 | 5105039 | True | 11.659160021605253 | 6.656960291514363e-06 | networkx.classes.digraph.DiGraph | |
pokec | 1632803 | 30622564 | True | 37.50919614919865 | 1.148614349725155e-05 | networkx.classes.digraph.DiGraph |
pokec_lcc | 1632803 | 22301964 | False | 27.31739713854029 | 1.6730379518484355e-05 | networkx.classes.graph.Graph |
er_500 | 500 | 2511 | False | 10.044 | 0.020128256513026053 | easygraph.classes.graph.Graph |
er_1000 | 1000 | 4950 | False | 9.9 | 0.00990990990990991 | easygraph.classes.graph.Graph |
er_5000 | 5000 | 24920 | False | 9.968 | 0.001993998799759952 | easygraph.classes.graph.Graph |
er_10000 | 10000 | 50023 | False | 10.0046 | 0.0010005600560056005 | easygraph.classes.graph.Graph |
er_paper_20221213_50000 | 50000 | 60316 | False | 2.41264 | 4.8253765075301506e-05 | easygraph.classes.graph.Graph |
er_paper_20221213_500000 | 500000 | 70315 | False | 0.28126 | 5.625211250422501e-07 | easygraph.classes.graph.Graph |
er_paper_20221213_1000000 | 1000000 | 80266 | False | 0.160532 | 1.6053216053216054e-07 | easygraph.classes.graph.Graph |
Objective 1:
- Try it out:
- For local benchmarking results:
- For server benchmarking results:
Just prepend/s
in the URL path, like so:
- Documentation: https://easygraph-bench-results.teddysc.me/docs
You can download the benchmarking results on the Releases page.
The server
release contains benchmarking run on a powerful EC2 server (c5.2xlarge), for the paper.
Simply write a function load_<dataset_name>()
and add it to dataset_loaders.py
. Checkout the examples in that file.
Then duplicate any of the benchmarking script, and replace the loading function with your own loader.