A simple implementation of a Graph Neural Network for graph classification tasks. The code is currently set to operate on a directed graph with node features. The graphs can also have edge weights. This code is leveraging the DGL
dataloader that allow to directly download and use the TUDortmund
datasets. A custom dataloader can easlily be implemented to adapt to any custom graph (e.g., with node labels). The model.py
can also be adapted to perform node classification or link prediction tasks. Note that the layers
could remain unchanged as they were built to be task-agnostic.
Create a conda
environment:
conda env create -f conda.yml
Activate it:
conda activate gnn-min-example
Add the root directory to your PYTHONPATH:
export PYTHONPATH="<YOUR_PATH>/gnn-min-example/"
By default, the GNN is run on the Letter-low dataset. For a comprehensive list of available TU Dortmund datasets, the reader can refer to this link.
The code can simply be run with:
python bin/run_gnn.py --config_fpath core/config/config_file.json
The optional arguments are:
usage: run_gnn.py [-h] --config_fpath CONFIG_FPATH [--gpu GPU] [--lr LR]
[--weight-decay WEIGHT_DECAY] [--n-epochs N_EPOCHS]
[--batch-size BATCH_SIZE] [--eval-every EVAL_EVERY]
Graph Neural Network Minimum Example.
Arguments:
-h, --help show this help message and exit
--config_fpath CONFIG_FPATH
Path to JSON configuration file.
--gpu GPU gpu (-1 for no GPU, 0 otherwise)
--lr LR learning rate
--weight-decay WEIGHT_DECAY
Weight for L2 loss
--n-epochs N_EPOCHS number of epochs
--batch-size BATCH_SIZE
batch size
--eval-every EVAL_EVERY
evaluate model every EVAL_EVERY steps
An example configuration file is provided in core/config/config_files.json
. The
configuration file is dataset dependent, which means that if the GNN is trained on
another dataset, some configuration parameters need to be changed accordingly.
The parameters are:
num_layers
: number of GNN layers. Most probably a number between 3 and 5.node_dim
: input dimension of the nodes. This information is dataset dependent and can be found in theNode Attr. (Dim.)
column here.activation
: activation function used between the GNN layers. Choose among'relu', 'tanh', 'sigmoid', 'elu', 'leaky_relu'
.neighbor_pooling_type
: type of pooling when aggregating node features from the neighbors of each node. Choose among'sum', 'mean''
.- Readout parameters:
num_layers
: number of linear layers in the readout functionhidden_dim
: hidden dimension of the readout functionout_dim
: number of classes. This parameter is dataset dependent.
For more specific information on the GNN parameters, please refer to the implementation
in core/layers/gnn
. The default configuration file running on the Letter-low
dataset
is:
{
"num_layers": 5,
"node_dim": 2,
"activation": "relu",
"neighbor_pooling_type": "sum",
"readout": {
"num_layers": 2,
"hidden_dim": 64,
"out_dim": 15
}
}