Skip to content

Commit

Permalink
[Doc] hot fix incorrect link of training and inference outputs (#974)
Browse files Browse the repository at this point in the history
*Issue #, if available:*

*Description of changes:*
This PR fixed the incorrect reference link of gs outputs and added a
sentence to brief the output in the model training and inference landing
page.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Ubuntu <[email protected]>
  • Loading branch information
2 people authored and jalencato committed Aug 16, 2024
1 parent ea8b3cc commit 6aea228
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ GraphStorm provides dozens of configurable parameters for users to control their

Launch Arguments
--------------------
GraphStorm's `graphstorm.run.launch <https://github.com/awslabs/graphstorm/blob/main/python/graphstorm/run/launch.py>`_ command has a set of parameters to control the launch behavior of training and inference.
GraphStorm's model training and inference CLIs (both task-specific and task-agnostic) have a set of parameters to control the behavior of training and inference.

- **workspace**: the folder where launch command assume all artifacts were saved. If the other parameters' file paths are relative paths, launch command will consider these files in the workspace.
- **part-config**: (**Required**) Path to a file containing graph partition configuration. The graph partition is generated by GraphStorm Partition tools. **HINT**: Use absolute path to avoid any path related problems. Otherwise, the file should be in workspace.
Expand Down Expand Up @@ -51,13 +51,16 @@ GraphStorm provides a set of parameters to config the GNN model structure (input
- Yaml: ``model_encoder_type: rgcn``
- Argument: ``--model-encoder-type rgcn``
- Default value: This parameter must be provided by user.
- **node_feat_name**: User defined feature name. It accepts two formats: a) `fname`, if a node has node features, the corresponding feature name will be fname; b) `ntype0:feat0 ntype1:featA ...`, different node types have different node feature name(s). In the example, "ntype0" has a node feature named "feat0" and "ntype1" has a node feature named "featA". Note: Characters `:` and ` ` are not allowed to be used in node feature names. And in Yaml format, need to put each node's feature in a separated line that starts with a hyphon.
- **node_feat_name**: User defined feature name. It accepts two formats: a) `fname`, if a node has node features, the corresponding feature name will be fname; b) `ntype0:feat0 ntype1:featA ...`, different node types have different node feature name(s). In the example, "ntype0" has a node feature named "feat0" and "ntype1" has a node feature named "featA".

- Yaml: ``node_feat_name:``
| ``- "ntype1:featA"``
| ``- "ntype0:feat0"``
- Argument: ``--node-feat-name "ntype0:feat0 ntype1:featA"``
- Default value: If not provided, there will be no node features used by GraphStorm even graphs have node features attached.

.. Note:: Characters ``:`` and white space are not allowed to be used in node feature names. And in Yaml format, need to put each node's feature in a separated line that starts with a hyphon.

- **num_layers**: Number of GNN layers. Must be an integer larger than 0 if given. By default, it is set to 0, which means no GNN layers.

- Yaml: ``num_layers: 2``
Expand Down Expand Up @@ -111,7 +114,7 @@ GraphStorm provides a set of parameters to control how and where to save and res
- Yaml: ``save_model_frequency: 1000``
- Argument: ``--save-model-frequency 1000``
- Default value: ``-1``. GraphStorm will not save models within an epoch.
- **topk_model_to_save**: The number of top best GraphStorm model to save. By default, GraphStorm will keep all the saved models in disk, which will consume huge number of disk space. Users can set a positive integer, e.g. `K`, to let GraphStorm only save `K`` models with the best performance.
- **topk_model_to_save**: The number of top best GraphStorm model to save. By default, GraphStorm will keep all the saved models in disk, which will consume huge number of disk space. Users can set a positive integer, e.g. `K`, to let GraphStorm only save `K` models with the best performance.

- Yaml: ``topk_model_to_save: 3``
- Argument: ``--topk-model-to-save 3``
Expand Down
2 changes: 1 addition & 1 deletion docs/source/cli/model-training-inference/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ This section provides guidelines of GraphStorm model training and inference on :

GraphStorm CLIs require less- or no-code operations for users to perform Graph Machine Learning (GML) tasks. In most cases, users only need to configure the parameters or arguments provided by GraphStorm to fulfill their GML tasks. Users can find the details of these configurations in the :ref:`Model Training and Inference Configurations<configurations-run>`.

In addition, there are two node ID mapping operations during the graph construction procedure, and these mapping results are saved in a certain folder by which GraphStorm inference pipelines will automatically use to remap prediction results' node IDs back to the original IDs. In case when such automatic remapping does not occur, you can do it mannually according to the :ref:`GraphStorm Output Node ID Remapping <output-remapping>` guideline.
In addition, there are two node ID mapping operations during the graph construction procedure, and these mapping results are saved in a certain folder by which GraphStorm training and inference CLIs will automatically use to remap prediction results' node IDs back to the original IDs. In case when such automatic remapping does not occur, you can find the details of outputs of model training and inference without remapping in :ref:`GraphStorm Training and Inference Output <gs-output>`. In addition, users can do the remapping mannually according to the :ref:`GraphStorm Output Node ID Remapping <gs-output-remapping>` guideline.

.. toctree::
:maxdepth: 2
Expand Down

0 comments on commit 6aea228

Please sign in to comment.