diff --git a/docs/user_guide.md b/docs/user_guide.md index 573b016..714656e 100644 --- a/docs/user_guide.md +++ b/docs/user_guide.md @@ -1,35 +1,34 @@ # GroqFlow™ User Guide -The following reviews the different functionality provided by GroqFlow. +Welcome to the GroqFlow User Guide! Here you will learn all about the `groqit()` function that is used to build and run models with GroqFlow, the GroqModel class instance returned by `groqit()` calls that implements your model for Groq hardware, and GroqFlow concepts. ## Table of Contents -- [Just Groq It](#just-groq-it) -- [`groqit()` Arguments](#groqit-arguments) - - [Quickest Way](#quickest-way) - - [Multi-Chip](#multi-chip) - - [Turn off the Progress Monitor](#turn-off-the-progress-monitor) - - [Rebuild Policy](#rebuild-policy) - - [Set the Build Name](#setting-the-build-name) - - [Build a GroqView™ Visualization](#build-a-groqview™-visualization) - - [Compiler Flags](#compiler-flags) - - [Assembler Flags](#assembler-flags) - - [Choose a Cache Directory](#choose-a-cache-directory) - - [Perform Post-training Quantization](#perform-post-training-quantization) - - [Custom User Sequence](#custom-user-sequence) +- [The `groqit()` Function](#the-groqit-function) + - [`groqit()` Arguments](#groqit-arguments) + - [Multi-Chip](#multi-chip) + - [Turn Off the Progress Monitor](#turn-off-the-progress-monitor) + - [Rebuild Policy](#rebuild-policy) + - [Set the Build Name](#setting-the-build-name) + - [Build and Open a GroqView™ Visualization](#build-and-open-a-groqview-visualization) + - [Compiler Flags](#compiler-flags) + - [Assembler Flags](#assembler-flags) + - [Choose a Cache Directory](#choose-a-cache-directory) + - [Perform Post-training Quantization](#perform-post-training-quantization) + - [Custom User Sequence](#custom-user-sequence) - [GroqModel Methods](#groqmodel-methods) - [GroqModel Class](#groqmodel-class) - [GroqModel Specializations](#groqmodel-specializations) - [Calling an Inference](#inference-forward-pass) - [Benchmarking the Model](#benchmark) - [Netron](#netron) - - [Open a GroqView Visualization](#open-a-groqview-visualization) - [Concepts](#concepts) - [GroqFlow Build Cache](#groqflow-build-cache) - [`state.yaml` File](#stateyaml-file) -## Just Groq It +## The `groqit()` Function +The simplest way to use GroqFlow is by calling `groqit()` with your model and a sample input. The `groqit()` function is imported from the `groqflow` library, which returns a callable `GroqModel` instance that works like a PyTorch model (torch.nn.Module) or, when given scikit-learn or xgboost inputs, has `predict` and `predict_proba` methods: ``` from groqflow import groqit # import our function @@ -37,18 +36,7 @@ gmodel = groqit(model, inputs) # returns a callable GroqModel gmodel(**inputs) # inference with provided inputs ``` ---- - - -## `groqit()` Arguments - -### Quickest Way - -The simplest way to use GroqFlow is by calling `groqit()` with your model and a sample input. - -Returns a callable `GroqModel` instance that works like a PyTorch model (torch.nn.Module) or, when given scikit-learn or xgboost inputs, has `predict` and `predict_proba` methods. - -**model:** +**`model`:** - Model to be mapped to Groq hardware. - Can be an instance of: @@ -75,7 +63,7 @@ Returns a callable `GroqModel` instance that works like a PyTorch model (torch.n - xgboost.XGBClassifier - xgboost.XGBRegressor -**inputs:** +**`inputs`:** - Used by `groqit()` to determine the shape of input to build against. - Dictates the maximum input size the model will support. @@ -91,6 +79,7 @@ Returns a callable `GroqModel` instance that works like a PyTorch model (torch.n `inputs = tokenizer("I like dogs")` + ### Examples: ``` @@ -103,17 +92,22 @@ See --- +## `groqit()` Arguments + +This section includes descriptions of all the available `groqit()` function arguments and their parameters that can be used to override default GroqFlow settings. + ### Multi-Chip -By default, GroqFlow will automatically partition models across multiple GroqChip™ processors, however, a user can still specify the desired number of GroqChip™ processors they would like `groqit()` to target. +By default, GroqFlow will automatically partition models across multiple Groq Language Processing Units™ (LPUs). However, you can specify the desired number of LPUs for `groqit()` to target. -**num_chips** +#### Argument(s): +**`num_chips`** -- Number of GroqChip processors to be used. -- *Default*: `groqit()` automatically selects a number of chips. -- 1, 2, 4, or 8 chips are valid options for systems using GroqCard™ accelerators (GC1-010B/GC1-0109). +- Number of Groq LPUs to be used. +- *Default*: `groqit()` automatically selects a number of available chips. +- Currently, 1, 2, 4, or 8 chips are valid options for systems using Groq LPUs. -### Example: +##### Example: ``` groqit(model, inputs, num_chips=4) @@ -123,7 +117,7 @@ See: `examples/num_chips.py` --- -**topology** +**`topology`** - The topology configuration of chips to use for execution. For more information on topologies, see the [Groq RealScale User Guide](https://support.groq.com/#/downloads/realscale). @@ -132,7 +126,7 @@ see the [Groq RealScale User Guide](https://support.groq.com/#/downloads/realsca - groqflow.common.build.DRAGONFLY - groqflow.common.build.ROTATIONAL -### Example: +### Examples: ``` groqit(model, inputs, num_chips=8) # Use DRAGONFLY topology by default @@ -141,8 +135,9 @@ groqit(model, inputs, num_chips=16, topology=groqflow.common.build.ROTATIONAL) ### Turn off the Progress Monitor -GroqFlow displays a monitor on the command line that updates the progress of `groqit()` as it builds. By default, this monitor is on, however, it can be disabled using the `monitor` flag. +GroqFlow displays a monitor on the command line that updates the progress of `groqit()` as it builds. By default, this monitor is on, but can be disabled using the `monitor` flag. +#### Argument(s): **monitor** - *Default*: `groqit(monitor=True, ...)` displays a progress monitor on the command line. - Set `groqit(monitor=False, ...)` to disable the command line monitor. @@ -163,12 +158,13 @@ By default, GroqFlow will load successfully built models from the [GroqFlow buil However, sometimes you may want to change this policy. The `rebuild` argument has a few settings that allow you to do just that. +#### Argument(s): **rebuild** - *Default*: `groqit(rebuild="if_needed", ...)` will use a cached model if available, build one if it is not available, and rebuild any stale builds. - Set `groqit(rebuild="always", ...)` to force `groqit()` to always rebuild your model, regardless of whether it is available in the cache or not. - Set `groqit(rebuild="never", ...)` to make sure `groqit()` never rebuilds your model, even if it is stale. `groqit()` will attempt to load any previously built model in the cache, however there is no guarantee it will be functional or correct. -### Example: +### Examples: ``` # Rebuild a model every time @@ -192,7 +188,7 @@ However, you can also specify the name using the `build_name` argument. If you want to build multiple models in the same script, you must set a unique `build_name` for each to avoid collisions. - +#### Argument(s): **build_name** - Name of the build in the [GroqFlow build cache](#groqflow-build-cache), specified by `groqit(build_name="name", ...)` @@ -220,15 +216,21 @@ See: `examples/pytorch/build_names.py` --- -### Build a GroqView™ Visualization +### Build and Open a GroqView™ Visualization -GroqView is a visualization and profiler tool that is launched in your web browser. For more information about GroqView, see the GroqView User Guide on Groq's Customer Portal at [support.groq.com](https://support.groq.com/#/downloads/groqview-ug) +GroqView is a visualization and profiler tool launched in your web browser to view your model's data streams and execution schedule on Groq hardware, which requires using the `groqview` argument to `groqit()` at build time. - -**groqview** +#### Argument(s): +**`groqview`** - By default, GroqView files are not included in the build because they increase the build time and take up space on disk. -- When calling `groqit()`, set the groqview argument to True such as, `groqit(groqview=True, ...)` to include GroqView files in the build. +- When calling `groqit()`, set the groqview argument to `True` such as, `groqit(groqview=True, ...)` to include GroqView files in the build. + +#### Steps: +- To build the visualization, set the `groqview` argument within the `groqit()` function to `True` (as shown in the below example). - To open the visualization, take the resulting `GroqModel` instance and call the `GroqModel.groqview()` method. +- If running GroqView on a server other than your local machine, open a tunnel to the host server. On your local machine, use SSH to tunnel to the server that has the GroqView launch (e.g. `ssh -L 8439:localhost:8439 servername` on your command line). +- To view the GroqView visualization, copy the given URL and paste it into your web browser (e.g. http://localhost:8439). +- When finished, you can enter Ctrl+C into the command line to shut down the GroqView server. ### Examples: @@ -247,9 +249,15 @@ Users familiar with the underlying compiler may want to override the default fla Warning: at this time, `groqit()` does nothing to ensure that you are providing legal flags to the `compiler_flags` argument. If you provide illegal flags, `groqit()` will raise a generic exception and point you to a log file where you can learn more. +#### Argument(s): **compiler_flags** - Provide the flags as a list of strings, i.e., `groqit(compiler_flags=["flag 1", "flag 2"], ...)` - *Note*: By providing flags, this overwrites the defaults flags used by GroqFlow. + +### Parameters +The following is a list of available compiler flags and their descriptions that can be used as parameters for the `compiler_flags` argument of the `groqit()` function. + +**TBA** ### Example: @@ -267,6 +275,7 @@ Users familiar with the underlying assembler may want to override the default fl Warning: at this time, `groqit()` does nothing to ensure that you are providing legal flags to the `assembler_flags` argument. If you provide illegal flags, `groqit()` will raise a generic exception and point you to a log file where you can learn more. +#### Argument(s): **assembler_flags** - Provide the flags as a list of strings, i.e., `groqit(assembler_flags=["flag 1", "flag 2"], ...)` - *Note*: By providing flags, this overwrites the defaults flags used by GroqFlow. @@ -284,7 +293,7 @@ See: `examples/pytorch/assembler_flags.py` ### Choose a Cache Directory -The location of the [GroqFlow build cache](#groqflow-build-cache) defaults to `~/.cache/groqflow`. However, there are two ways for you to customize this. +The location of the [GroqFlow build cache](#groqflow-build-cache) defaults to `~/.cache/groqflow`. However, there are two ways for you to customize this: - On a per-build basis, you can set `groqit(cache_dir="path", ...)` to specify a path to use as the cache directory. - To change the global default, set the `GROQFLOW_CACHE_DIR` environment variable to a path of your choosing. @@ -324,12 +333,13 @@ See: `examples/pytorch/cache_dir.py` ### Perform Post Training Quantization -By default, `groqit()` converts the input model into an equivalent ONNX model, optimizes the ONNX model, and converts the model's trained parameters into type float16 before compiling and assembling the model into a groq model. +By default, `groqit()` converts the input model into an equivalent ONNX model, optimizes the ONNX model, and converts the model's trained parameters into type `float16` before compiling and assembling the model into a `GroqModel`. -When quantization data samples are specified to the `quantization_samples` argument, `groqit()` performs post-training quantization to int8 on the equivalent ONNX model using the specified samples, before compiling and assembling a GroqModel from the quantized ONNX model. The provided unlabeled samples are used to estimate distribution statistics of the data to pre-compute scales and zero points of the float-to-int8 range mapping for the activation tensors in the model. After static quantization, all conv, matmul, and relu operations in the quantized model will have int8 precision. Please note that rebuild is required when different quantization samples are provided, so the rebuild policy in this case must be set to `always`. +When quantization data samples are specified to the `quantization_samples` argument, `groqit()` performs post-training quantization to `int8` on the equivalent ONNX model using the specified samples, before compiling and assembling a GroqModel from the quantized ONNX model. The provided unlabeled samples are used to estimate distribution statistics of the data to pre-compute scales and zero points of the float-to-int8 range mapping for the activation tensors in the model. After static quantization, all conv, matmul, and relu operations in the quantized model will have int8 precision. Please note that rebuild is required when different quantization samples are provided, so the rebuild policy in this case must be set to `always`. Currently, `groqit()` only provides post training quantization support for PyTorch models. +#### Argument(s): **quantization_samples** - A list of data samples to be used to perform post-training quantization to the input model, specified by `groqit(quantization_samples=my_samples, ...)`. Each sample should be a numpy array or similar object @@ -553,20 +563,6 @@ See: `examples/pytorch/netron.py` --- -### Open a GroqView Visualization - -Use a `GroqModel` instance to open a GroqView visualization that was built using the `groqview` argument to `groqit()` (more information [here](#groqview)). - -- Visualize data streams and execution schedule. -- Requires using the groqview argument to groqit() at build time. - -``` -gmodel = groqit(model, inputs, groqview=True) -gmodel.groqview() -``` - -See: `examples/pytorch/groqview.py` - ## Concepts ### GroqFlow Build Cache