Skip to content

Commit

Permalink
Merge pull request #22 from groq/release/4_2_0
Browse files Browse the repository at this point in the history
GroqFlow release 4.2.0
  • Loading branch information
hozen-groq authored Sep 21, 2023
2 parents 04c04dc + 90fe448 commit 43da677
Show file tree
Hide file tree
Showing 17 changed files with 144 additions and 111 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/cla.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ jobs:
path-to-document: "https://github.com/groq/groqflow/cla.md"
# branch should not be protected
branch: "main"
allowlist: hozen-groq,MihailoMilenkovic,bot*
allowlist: hozen-groq,MihailoMilenkovic,ataheridezfouli-groq,bot*
remote-organization-name: groq
remote-repository-name: cla

Expand Down
26 changes: 2 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,29 +45,7 @@ To Groq a PyTorch model, simply provide your model and inputs to the `groqit()`

## Contributors

GroqFlow development is primarily conducted within Groq's internal repo and is periodically synced to GitHub. This approach means that developer contributions are not immediately obvious in the commit log. The following awesome developers have contributed to GroqFlow (order is alphabetical):

<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->
<!-- prettier-ignore-start -->
<!-- markdownlint-disable -->
<table>
<tbody>
<tr>
<td align="center"><a href="https://www.linkedin.com/in/danielholandanoronha"><img src="https://avatars.githubusercontent.com/u/9607530?v=4" width="100px;" alt="Daniel Holanda"/><br /><sub><b>Daniel Holanda</b></sub></a><br /></td>
<td align="center"><a href="https://github.com/jeremyfowers"><img src="https://avatars.githubusercontent.com/u/80718789?v=4" width="100px;" alt="Jeremy Fowers"/><br /><sub><b>Jeremy Fowers</b></sub></a><br /></td>
<td align="center"><a href="https://github.com/levzlotnik"><img src="https://avatars.githubusercontent.com/levzlotnik" width="100px;" alt="Lev Zlotnik"/><br /><sub><b>Lev Zlotnik</b></sub></a><br /></td>
<td align="center"><a href="https://www.linkedin.com/in/philipcolangelo"><img src="https://lh3.googleusercontent.com/pw/AMWts8CciuaYWKT-YVg86giohRGuQI8Jqm3xYeWlkEh41jO4EuPTSn0FLwHp8m0FfLHLIxJOWOxuBRyppa3blDT_YcKokVFbI6yHBYJ1env5evNRCFUPiIBhIlkOzVKMrMMC7aoTjrBGSk6HWUJ803DvMKFudw=s1426-no?authuser=0" width="100px;" alt="Philip Colangelo"/><br /><sub><b>Philip Colangelo</b></sub></a><br /></td>
<td align="center"><a href="https://www.linkedin.com/in/ramakrishnansivakumar/"><img src="https://media.licdn.com/dms/image/D5603AQGH0fQ4EWzmnw/profile-displayphoto-shrink_800_800/0/1675440402753?e=1692230400&v=beta&t=Lm44fMhYcYHFKVRUraHZDJAzS3IpOuijsp05XbTkVL8" width="100px;" alt="Ramakrishnan Sivakumar"/><br /><sub><b>Ramakrishnan Sivakumar</b></sub></a><br /></td>
<td align="center"><a href="https://www.linkedin.com/in/sarah-garrod-massengill-76262728/"><img src="https://media.licdn.com/dms/image/D5635AQGd3_jmheGW7g/profile-framedphoto-shrink_800_800/0/1685551437072?e=1687186800&v=beta&t=jTqUpYhEp6UDl47N6Tl-u7J4jFJ7Y5iFlYlc4JpmUPs" width="100px;" alt="Sarah Garrod Massengill"/><br /><sub><b>Sarah Garrod Massengill</b></sub></a><br /></td>
<td align="center"><a href="https://github.com/vgodsoe-groq"><img src="https://avatars.githubusercontent.com/u/105250658?v=4" width="100px;" alt="Victoria Godsoe"/><br /><sub><b>Victoria Godsoe</b></sub></a><br /></td>
</tr>
</tbody>
</table>

<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->

<!-- ALL-CONTRIBUTORS-LIST:END -->
GroqFlow development is primarily conducted within Groq's internal repo and is periodically synced to GitHub. This approach means that developer contributions are not immediately obvious in the commit log.

This project follows the [all-contributors](https://allcontributors.org) specification.
Contributions of any kind are welcome!
Contributions of any kind are welcome!
2 changes: 1 addition & 1 deletion demo_helpers/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"datasets>=2.3.2",
"prettytable>=3.3.0",
"wget>=3.2",
"setuptools>=61.2.0",
"setuptools==57.2.0",
"torchvision>=0.11.3",
"torchaudio>=0.12.1",
"path>=16.4.0",
Expand Down
20 changes: 5 additions & 15 deletions docs/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ The following describes how to install GroqFlow. These instructions enable users

### Check your versions

- Ensure that you are using one of the following Linux distributions: Ubuntu 18.04, Ubuntu 22.04 or Rocky 8.4.
- Ensure that you are using one of the following Linux distributions: Ubuntu 22.04 or Rocky 8.4.
- Download and install the GroqWare™ Suite version >=0.9.2.1.
- For more information, see the GroqWare Quick Start Guide at [support.groq.com](https://support.groq.com).
- To compile your model for Groq hardware, GroqFlow requires the Groq Developer Tools Package (groq-devtools). To run your compiled model on hardware, GroqFlow requires the Groq Runtime Package (groq-runtime).
Expand All @@ -24,7 +24,7 @@ Make sure that your combination of GroqWare™ Suite version, OS version, and Py

### Install GroqWare

Download and install the GroqWare Suite version >=0.9.2.
Download and install the GroqWare Suite version >=0.9.2.1.
- For more information, see the GroqWare Quick Start Guide at [support.groq.com](https://support.groq.com).
- To compile your model for Groq hardware, GroqFlow requires the Groq Developer Tools Package (groq-devtools). To run your compiled model on hardware, GroqFlow requires the Groq Runtime Package (groq-runtime).

Expand Down Expand Up @@ -61,8 +61,6 @@ pip install .

where `groqflow` is the directory where you cloned the GroqFlow repo in the [prerequisites](#prerequisites).

**Note:** On GroqNode™ systems you will may run into an installation error that suggests that you install with the `--user` flag. If you encounter this error, please try `pip install . --user`.

_Optional_: if you want to use GroqFlow with TensorFlow, use this install command instead of `pip install .`:

```
Expand All @@ -83,15 +81,7 @@ conda env config vars set PYTHONPATH="/opt/groq/runtime/site-packages:$PYTHONPAT
- You forgot to complete this step.
- Your GroqWare Suite installation failed and you should attempt to re-install the GroqWare Suite.

### Step 4: Identify your groqit() card

GroqFlow sets the GroqCard to be of type A1.4 by default. If you have a Legacy A1.1 GroqCard, run the following command before running on hardware so the multi-card workloads can properly bring up the connections between the cards:

```
export GROQFLOW_LEGACY_A11=True
```

### Step 5: Rock-It with groqit()
### Step 4: Rock-It with groqit()

To confirm that you're setup correctly, navigate to the examples folder at `groqflow/examples/` and run the `hello_world.py` example that can be found in the `keras`, `onnx`, and `pytorch` folder depending on your preferred framework:

Expand All @@ -100,7 +90,7 @@ cd groqflow/examples/<framework>
python hello_world.py
```

### Step 6: Take-off with a Proof Point
### Step 5: Take-off with a Proof Point

Included in the directory: `groqflow/proof_points`, are multiple examples of various machine learning and linear algebra workloads. To run these proof points, the `groqflow/demo_helpers` must be installed in your groqflow environment.

Expand All @@ -124,4 +114,4 @@ When you are ready to try out your own model with GroqFlow, we recommend taking

**Note:** The supported Python/OS combinations in [Check your Versions](#check-your-versions) apply here as well.

**Note:** We recommend using separate conda environments for PyTorch/ONNX/Hummingbird development vs. TensorFlow development. The reason we make TensorFlow support optional in GroqFlow is to help you avoid dependency conflicts between the TensorFlow package and the other Groq/GroqFlow dependencies. Do not `pip install groqflow[tensorflow]` into an environment where you already did `pip install groqflow`, as this will cause errors.
**Note:** We recommend using separate conda environments for PyTorch/ONNX/Hummingbird development vs. TensorFlow development. The reason we make TensorFlow support optional in GroqFlow is to help you avoid dependency conflicts between the TensorFlow package and the other Groq/GroqFlow dependencies. Do not `pip install groqflow[tensorflow]` into an environment where you already did `pip install groqflow`, as this will cause errors.
16 changes: 16 additions & 0 deletions docs/user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,22 @@ See: `examples/num_chips.py`

---

**topology**

- The topology configuration of chips to use for execution. For more information on topologies,
see the [Groq RealScale User Guide](https://support.groq.com/#/downloads/realscale).
- *Default*: `groqit()` automatically will choose the `DRAGONFLY` topology.
- Options are:
- groqflow.common.build.DRAGONFLY
- groqflow.common.build.ROTATIONAL

### Example:

```
groqit(model, inputs, num_chips=8) # Use DRAGONFLY topology by default
groqit(model, inputs, num_chips=16, topology=groqflow.common.build.ROTATIONAL)
```

### Turn off the Progress Monitor

GroqFlow displays a monitor on the command line that updates the progress of `groqit()` as it builds. By default, this monitor is on, however, it can be disabled using the `monitor` flag.
Expand Down
105 changes: 62 additions & 43 deletions groqflow/common/build.py
Original file line number Diff line number Diff line change
@@ -1,28 +1,28 @@
import os
import enum
import math
from typing import Optional, List
from typing import Optional, List, Dict
import dataclasses
import onnxflow.common.build as of_build
from groqflow.version import __version__ as groqflow_version


DEFAULT_ONNX_OPSET = 14
MINIMUM_ONNX_OPSET = 11

DEFAULT_ONNX_OPSET = 16
MINIMUM_ONNX_OPSET = 13

# Identifiers for specific GroqCard Accelerators
GROQCARD_A14 = "A1.4"
GROQCARD_A11 = "A1.1"

# Identifiers for specific chip topologies
DRAGONFLY = "Dragonfly"
ROTATIONAL = "Rotational"

# WARNING: The "internal" env var may cause unexpected behavior if enabled
# outside of the internal Groq dev environment.
environment_variables = {
"cache_dir": "GROQFLOW_CACHE_DIR",
"rebuild": "GROQIT_REBUILD_POLICY",
"dont_use_sdk": "GROQFLOW_BAKE_SDK",
"target_a11": "GROQFLOW_LEGACY_A11",
"debug": "GROQFLOW_DEBUG",
"internal": "GROQFLOW_INTERNAL_FEATURES",
}
Expand Down Expand Up @@ -56,12 +56,12 @@
else:
USE_SDK = True

# Direct builds to target legacy GroqCard A1.1 accelerators instead
# of the default A1.4 accelerators
if os.environ.get(environment_variables["target_a11"]) == "True":
GROQCARD = GROQCARD_A11
else:
GROQCARD = GROQCARD_A14
# Direct builds to target the default GroqCard A1.4 accelerators.
GROQCARD = GROQCARD_A14

# By default, choose the dragonfly topology. Users can change this by passing in
# the topology argument to groqit().
TOPOLOGY = DRAGONFLY


class Backend(enum.Enum):
Expand All @@ -71,15 +71,50 @@ class Backend(enum.Enum):
REMOTE = "remote"


def supported_topology(groqcard: str):
if os.environ.get(environment_variables["internal"]) == "True":
return [1, 2, 4] if groqcard == GROQCARD_A11 else [1, 2, 4, 8, 16, 32, 64]
def supported_topology(groqcard: str, topology: str) -> Dict[int, str]:
"""
Return a map of the number of chips to the topology string, given a groqcard
and connection topology. Only groqcard value of GROQCARD_A14 and topologies
of value DRAGONFLY, ROTATIONAL are currently supported.
"""

topo_df_a14 = {
2: "DF_A14_2_CHIP",
4: "DF_A14_4_CHIP",
8: "DF_A14_8_CHIP",
16: "DF_A14_16_CHIP",
32: "DF_A14_32_CHIP",
64: "DF_A14_64_CHIP",
}
topo_rt_a14 = {
16: "RT09_A14_16_CHIP",
32: "RT09_A14_32_CHIP",
40: "RT09_A14_40_CHIP",
48: "RT09_A14_48_CHIP",
56: "RT09_A14_56_CHIP",
64: "RT09_A14_64_CHIP",
72: "RT09_A14_72_CHIP",
}

if groqcard != GROQCARD_A14:
return {}

if topology == DRAGONFLY:
return topo_df_a14
elif topology == ROTATIONAL:
return topo_rt_a14
else:
return [1, 2, 4] if groqcard == GROQCARD_A11 else [1, 2, 4, 8]
return {}


def max_chips(groqcard: str):
return supported_topology(groqcard)[-1]
def max_chips(groqcard: str, topology: str):
chips = list(supported_topology(groqcard, topology).keys())
if len(chips) == 0:
raise ValueError(
f"Could not find the number of chips for groqcard {groqcard}, "
f"topology {topology}."
)
return chips[-1]


# Each chip can hold approximately 50M parameters
Expand Down Expand Up @@ -113,10 +148,11 @@ class GroqConfig(of_build.Config):
breaking change.
"""

compiler_flags: List[str] = None
assembler_flags: List[str] = None
compiler_flags: Optional[List[str]] = None
assembler_flags: Optional[List[str]] = None
groqview: bool = False
groqcard: str = GROQCARD
topology: str = TOPOLOGY
num_chips: Optional[int] = None


Expand Down Expand Up @@ -145,9 +181,9 @@ class GroqInfo(of_build.Info):
estimated_pcie_output_latency: Optional[float] = None
estimated_throughput: Optional[float] = None
estimated_latency: Optional[float] = None
compiled_onnx_input_bytes: int = None
compiled_onnx_output_bytes: int = None
compiler_ram_bytes: float = None
compiled_onnx_input_bytes: Optional[int] = None
compiled_onnx_output_bytes: Optional[int] = None
compiler_ram_bytes: Optional[float] = None


@dataclasses.dataclass
Expand Down Expand Up @@ -213,26 +249,9 @@ def groqview_file(self):

@property
def topology(self):
topo_a14 = {
1: "n/a",
2: "DF_A14_2_CHIP",
4: "DF_A14_4_CHIP",
8: "DF_A14_8_CHIP",
16: "DF_A14_16_CHIP",
32: "DF_A14_32_CHIP",
64: "DF_A14_64_CHIP",
}
topo_a11 = {
1: "n/a",
2: "FC2_A11_2_CHIP",
4: "FC2_A11_4_CHIP",
}

# Select topology based on the groqcard gen
if self.config.groqcard == GROQCARD_A11:
return topo_a11[self.num_chips_used]
elif self.config.groqcard == GROQCARD_A14:
return topo_a14[self.num_chips_used]
topology = supported_topology(self.config.groqcard, self.config.topology)
if self.num_chips_used in topology.keys():
return topology[self.num_chips_used]
else:
return "Unknown"

Expand Down
4 changes: 2 additions & 2 deletions groqflow/common/sdk_helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ def validate_devtools(
os_version: OS,
required=False,
exception_type: Type[Exception] = exp.EnvError,
) -> Union[bool, str]:
):
version = _installed_package_version("groq-devtools", os_version)
hint = "Please contact [email protected] to get access to groq-devtools."
version_is_valid(version, required, "groq-devtools", exception_type, hint)
Expand All @@ -190,7 +190,7 @@ def validate_runtime(
os_version: OS,
required=False,
exception_type: Type[Exception] = exp.EnvError,
) -> Union[bool, str]:
):
version = _installed_package_version("groq-runtime", os_version)
hint = "Please contact [email protected] to get access to groq-runtime."
version_is_valid(version, required, "groq-runtime", exception_type, hint)
Expand Down
4 changes: 1 addition & 3 deletions groqflow/groqmodel/execute.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,6 @@ def get_multi_tsp_runner(
"DF_A14_2_CHIP": g.TopologyConfig.DF_A14_2_CHIP,
"DF_A14_4_CHIP": g.TopologyConfig.DF_A14_4_CHIP,
"DF_A14_8_CHIP": g.TopologyConfig.DF_A14_8_CHIP,
"FC2_A11_2_CHIP": g.TopologyConfig.FC2_A11_2_CHIP,
"FC2_A11_4_CHIP": g.TopologyConfig.FC2_A11_4_CHIP,
}

if bringup_topology:
Expand Down Expand Up @@ -91,7 +89,7 @@ def forward_singlechip(example):
forward = forward_singlechip if num_chips == 1 else forward_multichip
batch_size = len(input_batch)
output_batch = []
total_latency = 0
total_latency = 0.0
for idx in range(batch_size):
example = input_batch[idx]
latency, output = rtime(forward, repetitions, example)
Expand Down
4 changes: 4 additions & 0 deletions groqflow/groqmodel/groqmodel.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ def topology_initialized(self, topology):

class GroqModel:
def __init__(self, state: build.GroqState, tensor_type=np.array, input_dtypes=None):

self.input_dtypes = input_dtypes
self.tensor_type = tensor_type
self.state = state
Expand Down Expand Up @@ -194,6 +195,7 @@ def benchmark(
def benchmark_abunch(
self, input_collection: Collection, repetitions: int = 1
) -> GroqMeasuredPerformance:

self._validate_input_collection(input_collection, "benchmark_abunch")

_, benchmark_results = self._execute(
Expand Down Expand Up @@ -291,6 +293,7 @@ def _select_backend(self):
)
)
elif backend == build.Backend.REMOTE:

try:
import groqflow.groqmodel.remote as remote

Expand Down Expand Up @@ -492,6 +495,7 @@ def _execute_locally(self, bringup_topology: bool, repetitions: int) -> None:

# Launch groqview
def groqview(self) -> None:

# Select either bake or SDK
if self.state.use_sdk:
groqview_path = sdk.find_tool("groqview")
Expand Down
6 changes: 2 additions & 4 deletions groqflow/groqmodel/remote.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,15 +263,13 @@ def execute(
self,
state: build.GroqState,
repetitions: int,
) -> Dict[str, np.ndarray]:
):
"""
Executes a build on the given inputs and returns the outputs.
Executes a build on the given inputs and saves results to disk.
Args:
state: State of the build being executed
repetitions: Number of times to execute a build
Returns:
The outputs of the execution
"""
inputs_file = state.execution_inputs_file
inputs_data = np.load(inputs_file, allow_pickle=True)
Expand Down
Loading

0 comments on commit 43da677

Please sign in to comment.