Skip to content

Latest commit

 

History

History
342 lines (266 loc) · 16.4 KB

Rpi5-M.2_Edge_TPU_Installation.md

File metadata and controls

342 lines (266 loc) · 16.4 KB

To get started with either the Mini PCIe or M.2 Accelerator, all you need to do is connect the card to your system, and then install our PCIe driver, Edge TPU runtime, and the TensorFlow Lite runtime. This page walks you through the setup and shows you how to run an example model.

The setup and operation is the same for both M.2 form-factors, including the M.2 Accelerator with Dual Edge TPU.

Requirements

  • Raspberry Pi 5 with the following Linux operating system:
    • Raspberry Pi OS (64-bit) based on Debian 10 or newer
    • Ubuntu (64-bit) 23.10 or newer.
  • All systems require support for MSI-X as defined in the PCI 3.0 specification
  • At least one available M.2 module slot
  • Python 3.6-3.9.16

1. Connect the module

  1. Make sure the host system where you'll connect the module is shut down.
  2. Carefully connect the M.2 module to the corresponding module slot on the host, according to your host system recommendations. We recommend the PineBerry AI Hat (E-Key) for the Raspberry Pi 5.

2: Install the PCIe driver and Edge TPU runtime

Next, you need to install both the Coral PCIe driver and the Edge TPU runtime. You can install these packages on your host computer as follows, on Linux.

The Coral ("Apex") PCIe driver is required to communicate with any Edge TPU device over a PCIe connection, whereas the Edge TPU runtime provides the required programming interface for the Edge TPU.


  1. Run @dataslayermedia 's script to install the edge TPU's runtime, Gasket driver, edit boot configuration, and Modify the Device Tree Source.

    curl https://gist.githubusercontent.com/dataslayermedia/714ec5a9601249d9ee754919dea49c7e/raw/52545bb7b3a961290a4d7c5042d3fd6eb7bc33d2/coral-ai-pcie-edge-tpu-raspberrypi-5-setup | sh

    If @dataslayermedia 's script is no longer available please use the following fork:

    curl https://gist.githubusercontent.com/Reddimus/c6948d08a4f4b54ee9d075270bd79c3b/raw/52545bb7b3a961290a4d7c5042d3fd6eb7bc33d2/coral-ai-pcie-edge-tpu-raspberrypi-5-setup | sh
  2. Once rebooted, verify that the accelerator module is detected:

    lspci -nn | grep 089a

    You should see something like this:

    03:00.0 System peripheral: Device 1ac1:089a

    The 03 number and System peripheral name might be different, because those are host-system specific, but as long as you see a device listed with 089a then you're okay to proceed.

  3. Also verify that the PCIe driver is loaded:

    ls /dev/apex_0

    You should simply see the name repeated back:

    /dev/apex_0

    If the accelerator module is detected but /dev/apex_0 is not found, then read the troubleshooting section at the end of this guide.

  4. Give permissions to the /dev/apex_0 device by creating a new udev rule:
    Open a terminal and use your favorite text editor with sudo to create a new file in /etc/udev/rules.d/. The file name should end with .rules. It's common practice to start custom rules with a higher number (e.g., 99-) to ensure they are applied after the default rules. For example:

    sudo nano /etc/udev/rules.d/99-coral-edgetpu.rules
  5. Add a rule to the file:
    You'll need to identify your device by attributes like idVendor and idProduct or use the KERNEL attribute if the device path is consistent. For the Coral Edge TPU, using the device path /dev/apex_0 directly in a udev rule is not standard because this path might not be persistent across reboots or other device changes. Instead, use attributes to match the device.

    However, since we're dealing with a specific device path here, your rule might look something like this, assuming /dev/apex_0 is consistently named and you're setting permissions:

    KERNEL=="apex_0", MODE="0666"

    This rule sets the device file /dev/apex_0 to be readable and writable by everyone. Adjust the MODE as necessary for your security requirements.

  6. Reload the udev rules and trigger them: After saving the file, you need to reload the udev rules and trigger them to apply the changes without rebooting. Reload the rules:

    sudo udevadm control --reload-rules
    sudo udevadm trigger
  7. Verify /dev/apex_0 and MSI-X are enabled:
    Verify that the permissions for /dev/apex_0 are set as expected. After rebooting, check the permissions of the device file:

    ls -l /dev/apex_0

    Also verify all Message Signaled Interrupts (MSI) are enabled:

    sudo lspci -vvv|grep -i MSI-X

    You should see something like this, where + indicates that MSI-X is enabled and - indicates that it's disabled:

    Capabilities: [d0] MSI-X: Enable+ Count=128 Masked-
    Capabilities: [b0] MSI-X: Enable+ Count=61 Masked-

    Note: The use of MODE="0666" makes the device world-readable and writable, which may not be secure for all environments. Consider your security requirements and adjust the permissions accordingly, possibly using GROUP to restrict access to users within a specific group.

Now continue to install PyCoral and TensorFlow Lite.

3: Install the PyCoral library

PyCoral is a Python library built on top of the TensorFlow Lite library to speed up your development and provide extra functionality for the Edge TPU.

We recommend you start with the PyCoral API, and we use this API in our example code below, because it simplifies the amount of code you must write to run an inference. But you can build your own projects using TensorFlow Lite directly, in either Python or C++.

First check your Linux system's Python version:

python3 --version

PyCoral currently supports Python 3.6 through 3.9.16. If your default version is something else, we suggest you install Python 3.9 with pyenv.

To install the PyCoral library, use the following commands based on your Python environment.

On Linux with System Python 3.6-3.9.16

sudo apt-get install python3-pycoral

On Linux with Python 3.6-3.9.16 installed with pyenv

python3 -m pip install --extra-index-url https://google-coral.github.io/py-repo/ pycoral~=2.0

Lastly, check the list of installed packages to verify that PyCoral is installed:

pip3 list

The packages needed should roughly look like this:

Python 3.9.16

Package        Version
----------------------------
numpy            1.26.4
Pillow           9.5.0
pip              24.0
pycoral          2.0.0
setuptools       58.1.0
tflite-runtime   2.5.0.post1
Note: Pillow must be 9.5.0 or older. If you have a newer version, you can downgrade it with `pip3 install Pillow==9.5.0`.

3: Run a model on the Edge TPU

Now you're ready to run an inference on the Edge TPU.

Follow these steps to perform image classification with our example code and MobileNet v2:

  1. Download the example code from GitHub:
    mkdir coral && cd coral
    
    git clone https://github.com/google-coral/pycoral.git
    
    cd pycoral
  2. Download the model, labels, and bird photo:
    bash examples/install_requirements.sh classify_image.py
  3. Run the image classifier with the bird photo (shown in figure 1):
    python3 examples/classify_image.py \
    --model test_data/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite \
    --labels test_data/inat_bird_labels.txt \
    --input test_data/parrot.jpg

You should see results like this:

----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
11.8ms
2.9ms
2.8ms
2.9ms
2.9ms
-------RESULTS--------
Ara macao (Scarlet Macaw): 0.75781

These speeds are faster compared to running the same model on the Google Coral USB TPU:

----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
20.6ms
7.0ms
6.8ms
5.2ms
5.1ms
-------RESULTS--------
Ara macao (Scarlet Macaw): 0.75781

Congrats! You just performed an inference on the Edge TPU using TensorFlow Lite.

To demonstrate varying inference speeds, the example repeats the same inference five times. Your inference speeds might differ based on your host system.

The top classification label is printed with the confidence score, from 0 to 1.0.

To learn more about how the code works, take a look at the classify_image.py source code and read about how to run inference with TensorFlow Lite.

Note: The example above uses the PyCoral API, which calls into the TensorFlow Lite Python API, but you can instead directly call the TensorFlow Lite Python API or use the TensorFlow Lite C++ API. For more information about these options, read the Edge TPU inferencing overview.

Next steps

Important: To sustain maximum performance, the Edge TPU must remain below the maximum operating temperature specified in the datasheet. By default, if the Edge TPU gets too hot, the PCIe driver slowly reduces the operating frequency and it may reset the Edge TPU to avoid permanent damage. To learn more, including how to configure the frequency scaling thresholds, read how to manage the PCIe module temperature.

To run some other models, such as real-time object detection, pose estimation, keyphrase detection, on-device transfer learning, and others, check out our example projects. In particular, if you want to try running a model with camera input, try one of the several camera examples.

If you want to train your own model, try these tutorials:

The following section describes how the power throttling works and how to customize the trip points.

Troubleshooting

Here are some solutions to possible problems on Linux.

HIB error

If you receive an error messages such as the following when you run an inference...

HIB Error. hib_error_status = 0000000000002200, hib_first_error_status = 0000000000000200

... You should be able to solve it if you modify your kernel command line arguments to include gasket.dma_bit_mask=32.

For information about how to modify your kernel command line arguments, refer to your respective platform documentation. For bootloaders based on U-Boot, you can usually modify the arguments either by modifying the bootargs U-Boot environment variable or by setting othbootargs environment variable as follows:

=> setenv othbootargs gasket.dma_bit_mask=32
=> printenv othbootargs
othbootargs=gasket.dma_bit_mask=32
=> saveenv

If you make the above change and then receive errors such as, DMA: Out of SW-IOMMU space, then you need to increase the swiotlb buffer size by adding another kernel command line argument: swiotlb=65536.

pcieport error

If you see a lot of errors such as the following:

pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0008(Transmitter ID)
pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00003100/00002000
pcieport 0000:00:01.0: [ 8] RELAY_NUM Rollover
pcieport 0000:00:01.0: [12] Replay Timer Timeout
pcieport 0000:00:01.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0008(Requester ID)
pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00004000/00000000

... You should be able to solve it if you modify your kernel command line arguments to include pcie_aspm=off.

For information about how to modify your kernel command line arguments, refer to your respective platform documentation. If your device includes U-Boot, see the previous HIB error for an example of how to modify the kernel commands. For certain other devices, you might instead add pcie_aspm=off to an APPEND line in your system /boot/extlinux/extlinux.conf file:

LABEL primary
      MENU LABEL primary kernel
      LINUX /boot/Image
      INITRD /boot/initrd
      APPEND ${cbootargs} quiet pcie_aspm=off

Workaround to disable Apex and Gasket

The following procedure is necessary only if your system includes a pre-build driver for Apex devices (as per the first steps for installing the PCIe driver). Due to a bug, updating this driver with ours can fail, so you need to first disable the apex and gasket modules as follows:

  1. Create a new file at /etc/modprobe.d/blacklist-apex.conf and add these two lines:
    blacklist gasket
    blacklist apex
  2. Reboot the system.
  3. Verify that the apex and gasket modules did not load by running this:
    lsmod | grep apex
    It should print nothing.
  4. Now follow the rest of the steps to install the PCIe driver.
  5. Finally, delete /etc/modprobe.d/blacklist-apex.conf and reboot your system.

PCIe (dev/apex_0) driver not loaded

Ensure the Kernel Moudle is Loaded

Check if the gasket and apex kernel modules are loaded properly. You've already checked for apex, but let's ensure everything is set up correctly:

lsmod | grep gasket
lsmod | grep apex

If they're not listed, try manually loading them:

sudo modprobe gasket
sudo modprobe apex

If you see the following error message continue reading:

modprobe: FATAL: Module gasket not found in directory /lib/modules/6.1.0-rpi8-rpi-2712
modprobe: FATAL: Module apex not found in directory /lib/modules/6.1.0-rpi8-rpi-2712

The error messages from modprobe indicate that the gasket and apex modules are not found in your current kernel's module directory. This suggests that either the modules are not installed correctly, or they are not compatible with your current kernel version (6.1.0 for Raspberry Pi). Here are some steps you can take to address this issue:

  1. Ensure Kernel Headers are Installed For modules like gasket and apex to be built and installed properly, you need the kernel headers for your currently running kernel. Install the kernel headers with:

    sudo apt-get install raspberrypi-kernel-headers

    After installing the headers, try reinstalling the gasket-dkms and libedgetpu1-std packages, as DKMS should automatically build the modules against your current kernel:

    sudo apt-get reinstall gasket-dkms libedgetpu1-std
  2. Check DKMS Status After installing the kernel headers and reinstalling the packages, check the status of DKMS to see if the gasket and apex modules have been built:

    dkms status

    This command will list all DKMS modules and their status. You're looking for gasket and apex to be listed as installed for your kernel version.

    You should roughly see the following output:

    Deprecated feature: REMAKE_INITRD (/var/lib/dkms/gasket/1.0/source/dkms.conf)
    Deprecated feature: REMAKE_INITRD (/var/lib/dkms/gasket/1.0/source/dkms.conf)
    gasket/1.0, 6.1.0-rpi8-rpi-2712, aarch64: installed
    gasket/1.0, 6.1.0-rpi8-rpi-v8, aarch64: installed
  3. Reboot

    sudo reboot

You can now continue with the rest of the steps to install the PCIe driver.