TensorFlow Lite for Microcontrollers (TFLM) is a port of TensorFlow Lite designed to run machine learning models on DSPs, microcontrollers and other devices with limited memory.
We have done TFLM NN kernel optimized implementations(main p/v extension) for Nuclei RISC-V Processor, and you can run the TFLM examples using software emulation or FPGA board.
TFLM has been ported to Nuclei RISC-V Processor and Nuclei SDK, you can evaluate it in Nuclei SDK and also in official TFLM build system.
- If you want to use it in directly TFLM build system, please check this repo: https://github.com/Nuclei-Software/tflite-micro/tree/nuclei/nsdk_0.6.0
- If you want to use it in Nuclei SDK or Nuclei Studio as a software component, you can follow the following steps:
Here are two ways to use Nuclei SDK TFLM component:
- Use Nuclei SDK 0.6.0 in terminal
- Use Nuclei Studio IDE 2024.06
-
Get Nuclei SDK (v0.6.0) from https://github.com/Nuclei-Software/nuclei-sdk/releases/tag/0.6.0
-
Get tflm zip package from https://github.com/Nuclei-Software/npk-tflm, unzip it and put under the Components folder of $NUCLEI_SDK_ROOT.
nuclei-sdk$ tree -L 2 . ├── application │ ├── baremetal │ ├── freertos │ ├── rtthread │ ├── threadx │ └── ucosii ├── Build │ ├── gmsl │ ├── Makefile.base │ ├── Makefile.components │ ├── Makefile.conf │ ├── Makefile.core │ ├── Makefile.files │ ├── Makefile.misc │ ├── Makefile.rtos │ ├── Makefile.rules │ ├── Makefile.soc │ ├── Makefile.toolchain │ └── toolchain ├── Components │ ├── profiling │ └── tflm ├── doc │ ├── Makefile │ ├── requirements.txt │ └── source ├── ideprojects │ └── iar ├── LICENSE ├── Makefile ├── NMSIS │ ├── build.mk │ ├── Core │ ├── DSP │ ├── Library │ ├── NN │ └── npk.yml ├── NMSIS_VERSION ├── npk.yml ....
-
Setup Tools and Environment, details can refer to https://doc.nucleisys.com/nuclei_sdk/quickstart.html.
-
Build and run application.
Assuming that run application on nuclei evalsoc with nx900fd cpu.
run qemu (software emulation):
cd Components/tflm/examples/xxx make SOC=evalsoc CORE=nx900fd DOWNLOAD=ilm clean make SOC=evalsoc CORE=nx900fd DOWNLOAD=ilm all make SOC=evalsoc CORE=nx900fd DOWNLOAD=ilm run_qemu # select ARCH_EXT, for example, _xxldsp, v, v_xxldsp, use pure c version if not select ARCH_EXT ## _xxldsp: Nuclei DSP extension present ## v: v extension present ## v_xxldsp: Nuclei DSP and v extension present make SOC=evalsoc CORE=nx900fd ARCH_EXT=v_xxldsp DOWNLOAD=ilm all make SOC=evalsoc CORE=nx900fd ARCH_EXT=v_xxldsp DOWNLOAD=ilm run_qemu
run on FPGA Board:
Use Correct FPGA Board and bitstream(contact Nuclei AE) should be prepared(512K ILM/DLM bitstream is preferred).
Configure the board and open UART terminal (the default UART baudrate is
115200
), then download the executable file.cd Components/tflm/examples/xxx make SOC=evalsoc CORE=nx900fd DOWNLOAD=ilm clean all make SOC=evalsoc CORE=nx900fd DOWNLOAD=ilm upload
Then, result will be printed in the terminal.
Here take the
tflm/examples/person_detection
as an example.Nuclei SDK Build Time: May 26 2023, 11:05:15 Download Mode: ILM CPU Frequency 999999078 Hz CPU HartID: 0 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72 person score:-72 no person score 72
FAQs: Default ilm/dlm size in evalsoc is 64K/64K, need to change it to 512K to run these cases
If you met issue like this: section .text will not fit in region ilm
, this is caused by ilm size is not big enough to store the code, 64K is not enough to run this application,
please use 512K, if you want to run on hardware, please make sure your cpu bitstream configured with 512K ILM/DLM.
# file: /path/to/nuclei_sdk/SoC/evalsoc/Board/nuclei_fpga_eval/Source/GCC/gcc_evalsoc_ilm.ld
# Partial as follows:
OUTPUT_ARCH( "riscv" )
ENTRY( _start )
MEMORY
{
ilm (rxa!w) : ORIGIN = 0x80000000, LENGTH = 512K /* change 64K to 512K */
ram (wxa!r) : ORIGIN = 0x90000000, LENGTH = 512K /* change 64K to 512K */
}
-
Download Nuclei Studio IDE 2024.06 from https://www.nucleisys.com/download.php
Refer to the Nuclei IDE User Guide if necessary.
-
Open the Nuclei Studio IDE
-
Download the zip package of Nuclei SDK
Make sure the version of the SDK should be 0.6.0.
-
Import the zip package of tflm
-
Create a new Nuclei RISC-V C/C++ Project
Note: If you meet memory overflow error when building project, you could use DDR download mode(evalsoc using nuclei 600/900 processor support this mode) that will meet memory requirement.