What is MicroPyTorch?

It’s a PyTorch binding for MicroPython. We produced an executable that runs PyTorch eager code using the MicroPython interpreter + PyTorch operator library.
It supports same PyTorch operators with same function schema, including in-place / out-place / functional variants, overload resolution, etc. We've tested ~20 commonly used operators, totally ~40 overloads.
It can run unmodified eager model code, including those using torch.nn modules. We've tested AlexNet - other more complex models should work with some extra effort.
The uncompressed x86-64 macOS binary that runs (inference only) AlexNet is <1MB, including the MicroPython runtime and the selected ATen CPU kernels with selected dtypes. The compressed x86-64 binary size (runtime + ops) is ~430KB. It's much smaller than existing PyTorch runtimes that run Python model directly:
- The CPython 3.8.3 alone is ~2.6M (built with "-Os" compiler flag);
- The stripped libtorch-cpu.so is >50MB;
- The PyTorch mobile OSS prebuilt library (including all forward ops, uncompressed) is >20MB;
- The PyTorch mobile OSS selective build (including ops for MobileNetV2, compressed arm-v7) is ~4.5MB;
- We haven’t squeezed all the juice yet - the binary still contains a REPL interactive shell and MicroPython built-in modules which are not necessary for mobile apps.
The simple-add microbenchmark shows that it’s ~10% faster than CPython linking with the same prebuilt libtorch 1.7.1. The CPU-only static dispatch variant is even faster.
It can run on ESP32 microcontroller which only has 520K RAM / 4M flash.

Quick Start

Checkout the repo and submodules (MicroPython 1.13 + PyTorch dev branch + ESP-IDF SDK)

git clone --recursive https://github.com/ljk53/upytorch

Option A) Build MicroPython + PyTorch binding locally, dynamically link with the prebuilt LibTorch 1.7 from the official website.

LIBTORCH=prebuilt make test

Option B) Build everything locally, dynamically with libtorch and run unit tests.

make test

Option C) Build and statically link with a customized libtorch that only includes selected ops/dtypes/features for specific models.

# The selective builds do not necessarily include all ops to pass unit tests.

# To include the ~40 tested op overloads
LIBTORCH=local_lite OP_SELECTION_YAML=tools/dev.yaml make

# To only include ops needed by AlexNet
LIBTORCH=local_lite OP_SELECTION_YAML=tools/alexnet.yaml make

# To not include any op
LIBTORCH=local_lite OP_SELECTION_YAML=tools/noop.yaml make

Option D) Build ESP32 firmware

# Check out the CI job: https://github.com/ljk53/upytorch/blob/main/.github/workflows/make-esp.yaml

# ESP-IDF SDK is included as git submodule. Set environment variable to its location.
export IDF_PATH=esp32/esp-idf

# Install the toolchain.
$IDF_PATH/install.sh

# Set the path to the toolchain.
source $IDF_PATH/export.sh

# Install PyTorch dependencies to the toolchain's virtual env.
pip3 install -r pytorch/requirements.txt

# Kick off the build.
LIBTORCH=local_esp make

Launch the REPL shell to play with it

MicroPython v1.13 on 2020-12-10; linux version
Use Ctrl-D to exit, Ctrl-E for paste mode
>>> import torch
>>> a = torch.eye(3, 4)
>>> a.mul_(torch.ones(4).mul_(5))
 5  0  0  0
 0  5  0  0
 0  0  5  0
[ CPUFloatType{3,4} ]
>>> torch.mul(a, 3, out=a)
 15   0   0   0
  0  15   0   0
  0   0  15   0
[ CPUFloatType{3,4} ]
>>> a.sum(dtype=int)
45
[ CPULongType{} ]
>>>
>>> a = torch.ones(1, 1, 5, 5)
>>> b = torch.ones(1, 1, 3, 3)
>>> torch.conv2d(a, b, stride=[2, 2], padding=[2, 2], dilation=[2, 2])
(1,1,.,.) =
  4  6  4
  6  9  6
  4  6  4
[ CPUFloatType{1,1,3,3} ]
>>>
>>> import torch.nn as nn
>>> L = nn.Linear(6 * 6, 6)
>>> L.forward(torch.ones(1, 6 * 6))
0.0001 *
-7.6302 -6.7863 -8.4434 -7.6196 -6.8142 -7.6102
[ CPUFloatType{1,6} ]

Binary Size Inspection / MicroBenchmark

We setup GitHub Action workflow to continuously measure the binary size changes and micro-benchmark results on macOS and Ubuntu. The results can be found on the Actions tab.

Sample binary artifacts

Name	Size
platform-macos.libtorch-prebuilt.ops-dev	499 KB
platform-macos.libtorch-lite.ops-noop	688 KB
platform-macos.libtorch-lite.ops-alexnet	995 KB
platform-macos.libtorch-lite.ops-dev-with-dummy	1.96 MB
platform-ubuntu.libtorch-prebuilt.ops-dev	540 KB
platform-ubuntu.libtorch-lite.ops-noop	905 KB
platform-ubuntu.libtorch-lite.ops-alexnet	1.21 MB
platform-ubuntu.libtorch-lite.ops-dev-with-dummy	2.4 MB

Sample wall-time microbenchmark result

System: Linux fv-az59-708 5.4.0-1032-azure #33~18.04.1-Ubuntu SMP Tue Nov 17 11:40:52 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Installed PyTorch: 1.7.1+cpu

# CPython (prebuilt libtorch)
                                         name       ns (avg)       ns (min)          stdev
                       add_s1_outplace_N10000        2537.68        2305.98         262.05
                      add_s1_outplace_N100000        1854.04        1705.02          86.73
                                add_s1_N10000        3173.71        3016.07         161.67
                               add_s1_N100000        2700.19        2605.55          87.78

# MicroPython - Option A (prebuilt libtorch)
                                         name       ns (avg)       ns (min)          stdev
                       add_s1_outplace_N10000        1365.29        1244.62          83.08
                      add_s1_outplace_N100000        1347.03        1309.73          33.61
                                add_s1_N10000        2265.60        2022.70         181.24
                               add_s1_N100000        2664.79        2393.19         153.12

# MicroPython - Option C (optimized size and perf)
                                         name       ns (avg)       ns (min)          stdev
                       add_s1_outplace_N10000        1040.23         980.40          57.20
                      add_s1_outplace_N100000        1047.44         989.96          60.01
                                add_s1_N10000        1265.49        1125.41         163.21
                               add_s1_N100000        1362.06        1250.22          85.14

Sample valgrind instruction count becnhmark result

# CPython (prebuilt libtorch)
Run ID                                            2N Insts #      N Insts #    Avg Insts #
py3.prebuilt.simple_add.add_s1_outplace.N3000     1510584902     1483545156           9013
py3.prebuilt.simple_add.add_s1_outplace.N6000     1562349179     1509447061           8817
py3.prebuilt.simple_add.add_s1.N3000              1528046284     1492903000          11714
py3.prebuilt.simple_add.add_s1.N6000              1600161890     1528326434          11972

# MicroPython - Option A (prebuilt libtorch)
Run ID                                            2N Insts #      N Insts #    Avg Insts #
upy.prebuilt.simple_add.add_s1_outplace.N3000      542848755      518788766           8019
upy.prebuilt.simple_add.add_s1_outplace.N6000      590984876      542848755           8022
upy.prebuilt.simple_add.add_s1.N3000               558037879      526252708          10595
upy.prebuilt.simple_add.add_s1.N6000               621387655      558037879          10558

# MicroPython - Option C (optimized size and perf)
Run ID                                            2N Insts #      N Insts #    Avg Insts #
upy.lite.dev.simple_add.add_s1_outplace.N3000       41404171       23907421           5832
upy.lite.dev.simple_add.add_s1_outplace.N6000       76415502       41404171           5835
upy.lite.dev.simple_add.add_s1.N3000                46173148       26292985           6626
upy.lite.dev.simple_add.add_s1.N6000                85949713       46173148           6629

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

What is MicroPyTorch?

Quick Start

Checkout the repo and submodules (MicroPython 1.13 + PyTorch dev branch + ESP-IDF SDK)

Option A) Build MicroPython + PyTorch binding locally, dynamically link with the prebuilt LibTorch 1.7 from the official website.

Option B) Build everything locally, dynamically with libtorch and run unit tests.

Option C) Build and statically link with a customized libtorch that only includes selected ops/dtypes/features for specific models.

Option D) Build ESP32 firmware

Launch the REPL shell to play with it

Binary Size Inspection / MicroBenchmark

Sample binary artifacts

Sample wall-time microbenchmark result

Sample valgrind instruction count becnhmark result

Files

README.md

Latest commit

History

README.md

File metadata and controls

What is MicroPyTorch?

Quick Start

Checkout the repo and submodules (MicroPython 1.13 + PyTorch dev branch + ESP-IDF SDK)

Option A) Build MicroPython + PyTorch binding locally, dynamically link with the prebuilt LibTorch 1.7 from the official website.

Option B) Build everything locally, dynamically with libtorch and run unit tests.

Option C) Build and statically link with a customized libtorch that only includes selected ops/dtypes/features for specific models.

Option D) Build ESP32 firmware

Launch the REPL shell to play with it

Binary Size Inspection / MicroBenchmark

Sample binary artifacts

Sample wall-time microbenchmark result

Sample valgrind instruction count becnhmark result