FlexNN

This repository contains the code and artifact evaluation guide for "FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices" (ACM MobiCom 2024). [ABS] [PDF]

The artifact has been awarded all the 4 badges (Artifacts Available, Artifacts Evaluated - Functional, Artifacts Evaluated - Reusable, Results Replicated) in the MobiCom 2024 Artifact Evaluation. Here are the main evaluation results:

Please refer to Evaluation.md for detailed instructions to evaluate FlexNN and reproduce the results.

Introduction

FlexNN is an efficient and adaptive memory management framework for DNN inference on memory-constrained devices.

For a given memory budget and DNN model, it uses a slicing-loading-computing joint planning approach, to achieve optimal memory utilization with minimal memory management overhead.

Design

FlexNN adopts a 2-stages design: the offline planning stage performs slicing-loading-computing joint planning according to the memory budget and the given model, and the online execution stage conducts model inference based on the offline-generated plans. It is efficient and adaptive with the following designs:

Bottleneck-aware layer slicing ($\S 3.2$) that significantly reduces the peak memory usage through fine-grained partitioning on "bottleneck" layers.
Preload-aware memory planning ($\S 3.3$) that reduces memory fragments and I/O waiting time by considering the weights preloading process at the offline planning stage.
Online execution designs (dependency-based synchronization and type-based static allocation, in $\S 3.4$) that fill the gaps between the planning results and the actual execution.

For more details, please refer to our paper.

Implementation

FlexNN is built atop NCNN, which is a high-performance Neural Network inference framework optimized for the mobile platform. It is implemented and best-optimized for floating-point CNN inference on ARMv8 CPUs.

With a given memory budget and DNN model, FlexNN flexibly slices the model and plans memory allocations and the execution order. The complete steps and corresponding compiled binaries are listed as follows:

Offline Planning
- Layer Slicing - implemented in flexnnslice.
  - Input: memory, model.
  - Output: sliced model.
- Memory Profiling - implemented in flexnnprofile.
  - Input: sliced model.
  - Output: profiles.
- Memory Planning - implemented in flexnnschedule.
  - Input: profiles.
  - Output: plans (dependencies and allocations).
Online Execution
- Model Inference - implemented in benchflexnn.
  - Input: memory, model, plans.
  - Output: inference results.

For more details, please refer to our paper and Evaluation.md.

Evaluation

Please refer to Evaluation.md for detailed instructions to deploy and evaluate FlexNN.

The current FlexNN implementation is mainly for research purpose and has a few limitations. Please also refer to Evaluation.md for details.

Citation

If you find FlexNN useful for your research, please cite our paper.

@inproceedings{li2024flexnn,
  title={FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices},
  author={Li, Xiangyu and Li, Yuanchun and Li, Yuanzhe and Cao, Ting and Liu, Yunxin},
  booktitle={Proceedings of the 30th Annual International Conference on Mobile Computing and Networking},
  pages={709--723},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
cmake		cmake
examples		examples
glslang @ bf08e1d		glslang @ bf08e1d
scripts		scripts
src		src
toolchains		toolchains
.astylerc		.astylerc
.clang-format		.clang-format
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.restyled.yaml		.restyled.yaml
CMakeLists.txt		CMakeLists.txt
Dockerfile		Dockerfile
Evaluation.md		Evaluation.md
Info.plist		Info.plist
LICENSE		LICENSE
README.md		README.md
codeformat.sh		codeformat.sh
sources.list		sources.list

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FlexNN

Introduction

Design

Implementation

Evaluation

Citation

About

Releases

Packages

Languages

License

xxxxyu/FlexNN

Folders and files

Latest commit

History

Repository files navigation

FlexNN

Introduction

Design

Implementation

Evaluation

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages