Moonlit: Research for enhancing AI models' efficiency and performance.

Moonlit is a collection of our model compression work for efficient AI.

ToP (@KDD'23): Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference

ToP is a constraint-aware and ranking-distilled token pruning method, which selectively removes unnecessary tokens as input sequence pass through layers, allowing the model to improve online inference speed while preserving accuracy.

SpaceEvo (@ICCV'23): SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference

SpaceEvo is an automatic method for designing a dedicated, quantization-friendly search space for target hardware. This work is featured on Microsoft Research blog: Efficient and hardware-friendly neural architecture search with SpaceEvo

ElasticViT (@ICCV'23): ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices

ElasticViT is a two-stage NAS approach that trains a high-quality ViT supernet over a very large search space for covering a wide range of mobile devices, and then searches an optimal sub-network (subnet) for direct deployment.

LitePred (@NSDI'24): LitePred: Transferable and Scalable Latency Prediction for Hardware-Aware Neural Architecture Search

LitePred is a lightweight transferrable approach for accurately predicting DNN inference latency. Instead of training a latency predictor from scratch, LitePred is the first to transfer pre-existing latency predictors and achieve accurate prediction on new edge platforms with a profiling cost of less than 1 hour.

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
Compresso		Compresso
ElasticViT		ElasticViT
LitePred		LitePred
SpaceEvo		SpaceEvo
ToP		ToP
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE.md		NOTICE.md
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Moonlit: Research for enhancing AI models' efficiency and performance.

About

Releases

Packages

Contributors 7

Languages

License

microsoft/Moonlit

Folders and files

Latest commit

History

Repository files navigation

Moonlit: Research for enhancing AI models' efficiency and performance.

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages