Title | Venue | Backbone | CODE | |
---|---|---|---|---|
Vision Transformer Adapter for Dense Prediction | arxiv | ViT | https://arxiv.org/pdf/2205.08534.pdf | https://github.com/czczup/ViT-Adapter |
Masked-attention Mask Transformer for Universal Image Segmentation | CVPR | Swin-transformer | https://arxiv.org/pdf/2112.01527.pdf | https://github.com/facebookresearch/Mask2Former |
MPViT : Multi-Path Vision Transformer for Dense Prediction | CVPR | MPViT | https://arxiv.org/pdf/2112.11010.pdf | https://git.io/MPViT |
Revisiting Multi-Scale Feature Fusion for Semantic Segmentation | -- | EfficientNet | https://arxiv.org/pdf/2203.12683.pdf | -- |
Rethinking Semantic Segmentation: A Prototype View | CVPR | --- | https://arxiv.org/pdf/2203.13611.pdf | https://github.com/tfzhou/ProtoSeg |
Cross-Image Relational Knowledge Distillation for Semantic Segmentation | CVPR | --- | https://arxiv.org/pdf/2204.06986.pdf | https://github.com/winycg/CIRKD |
Title | Venue | Backbone | CODE | |
---|---|---|---|---|
SegFormer: Simple and Efficient Design for SemanticSegmentation with Transformers | NeurIPS | -- | https://arxiv.org/pdf/2105.15203.pdf | https://github.com/NVlabs/SegFormer |
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers | CVPR | -- | https://arxiv.org/abs/2012.15840 | https://fudan-zvg.github.io/SETR/ |
Per-Pixel Classification is Not All You Need for Semantic Segmentation | NeurIPS | Transformer head | https://papers.nips.cc/paper/2021/file/950a4152c2b4aa3ad78bdd6b366cc179-Paper.pdf | https://github.com/facebookresearch/MaskFormer |
FaPN: Feature-aligned Pyramid Network for Dense Image Prediction | ICCV | ResNet | https://arxiv.org/pdf/2108.07058v2.pdf | https://github.com/EMI-Group/FaPN |
Interlaced Sparse Self-Attention for Semantic Segmentation | IJCV | -- | https://arxiv.org/pdf/1907.12273.pdf | -- |
ISNet: Integrate Image-Level and Semantic-Level Context for Semantic Segmentation | ICCV | -- | https://arxiv.org/abs/2108.12382 | https://github.com/SegmentationBLWX/sssegmentation |
Learning Debiased and Disentangled Representations for Semantic Segmentation | NEURIPS | -- | https://arxiv.org/pdf/2111.00531.pdf | https://github.com/sanghyeokchu/DropClass |
Title | Venue | Backbone | CODE | |
---|---|---|---|---|
SegFix: Model-Agnostic Boundary Refinement for Segmentation | ECCV | -- | https://link.springer.com/chapter/10.1007%2F978-3-030-58610-2_29 | https://github.com/openseg-group/openseg.pytorch |
Context Prior for Scene Segmentation | CVPR | -- | https://arxiv.org/pdf/2004.01547.pdf | https://github.com/ycszen/ContextPrior |
Disentangled Non-Local Neural Networks | ECCV | -- | https://arxiv.org/pdf/2006.06668.pdf | https://github.com/yinmh17/DNL-Semantic-Segmentation |
Deep High-Resolution Representation Learning for Visual Recognition | TPAMI | resnet | https://arxiv.org/pdf/1908.07919.pdf | https://github.com/HRNet |
Title | Venue | Backbone | CODE | |
---|---|---|---|---|
Dual Attention Network for Scene Segmentation | CVPR | resnet | https://arxiv.org/pdf/1809.02983.pdf | github.com/junfu1115/DANet/ |
Asymmetric Non-local Neural Networks for Semantic Segmentation | ICCV | resnet | https://arxiv.org/pdf/1908.07678.pdf | https://github.com/MendelXu/ANN |
GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond | ECCV | resnet | https://arxiv.org/pdf/1904.11492.pdf | https://github.com/xvjiarui/GCNet |
Dynamic Multi-scale Filters for Semantic Segmentation | ICCV | resnet | https://openaccess.thecvf.com/content_ICCV_2019/papers/He_Dynamic_Multi-Scale_Filters_for_Semantic_Segmentation_ICCV_2019_paper.pdf | https://github.com/Junjun2016/DMNet |
Title | Venue | Backbone | CODE | |
---|---|---|---|---|
Rethinking Atrous Convolution for Semantic Image Segmentation | Arxiv | resnet | https://arxiv.org/pdf/1706.05587.pdf | https://github.com/tensorflow/models/tree/master/research/deeplab |