This is the code repository for our paper "SPL-Net: Spatial-Semantic Patch Learning Network for Facial Attribute Recognition with Limited Labeled Data".
Our SPL-Net is to perform FAR with limited labeled data effectively. The SPL-Net method involves a two-stage learning procedure. For the first stage, three auxiliary tasks (PRT, PST, and PCT) are jointly developed to exploit the spatial-semantic information on large-scale unlabeled facial data, and thus a powerful pretrained MSS is obtained. For the second stage, only a few number of labeled facial data are leveraged to fine-tune the pretrained MSS and an FAR model is finally learned.
- Python 3.7.11
- torch 1.10.1
- torchvision 0.11.2
- The CelebA-HQ dataset is required. Random select 30 (or 300) images from CelebA-HQ, and train BiSeNet-v2 on these images to get a model.
- The CelebA, LFWA and MAAD datasets are required. Use the trained BiSeNet-v2 model to generate semantic masks of these three datasets for training PST in SPL-Net.
- modify
config_files/test.yml
- run
python train_pretext_5b_adv.py
- modify
config_files/train_downstream.yml
- set
PRETRAIN_EPOCH
andPRETRAIN_PTH
based on the saved model in stage 1 - run
python train_downstream.py