no weight decay setting error in SimMIM pretraining #369

wanghaoyucn · 2024-09-26T06:39:20Z

Lines 154 to 158 in f82860b

    
           @torch.jit.ignore 
        
           def no_weight_decay_keywords(self): 
        
               if hasattr(self.encoder, 'no_weight_decay_keywords'): 
        
                   return {'encoder.' + i for i in self.encoder.no_weight_decay_keywords()} 
        
               return {}

Hello, I found this function in class SimMIM will have an output {'encoder.cpb_mlp', 'encoder.logit_scale', 'encoder.relative_position_bias_table'}. When it is passed into the build_optimizer, finally it will call function check_keywords_in_name(name, skip_keywords) to check if we need to set weight decay of this parameter to 0.

Swin-Transformer/optimizer.py

Lines 76 to 81 in f82860b

    
           def check_keywords_in_name(name, keywords=()): 
        
               isin = False 
        
               for keyword in keywords: 
        
                   if keyword in name: 
        
                       isin = True 
        
               return isin

Sadly, 'encoder.cpb_mlp' in 'encoder.layers.0.blocks.0.attn.cpb_mlp.0.bias' == False, which means the weight decay of cpb_mlp is not 0 during pretraining. The right implementation of no_weight_decay_keywords would be:

@torch.jit.ignore 
 def no_weight_decay_keywords(self): 
     if hasattr(self.encoder, 'no_weight_decay_keywords'): 
         return {i for i in self.encoder.no_weight_decay_keywords()} 
     return {}

Is this an intentional behavior or a bug? I appreciate your help!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

no weight decay setting error in SimMIM pretraining #369

no weight decay setting error in SimMIM pretraining #369

wanghaoyucn commented Sep 26, 2024 •

edited

Loading

no weight decay setting error in SimMIM pretraining #369

no weight decay setting error in SimMIM pretraining #369

Comments

wanghaoyucn commented Sep 26, 2024 • edited Loading

wanghaoyucn commented Sep 26, 2024 •

edited

Loading