What watershed do in precompute_freq_cis function #73

yjhong89 · 2024-06-17T23:53:23Z

Hi. Thanks for sharing great works!

I wonder what is the role of scale_watershed in

Lumina-T2X/lumina_next_t2i/models/model.py

Line 921 in 7bc7d7d

scale_watershed: float = 1.0,

?

It is 0.3 in sample.py argument
- Lumina-T2X/lumina_next_t2i/sample.py
  
  Line 312 in 7bc7d7d
  
  parser.add_argument(

The text was updated successfully, but these errors were encountered:

ChrisLiu6 · 2024-06-18T02:53:31Z

In short, it is a watershed w.r.t. time step, before which position embedding is linearly scaled, and after which position embedding is NTK scaled.

More details: to make a model trained at 1k resolution to generate images at 1.5k or higher resolutions, an extrapolation on position embedding (i.e. RoPE in Lumina) is needed. We find that linear RoPE scaling leads to good global structure and composition, but the nearby pixels tend not to be harmonious; In contrast, NTK scaling makes good local texture, but global structure is usually unreasonable. Therefore, we use a combination of them two, applying linear scaling in the initial diffusion steps to define the global composition (intuitively like to draw a draft), and then switch to NTK for high-quality texture. It follows the same intuition as the method introduced in Sec 2.2 of the Lumina-Next paper but usually behaves more stably.

This method is very simple w.r.t. implementation

Lumina-T2X/lumina_next_t2i/models/model.py

Lines 944 to 952 in 7bc7d7d

    
           if timestep < scale_watershed: 
        
               linear_factor = scale_factor 
        
               ntk_factor = 1.0 
        
           else: 
        
               linear_factor = 1.0 
        
               ntk_factor = scale_factor 
        
           theta = theta * ntk_factor 
        
           freqs = 1.0 / (theta ** (torch.arange(0, dim, 4)[: (dim // 4)].float().cuda() / dim)) / linear_factor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What watershed do in precompute_freq_cis function #73

What watershed do in precompute_freq_cis function #73

yjhong89 commented Jun 17, 2024

ChrisLiu6 commented Jun 18, 2024 •

edited

Loading

What watershed do in precompute_freq_cis function #73

What watershed do in precompute_freq_cis function #73

Comments

yjhong89 commented Jun 17, 2024

ChrisLiu6 commented Jun 18, 2024 • edited Loading

ChrisLiu6 commented Jun 18, 2024 •

edited

Loading