A bug(maybe) #55

shaoyangxu · 2023-03-27T17:12:25Z

Hi, dear ziyi~
I found in your code, the bert output weights are not set to be the same as the input embedding, which can be proved in here(In detail, the code didnt set the weight of BertLMPredictionHead.decoder to be the same as the weight of BertModel.embeddings).
I think maybe it is you who deliberately modified the source code of bert LM into this way. Why? Will it influence the final result?

The text was updated successfully, but these errors were encountered:

zdou0830 · 2023-03-27T18:07:24Z

Hi, yes the input and output embedding layers are not shared because I didn't define get_input/output_embeddings in BERT. This was not intentional but I found this has almost no impact on the alignment performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A bug(maybe) #55

A bug(maybe) #55

shaoyangxu commented Mar 27, 2023

zdou0830 commented Mar 27, 2023

A bug(maybe) #55

A bug(maybe) #55

Comments

shaoyangxu commented Mar 27, 2023

zdou0830 commented Mar 27, 2023