You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am trying to use the MOE class in the decoder portion of a transformer architecture in which I want to replace the feed forward step with a mixture of experts. The input dimension of the class is of type [batch, input_size]. The sequence in each step is variable which leads to a variable input size. How can I use this class in that case
The text was updated successfully, but these errors were encountered:
Hi, I am trying to use the MOE class in the decoder portion of a transformer architecture in which I want to replace the feed forward step with a mixture of experts. The input dimension of the class is of type [batch, input_size]. The sequence in each step is variable which leads to a variable input size. How can I use this class in that case
The text was updated successfully, but these errors were encountered: