-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Testing out coatnet style relative attention #182
base: master
Are you sure you want to change the base?
Conversation
net.proto to go with my saving tweaks. Leela Chess is free software: you can redistribute it and/or modify Leela Chess is distributed in the hope that it will be useful, You should have received a copy of the GNU General Public License Additional permission under GNU GPL version 3 section 7 If you modify this Program, or any covered work, by linking or package pblczero; message EngineVersion { message Weights { message ConvBlock { message SEunit { message Residual { message MHRA { message FFN { message RRA { // Input convnet. // Residual tower. // Policy head // Value head // Moves left head // rra tower. message TrainingParams { message NetworkFormat { // Output format of the NN. Used by search code to interpret results. // Network architecture. Used by backends to build the network. // Policy head architecture // Value head architecture // Moves left head architecture message Format { optional Encoding weights_encoding = 1; message Net { |
multi_head_relative_attention is a bit overkill, I forked the entire multi_head_attention from keras, it supports all kinds of stuff, in terms of dimensions, but with the relative logic added, it is now limited to NHWC input despite all the options indicating otherwise.