Tf 214 #220

Ergodice · 2023-11-08T02:28:47Z

No description provided.

dense function combines dydense/dyrelu/linearscaling/gating into one function

Add logit gating, dense_layer, stop file, make dyrelu slopes/intercepts trainable

Weight gen (simple_gen) generates attention weights from each square by compressing then doing a batched dense to 64; buckets divides the training data based on material left.

Weight gen, buckets

Dytalking heads at this stage dynamically generates the projection matrices for the attention weights (same for all square pairs). Fixed set_visible_devices error by initializing tensorflow first in TFProcess and making DyDense temperature an instance attribute.

DyDense layers had issue saving sublayers so the design approach of the squeeze-excite layers is used, i.e., sublayers are moved outside into a function.

Fullgen compresses the tokens and combines them to extract global information into attention weights.

Dytalking heads adds residual to matrix specifying linear transformation

Removed old modules which were not useful including yaml references, also removed legacy resnet code. Added arc's encoding with option in yaml and also added example.yaml

update Readme to describe talking heads, fullgen, and dynamic kernel methods. Removed leelalogs and configs, except for example.yaml.

Fixed typo in tfprocess and Readme, made some config stuff oprtional, fixed arc encoding, removed fullgen bias

Added search_loss, which is one over the prediction for the best move, and confident_accuracy, which is the accuracy for positions where there is a clear best move. Removed simple gating. Also updated README to include info on auxiliary losses.

Smolgen is more efficient version of fullgen, also added square relu which adds 0.5% pol acc

Used for regularization, was found to speed up training in Katago

This reverts commit bb25b71.

This reverts commit c51152c.

…roto.

Fix net.py to match current proto.

Arcturai and others added 30 commits February 21, 2022 15:22

move encoder layers from the policy head to the end of the net body

f62009a

add in attention policy map

9acf63f

sync and add support for chess transformer

553f9ea

restore tf.function()'s

b9f13f6

bugfixes and DeepNorm implementation {https://arxiv.org/abs/2203.00555}

6a3e292

asdf

9dd755a

try fix for net.py bug

5b37de0

correct rule_50 scaling in net.py

523fa60

'policy map' positional encoding

42292ab

Potpourri of architectural improvements

cd1d5d6

Added dense

9652e93

dense function combines dydense/dyrelu/linearscaling/gating into one function

Small changes

eec873b

Add logit gating, dense_layer, stop file, make dyrelu slopes/intercepts trainable

Sideways and Davit attention, yaml spec improvements

17bb790

Weight gen, buckets

bb26b31

Weight gen (simple_gen) generates attention weights from each square by compressing then doing a batched dense to 64; buckets divides the training data based on material left.

Merge pull request #1 from Ergodice/multiple-nets

65d6ecc

Weight gen, buckets

Fix DyDense saving issues

afd1ad0

DyDense layers had issue saving sublayers so the design approach of the squeeze-excite layers is used, i.e., sublayers are moved outside into a function.

Add fullgen

b98b1c4

Fullgen compresses the tokens and combines them to extract global information into attention weights.

Horizontal and vertical convolutions

99b5b20

Fix fullgen history

636f502

Add dytalking heads

4462fda

Dytalking heads adds residual to matrix specifying linear transformation

Remove legacy code, add arc encoding and example yaml

d16db6c

Removed old modules which were not useful including yaml references, also removed legacy resnet code. Added arc's encoding with option in yaml and also added example.yaml

typo in example.yaml

50ae5c0

Update Readme, remove old files

a8e76d8

update Readme to describe talking heads, fullgen, and dynamic kernel methods. Removed leelalogs and configs, except for example.yaml.

Typos, bug fixes

a01b2fa

Fixed typo in tfprocess and Readme, made some config stuff oprtional, fixed arc encoding, removed fullgen bias

Fix checkpointing, remove obsolete stuff

1a5a860

Remove dyrelu reference

97bc128

Remove use_simple_gating, fix activation

0e7740e

Add metrics, update README

6619318

Added search_loss, which is one over the prediction for the best move, and confident_accuracy, which is the accuracy for positions where there is a clear best move. Removed simple gating. Also updated README to include info on auxiliary losses.

Smolgen!

e8c76b2

Smolgen is more efficient version of fullgen, also added square relu which adds 0.5% pol acc

Ergodice and others added 30 commits August 18, 2023 14:12

Add and clean up heads

e26692b

Move sqrt in optwgt gen

7ec4b47

Fix bug with optwgt shape

82bd638

Remove policy_val

055e903

Add opponent policy

bb25b71

Used for regularization, was found to speed up training in Katago

Revert "Add opponent policy"

14ed59c

This reverts commit bb25b71.

Fix q_st rescoring

b0626de

Simplify apply_alpha

6b55227

Change sign on alpha exponent

c51152c

Revert "Change sign on alpha exponent"

4905804

This reverts commit c51152c.

Remove policy val loss

8220533

Fix net.py to match current proto.

52231ad

Fix policy embedding reference for new policy heads.

64b9ba5

Update policy in net.py to match proto

39ce9aa

Fix typo in tfprocess.py

00166e9

Remove arc encoding

d6f59b9

Set new network format for multihead nets.

cb590eb

Update tfprocess.py and net.py to save new embedding and version to p…

e1a71d9

…roto.

Merge pull request #5 from almaudoh/attention-net-updates

1eacb21

Fix net.py to match current proto.

Get mixed precision working

8d339ae

Add categorical value

7ccd535

Add BT4 improvements

3a46f31

Update lczero-common

80d46a7

Fix policy loss bug

a0a473f

Reemove redundant code in TFProcess init

cb2cf30

Initial tf 2.14 commit

dc83a58

Allow disabling other biases

8acd286

Support tf2.10 and fix future heads

95c2c1a

Turn on test reporting

1317cc3

Update net.proto

dbfb7fd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tf 214 #220

Tf 214 #220

Ergodice commented Nov 8, 2023

Tf 214 #220

Are you sure you want to change the base?

Tf 214 #220

Conversation

Ergodice commented Nov 8, 2023