At text classification example #945

atuzhykov · 2020-03-10T10:11:19Z

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

Please review
https://github.com/eclipse/deeplearning4j/blob/master/CONTRIBUTING.md before opening a pull request.

Signed-off-by: atuzhykov <[email protected]>

l2 1e-6 > 1e-3 Signed-off-by: atuzhykov <[email protected]>

l2 1e-3 > 1e-6 Signed-off-by: atuzhykov <[email protected]>

Signed-off-by: atuzhykov <[email protected]>

) Signed-off-by: atuzhykov <[email protected]>

Signed-off-by: atuzhykov <[email protected]>

) Signed-off-by: atuzhykov <[email protected]>

Signed-off-by: atuzhykov <[email protected]>

…r1e-4 Signed-off-by: atuzhykov <[email protected]>

Signed-off-by: Andrii Tuzhykov <[email protected]>

AlexDBlack

Looking good, just a few minor improvements to make.
Can we also make a backup of the branch, then flatten + sign on this branch as described here: https://deeplearning4j.org/eclipse-contributors

Otherwise I'm happy with this 👍

AlexDBlack · 2020-03-11T06:03:21Z

...in/java/org/deeplearning4j/examples/nlp/sentencepiecernnexample/SentencePieceRNNExample.java

+ * As far model is predisposed to overfitting we also add l2 regularization and dropout for certain layers.
+ * To prepare reviews we use BertIterator, which is MultiDataSetIterator for training BERT (Transformer) models.
+ * We congigure BertIterator for supervised sequence classification:
+ * 0. As tokenizer we use BertWordPieceTokenizerFactory with provided BERT BASE UNCASED vocabulary.


Maybe let's improve this slightly, add another line under 0.:
BertIterator and BertWordPieceTokenizer implement the Word Piece sub-word tokenization algorithm, with a vocabulary size of 30522 tokens.

AlexDBlack · 2020-03-11T06:05:20Z

...in/java/org/deeplearning4j/examples/nlp/sentencepiecernnexample/SentencePieceRNNExample.java

+        int listenerFrequency = 20;
+        net.setListeners(new StatsListener(statsStorage, listenerFrequency), new ScoreIterationListener(50));
+        //Attach the StatsStorage instance to the UI: this allows the contents of the StatsStorage to be visualized
+        uiServer.attach(statsStorage);


Maybe let's comment out the UI by default, as it adds some overhead (slows down training a bit). Users can uncomment it if they want to run it with UI. That would look like this:

/* //Uncomment this section to run the example with the user interface UIServer uiServer = UIServer.getInstance(); //Configure where the network information (gradients, activations, score vs. time etc) is to be stored //Then add the StatsListener to collect this information from the network, as it trains StatsStorage statsStorage = new FileStatsStorage(new File(System.getProperty("java.io.tmpdir"), "ui-stats-" + System.currentTimeMillis() + ".dl4j")); int listenerFrequency = 20; net.setListeners(new StatsListener(statsStorage, listenerFrequency), new ScoreIterationListener(50)); //Attach the StatsStorage instance to the UI: this allows the contents of the StatsStorage to be visualized uiServer.attach(statsStorage); */ net.setListeners(new ScoreIterationListener(50));

AlexDBlack · 2020-03-11T06:05:22Z

...in/java/org/deeplearning4j/examples/nlp/sentencepiecernnexample/SentencePieceRNNExample.java

+            net.fit(train);
+
+            // Get and print accuracy, precision, recall & F1 and confusion matrix
+            Evaluation eval = net.doEvaluation(test, new Evaluation[]{new Evaluation()})[0];


For MultiLayerNetwork, we can use net.evaluate(test)

AlexDBlack · 2020-03-11T06:05:43Z

pom.xml

@@ -28,7 +28,7 @@
    <properties>
        <!-- Change the nd4j.backend property to nd4j-cuda-9.2-platform,nd4j-cuda-10.0-platform or nd4j-cuda-10.1-platform to use CUDA GPUs -->
        <nd4j.backend>nd4j-native-platform</nd4j.backend>
-<!--        <nd4j.backend>nd4j-cuda-10.2-platform</nd4j.backend>-->
+<!--        <nd4j.backend>nd4j-cuda-10.0-platform</nd4j.backend>-->


Leave this commented out with 10.2

Signed-off-by: Andrii Tuzhykov <[email protected]>

…ation class Signed-off-by: Andrii Tuzhykov <[email protected]>

atuzhykov added 30 commits February 20, 2020 20:04

examples added + changed nd4j backend in pom.xml to run on DGX1

de5f44c

Signed-off-by: atuzhykov <[email protected]>

examples added + changed nd4j backend in pom.xml to run on DGX1

d65fa9a

Signed-off-by: atuzhykov <[email protected]>

other small changes

c2967df

Signed-off-by: atuzhykov <[email protected]>

small fix to match cuda version with container

9e87835

Signed-off-by: atuzhykov <[email protected]>

small fix to match cuda version with container

41530e1

Signed-off-by: atuzhykov <[email protected]>

lr 1e-3 > 4e-3 (as multiplying batchsize*k, lr*sqrt(k))

3198021

l2 1e-6 > 1e-3 Signed-off-by: atuzhykov <[email protected]>

lr 1e-3 > 4e-3 (as multiplying batchsize*k, lr*sqrt(k))

9a1ba54

l2 1e-3 > 1e-6 Signed-off-by: atuzhykov <[email protected]>

experiment0

95aa639

Signed-off-by: atuzhykov <[email protected]>

experiment1 (notes belong to commit name are here http://tiny.cc/yashkz)

7501626

Signed-off-by: atuzhykov <[email protected]>

experiment2 (notes belong to commit name are here http://tiny.cc/yashkz)

091c386

Signed-off-by: atuzhykov <[email protected]>

experiment3 (notes belong to commit name are here http://tiny.cc/yashkz)

86a6518

Signed-off-by: atuzhykov <[email protected]>

experiment4 (notes belong to commit name are here http://tiny.cc/yashkz)

d669d6f

Signed-off-by: atuzhykov <[email protected]>

experiment5 (notes belong to commit name are here http://tiny.cc/yashkz)

578a186

Signed-off-by: atuzhykov <[email protected]>

experiment6 (notes belong to commit name are here http://tiny.cc/yashkz)

fe42967

Signed-off-by: atuzhykov <[email protected]>

experiment7 (notes belong to commit name are here http://tiny.cc/yashkz)

302f7bb

Signed-off-by: atuzhykov <[email protected]>

experiment8 (notes belong to commit name are here http://tiny.cc/yashkz)

bb933c5

Signed-off-by: atuzhykov <[email protected]>

experiment9 (notes belong to commit name are here http://tiny.cc/yashkz)

000f7d8

Signed-off-by: atuzhykov <[email protected]>

experiment9 (notes belong to commit name are here http://tiny.cc/yashkz)

79014f8

Signed-off-by: atuzhykov <[email protected]>

experiment10 (notes belong to commit name are here http://tiny.cc/yashkz

4a5cffd

) Signed-off-by: atuzhykov <[email protected]>

experiment10 (notes belong to commit name are here http://tiny.cc/yashkz

8ef7519

) Signed-off-by: atuzhykov <[email protected]>

experiment11 (notes belong to commit name are here http://tiny.cc/yashkz

b99d59a

) Signed-off-by: atuzhykov <[email protected]>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

1aabba1

) Signed-off-by: atuzhykov <[email protected]>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

dc3f3b3

) Signed-off-by: atuzhykov <[email protected]>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

f16d8ac

) Signed-off-by: atuzhykov <[email protected]>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

36ae8ee

) Signed-off-by: atuzhykov <[email protected]>

experiment13 (notes belong to commit name are here http://tiny.cc/yashkz

0de07d8

) Signed-off-by: atuzhykov <[email protected]>

experiment14 (notes belong to commit name are here http://tiny.cc/yashkz

f2eece6

) Signed-off-by: atuzhykov <[email protected]>

baseline conf + LengthHandling.FIXED_LENGTH=256

ff91a96

Signed-off-by: atuzhykov <[email protected]>

baselineconf+LengthHandling.FIXED_LENGTH=256+Bidirectional_lstm

2e99c55

Signed-off-by: atuzhykov <[email protected]>

baselineconf+LengthHandling.FIXED_LENGTH=256+Bidirectional_lstm+lr1e-4

e107c2c

Signed-off-by: atuzhykov <[email protected]>

atuzhykov and others added 27 commits March 5, 2020 01:03

experiment10 (notes belong to commit name are here http://tiny.cc/yashkz

b2f6510

) Signed-off-by: atuzhykov <[email protected]>

experiment10 (notes belong to commit name are here http://tiny.cc/yashkz

c5de18f

) Signed-off-by: atuzhykov <[email protected]>

experiment11 (notes belong to commit name are here http://tiny.cc/yashkz

145e1fd

) Signed-off-by: atuzhykov <[email protected]>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

2a72f0a

) Signed-off-by: atuzhykov <[email protected]>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

3f07a38

) Signed-off-by: atuzhykov <[email protected]>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

6256e85

) Signed-off-by: atuzhykov <[email protected]>

experiment12 (notes belong to commit name are here http://tiny.cc/yashkz

da14588

) Signed-off-by: atuzhykov <[email protected]>

experiment13 (notes belong to commit name are here http://tiny.cc/yashkz

cb26a75

) Signed-off-by: atuzhykov <[email protected]>

experiment14 (notes belong to commit name are here http://tiny.cc/yashkz

f0e1241

) Signed-off-by: atuzhykov <[email protected]>

baseline conf + LengthHandling.FIXED_LENGTH=256

c4f9d9a

Signed-off-by: atuzhykov <[email protected]>

baselineconf+LengthHandling.FIXED_LENGTH=256+Bidirectional_lstm

02d47fd

Signed-off-by: atuzhykov <[email protected]>

baselineconf+LengthHandling.FIXED_LENGTH=256+Bidirectional_lstm+lr1e-4

61b63f8

Signed-off-by: atuzhykov <[email protected]>

baselineconf+LengthHandling.FIXED_LENGTH=256+Bidirectional_lstm_256

820eda5

Signed-off-by: atuzhykov <[email protected]>

baselineconf+LengthHandling.FIXED_LENGTH=256+Bidirectional_lstm_256_l…

2c757c0

…r1e-4 Signed-off-by: atuzhykov <[email protected]>

base_conf+bidir_LSTM_256_layersize_Adam_lr1e-3_SGD_lr1e-3_for_EmbdLayer

53efeec

base_conf+bidir_LSTM_256_layersize_Adam_lr1e-3_SGD_lr1e-3_for_EmbdLayer

1bbb9e0

base_conf+bidir_LSTM_256_layersize_Nadam_lr1e-3

880cd30

base_conf+bidir_LSTM_256_layersize_Nadam_lr1e-3

9555d55

Signed-off-by: Andrii Tuzhykov <[email protected]>

base_conf+bidir_LSTM_256_layersize_Nadam_lr1e-3

8aab2fb

Signed-off-by: Andrii Tuzhykov <[email protected]>

base_conf+3x_bidir_LSTM_256_layersize_Nadam_lr1e-3

542db86

Signed-off-by: Andrii Tuzhykov <[email protected]>

base_conf+3xbidir_LSTM_256_layersize_Adam_lr1e-3_l21e-5

46dffc0

Signed-off-by: Andrii Tuzhykov <[email protected]>

base_conf+3x_bidir_LSTM_256_layersize_Adam_Sheduled_lr

c5a979a

Signed-off-by: Andrii Tuzhykov <[email protected]>

base_conf+2x_bidir_LSTM_256_Adam_lr1e-3_lstm_dropout_075

6e390eb

Signed-off-by: Andrii Tuzhykov <[email protected]>

prefinal examples

7945508

Signed-off-by: Andrii Tuzhykov <[email protected]>

prefinal

74162ff

Signed-off-by: Andrii Tuzhykov <[email protected]>

changed package and class name, added trained model URL

13a2392

Signed-off-by: Andrii Tuzhykov <[email protected]>

fixed required changes

5b1b710

Signed-off-by: Andrii Tuzhykov <[email protected]>

AlexDBlack suggested changes Mar 11, 2020

View reviewed changes

atuzhykov added 2 commits March 11, 2020 12:00

fixed new round of required changes

a22956d

Signed-off-by: Andrii Tuzhykov <[email protected]>

small issue belong to match BertIterator and DataSetIterator in Evalu…

5e7df4f

…ation class Signed-off-by: Andrii Tuzhykov <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

At text classification example #945

At text classification example #945

atuzhykov commented Mar 10, 2020

AlexDBlack left a comment

AlexDBlack Mar 11, 2020

AlexDBlack Mar 11, 2020

AlexDBlack Mar 11, 2020

AlexDBlack Mar 11, 2020

At text classification example #945

Are you sure you want to change the base?

At text classification example #945

Conversation

atuzhykov commented Mar 10, 2020

What changes were proposed in this pull request?

How was this patch tested?

AlexDBlack left a comment

Choose a reason for hiding this comment

AlexDBlack Mar 11, 2020

Choose a reason for hiding this comment

AlexDBlack Mar 11, 2020

Choose a reason for hiding this comment

AlexDBlack Mar 11, 2020

Choose a reason for hiding this comment

AlexDBlack Mar 11, 2020

Choose a reason for hiding this comment