Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge florianmai master to add CNN and LSTM models #10

Open
wants to merge 129 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
129 commits
Select commit Hold shift + click to select a range
eddf2b2
Read data from csv format, consider extra samples.
florianmai Jun 11, 2017
d3ed0da
Removed some unnecessary printouts
florianmai Jun 11, 2017
04cf27b
Max features and bugfix
florianmai Jun 11, 2017
23f7f0f
Specify batch size and adjust learning rate accordingly
florianmai Jun 13, 2017
5bd3286
Move to keras 2
florianmai Jun 13, 2017
9e2e197
Fixed a bug occurring when reading data.
florianmai Jun 13, 2017
66ae849
Move from deprecated sklearn.cross_validation to sklearn.model_select…
florianmai Jun 14, 2017
6c7e22f
Enabled early stopping on validation set.
florianmai Jun 15, 2017
57234ca
Fixed bug in number of steps that occured when moving from keras1.2 t…
florianmai Jun 15, 2017
a7a8022
Fixed a bug at prediction step where outdated keras functions are used
florianmai Jun 18, 2017
cd1a005
Learning rate and number of epochs for MLP as command line parameter
florianmai Jun 18, 2017
17dec4b
Use model checkpoint of best model after early stopping
florianmai Jun 18, 2017
749e7c0
First version of tensorflow reimplementation of MLP
florianmai Jun 19, 2017
768b704
Wrapper class for tensorflow models in general with mini batching
florianmai Jun 19, 2017
f4e9c67
MLP_Soph now replicates MLP_Base in tensorflow instead of keras
florianmai Jun 19, 2017
14ca61a
Pass sparse tensors directly to tf-models to circumvent tf.constant s…
florianmai Jun 20, 2017
52f4779
Fixed prediction step for sophmlp
florianmai Jun 20, 2017
0d3035d
Fixed dropout parameter
florianmai Jun 20, 2017
007ec5f
Switched from tf.contrib.learn to plain tensorflow due to issues with…
florianmai Jun 21, 2017
7f8ef02
Adopt keras' categorical xentropy computation
florianmai Jun 21, 2017
9d100cb
Fixed a bug in tensorflow prediction step
florianmai Jun 21, 2017
a0136a9
Compute validation loss in tensorflow-models
florianmai Jun 21, 2017
402ce24
Save and use best model in terms of validation loss for prediction
florianmai Jun 21, 2017
e475117
Keep TF-reimplementation of keras mlp as 'mlpbase'
florianmai Jun 21, 2017
c06434e
MLP-Soph now has lookup table similar to fastText
florianmai Jun 21, 2017
5bc5baa
Skip embedding layer by specifying embedding_size=0
florianmai Jun 21, 2017
3e7f0d9
Self normalizing neural networks for MLP-Soph
florianmai Jun 21, 2017
691485b
Bugfix: unable to use 'combined' dataset format with -xX
florianmai Jun 21, 2017
d5a90fc
Use --fixed_folds and --folds=1 to do only a one fold evaluation with…
florianmai Jun 21, 2017
289bf0d
Quick-fix for extra-samples not being sampled from the 11th fold
florianmai Jun 22, 2017
1ff4c28
Stop after 5 rounds of no improvement in validation loss
florianmai Jun 22, 2017
334ea02
Merge branch 'master' of ren-mai.net:quadflor-code
florianmai Jun 22, 2017
37f4c9b
Made clearer where 'dropout' means keep probability.
florianmai Jun 26, 2017
e40ad5a
Escaped code that would cause crash if no thesaurus is provided
florianmai Jun 28, 2017
3d72aad
May specify 'label_delimiter' in data json file to tell how to split …
florianmai Jun 28, 2017
3fbdae3
Merge branch 'master' of ren-mai.net:quadflor-code
florianmai Jun 28, 2017
16accd0
Added first version of CNN (Kim's sentence classification)
florianmai Jun 29, 2017
932df35
One-Hot encoding of text
florianmai Jun 29, 2017
02b5323
Fixed a crash occuring when using onehot encoding with validation set
florianmai Jun 30, 2017
2f77e06
Option for limiting number of words to be taken into account when usi…
florianmai Jul 1, 2017
436a9bb
Merge branch 'master' of ren-mai.net:quadflor-code
florianmai Jul 1, 2017
43c8093
Fixed an error with type of feature matrix
florianmai Jul 2, 2017
05d584a
Save model of best validation store to distinct path based on current…
florianmai Jul 2, 2017
146b203
Implemented simple multi-layer LSTM.
florianmai Jul 2, 2017
2c5afda
Use dynamic_rnn as a wrapper instead of multi-rnn-cell
florianmai Jul 3, 2017
3bc12ed
Encode in feature matrix and extract from it the size of vocabulary
florianmai Jul 3, 2017
586f30b
May use any sklearn.metrics metric as validation score now, f1-sample…
florianmai Jul 3, 2017
5768e8b
Use --optimize_threshold to optimize threshold on validation set duri…
florianmai Jul 3, 2017
850b9e6
Specify folder to save weights of best model to
florianmai Jul 3, 2017
2515d5c
Merge branch 'master' of ren-mai.net:quadflor-code
florianmai Jul 3, 2017
f96fe46
Little bit of refactorization in run.py script
florianmai Jul 23, 2017
41e20fa
Random search and bayesian hyperparameter optimization
florianmai Jul 23, 2017
6ea18fc
Default search strategies adapted to Snoerk
florianmai Jul 26, 2017
2c48f53
Pretrained word embeddings
florianmai Jul 27, 2017
1bb04d7
Added fixed seeds to assure reproducability of results.
florianmai Aug 7, 2017
9c540eb
You may now use pretrained word embeddings with LSTM and CNN
florianmai Aug 7, 2017
d8694b0
Option for continuing to train the embedding layer alongside the task
florianmai Aug 7, 2017
9c260fb
Bugfix where embedding training from scratch wasnt working
florianmai Aug 8, 2017
12a6f14
Print longest sequence of epochs with no improvement
florianmai Aug 8, 2017
8bbb9a5
Fixed an issue when loading word embeddings
florianmai Aug 8, 2017
ec8863a
Patience for early stopping is now a program argument
florianmai Aug 8, 2017
2bbe544
Merge branch 'master' of ren-mai.net:quadflor-code
florianmai Aug 8, 2017
8055b2f
Fixed an issue with initializing word embeddings
florianmai Aug 14, 2017
1561b83
Feed the embedding lookup table at variable initialization to avoid l…
florianmai Aug 15, 2017
0421773
Print information on label frequency when verbose
florianmai Aug 16, 2017
4f4ed04
Keep prob now interpreted correctly, activation fn is a parameter
florianmai Aug 17, 2017
01291eb
Added option for normalizing in MLP Soph
florianmai Aug 17, 2017
24cd0ba
Define MLP-Base as special case of MLPsoph
florianmai Aug 17, 2017
efd5f39
Grid search as optimization technique
florianmai Aug 17, 2017
08a0094
Bayesian optimization can take more than one initial value
florianmai Aug 17, 2017
7d03b31
Option for using n-grams for BoW
florianmai Aug 24, 2017
52a7e7f
Removed CPU scope which prevented GPU usage when using embedding lookup
florianmai Aug 27, 2017
6867be0
Specify number of steps / weight updates before doing evaluation
florianmai Aug 28, 2017
0319dcd
Merge branch 'master' of ren-mai.net:quadflor-code
florianmai Aug 28, 2017
88009c7
Moved normalization to utils in order to test it
florianmai Aug 28, 2017
64120b9
Fixed division by zero when scaling
florianmai Aug 28, 2017
7a55bb2
Character n-grams as features
florianmai Aug 28, 2017
7d14545
Option for identity as activation function on hidden layer
florianmai Aug 28, 2017
0229b64
Fixed a bug where tf models crash without validation set
florianmai Aug 30, 2017
5455b7a
Max features applies to each ngram group individually
florianmai Aug 30, 2017
b1a1cf9
Restrict word embedding lookup table to only those that occur in the …
florianmai Sep 5, 2017
d64a469
Bottleneck layer
florianmai Sep 29, 2017
f60e22d
Dynamic max pooling
florianmai Sep 29, 2017
2c0de05
Variational recurrent dropout for LSTM
florianmai Sep 29, 2017
40f6d9b
Option for setting fraction of GPU memory to use
florianmai Sep 29, 2017
901b5c0
Specify multiple feature extraction methods as parameters, merge all
florianmai Oct 8, 2017
7939b5b
Apply variational recurrent dropout only at inner states
florianmai Oct 8, 2017
b38c463
Merge branch 'master' of ren-mai.net:quadflor-code
florianmai Oct 8, 2017
7c06abf
Pass activation function on hidden layer to MLP soph as well
florianmai Oct 10, 2017
8ee92eb
Option for bidirectional LSTM by creating a (multi-layered) LSTM for …
florianmai Oct 10, 2017
d4c6a86
Option for LSTM output aggregation strategy
florianmai Oct 10, 2017
6382ce5
Fixed a deprecated import.
florianmai Oct 18, 2017
45d5f71
Use 'vanilla' LSTM and fix averaging over outputs and extracting last…
florianmai Oct 23, 2017
a064ade
Fix sequence length computation
florianmai Oct 23, 2017
0a4d56b
Fix dynamic max pooling
florianmai Oct 23, 2017
baaedc7
Added attention and sum as output aggregation
florianmai Oct 26, 2017
79c8659
Move sequence length computation to utils
florianmai Oct 26, 2017
0d9644e
Merge branch 'master' of git.kd.informatik.uni-kiel.de:kd-group/quadf…
florianmai Oct 26, 2017
fd67da9
Unit test for sequence length
florianmai Oct 26, 2017
9cc274f
Test averaging LSTM outputs
florianmai Oct 26, 2017
89408d4
No input normalization for SNNs, implementation of swish
florianmai Oct 26, 2017
94b8539
Moved dynamic max pooling to tf-utils and tested it.
florianmai Oct 27, 2017
aa2088a
Option for iterating LSTM until max-length
florianmai Oct 27, 2017
1c81d57
Merge branch 'master' of git.kd.informatik.uni-kiel.de:kd-group/quadf…
florianmai Oct 27, 2017
4a91c40
First iteration of Meta-Labeler
florianmai Oct 30, 2017
05bb363
Print progress when using meta-labeler
florianmai Oct 31, 2017
fc0fa6f
Fix in extracting the correct number of labels to predict.
florianmai Oct 31, 2017
37d8f03
Compute score-based Meta-Labeler based on label predictions.
florianmai Oct 31, 2017
592dc09
Grid search file now can have string values
florianmai Oct 31, 2017
c0d63c0
Option for using batch normalization in MLP.
florianmai Nov 7, 2017
3785c6b
Options for choosing number of filters and window sizes for CNN.
florianmai Nov 14, 2017
15cad87
Allow lists for grid search
florianmai Dec 17, 2017
1065e98
Print label statistics when verbose.
florianmai Feb 10, 2018
caa0d91
Merge branch 'master' of git.kd.informatik.uni-kiel.de:kd-group/quadf…
florianmai Feb 10, 2018
c220267
Documentation for two classes
florianmai Feb 11, 2018
d0c92c1
Implement OE-LSTMs but not iterate until maxlength but for a set numb…
florianmai Mar 2, 2018
56332ee
Remove unneccessary printout
florianmai Mar 2, 2018
c4915c6
OverEager attention: attention only over overeager outputs.
florianmai Mar 2, 2018
1c7f1e9
Updated readme and added configuration files for reproducability.
florianmai Mar 15, 2018
9c1c67e
Updated requirements
florianmai Mar 15, 2018
2cc0984
Update README.md
Apr 3, 2018
f98e49a
Merge branch 'master' of github.com:florianmai/Quadflor
florianmai Apr 3, 2018
d748a5c
Update README.md
florianmai Apr 3, 2018
d15414a
Added long version of the paper.
florianmai Apr 3, 2018
1010936
Merge branch 'master' of github.com:florianmai/Quadflor
florianmai Apr 3, 2018
f2ebfc4
Resolved conflicts
lgalke Aug 23, 2019
3de697b
Actually merge READMEs
lgalke Aug 24, 2019
2e92f9a
Add hint how to get GloVe vectors
lgalke Aug 24, 2019
d052329
Remove word vector file
lgalke Aug 24, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 55 additions & 8 deletions Code/lucid_ml/classifying/neural_net.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,32 @@
from scipy import sparse

from sklearn.base import BaseEstimator
from keras.layers import Dense, Activation, Dropout, BatchNormalization
from keras.layers import Dense, Activation, Dropout
from keras.models import Sequential
from keras.optimizers import Adam
import numpy as np
from sklearn.metrics import f1_score
from sklearn.linear_model import Ridge

from sklearn.linear_model import Ridge
from keras.callbacks import EarlyStopping, ModelCheckpoint

#===============================================================================
# class EarlyStoppingBySklearnMetric(Callback):
# def __init__(self, metric=lambda y_test, y_pred : f1_score(y_test, y_pred, average='samples'), value=0.00001, verbose=0):
# super(Callback, self).__init__()
# self.metric = metric
# self.value = value
# self.verbose = verbose
#
# def on_epoch_end(self, epoch, logs={}):
# current = logs.get(self.monitor)
# if current is None:
# warnings.warn("Early stopping requires %s available!" % self.monitor, RuntimeWarning)
#
# if current < self.value:
# if self.verbose > 0:
# print("Epoch %05d: early stopping THR" % epoch)
# self.model.stop_training = True
#===============================================================================

def _batch_generator(X, y, batch_size, shuffle):
number_of_batches = np.ceil(X.shape[0] / batch_size)
Expand Down Expand Up @@ -43,10 +62,22 @@ def _batch_generatorp(X, batch_size):


class MLP(BaseEstimator):
def __init__(self, verbose=0, model=None, final_activation='sigmoid'):
def __init__(self, verbose=0, model=None, final_activation='sigmoid', batch_size = 512, learning_rate = None, epochs = 20):
self.verbose = verbose
self.model = model
self.final_activation = final_activation
self.batch_size = batch_size
self.validation_data_position = None
self.epochs = epochs

# we scale the learning rate proportionally with the batch size as suggested by
# [Thomas M. Breuel, 2015, The Effects of Hyperparameters on SGD
# Training of Neural Networks]
# we found lr=0.01 to be a good learning rate for batch size 512
if learning_rate is None:
self.lr = self.batch_size / 512 * 0.01
else:
self.lr = learning_rate

def fit(self, X, y):
if not self.model:
Expand All @@ -56,16 +87,32 @@ def fit(self, X, y):
self.model.add(Dropout(0.5))
self.model.add(Dense(y.shape[1]))
self.model.add(Activation(self.final_activation))
self.model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=0.01))
self.model.fit_generator(generator=_batch_generator(X, y, 256, True),
samples_per_epoch=X.shape[0], nb_epoch=20, verbose=self.verbose)
self.model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=self.lr))

val_pos = self.validation_data_position


callbacks = []
if self.validation_data_position is not None:
callbacks.append(EarlyStopping(monitor='val_loss', min_delta=0, patience=5, verbose=0, mode='auto'))
callbacks.append(ModelCheckpoint("weights.best.hdf5", monitor='val_loss', verbose=1, save_best_only=True, mode='min'))
X_train, y_train, X_val, y_val = X[:val_pos, :], y[:val_pos,:], X[val_pos:, :], y[val_pos:,:]
else:
X_train, y_train = X, y
self.model.fit_generator(generator=_batch_generator(X_train, y_train, self.batch_size, True), callbacks=callbacks,
steps_per_epoch=int(X.shape[0] / float(self.batch_size)) + 1, nb_epoch=self.epochs, verbose=self.verbose,
validation_data = _batch_generator(X_val, y_val, self.batch_size, False) if self.validation_data_position is not None else None,
validation_steps = 10)

if self.validation_data_position is not None:
self.model.load_weights("weights.best.hdf5")

def predict(self, X):
pred = self.predict_proba(X)
return sparse.csr_matrix(pred > 0.2)

def predict_proba(self, X):
pred = self.model.predict_generator(generator=_batch_generatorp(X, 512), val_samples=X.shape[0])
pred = self.model.predict_generator(generator=_batch_generatorp(X, self.batch_size), steps=int(X.shape[0] / float(self.batch_size)) + 1)
return pred


Expand Down
Loading