Boilerplate for Deep Learning Projects

Model Templates

Multi-layer Perceptron - MNIST (Homemade framework)
CNN from scratch (Homemade framework)
Logistic Regression - MNIST (TensorFlow)
Simple Multi-layer Perceptron - MNIST (TensorFlow)
Enhanced Multi-layer Perceptron using Batch Normalization - MNIST (TensorFlow)
Enhanced Multi-layer Perceptron using TensorFlow Estimator API - MNIST
Simple CNN - MNIST (TensorFlow)
Enhanced CNN - Image Classifier (Keras)
Image classifier (Keras)
Autoencoder - Denoising images, Facial Recognition, Face Generation (Keras)
RNN - Name Generator (Keras)
Part of speech (POS) tagging using an RNN (Keras)
Image Captioning (Keras)
Image Classifier using ResNet and Fast.ai (PyTorch)
Deep Q Network (Keras)
Generative Adversarial Network (GAN) (Keras)
Predicting StackOverflow Tags using Classical NLP
CNN using Sonnet - Signs dataset (DeepMind Sonnet)
Recognize named entities on Twitter using a Bidirectional LSTM (TensorFlow)
Recognize named entities on Twitter using CRF (sklearn-crfsuite)
Recognize named entities on Twitter using Bi-LSTM + CRF (TensorFlow)
Detect Duplicate Questions on StackOverflow using Embeddings
Building a Simple Calculator using a Sequence-to-Sequence Model (TensorFlow)
Reinforcement Learning using crossentropy method
Reinforcement Learning using a neural net (sklearn)
Navigate a Frozen Lake using a Markov Decision Process (MDP)
A Sequence-to-Sequence Chatbot (TensorFlow)
Solve the Taxi Challenge using Q-Learning
Training a Deep Q-Learning Network to play Atari Breakout (Keras)
Playing CartPole using REINFORCE (Keras)
Playing Kung Fu Master using Advantage Actor Critic (AAC) (Keras)
Playing CartPole using Monte Carlo Tree Search
Translating Hebrew to English using RL for Seq2Seq Models (TensorFlow)
Bernoulli Bandits - Survey of Model-free RL Algorithms
Q-Table Learning Agent
Multi-armed Bandit (TensorFlow)
Contextual Bandits (TensorFlow)
Vanilla Policy Gradient Agent (TensorFlow)
Model-based example for RL (TensorFlow)
Deep Q-Network (TensorFlow)
Deep Recurrent Q-Network (TensorFlow)
Asynchronous Actor-Critic Agents (A3C) (TensorFlow)
Wake-word Detection (Keras)
Neural Turing Machine (TensorFlow)
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Adversarial Networks (PyTorch)
Pointer Generator Network for Text Summarization (TensorFlow)
Minimizing network delay using Deep Deterministic Policy Gradients (DDPG) (Keras)
RL from scratch - Using Policy Gradients to play Pong
Multi-class Text Classification using a CNN and RNN (TensorFlow)
Multi-class Text Classification using fastText (fastText)
Multi-class Text Classification using Fastai (Fastai / PyTorch)
Multi-class Text Classification using Logistic Regression (sklearn)
Multi-class Text Classification using Multinomial Naive Bayes (sklearn)
Multi-class Text Classification using NBSVM (SVM with Naive Bayes Features) (sklearn)
Multi-class Text Classification using BiLSTM (TensorFlow)
Multi-class Text Classification using Word-level CNN (TensorFlow)
Multi-class Text Classification using Word-level CNN initialized with Word2Vec Embeddings (TensorFlow)
Multi-class Text Classification using Character-level CNN (Keras)
Multi-class Text Classification using a Transformer Model (TensorFlow)
Multi-class Text Classification using BERT - Transfer Learning using Deep Bidirectional Transformers (TensorFlow)
Siamese CNN for document matching (Keras)
Text Generation via Adversarial Training (TensorFlow)
Question Detector using Word CNN (TensorFlow)
Decomposable Attention to identify question pairs that have the same intent (Keras)
LightGBM model to identify question pairs (sklearn / LightGBM)
lda2vec to mix the best parts of word2vec and LDA (TensorFlow)
Summarization using LSTM (TensorFlow)

Special Topics

Reinforcement Learning -- Survey of Methods
Natural Language Processing
What do you do with...
Exploring state-of-the-art in text classification

Demonstrates

Basic principles of a neural net framework with methods for forward and backward steps
Basic principles of convolutional neural network
Basics of TensorFlow
Basic setup for a deep network
More complex network using batch normalization
Training with the TensorFlow Estimator API
Basic principles of a convolutional neural network
CNN using Keras
Fine-tuning InceptionV3 for image classification
Autoencoders
Basic principles of a recurrent neural network for character-level text generation
Using an RNN for POS tagging, using the high-level Keras API for building an RNN, creating a bidirectional RNN
Combining a CNN (encoder) and RNN (decoder) to caption images
A higher level framework (3 lines of code for an image classifier)
Deep Reinforcement Learning using CartPole environment in the OpenAI Gym
Basic principles of a GAN to generate doodle images trained on the 'Quick, Draw!' dataset.
Exploring classical NLP techniques for multi-label classification.
Basic usage of Sonnet to organize a TensorFlow model
Basic principles of a Bidirectional LSTM for named entity recognition
Basic principles of Conditional Random Fields (CRF) and comparison with Bi-LSTM on the same task
Combining a Bi-LSTM with CRF to get learned features + constraints
Use of embeddings at a sentence level, testing StarSpace from Facebook Research.
Solving sequence-to-sequence prediction tasks.
Basic principles of reinforcement learning
Approximating crossentropy with neural nets in an RL model
Using a Markov Decision Process to solve an RL problem.
Building a chatbot using a sequence-to-sequence model approach.
Basic principles of Q-Learning
Tips and tricks to train a Deep Q-Learning Network - Frame Buffer, Experience Replay
Basic principles of using the REINFORCE algorithm
Basic principles of using the Advantage Actor Critic (AAC) algorithm
Introduction to Planning Algorithms using Monte Carlo Tree Search.
Reinforcement learning for sequence-to-sequence models.
Survey of Model-free RL algorithms - Epsilon-greedy, UCB1, and Thompson Sampling.
Introduction to Q-Table Learning.
Building a simple policy-gradient based agent that can solve the multi-armed bandit problem.
Building a simple policy-gradient based agent where the environment has state, but state is not determined by the previous state or action.
Introduction to Policy Gradient methods in RL.
Introduction to model-based RL networks.
Implement a Deep Q-Network using Experience Replay.
Implement a Deep Recurrent Q-Network to handle Partially Observable Markov Decision Processes (POMDPs).
Introduction to Asynchronous Actor-Critic Networks based on DeepMind Paper.
Processing audio using an RNN to detect wake-words.
Introduction to Neural Turing Machines.
Using a GAN to transfer style from one domain to another while preserving key attributes such as orientation and face identity.
Basic principles of Pointer Generator Networks.
Using Reinforcement Learning to optimize a Software Defined Network (SDN).
Introduction to Policy Gradients.
Experiments in finding best-in-class short-text classifier.
fastText (Facebook Research) performance in text classification tasks.
Using Transfer Learning in NLP to achieve state-of-the-art performance in text classification.

52. Baseline model for Multi-class Text Classification. 53.

Datasets

MNIST - handwritten digits (Keras)
CIFAR-10 - labelled images with 10 classes
Flowers classification dataset
LFW (Labeled Faces in the Wild) - photographs of faces from the web
Names - list of human names
Captioned Images
Tagged sentences from the NLTK Brown Corpus
Quick, Draw! dataset
StackOverflow posts and corresponding tags
Sign language - numbers 0 - 5
Tweets tagged with named entities
Duplicate questions set, with positive and negative examples, from StackOverflow
Cornell movie dialog corpus.
Open Subtitles movie dialog corpus.
Hebrew to English words.
Pix2pix datasets.
San Francisco Crime Classification (for text/intent classification).
Large Movie Review Dataset (for text/intent classification).

Notation

Superscript [l] denotes an object of the l^{th} layer.
- Example: a^{[4]} is the 4^{th} layer activation. W^{[5]} and b^{[5]} are the 5^{th} layer parameters.
Superscript (i) denotes an object from the i^{th} example.
- Example: x^{(i)} is the i^{th} training example input.
Subscript i denotes the i^{th} entry of a vector.
- Example: a^{[l]}_i denotes the i^{th} entry of the activations in layer l, assuming this is a fully connected (FC) layer.
n_H, n_W and n_C denote respectively the height, width and number of channels of a given layer. If you want to reference a specific layer l, you can also write n_H^{[l]}, n_W^{[l]}, n_C^{[l]}.
n_{H_{prev}}, n_{W_{prev}} and n_{C_{prev}} denote respectively the height, width and number of channels of the previous layer. If referencing a specific layer l, this could also be denoted n_H^{[l-1]}, n_W^{[l-1]}, n_C^{[l-1]}.

Naming conventions

Hyperparameters

n_epochs
learning_rate, lr
epsilon

Parameters

features, inp, x, x_train, x_val, x_test
labels, y, y_train, y_val, y_test
weights, w, w1, w2, w3
bias, b, b1, b2, b3
z, z1, z2, z3
a, a1, a2, a3

Common tests

Check gradients against a calculated finite-difference approximation
Check shapes
Logits range. If your model has a specific output range rather than linear, you can test to make sure that the range stays consistent. For example, if logits has a tanh output, all of our values should fall between 0 and 1.
Input dependencies. Makes sure all of the variables in feed_dict affect the train_op.
Variable change. Check variables you expect to train with each training op.

Good practices for tests:

Keep them deterministic. If you really want randomized input, make sure to seed the random number so you can rerun the test easily.
Keep the tests short. Don’t have a unit test that trains to convergence and checks against a validation set. You are wasting your own time if you do this.
Make sure you reset the graph between each test.

Useful references

How to test gradient implementations

Ideas

Turn trainers into generators, one epoch at a time

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.rst

README.rst

Boilerplate for Deep Learning Projects

Model Templates

Special Topics

Demonstrates

Datasets

Notation

Naming conventions

Hyperparameters

Parameters

Common tests

Useful references

Ideas

Files

README.rst

Latest commit

History

README.rst

File metadata and controls

Boilerplate for Deep Learning Projects

Model Templates

Special Topics

Demonstrates

Datasets

Notation

Naming conventions

Hyperparameters

Parameters

Common tests

Useful references

Ideas