This chapter kicks off part four, which covers how several deep learning (DL) modeling techniques can be useful for investment and trading. DL has achieved numerous breakthroughs in many domains ranging from image and speech recognition to robotics and intelligent agents that have drawn widespread attention and revived large-scale research into Artificial Intelligence (AI). The expectations are high that the rapid development will continue and many more solutions to difficult practical problems will emerge.
In this chapter, we will present feedforward neural networks to introduce key elements of working with neural networks relevant to the various DL architectures covered in the following chapters. More specifically, we will demonstrate how to train large models efficiently using the backpropagation algorithm and manage the risks of overfitting. We will also show how to use the popular Keras, TensorFlow 2.0, and PyTorch frameworks, which we will leverage throughout part four.
In the following chapters, we will build on this foundation to design various architectures suitable for different investment applications with a particular focus on alternative text and image data. These include recurrent neural networks (RNNs) tailored to sequential data such as time series or natural language, and Convolutional Neural Networks (CNNs), which are particularly well suited to image data but can also be used with time-series data. We will also cover deep unsupervised learning, including autoencoders and Generative Adversarial Networks (GANs) as well as reinforcement learning to train agents that interactively learn from their environment.
In particular, this chapter will cover
- How DL solves AI challenges in complex domains
- Key innovations that have propelled DL to its current popularity
- How feed-forward networks learn representations from data
- Designing and training deep neural networks in Python
- Implementing deep NN using Keras, TensorFlow, and PyTorch
- Building and tuning a deep NN to predict asset price moves
- Deep Learning, Ian Goodfellow, Yoshua Bengio and Aaron Courville, MIT Press, 2016
- Deep learning, Yann LeCun, Yoshua Bengio, and Geoffrey Hinton, Nature 2015
- Neural Networks and Deep Learning, Michael A. Nielsen, Determination Press, 2015
- The Quest for Artificial Intelligence - A History of Ideas and Achievements, Nils J. Nilsson, Cambridge University Press, 2010
- One Hundred Year Study on Artificial Intelligence (AI100)
- TensorFlow Playground, Interactive, browser-based Deep Learning platform
- Gradient Checking & Advanced Optimization, Unsupervised Feature Learning and Deep Learning, Stanford University
- An overview of gradient descent optimization algorithms, Sebastian Ruder, 2016
To gain a better understanding of how NN work, the notebook 01_build_and_train_feedforward_nn formulates as simple feedforward architecture and forward propagation computations using matrix algebra and implements it using Numpy, the Python counterpart of linear algebra.
Currently, the most popular DL libraries are TensorFlow (supported by Google), Keras (led by Francois Chollet, now at Google), and PyTorch (supported by Facebook). Development is very active with PyTorch just releasing version 1.0 and TensorFlow 2.0 expected in early Spring 2019 when it is expected to adopt Keras as its main interface.
Additional options include:
- Microsoft Cognitive Toolkit (CNTK)
- Caffe
- Thenao, developed at University of Montreal since 2007
- Apache MXNet, used by Amazon
- Chainer, developed by the Japanese company Preferred Networks
- Torch, uses Lua, basis for PyTorch
- Deeplearning4J, uses Java
All popular Deep Learning libraries support the use of GPU, and some also allow for parallel training on multiple GPU. The most common types of GPU are produced by NVIDA, and configuration requires installation and setup of the CUDA environment. The process continues to evolve and can be somewhat challenging depending on your computational environment.
A more straightforward way to leverage GPU is via the the Docker virtualization platform. There are numerous images available that you can run in local container managed by Docker that circumvents many of the driver and version conflicts that you may otherwise encounter. Tensorflow provides docker images on its website that can also be used with Keras.
- Which GPU(s) to Get for Deep Learning: My Experience and Advice for Using GPUs in Deep Learning, Tim Dettmers
Keras was designed as a high-level or meta API to accelerate the iterative workflow when designing and training deep neural networks with computational backends like TensorFlow, Theano, or CNTK. It has been integrated into TensorFlow in 2017 and is set to become the principal TensorFlow interface with the 2.0 release. You can also combine code from both libraries to leverage Keras’ high-level abstractions as well as customized TensorFlow graph operations.
The notebook how_to_use_keras demonstrates the functionality.
- A Full Hardware Guide to Deep Learning, Tim Dettmers
- Keras documentation
Tensorboard is a great visualization tool that comes with TensorFlow. It includes a suite of visualization tools to simplify the understanding, debugging, and optimization of neural networks.
You can use it to visualize the computational graph, plot various execution and performance metrics, and even visualize image data processed by the network. It also permits comparisons of different training runs. When you run the how_to_use_keras notebook, and with TensorFlow installed, you can launch Tensorboard from the command line:
tensorboard --logdir=/full_path_to_your_logs ## e.g. ./tensorboard
Pytorch has been developed at the Facebook AI Research group led by Yann LeCunn and the first alpha version released in September 2016. It provides deep integration with Python libraries like Numpy that can be used to extend its functionality, strong GPU acceleration, and automatic differentiation using its autograd system. It provides more granular control than Keras through a lower-level API and is mainly used as a deep learning research platform but can also replace NumPy while enabling GPU computation.
It employs eager execution, in contrast to the static computation graphs used by, e.g., Theano or TensorFlow. Rather than initially defining and compiling a network for fast but static execution, it relies on its autograd package for automatic differentiation of Tensor operations, i.e., it computes gradients ‘on the fly’ so that network structures can be partially modified more easily. This is called define-by-run, meaning that backpropagation is defined by how your code runs, which in turn implies that every single iteration can be different. The PyTorch documentation provides a detailed tutorial on this.
TensorFlow has become the leading deep learning library shortly after its release in September 2015, one year before PyTorch. TensorFlow 2.0 aims to simplify the API that has grown increasingly complex over time by making the Keras API, integrated into TensorFlow as part of the contrib package since 2017 its principal interface, and adopting eager execution. It will continue to focus on a robust implementation across numerous platforms but will make it easier to experiment and do research.
The notebook how_to_use_tensorflow will illustrateshow to use the 2.0 release (updated as the interface stabilizes).
- TensorFlow.org
- Standardizing on Keras: Guidance on High-level APIs in TensorFlow 2.0
- TensorFlow.js, A JavaScript library for training and deploying ML models in the browser and on Node.js
In practice, we need to explore variations of the design options outlined above because we can rarely be sure from the outset which network architecture best suits the data. The GridSearchCV class provided by scikit-learn that we encountered in Chapter 6, The Machine Learning Workflow conveniently automates this process. Just be mindful of the risk of false discoveries and keep track of how many experiments you are running to adjust the results accordingly.
The notebook how_to_optimize_a_NN_architecure explores various options to build a simple feedforward Neural Network to predict asset price moves for a one-month horizon. The python script of the same name aims to facilitate running the code on a server in order to speed up computation.