Skip to content

Analytics labs notebooks for Statistics and Business School students

License

Notifications You must be signed in to change notification settings

CBravoR/AdvancedAnalyticsLabs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AdvancedAnalyticsLabs

Analytics labs notebooks, supporting analytics teaching for BSc and MSc courses. I've taught these at a business school and a statistics department, so I think they fit both reasonably well. Currently, there are 19 labs uploaded divided into five topics:

Intro to Python

  1. Introduction to Python: First few steps. Simple intro for people who might be already familiar with other languages, not meant for people with no programming experience!

  2. Functions and Revenue Management: Implementation of simple algorithms (Littlewood, EMSR-a and EMSR-b). Covers function creation and an introduction to PyPlot. Taught until 2019 in Southampton University as part of Advanced Analytics course.

Banking Regulation

  1. Basel Capital Requirements: Covers Lambda functions and an introduction to Pandas in the context of the Basel capital requirements formulas.

  2. Bond Pricing: Teaches bond pricing, yields and clean/dirty prices. Taught from 2019 at Western University, as part of the Banking Analytics course I created. Replaces Revenue Management lab above, and also covers function creation and an introduction to PyPlot.

Credit Risk Modelling

  1. Data Preprocessing: Simple data preprocessing using pandas and scikit-learn.

  2. Weight of evidence transformation: How to calculate Weight of Evidence transformations in Python. Uses my own fork of the excellent scorecardpy package by @ShichenXie, with some bugs fixed and other personalizations.

  3. Logistic Regression and Scorecards: Intro to scikit-learn, how to run a Lasso and Ridge regression, and how to calculate a scorecard.

  4. Random Forest and XGBoosting: How to run a Random Forest, an XGBoost model, tune parameters over a grid, use Shapley values to explain predictions, and compare ROC curves.

  5. LGD Modelling: How to model LGD using either a GLM or an XGB model.

  6. PD / LGD Calibration: How to define ratings by segmenting the AUC curve and calibrate a long-run PD / downturn LGD adjusted by macroeconomic factors.

Deep Learning

  1. Introduction to Keras, Pytorch, and Shallow ANN: Gentle introduction to Keras and Pytorch.

  2. 2D CNN and Gradient Backtracing: 2D Convolutions for image classification. Use of pre-trained models (VGG16), and gradient backtracing to visualize what is being used to discriminate in Pytorch.

  3. Multimodal learning: Regression example using ResNet50v2 and the Keras' Model API. Current multimodal example I use in my lectures combining categorical data and image data.

  4. Recurrent Networks: LSTM and GRU in Pytorch.

  5. Transformers: The Transformer applied using Huggingface's packages.

  6. LLM API: Using OpenAI's LLM libraries and examples.

Other labs

  1. SQL Refresher: Refresher on SQL, how to access it from Python, and a very light introduction to SQLAlchemy.

  2. Primer on Visualization: A few plots using pyplot, seaborn and plotly. Very introductory primer.

  3. Explainability and Confounding: How to use the Shap package to explain XGB models and a couple of confounding factors examples. Taught as part of the DS3000 - Intro to Machine Learning course at Western.

These labs are available under the GPL v3, feel free to use them as you wish. I'll be grateful if you can point to the Github, as I'll keep these updated in subsequent iterations of the modules where I teach this. As always, these notebooks are provided with no guarantees.

Comments are welcome!