Skip to content

Latest commit

 

History

History
43 lines (24 loc) · 4.52 KB

README.md

File metadata and controls

43 lines (24 loc) · 4.52 KB

Notebooks

This directory contains Jupyter notebooks that demonstrate how to use the iQual package.

The following notebooks demonstrate how to construct a basic model for a single annotation task:

The following notebooks demonstrate how to construct more advanced models, including models for multiple annotation tasks, models with multiple vectorizers and classifiers, and models with bootstrap resampling.

  • Model with Multiple Vectorizers - This notebook demonstrates how to construct a model for a single annotation task using multiple vectorizers. This can be useful if you want to combine different types of vectorizers (e.g. pretrained-embedding models, count-based models)

  • Model with Multiple Classifiers - This notebook demonstrates how to construct a model for a single annotation task using multiple classifiers. This can be useful if you want to combine different types of classifiers, and compare their performance on the same data.

  • Model with Multiple Annotations - This notebook demonstrates how to run the model fitting process for a multiple annotations.

  • Model with Bootstrap - This notebook demonstrates how to run the model fitting process with bootstrap resampling.

The following notebooks demonstrate how to measure the interpretability of a model, to test whether interpretability improves with increasing Nh (the number of human annotations), and how to plot the distribution of regression coefficients.

  • Interpretability Tests - This notebook demonstrates how to run the interpretability tests on human and enhanced data to determine whether the enhanced data adds value by augmenting the human data.

  • Interpretability with increasing N[h] - This notebook demonstrates how interpretability of ML-assisted enhanced data increases with increasing Nh. This notebooks takes a look at the effect of increasing Nh while holding N = Nh +Nm fixed. Intuitively, this can be thought of as adding human annotations to some of the existing interviews that are currently machine annotated.

  • Distribution of Regression Coefficients - This notebook demonstrates how to run the interpretability tests on a model and plot the distribution of regression coefficients. The sizes of both the human annotated (Nh) and machine annotated (Nm) samples are varied to evaluate how many documents should be annotated by humans to achieve a certain level of interpretability.

  • Bias Test - This notebook demonstrates how to explicity run bias tests on a model using cross-validated predictions across 25 bootstrap samples.