Skip to content

Latest commit

 

History

History
183 lines (134 loc) · 9.88 KB

README.md

File metadata and controls

183 lines (134 loc) · 9.88 KB

iQual

MIT license PyPI version Docs - GitHub.io

This repository contains the code and resources necessary to implement the techniques described in the paper A Method to Scale-Up Interpretative Qualitative Analysis, with an Application to Aspirations in Cox’s Bazaar, Bangladesh. The iQual package is designed for qualitative analysis of open-ended interviews and aims to extend a small set of interpretative human-codes to a much larger set of documents using natural language processing. The package provides a method for assessing the robustness and reliability of this approach. The iQual package has been applied to analyze 2,200 open-ended interviews on parent’s aspirations for children from Rohingya refugees and their Bangladeshi hosts in Cox’s Bazaar, Bangladesh. It draws on work in anthropology and philosophy to expand conceptions of aspirations in economics to distinguish between material goals, moral and religious values, and navigational capacity—the ability to achieve goals and aspirations, showing that they have very different correlates.

With iQual, researchers can efficiently analyze large amounts of qualitative data while maintaining the nuance and accuracy of human interpretation.


Installation

  • To install iQual using pip, use the following command:
pip install -U iQual
  • Alternatively, you can install iQual from source. To do so, use the following commands:
git clone https://github.com/worldbank/iQual.git
cd iQual
pip install -e .

Dependencies

iQual requires Python 3.7+ and the following dependencies:


Features

iQual is a package designed for qualitative analysis of open-ended interviews. It allows researchers to efficiently analyze large amounts of qualitative data while maintaining the nuance and accuracy of human interpretation.


Basic Usage

The following code demonstrates the basic usage of the iQual package. It shows how to construct a pipeline, fit it to the data, and use it to classify new data.

Import the iqual package and initiate the model class.

from iqual import iqualnlp     # Import `iqualnlp` from the `iqual` package

iqual_model = iqualnlp.Model() # Initiate the model class

Add text features to the model. The add_text_features method takes the following arguments:

  • question_col: The name of the column containing the question text.
  • answer_col: The name of the column containing the answer text.
  • model: Name of a scikit-learn, spaCy, sentence-transformers, or a precomputed vector (picklized dictionary) model. The default is TfidfVectorizer.
  • env: The environment or package which is being used. The default is scikit-learn. Available options are scikit-learn, spacy, sentence-transformers, and saved-dict.
  • **kwargs: Additional keyword arguments to pass to the model.
# Use a scikit-learn feature extraction method
iqual_model.add_text_features(question_col,answer_col,model='TfidfVectorizer',env='scikit-learn') 

# OR - Use a sentence-transformers model
iqual_model.add_text_features(question_col,answer_col,model='all-mpnet-base-v2',env='sentence-transformers') 

# OR - Use a spaCy model
iqual_model.add_text_features(question_col,answer_col,model='en_core_web_lg',env='spacy') 

# OR - Use a precomputed vector (picklized dictionary)
iqual_model.add_text_features(question_col,answer_col,model='qa_precomputed.pkl',env='saved-dict') 

(OPTIONAL) Add a feature transformation layer. The add_feature_transformer method takes the following arguments:

  • name: The name of the feature transformation layer.
  • transformation: The type of transformation. Available options are FeatureScaler and DimensionalityReduction.

To add a feature scaling layer, use the following code:

iqual_model.add_feature_transformer(name='Normalizer', transformation="FeatureScaler") # or any other scikit-learn scaler

To add a dimensionality reduction layer, use the following code:

iqual_model.add_feature_transformer(name='UMAP', transformation="DimensionalityReduction") # supports UMAP or any other scikit-learn decomposition method

Add a classifier layer. The add_classifier method takes the following arguments:

  • name: The name of the classifier layer. The default is LogisticRegression.
  • **kwargs: Additional keyword arguments to pass to the classifier.
iqual_model.add_classifier(name = "LogisticRegression") #  Add a classifier layer from scikit-learn

(OPTIONAL) Add a threshold layer for the classifier using add_threshold

iqual_model.add_threshold() # Add a threshold layer for the classifier, recommended for imbalanced data

Compile the model with compile.

iqual_model.compile() # Compile the model

Fit the model to the data using fit. The fit method takes the following arguments:

  • X_train: The training data. (pandas dataframe)
  • y_train: The training labels. (pandas series)
iqual_model.fit(X_train,y_train) # Fit the model to the data

Predict the labels for new data using predict. The predict method takes the following arguments:

  • X_test: The test data. (pandas dataframe)
y_pred = iqual_model.predict(X_test) # Predict the labels for new data

For examples on cross-validation fitting, model selection & performance evaluation, bias, interpretability and measurement tests, refer to the notebooks folder.


Notebooks

The notebooks folder contains detailed examples on using iQual. The notebooks are organized into the following categories:

  • Basic Modelling These notebooks demonstrates the basic usage of the package, the pipeline construction, and the vectorization and classification options.

  • Advanced Modelling These notebooks demonstrate advanced pipeline construction, mixing and matching of feature extraction and classification methods, and model selection.

  • Interpretability These notebooks demonstrate the interpretability and related tests for measurement and comparison of interpretability across human and enhanced (machine + human) codes.

  • Bias and Efficiency These notebooks demonstrate the bias and efficiency tests for determining the value and validity of enhanced codes.


Citation & Authors

If you use this package, please cite the following paper:

A Method to Scale-Up Interpretative Qualitative Analysis, with an Application to Aspirations in Cox’s Bazaar, Bangladesh

Ashwin,Julian; Rao,Vijayendra; Biradavolu,Monica Rao; Chhabra,Aditya; Haque,Arshia; Khan,Afsana Iffat; Krishnan,Nandini.
A Method to Scale-Up Interpretative Qualitative Analysis, with an Application to Aspirations in Cox’s Bazaar, Bangladesh (English). (Policy Research Working Paper No. WPS 10046)
Paper is funded by the Knowledge for Change Program (KCP) Washington, D.C. : World Bank Group.
http://documents.worldbank.org/curated/en/099759305162210822/IDU0a357362e00b6004c580966006b1c2f2e3996

Maintainers

Please contact the following people for any queries regarding the package: