This repository contains the implementation of ProTST, with scripts intended to provide greater understanding for how it works.
The codebase is organized into five main components: Environment setup, Tokenizer preparation, Data preparation, Masked Language Modeling, and ProTST fine-tuning. The core ProTST implementation can be found in the src/finetune_linear_evaluation
directory, which contains teacher-student transfer learning logic. You can customize ProTST's structure by modifying the files in this directory.