Releases: geekylink/PicoGPT
Usable set of scripts to train a GPT model on input.txt text and generate text based on a prompt for the trained model.
A very basic set of scripts to train a GPT model on input.txt text and generate text based on a prompt for the trained model.
These scripts grew out of an attempt to create a set of scripts to train and run medium-sized GPTs on older hardware. Most existing tools require very recent and expensive GPUs to run. I wanted to create something that can run on lower end hardware and enable a larger audience of individuals to experiment with training LLMs.
Train
First thing you need to do is train your model, this is two steps. The first is to tokenize the input data, then you need train for however many epochs
Prepare - Tokenize your data
Run train.py
with --prepare
and --input input.txt
to tokenize data and prepare for training epochs.
Creates a directory: out/output.model/
python train.py --prepare --input input.txt [out/output.model]
Optional note: you can pass --model
to use any other model provided by https://huggingface.co/models
You can use it like so: --model gpt2-xl
Train - Run, run, run...
Train for X epochs using input.model and save to output.model Then train again for more epochs until coherent.
out/output.model
and out/output.model
should be the same model to resume and continue training.
Note: change --batch-size
for smaller/larger GPUs, default is 4.
python train.py --model [out/output.model] --epochs X [out/output.model]
Generate text with the model
python run.py [out/output.model] <prompt_text>