From Building to Deploying LLMs

This repository aims at the complete lifecycle of developing large language models (LLMs), including all stages from model building and pre-training to fine-tuning and deployment.

⚙️ Setup

This repository running environment python=3.12. If you already have a Python installation on your machine, the quickest way to get started is to install the package requirements from the requirements.txt file by executing the following pip installation command from the root directory of this code repository:

pip install -r requirements.txt

Tip

Certain versions of PyTorch exhibit issues with adaptation to Apple’s MPS acceleration device (such as torch==2.3.1), resulting in loss convergence anomalies during training. These issues were resolved in version 2.4.0.
I am using computers running macOS (Macmini M2 16GB), but this workflow is similar for Linux machines and may work for other operating systems as well.

🧑‍💻 Code Specification

architecture.py: This code describes the architecture of a GPT model implemented using PyTorch.
load_weigths.py: This code defines some function, which loads pre-trained weights from a Hugging Face GPT-2 model into a custom GPT model architecture defined in the architecture.py.
trainer.py: A class to encapsulate the training, evaluation, and testing procedures for a PyTorch model. This class supports various features including learning rate warmup, cosine decay, gradient clipping, and periodic evaluation. It can handle both classification and regression tasks.

🚂 Pre-training

The pretraining.ipynb provides a comprehensive description of the pre-training process and its specifics.

⏩ Fine-tuning

The practice-A and practice-B involves fine-tuning pre-trained models like GPT-2 for various downstream tasks. This process focuses on understanding the specifics of fine-tuning for both classification and autoregressive tasks. Low requirements on hardware performance.

💗 References

Sebastian Raschka. Build A Large Language Model (From Scratch). Manning, 2024. ISBN: 978-1633437166. Book link. GitHub Repository

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
figures		figures
practice-A		practice-A
practice-B		practice-B
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
architecture.py		architecture.py
load_weigths.py		load_weigths.py
pretraining.ipynb		pretraining.ipynb
requirements.txt		requirements.txt
the-verdict.txt		the-verdict.txt
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

From Building to Deploying LLMs

⚙️ Setup

🧑‍💻 Code Specification

🚂 Pre-training

⏩ Fine-tuning

💗 References

About

Releases

Packages

Languages

License

2544939880/from-building-to-deploying-LLMs

Folders and files

Latest commit

History

Repository files navigation

From Building to Deploying LLMs

⚙️ Setup

🧑‍💻 Code Specification

🚂 Pre-training

⏩ Fine-tuning

💗 References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages