Skip to content

Latest commit

 

History

History
29 lines (22 loc) · 1.11 KB

README.md

File metadata and controls

29 lines (22 loc) · 1.11 KB

ChemLoRA

Leveraging Large Language Models (LLMs) for Accurate Molecular Energy Predictions

Requirements

Data

The QM9-G4MP2 dataset is publicly available through Materials Data Facility (GitHub link).

Model Fine-Tuning

GPT-3 is fine-tuned on the QM9-G4MP2 dataset using the GPTChem framework. To run the provided Python script, execute the following command:

python gptchem_smiles.py

The runpeft.py script can be used to fine-tune any foundational LLM available in Hugging Face. For example, to fine-tune the gpt2 model, run the following command:

python runpeft.py "gpt2"

License

This software is released under the MIT License.