ChemLoRA

Leveraging Large Language Models (LLMs) for Accurate Molecular Energy Predictions

Requirements

PyTorch
GPTChem
transformers
PEFT
datasets
scikit-learn

Data

The QM9-G4MP2 dataset is publicly available through Materials Data Facility (GitHub link).

Model Fine-Tuning

GPT-3 is fine-tuned on the QM9-G4MP2 dataset using the GPTChem framework. To run the provided Python script, execute the following command:

python gptchem_smiles.py

The runpeft.py script can be used to fine-tune any foundational LLM available in Hugging Face. For example, to fine-tune the gpt2 model, run the following command:

python runpeft.py "gpt2"

License

This software is released under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ChemLoRA

Leveraging Large Language Models (LLMs) for Accurate Molecular Energy Predictions

Requirements

Data

Model Fine-Tuning

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

ChemLoRA

Leveraging Large Language Models (LLMs) for Accurate Molecular Energy Predictions

Requirements

Data

Model Fine-Tuning

License