Finetuning example¶
Warning
Finetuning is currently only available for the PET architecture.
This is a simple example for fine-tuning PET-MAD (or a general PET model), that
can be used as a template for general fine-tuning with metatrain.
Fine-tuning a pretrained model allows you to obtain a model better suited for
your specific system. You need to provide a dataset of structures that have
been evaluated at a higher reference level of theory, usually DFT. Fine-tuning
a universal model such as PET-MAD allows for reasonable model performance even if little training
data is available.
It requires using a pre-trained model checkpoint with the mtt train command and setting the
new targets corresponding to the new level of theory in the options.yaml file.
In order to obtain a pretrained model, you can use a PET-MAD checkpoint from huggingface
wget https://huggingface.co/lab-cosmo/pet-mad/resolve/v1.1.0/models/pet-mad-v1.1.0.ckpt
Next, we set up the options.yaml file. We can specify the fine-tuning method
in the finetune block in the training options of the architecture.
Here, the basic full option is chosen, which finetunes all weights of the model.
All available fine-tuning methods are found in the advanced concepts
Fine-tuning. This section discusses implementation details,
options and recommended use cases. Other fine-tuning options can be simply substituted in this script,
by changing the finetune block.
Furthermore, you need to specify the checkpoint, that you want to fine-tune in
the read_from option.
A simple options.yaml file for this task could look like this:
Training on a new level of theory is a common use case for transfer learning. Let’s
architecture:
name: pet
training:
num_epochs: 1000
learning_rate: 1e-5
finetune:
method: full
read_from: path/to/checkpoint.ckpt
training_set:
systems:
read_from: dataset.xyz
reader: ase
length_unit: angstrom
targets:
energy:
quantity: energy
read_from: dataset.xyz
reader: ase
key: energy
unit: eV
forces:
read_from: dataset.xyz
reader: ase
key: forces
stress:
read_from: dataset.xyz
reader: ase
key: stress
test_set: 0.1
validation_set: 0.1
In this example, we specified generic but reasonable num_epochs and learning_rate
parameters. The learning_rate is chosen to be relatively low to stabilise
training.
Warning
Note that in targets we use the PET-MAD energy head. This means, that there won’t be a new head
for the new reference energies provided in your dataset. This can lead to bad performance, if the reference
energies differ from the ones used in pretraining (different levels of theory, or different electronic structure
software used). In future it is recommended to create a new energy target for the new level of theory.
Find more about this in Transfer-Learning
We assumed that the pre-trained model is trained on the dataset dataset.xyz in which
energies are written in the energy key of the info dictionary of the
energies. Additionally, forces and stresses should be provided with corresponding keys
which you can specify in the options.yaml file under targets.
Further information on specifying targets can be found in Customize a Dataset Configuration.
Note
It is important that the length_unit is set to angstrom and the energy unit is eV in order
to match the units PET-MAD was trained on. If your dataset has different energy units, it is
necessary to convert it to eV before fine-tuning.
After setting up your options.yaml file, finetuning can then simply be run
via mtt train options.yaml.
Further fine-tuning examples can be found in the AtomisticCookbook