Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mismatch in tensor size #102

Closed
thangckt opened this issue Oct 17, 2024 · 12 comments
Closed

mismatch in tensor size #102

thangckt opened this issue Oct 17, 2024 · 12 comments

Comments

@thangckt
Copy link
Contributor

Dear Deverlopers,

I get the below error when set is_train_stress: True

path/python3.11/site-packages/torch/nn/modules/loss.py:535: UserWarning: Using a target size (torch.Size([2, 6])) that is different to the input size (torch.Size([12])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
  return F.mse_loss(input, target, reduction=self.reduction)
Traceback (most recent call last):
  File "/home1/p001cao/app/miniconda3/envs/py11mace/bin/sevenn", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "path/python3.11/site-packages/sevenn/main/sevenn.py", line 105, in main
    train(global_config, working_dir)
  File "path/python3.11/site-packages/sevenn/scripts/train.py", line 85, in train
    processing_epoch(
  File "path/python3.11/site-packages/sevenn/scripts/processing_epoch.py", line 50, in processing_epoch
    trainer.run_one_epoch(
  File "path/python3.11/site-packages/sevenn/train/trainer.py", line 65, in run_one_epoch
    error_recorder.update(output)
  File "path/python3.11/site-packages/sevenn/error_recorder.py", line 271, in update
    self._update(output)
  File "path/python3.11/site-packages/sevenn/error_recorder.py", line 266, in _update
    metric.update(output)
  File "path/python3.11/site-packages/sevenn/error_recorder.py", line 150, in update
    se = self._square_error(y_ref, y_pred, self.vdim)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "path/python3.11/site-packages/sevenn/error_recorder.py", line 146, in _square_error
    return self._se(y_ref, y_pred).view(-1, vdim).sum(dim=1)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "path/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "path/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "path/python3.11/site-packages/torch/nn/modules/loss.py", line 535, in forward
    return F.mse_loss(input, target, reduction=self.reduction)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "path/python3.11/site-packages/torch/nn/functional.py", line 3365, in mse_loss
    expanded_input, expanded_target = torch.broadcast_tensors(input, target)
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "path/python3.11/site-packages/torch/functional.py", line 76, in broadcast_tensors
    return _VF.broadcast_tensors(tensors)  # type: ignore[attr-defined]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: The size of tensor a (12) must match the size of tensor b (6) at non-singleton dimension 1

This error disapears when set is_train_stress: False

Can you have a little help?
Thanks.

@YutackPark
Copy link
Member

It is likely to be the consequence of sevenn.train.dataload::atoms_to_graph. The routine tries to load stress from the atoms object but does not ensure whether the shape and types of loaded stress are correct.

Could you share the version and minimal data to reproduce the error?

@thangckt
Copy link
Contributor Author

hi @YutackPark
I attach few frames of extxyz data.
Please use this

data_format_args:                            
    energy_key: 'ref_energy'                 
    force_key: 'ref_forces'                   
    stress_key: 'ref_stress' 

data.txt

@YutackPark
Copy link
Member

atoms.info['y_stress'] = atoms.info[stress_key]

When 'stress' data is loaded via custom key, in your case ref_stress, current code does not check whether it has (1, 6) shapes. 7net has to ensure the stress to have (1, 6) shape. This is a bug and will be patched.

For now, you may preprocess your ref_stress to have (1, 6) shape to avoid the problem. Sorry for the inconvenience. To prevent this kind of bugs, I'm writing pytest codes to have best practice... but not merged yet.

@thangckt
Copy link
Contributor Author

thangckt commented Oct 18, 2024

hi @YutackPark
Thank you for your explain.

you may preprocess your ref_stress to have (1, 6) shape to avoid the problem

ref_stress already has shape (1,6) what preprocess did you mean?

@YutackPark
Copy link
Member

@thangckt, I copy pasted the data you gave to me and read it using ase.io.read and it gives (6,) shaped array. Maybe ASE automatically converts it to have plain (6,) shape before writing a file. Seems only available bypass is storing the results using SinglePointCalculator as I mentioned in previous issue #61.

Plus, I looked more closely and seems the feature is broken, due to the difference between stress notation inside the SevenNet -1 * (xx, yy, zz, xy, yz, zx) and ASE (Voigt: xx, yy, zz, yz, zx, xy). I recommend to not use it before the patch, unless you're very confident of it.

@thangckt
Copy link
Contributor Author

thangckt commented Oct 18, 2024

hi @YutackPark
Thank you for your guide.

About the stress component order, it will be serious misleading. Can I know why you don't follow the well-known Voigt notation?
image

@YutackPark
Copy link
Member

hi @thangckt
It is another side-effect of following our groups's previous MLIP package SIMPLE-NN, or VASP itself.
VASP uses xx, yy, zz, xy, yz, zx notation in its OUTCAR file, and so does SIMPLE-NN.
I'm trying my best to hide this cumbersomeness to users, but it failed in this case. I may refactor this to follow Voigt notation ALWAYS after stabilizing the code.

@thangckt
Copy link
Contributor Author

thangckt commented Oct 18, 2024

hi @YutackPark
Thank you so much for your information.
I will follow your updates.

About stress notation, I think It should better follow Voigt notation. Any output from a specific software should be converted to this convention. Otherwise, it will be serious misleading for users.

@YutackPark
Copy link
Member

It was supposed to follow voigt notation for this recently introduced EFS key feature, and what happened here is simply my fault, and I agree with you.
Anyway, thanks for the bug report. I'll notify you with closing the issue after the fix. The mixed notation inside the code is really confusing (even for me).

@YutackPark
Copy link
Member

Hi @thangckt , could you check current main branch?
After update, try this

$ sevenn_inference 7net-0 your_data.extxyz --kwargs energy_key='ref_energy' force_key='ref_forces' stress_key='ref_stress'

From inference_results/per_graph.csv, you can see stress_yz is correctly assigned following assuming 'ref_stress' follows Voigt notation (it is in kB unit & '-1' multiplied, compared to raw values in the extxyz file)

Also, I confirmed training works without a problem. Lots of things have changed for this version (0.10.0), please check the change log: https://github.com/MDIL-SNU/SevenNet/blob/main/CHANGELOG.md.

By the way, if you don't like the changes I mad for this version, feel free to raise issue or discussion of it.

@thangckt
Copy link
Contributor Author

hi @YutackPark,

Thank you so much for the update and information.

About this

assuming 'ref_stress' follows Voigt notation (it is in kB unit & '-1' multiplied, compared to raw values in the extxyz file)

The data in my extxyz with unit: energy [eV], forces [eV/Angstrom], stress [eV/Angstrom^3]
Do you make any alternative value in your code?

Thanks.

@YutackPark
Copy link
Member

YutackPark commented Oct 21, 2024

@thangckt ,
Sorry for the confusion, what I actually mean was, the results written in the per_graph.csv have that unit (kB). You don't need to change anything. SevenNet always expects EFS obtained from ASE atoms has eV, eV/Ang., eV/Ang^3 units regardless of how it is parsed (specific keys or from calculator)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants