`projwfc.x`: parse from XML instead of parent calc #747

mbercx · 2021-10-03T12:47:05Z

Fixes #299

The current ProjwfcParser uses several in and output nodes from the
parent calculation. This increased the complexity of the tests for this
parser, and made running opengrid.x in between the pw.x and
projwfc.x run impossible without adding these in and output nodes to
the calculation job of opengrid.x.

Here we switch to parsing the XML instead of relying on the parent
calculation. The data-file-schema.xml of the parent calculation is
retrieved and parsed, providing the required information for the
subsequent parsing of the projwfc.x output. All the parsing tests are
updated to include the XML output file and remove the in/output links
for the parent calculation.

The convert_qe_to_kpoints function is added to convert the k-points
data in the XML to a KpointsData node.

mbercx · 2021-10-03T12:54:19Z

This PR is still blocked since it builds on #741 (I used the PdosWorkChain to generate the parser test files ^^).

Here I simply switch to parsing the XML, but leave the 🍝 -code of the parser. I'll open a second PR soon with the new (and hopefully more readable) code. I split these up for a couple of reasons:

Make this PR easier to review since it's time-critical.
Update the tests for using the XML with the old parsing code. I shouldn't have to adapt them when I switch to my code, so the tests should still pass without adapting them, demonstrating that the parsing is unchanged by my refactoring.
Although I'm sure the parsing is the same for QE v6.6 (the version I ran my tests with), I'm also quite confident that e.g. QE v6.3 will no longer parse properly, since I've removed some extra logic required to support it. Since starting from aiida-quantumespresso==4.0.0 we'll only support QE v6.5+, the PR with the refactored code can only be merged when we're getting ready to release 4.0.

mbercx · 2021-10-03T14:03:27Z

@sphuber one thing I was considering is that we duplicate a lot of files in the repository by retrieving the XML file. I suppose it would be better to instead add the XML file to the retrieve_temporary_list.

Additionally, I was wondering if we should move the PDOS files (e.g. aiida.pdos_atm#1(Fe)_wfc#6(p_j1.5)) to the temporary retrieve list as well. Adding all these to the repository adds quite a bit of files, since you get one of these for each orbital of each atom. Even for the simple 2 atom case of Fe, this results in 16 extra files in the repository when running spin-orbit calculations. I don't see the use case for keeping these in the repository save for testing, since all of this data is stored in the ProjectionData output under the link projections.

mbercx · 2021-10-03T17:05:41Z

In the end I decided to only move the XML file to the temporary retrieve list here, since technically no longer retrieving the PDOS files is backwards incompatible. I've added this to #749 instead.

Note: Please don't pay too much attention to coding style for this PR. I've reworked the code a lot in #749, so leave coding style comments for that one. I too was very annoyed by the use of out_info_dict to just pass everything between the various methods.

mbercx · 2021-10-05T12:45:47Z

@qiaojunfeng I've adapted the branch so it no longer builds on top of #741, so it's ready for review. I guess the main question is if this allows you to simplify the definition of the opengrid.x calculation job in #714.

qiaojunfeng

Thanks Marnik! Only few minor issues.

aiida_quantumespresso/calculations/projwfc.py

aiida_quantumespresso/parsers/projwfc.py

aiida_quantumespresso/calculations/projwfc.py

aiida_quantumespresso/parsers/projwfc.py

With (aiidateam#747) the `ProjwfcParser` has no implicit requirements on parent calculation. The code for compatibility with `ProjwfcCalculation` is now removed.

mbercx · 2021-10-07T12:06:50Z

@qiaojunfeng I've fixed the failure to return the XML exit codes and added some tests. I've also refactored the tests a bit to use pytest.mark.parametrize and have added a fixture to avoid too much repetition.

One final question I still had here: Since the projwfc needs to have a pw.x parent calculation (or rather, ancestor calculation in case open_grid.x is called in between), I suppose it's always guaranteed that there is an XML? Or is there a use case I'm missing where the XML file won't be present?

qiaojunfeng · 2021-10-07T13:14:51Z

Thanks @mbercx , all look good to me!

One final question I still had here: Since the projwfc needs to have a pw.x parent calculation (or rather, ancestor calculation in case open_grid.x is called in between), I suppose it's always guaranteed that there is an XML? Or is there a use case I'm missing where the XML file won't be present?

Yes I think so, these post-processing codes need to read the outdir of QE, if XML is not there, the code won't run. So we can safely assume XML is always there. 🚀

mbercx · 2021-10-07T13:23:59Z

Thanks @qiaojunfeng! I'll just wait for a final sign-off from @sphuber, since he still raised some questions above.

sphuber

Thanks @mbercx if you just add the comment to the code why XML is retrieved as temporary, then this is good to go for me :+:

The current `ProjwfcParser` uses several in and output nodes from the parent calculation. This increased the complexity of the tests for this parser, and made running `opengrid.x` in between the `pw.x` and `projwfc.x` run impossible without adding these in and output nodes to the calculation job of `opengrid.x`. Here we switch to parsing the XML instead of relying on the parent calculation. The `data-file-schema.xml` of the parent calculation is retrieved and parsed, providing the required information for the subsequent parsing of the `projwfc.x` output. All the parsing tests are updated to include the XML output file and remove the in/output links for the parent calculation. The `convert_qe_to_kpoints` function is added to convert the k-points data in the XML to a `KpointsData` node.

mbercx · 2021-10-08T13:51:24Z

@sphuber Locked 'n loaded, jefe!

sphuber

¡Ándale ándale!

mbercx · 2021-10-08T14:00:00Z

The current `ProjwfcParser` uses several in and output nodes from the parent calculation. This increased the complexity of the tests for this parser, and made running `opengrid.x` in between the `pw.x` and `projwfc.x` run impossible without adding these in and output nodes to the calculation job of `opengrid.x`. Here we switch to parsing the XML instead of relying on the parent calculation. The `data-file-schema.xml` of the parent calculation is retrieved and parsed, providing the required information for the subsequent parsing of the `projwfc.x` output. All the parsing tests are updated to include the XML output file and remove the in/output links for the parent calculation. Note that the XML file is added to the temporary retrieve list since although it is required for parsing, it is already in repository of a an ancestor calculation. The `convert_qe_to_kpoints` function is added to convert the k-points data in the XML to a `KpointsData` node.

In aiidateam#747 we adjusted the parser for `projwfc.x` to rely on the XML file instead of the provenance for obtaining the structure. This worked fine for structures that have chemical symbols as kind names, but once the kind labels contain numbers the `convert_qe2aiida_structure` failed to generate the `StructureData` based on the XML content since this case is not considered. Here we rename this method to `convert_qe_to_aiida_structure` and make it more flexible by also adding the case of numbered kind names. This is simply done by removing any digits from the kind names in the XML to obtain the corresponding `symbol` of the kind name.

In #747 we adjusted the parser for `projwfc.x` to rely on the XML file instead of the provenance for obtaining the structure. This worked fine for structures that have chemical symbols as kind names, but once the kind labels contain numbers the `convert_qe2aiida_structure` failed to generate the `StructureData` based on the XML content since this case is not considered. Here we rename this method to `convert_qe_to_aiida_structure` and make it more flexible by also adding the case of numbered kind names. This is simply done by removing any digits from the kind names in the XML to obtain the corresponding `symbol` of the kind name.

With (aiidateam#747) the `ProjwfcParser` has no implicit requirements on parent calculation. The code for compatibility with `ProjwfcCalculation` is now removed.

mbercx added the pr/blocked PR is blocked by another PR that should be merged first label Oct 3, 2021

mbercx requested review from sphuber and qiaojunfeng October 3, 2021 12:47

mbercx mentioned this pull request Oct 3, 2021

ProjwfcParser: refactor code and temp retrieve PDOS files #749

Open

1 task

mbercx force-pushed the fix/299/projwfc-parse branch from 0d5870f to 20aa631 Compare October 3, 2021 17:05

mbercx mentioned this pull request Oct 4, 2021

PdosWorkChain: Fix initialization and protocol usage of ElectronicType #741

Merged

mbercx force-pushed the fix/299/projwfc-parse branch from 20aa631 to c28d19c Compare October 5, 2021 12:40

mbercx removed the pr/blocked PR is blocked by another PR that should be merged first label Oct 5, 2021

qiaojunfeng requested changes Oct 5, 2021

View reviewed changes

aiida_quantumespresso/calculations/projwfc.py Outdated Show resolved Hide resolved

aiida_quantumespresso/parsers/projwfc.py Outdated Show resolved Hide resolved

sphuber requested changes Oct 5, 2021

View reviewed changes

aiida_quantumespresso/calculations/projwfc.py Outdated Show resolved Hide resolved

aiida_quantumespresso/calculations/projwfc.py Outdated Show resolved Hide resolved

aiida_quantumespresso/parsers/projwfc.py Show resolved Hide resolved

qiaojunfeng mentioned this pull request Oct 5, 2021

Add OpenGridCalculation for open_grid.x code #714

Merged

mbercx force-pushed the fix/299/projwfc-parse branch from f2ccbc6 to c50af41 Compare October 6, 2021 00:09

mbercx requested review from qiaojunfeng and sphuber October 6, 2021 00:34

mbercx force-pushed the fix/299/projwfc-parse branch from 82a3794 to f9d401e Compare October 7, 2021 12:03

qiaojunfeng previously approved these changes Oct 7, 2021

View reviewed changes

sphuber previously approved these changes Oct 8, 2021

View reviewed changes

mbercx dismissed stale reviews from sphuber and qiaojunfeng via 0478536 October 8, 2021 13:50

mbercx added 3 commits October 8, 2021 15:50

Move XML file to temporary retrieve list

ff69301

Apply reviewer suggestions

a1c8cd4

mbercx added 2 commits October 8, 2021 15:50

Fix XML exit code return + add tests

2e1b081

add comment on temporary XML retrieval

51c4c04

mbercx force-pushed the fix/299/projwfc-parse branch from 0478536 to 51c4c04 Compare October 8, 2021 13:50

mbercx requested a review from sphuber October 8, 2021 13:50

sphuber approved these changes Oct 8, 2021

View reviewed changes

mbercx merged commit 0874d95 into aiidateam:develop Oct 8, 2021

mbercx deleted the fix/299/projwfc-parse branch October 8, 2021 13:59

mbercx mentioned this pull request Dec 9, 2021

projwfc parser fails for atom labels with numbers #761

Closed

mbercx mentioned this pull request Jan 18, 2022

Fix XML parsing for structures with numbered kinds #770

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`projwfc.x`: parse from XML instead of parent calc #747

`projwfc.x`: parse from XML instead of parent calc #747

mbercx commented Oct 3, 2021

mbercx commented Oct 3, 2021

mbercx commented Oct 3, 2021

mbercx commented Oct 3, 2021

mbercx commented Oct 5, 2021

qiaojunfeng left a comment

mbercx commented Oct 7, 2021

qiaojunfeng commented Oct 7, 2021

mbercx commented Oct 7, 2021

sphuber left a comment

mbercx commented Oct 8, 2021

sphuber left a comment

mbercx commented Oct 8, 2021

projwfc.x: parse from XML instead of parent calc #747

projwfc.x: parse from XML instead of parent calc #747

Conversation

mbercx commented Oct 3, 2021

mbercx commented Oct 3, 2021

mbercx commented Oct 3, 2021

mbercx commented Oct 3, 2021

mbercx commented Oct 5, 2021

qiaojunfeng left a comment

Choose a reason for hiding this comment

mbercx commented Oct 7, 2021

qiaojunfeng commented Oct 7, 2021

mbercx commented Oct 7, 2021

sphuber left a comment

Choose a reason for hiding this comment

mbercx commented Oct 8, 2021

sphuber left a comment

Choose a reason for hiding this comment

mbercx commented Oct 8, 2021

`projwfc.x`: parse from XML instead of parent calc #747

`projwfc.x`: parse from XML instead of parent calc #747