♻️ REFACTOR: move basic parsing into `BaseParser` class #756

mbercx · 2021-11-23T16:28:34Z

Fixes #750
Fixes #797

Large refactoring of the basic parsing features to deduplicate code and rely on regular expressions via re instead of iterations over lines. The main changes are:

The Parser class is renamed to BaseParser class to avoid confusion with the aiida.core Parser class.
The BaseParser.parse_stdout_from_retrieved method can be used by all subclasses to obtain the stdout, basic parsed data and logs from retrieving and parsing the stdout.
The error and warning messages are defined as class attributes. Base error messages are defined on the BaseParser class, which can be extended on the subclasses via the class_[error,warning]_map attributes.
A BaseParser.check_base_errors() method is added, which checks for basic errors that indicate the stdout cannot be parsed for more data.
Shared exit codes are moved to the NamelistsCalculation spec from that of the subclasses.
Several changes in the logic of returning exit codes and emitting the logs. Some of the discussions on these topics have not been fully concluded, and will be revisited when integrating the BaseParser into the PwParser, see:
Revisit BaseParser when integrating into PwParser #953

mbercx · 2022-01-27T20:35:19Z

Some timing information for a very long aiida.out file (~140k lines) without errors:

Old parse_output_base:

14.4 ms ± 297 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

New parse_output_base:

10.7 ms ± 125 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

and a very short (131 lines) file with errors:

Old parse_output_base:

11.4 µs ± 262 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

New parse_output_base:

23.2 µs ± 271 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

So for large files the new parser is actually faster, but for small files the parsing is slower. We're taking microseconds here though, so I don't think this is a massive issue? At least I think it's worth the simplified code.

mbercx · 2022-01-28T08:52:10Z

aiida_quantumespresso/parsers/projwfc.py

+
+        :param stdout: the stdout content as a string.
+        """
+        out_info_dict['out_file'] = stdout.split('\n')


I know using the out_info_dict is messy as hell, but that'll be fixed in #749. :)

mbercx · 2022-01-28T09:05:39Z

@sphuber currently the PwParser and CpParser don't reuse the base parsing methods yet, but these are rather large and messy parsers I'd like to clean up in separate PRs for v4.0.

sphuber

Thanks a lot @mbercx great work! This really cleans things up nicely, just have some minor requests/suggestions

aiida_quantumespresso/calculations/projwfc.py

aiida_quantumespresso/calculations/pw2gw.py

aiida_quantumespresso/parsers/base.py

sphuber · 2022-01-28T10:47:15Z

tests/parsers/test_dos/test_dos_default.yml

@@ -10,6 +10,6 @@ dos:
  - states/eV
  - states/eV
 parameters:
-  code_version: v.6.4.1
+  code_version: 6.4.1


Should we keep this the same to not break backwards-compatibility?

Hmm, perhaps. I preferred removing the v., because it's cleaner and more consistent with e.g. the plugin version added to the metadata. But it's not a big deal, so probably backwards compatibility is the stronger argument.

This had been reverted to also include the v.

Looking at this again, I'd be less shy about breaking backwards compatibility regarding this version number. It's probably not being used at all atm, and removing the v. is cleaner in my view.

tests/parsers/test_neb.py

aiida_quantumespresso/parsers/base.py

mbercx · 2023-05-06T11:46:01Z

@sphuber she's almost ready, just a bit more cleaning. I would also:

Remove the v. for the code versions.
Remote the wall_time. It's a useless string and there is wall_time_seconds.

mbercx · 2023-05-07T06:21:20Z

src/aiida_quantumespresso/parsers/matdyn.py


-        if filename_frequencies not in retrieved.base.repository.list_object_names():
-            return self.exit(self.exit_codes.ERROR_OUTPUT_STDOUT_READ)


This is removed on purpose:

I don't think this ever happens if the stdout is "complete"

It should use its own exit code, and I don't want to define one if [1] is true.

mbercx · 2023-05-07T07:00:56Z

src/aiida_quantumespresso/parsers/base.py


        :param exit_code: an `ExitCode`
        :return: the exit code
        """
+        if logs:
+            self.emit_logs(logs)


I was wondering what to do here. I liked the idea of always passing the logs to the exit method, so the logs are always emitted before the parsing stops. But this lead to a few questions:

What does the emit_logs method really do? It seems to only use getattr, how does it really "emit" anything?

What's the purpose of this logger exactly?

this reminds me, perhaps we should end each parse method with an explicit return of a zero exit code via the exit method? Explicit is better than implicit, and then it passes through this method which emits the logs (now).

What does the emit_logs method really do? It seems to only use getattr, how does it really "emit" anything?

Believe it or not, it emits log messages 😄 It is easy to miss but the line

getattr(self.logger, level)(stripped)

will retrieve a method of the Logger instance stored at self.logger. Since level is a log-level in lower-case, e.g., warning or error, getattr(self.logger, level) will return the bound method warning or error in that case. We then call it passing the message ...(stripped). So essentially it is doing something like self.logger.warning(stripped).

What's the purpose of this logger exactly?

It is configured with the associated CalcJobNode whose outputs are being parsed and so any log messages are stored as Log entries linked to the node. They will show up in the output of verdi process report for example. This is useful information for the user.

this reminds me, perhaps we should end each parse method with an explicit return of a zero exit code via the exit method? Explicit is better than implicit, and then it passes through this method which emits the logs (now).

Yes, but we should then also remember to first check whether an exit status was already set by the scheduler plugin. For example if the scheduler parser detected the job was killed due to an OOW error, the associated exit status (I believe 120) is already set on the node. So before we return ExitCode(0), we should check if self.node.exit_status is not None in which case we probably want to return None to keep that status, because returning ExitCode(0) will override it.

Great, thanks for the clarification, @sphuber! 🙏

So essentially it is doing something like self.logger.warning(stripped).

Right, of course, it's quite clear after this explanation. I guess for this kind of use case, a CalcJobNode.get_logger(level) method would be more readable than a logger property. ^^

Now that I understand better, I'd be inclined to merge the emit_logs and exit methods into one method with signature:

def exit(self, logs: AttributeDict = None, exit_code: ExitCode = None) -> ExitCode:

which:

Emits any logs left in the logs input, if provided.

If exit_code has been provided, emit its message via self.logger.

If not, check if there is already an exit_status set on the node and log this message (or will this have been done already?)

Return

exit_code if it was provided.

None if the node already has an exit_status.

ExitCode(0) else

Right, of course, it's quite clear after this explanation. I guess for this kind of use case, a CalcJobNode.get_logger(level) method would be more readable than a logger property. ^^

Don't think that would be a correct name, because you cannot return a logger for a particular level, you log a message at a particular level through a logger. That being said, I think you can easily make the current code more readable. I think the Logger class has a log method where you can pass the desired level as an argument, so that line would become self.logger.log(level, stripped). Pretty sure that should work.

Now that I understand better, I'd be inclined to merge the emit_logs and exit methods into one method with signature:

I would keep emit_logs as a separate function, because it can be useful to call it directly elsewhere still. Fine to move the call in all parsers to the exit method.

If not, check if there is already an exit_status set on the node and log this message (or will this have been done already?)

This will already have been done by the engine before calling the Parser. So nothing to do here but to "keep" it if the parser itself cannot provide a more specific exit code.

I've adapted the method as discussed, see 6529ea6.

I would keep emit_logs as a separate function, because it can be useful to call it directly elsewhere still. Fine to move the call in all parsers to the exit method.

If we always end the parse method with the exit method for clarity, and always emit all the logs there, what would be the use case of calling the emit_logs method separate?

mbercx · 2023-05-07T07:26:16Z

@sphuber ok, I refactored my initial refactor quite a bit, so perhaps this needs a full review again, apologies. It was also quite a bit of work to fix all the merge conflicts after rebase, so I would now really push to get this PR in.

A couple of quick notes that I decided upon:

The code version in the output_parameters shouldn't have a v. prefix. It's redundant.
The wall_time output is similarly redundant and less useful than wall_time_seconds.

Now a more elaborate design discussion on the changes here and TODO's for parsers.

Keeping the `parse` method readable

Understanding what the parser parses when and which exit codes it returns at what point is difficult with many of the parser classes. We want to make sure that:

The "best" exit code is returned, i.e. the one that is most specific or allows for useful error handling.
As much data as possible is parsed and added to the outputs of the calculation job.

To improve clarity and make these two requirements possible, I think the parse method should:

Not contain any actual parsing. Any real parsing should happen in a separate method.
Instantiate a single logging container that is passed through these parsing methods.
Take care of decisions on when to return which exit code.
Output data at the appropriate time.

Note that this PR does not fully implement these ideas for all parser classes. It's already big enough as it is. But it already shows how this would be possible for e.g. the open_grid.x parser.

Other improvements

Exit code registry: Although this PR already make an effort of making sure exit codes are consistent across the calculation job classes, it's still tricky to keep track of all the exit codes they are used any why they are there or have a certain number. I think we should create an (automated) exit code registry in the documentation where we keep track of all this. Although two calculation jobs can in principle have the same exit code for different issues, we should probably also avoid this.
Stop using strings for exit codes: Now we use strings to keep track of the exit codes. I think this is a bit fragile and doesn't allow for nice features like tab-completion. It would be better to have a different approach, but I'd have to think a bit on how to do this.
PhCalculation and PpCalculation aren't subclasses from NamelistCalculation, and hence have to redefine the basic exit codes. So do PwCalculation and CpCalculation. There may be reasons to treat these calculation jobs differently, but we should investigate if we can't further consolidate this calculation job hierarchy.

mbercx · 2023-05-10T07:18:41Z

Because of the size of the PR, and the fact that even though the tests pasts it can definitely break things, perhaps:

@PNOGillespie @superstar54 maybe you can have a look a the XSpectra parsing?
@qiaojunfeng can you have a look at the open_grid.x parsing changes?

qiaojunfeng · 2023-05-11T11:56:56Z

* @qiaojunfeng can you have a look at the `open_grid.x` parsing changes?

Had a quick look and seems good to me!
Anyhow open_grid is very simple so if the tests passed then I wouldn't worry.

src/aiida_quantumespresso/parsers/base.py

sphuber · 2023-05-14T12:08:15Z

src/aiida_quantumespresso/parsers/base.py

+        """Read and parse the ``stdout`` content of a Quantum ESPRESSO calculation.
+
+        :param logs: Logging container that will be updated during parsing.
+        :returns: size 3 tuple: (``stdout`` content, parsed data, updated logs).


Why return the logs if they are passed in by reference and so are updated in place? Typically not a huge fan of tuple return values as it is often not immediately clear what is being returned.

I do see your point. To me, I like making it clear that the logs input is updated by also passing it as an output. I guess it also related to the question I pose here, and makes me inclined to simply generate separate logging containers in each parser method, but that also feels than ideal...

Maybe we should keep track of the logs on a class variable? That seems sensible. Initially I also wasn't sure why we needed this extra logging item because we can just actually use the logger. Then I figured this is so we can keep track of errors and then pick and choose which one to return as an exit code. But can't we just inspect the logger as well? Feels like we doing double work.

Re returning tuples: I can see your point, but what is the alternative for returning multiple outputs? A dictionary is even more opaque to me. Also: with the hover features in VSCode it's quite clear what is being returned: 🙃

You might not believe it, but I am actually VSCode myself now. But I think that in general making code design decisions based on a feature of the particular editor that we ourselves just happen to be using, is a bad idea, because not everyone is using the same tools. But fine, let's keep this as is.

src/aiida_quantumespresso/parsers/base.py

sphuber · 2023-05-14T12:24:51Z

src/aiida_quantumespresso/parsers/base.py

+            'ERROR_OUTPUT_STDOUT_PARSE'
+        ]:
+            if exit_code in logs.error:
+                exception = logs.error[logs.index(exit_code) + 1]


Hmm, interesting approach. I can see how this is the most straightforward as you don't have to adapt the structure of the container and can just keep a list, but it is quite fragile I think. Ideally, the container should become a dictionary, with the key the error code, and the value can be a dictionary with additional information, for example argument to format the exit message.

Hehe, when implementing this I was already 95% sure it wouldn't pass review. Yes, I think adapting the logging container probably makes more sense. Let me resolve the other issues first and give this some thought, also in relation to the other discussions on the container. It's clear we should rethink this aspect.

sphuber · 2023-05-14T12:26:43Z

src/aiida_quantumespresso/parsers/base.py


-    def exit(self, exit_code):
-        """Log the exit message of the give exit code with level `ERROR` and return the exit code.
+    def check_base_errors(self, logs: AttributeDict) -> Optional[ExitCode]:


There might be problems with this approach in the future. For example the PwParser, it may be possible that the stdout could not be parsed, but the XML can and so we can still parse and return useful data.

But in the end the Parser implementation still has full control over what to do, no? This method simply does these basic checks and does some extra formatting. It might not be necessary anymore if we redesign the structure of the logging container, however.

sphuber · 2023-05-14T12:30:49Z

src/aiida_quantumespresso/parsers/base.py

+        if exit_code is not None:
+            self.logger.error(exit_code.message)
+        elif self.node.exit_status is not None:
+            exit_code = ExitCode(self.node.exit_status, self.node.exit_message)
+        else:
+            exit_code = ExitCode(0)
+
        return exit_code


This logic is not correct. Imagine the following situation: job gets submitted, but the node crashes. This is parsed by the SlurmScheduler and it sets 140:ERROR_NODE_FAILURE on the CalcJobNode. Since the job crashed, stdout will either be missing or probably incomplete and so the base output parser will return the ERROR_STDOUT_MISSING exit code (or comparable) and in this logic, this will therefore always override the scheduler status. But I would argue that the scheduler exit code is more useful in this case and we definitely want to keep it.

I think ultimately, each concrete parser implementation needs to parse the output and check itself whether a scheduler exit code already exists and determine whether it can provide a more specific/informative exit code.

This currently not causing the tests to fail, because there only is such a test for PwParser and there you haven't made use of this method since you wanted to not touch it. But this also means that we probably won't be able to use it. Maybe we can leave this for now then as it is, but we will have to anticipate changing it in the future when other parsers will want to become aware of scheduler exit codes and not override them

Good points. I think the logic was indeed that in case the developer wants to return a scheduler exit code they should look for and return it in the parse method. This one just checks that in case the developer doesn't want to do this for whatever reason, the exit code is still added in case self.exit() is called at the end.

But I suppose scheduler failures will almost always be more informative/useful?

src/aiida_quantumespresso/parsers/dos.py

sphuber · 2023-06-04T12:09:16Z

@mbercx did you want to get this merged before the release? If so, is there anything to be done from your side? Otherwise I will approve. We can deal with the open points later. I also would like to have this done so I can finalize my refactor PR for PwParser.

mbercx · 2023-06-08T18:44:31Z

@mbercx did you want to get this merged before the release? If so, is there anything to be done from your side? Otherwise I will approve. We can deal with the open points later. I also would like to have this done so I can finalize my refactor PR for PwParser.

I've opened #953 to keep track of the discussions that are left here. I think it makes sense to simply reconsider these decisions when integrating the BaseParser in the PwParser, we'll have a clearer idea of what to do during this process, I think.

mbercx · 2023-06-08T18:59:54Z

@sphuber it seems the autoapi extension doesn't always play nice with type hints. To avoid going down this rabbit hole here, I simply added the problematic classes to nitpick_ignore. I'll probably look into using autoapi2 soon anyways:

https://pypi.org/project/sphinx-autoapi2/

sphuber · 2023-06-09T18:40:11Z

Thanks @mbercx let's get this merged

yakutovicha · 2023-06-20T13:57:12Z

Hi 👋

Just wondering, are you guys planning on making a major release after this PR is merged?

There were API changes, as far as I can see.

sphuber · 2023-06-20T14:26:56Z

We were getting close to releasing v4.4.0: #951 @mbercx want to finish that one up and release it?

mbercx · 2023-06-22T12:16:10Z

want to finish that one up and release it?

Yessir! Sorry for the slow response.

There were API changes, as far as I can see.

The internal parsing API was changed quite a bit, but I think the vast majority of the changes are backwards incompatible. Looking at the data regression files for the tests, the only changes in parsing output is the code_version and wall_time_seconds/wall_time. Which API changes are you worried about breaking e.g. the QEapp?

yakutovicha · 2023-06-22T15:38:32Z

Which API changes are you worried about breaking e.g. the QEapp?

Mostly about the pp parser (that is the one I noticed). The parse_gaussian method, for instance. Since the name doesn't start with an underscore, it is part of the public API of the class, strictly speaking.

I didn't closely look at the PR, but there might be others.

However, it depends on how closely you want to follow the semantic versioning principles.

mbercx · 2023-06-22T17:00:43Z

However, it depends on how closely you want to follow the semantic versioning principles.

Yeah, that's fair. Perhaps we should be clearer in defining the public API. I don't expect users to use the Parser classes directly, since in many cases you need to parse from a node (which perhaps we should change over time, but then you definitely break the API).

The parser classes are also quite messy, and we've been working on improving that. Fixing this technical debt would be more important than strictly maintaining backwards compatibility for a part of the API that I don't think users are relying on directly.

unkcpz mentioned this pull request Nov 23, 2021

Add diagonalization errors handler to ph base workchain #757

Merged

mbercx force-pushed the fix/748/base-stdout-parsing branch from b08b206 to 0522e9e Compare January 28, 2022 02:19

mbercx changed the title ~~WIP: refactoring base stdout parsing~~ ♻️ REFACTOR: move basic parsing into BaseParser class Jan 28, 2022

mbercx marked this pull request as ready for review January 28, 2022 02:49

mbercx requested a review from sphuber January 28, 2022 02:50

mbercx mentioned this pull request Jan 28, 2022

parse_output_error: fix parser for QE stdout errors #751

Closed

mbercx commented Jan 28, 2022

View reviewed changes

mbercx force-pushed the fix/748/base-stdout-parsing branch 2 times, most recently from e27573f to 92d7bf3 Compare January 28, 2022 10:16

sphuber requested changes Jan 28, 2022

View reviewed changes

mbercx force-pushed the fix/748/base-stdout-parsing branch from 92d7bf3 to 6373e0b Compare January 28, 2022 12:22

mbercx force-pushed the fix/748/base-stdout-parsing branch 2 times, most recently from 0c8c48a to 3edacd7 Compare May 6, 2023 09:32

mbercx commented May 7, 2023

View reviewed changes

mbercx force-pushed the fix/748/base-stdout-parsing branch 4 times, most recently from c9fe647 to 4e0caf7 Compare May 7, 2023 06:53

mbercx commented May 7, 2023

View reviewed changes

mbercx requested a review from sphuber May 7, 2023 07:55

mbercx force-pushed the fix/748/base-stdout-parsing branch from 4e0caf7 to b3655a9 Compare May 10, 2023 06:17

mbercx force-pushed the fix/748/base-stdout-parsing branch from 6529ea6 to 765e0de Compare May 12, 2023 08:30

mbercx commented May 12, 2023

View reviewed changes

src/aiida_quantumespresso/parsers/base.py Show resolved Hide resolved

mbercx commented May 12, 2023

View reviewed changes

src/aiida_quantumespresso/parsers/base.py Show resolved Hide resolved

sphuber requested changes May 14, 2023

View reviewed changes

mbercx mentioned this pull request May 28, 2023

HpCalculation: exit code and handler for cholesky aiidateam/aiida-quantumespresso-hp#38

Merged

bastonero mentioned this pull request May 30, 2023

♻️ REFACTOR: move basic parsing into BaseParser class aiidateam/aiida-quantumespresso-hp#41

Open

mbercx force-pushed the fix/748/base-stdout-parsing branch from 064df48 to fa3d23f Compare June 3, 2023 09:20

mbercx added 8 commits June 4, 2023 10:53

Refactor basic parsing

2e83540

First round of fixes

45aba27

Fix the ph tests

5d4f5c8

Refactor the refactor work

21e5a8f

♻️ Move ERROR_OUT_OF_WALLTIME to BasePwCpInputGenerator

2ec16ab

🔧 Fix type annotiations

4b5e281

👌 Clean up/add exit code returns

9cf5588

tmp commit

c88a681

mbercx force-pushed the fix/748/base-stdout-parsing branch from fa3d23f to c88a681 Compare June 4, 2023 08:54

sphuber mentioned this pull request Jun 8, 2023

Release v4.4.0 #951

Merged

mbercx mentioned this pull request Jun 8, 2023

Revisit BaseParser when integrating into PwParser #953

Open

3 tasks

Update nitpick_ignore

3ff01f9

mbercx requested a review from sphuber June 8, 2023 19:00

sphuber approved these changes Jun 9, 2023

View reviewed changes

mbercx merged commit 1c223c7 into aiidateam:main Jun 10, 2023

mbercx deleted the fix/748/base-stdout-parsing branch June 10, 2023 08:28

yakutovicha mentioned this pull request Aug 30, 2023

Update aiida-qe plugin to ~=4.4 nanotech-empa/aiida-nanotech-empa#142

Merged


		if filename_frequencies not in retrieved.base.repository.list_object_names():
		return self.exit(self.exit_codes.ERROR_OUTPUT_STDOUT_READ)

♻️ REFACTOR: move basic parsing into BaseParser class #756

♻️ REFACTOR: move basic parsing into BaseParser class #756

Conversation

mbercx commented Nov 23, 2021 • edited Loading

mbercx commented Jan 27, 2022 • edited Loading

Choose a reason for hiding this comment

mbercx commented Jan 28, 2022

sphuber left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mbercx commented May 6, 2023

mbercx May 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mbercx commented May 7, 2023

Keeping the parse method readable

Other improvements

mbercx commented May 10, 2023

qiaojunfeng commented May 11, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mbercx Jun 4, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sphuber commented Jun 4, 2023

mbercx commented Jun 8, 2023

mbercx commented Jun 8, 2023

sphuber commented Jun 9, 2023

yakutovicha commented Jun 20, 2023

sphuber commented Jun 20, 2023

mbercx commented Jun 22, 2023 • edited Loading

yakutovicha commented Jun 22, 2023 • edited Loading

mbercx commented Jun 22, 2023

♻️ REFACTOR: move basic parsing into `BaseParser` class #756

♻️ REFACTOR: move basic parsing into `BaseParser` class #756

mbercx commented Nov 23, 2021 •

edited

Loading

mbercx commented Jan 27, 2022 •

edited

Loading

mbercx May 7, 2023 •

edited

Loading

Keeping the `parse` method readable

mbercx Jun 4, 2023 •

edited

Loading

mbercx commented Jun 22, 2023 •

edited

Loading

yakutovicha commented Jun 22, 2023 •

edited

Loading