Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#614] schedule execution of functions at and around PRC close-down #616

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

d-w-moore
Copy link
Collaborator

No description provided.

@d-w-moore d-w-moore changed the title schedule execution of functions at and around PRC close-down [#614] schedule execution of functions at and around PRC close-down Aug 28, 2024
@d-w-moore d-w-moore marked this pull request as draft August 28, 2024 13:59
@d-w-moore d-w-moore self-assigned this Aug 28, 2024
irods/at_client_exit.py Outdated Show resolved Hide resolved
@d-w-moore
Copy link
Collaborator Author

Wondering if the *_PRC symbols should be renamed to be more specific, more indicative of what they refer to.
Maybe: BEFORE_PRC_CLEANUP, DURING_PRC_CLEANUP, AFTER_PRC_CLEANUP. Any thoughts?

@trel
Copy link
Member

trel commented Nov 17, 2024

was wondering about that too. we call it PRC verbally... but we don't really... have that in writing in the code? hmmm.

yes, happy to entertain other names.

@trel
Copy link
Member

trel commented Nov 17, 2024

well, that's not apparently true - it's at the VERY TOP of the README

@d-w-moore
Copy link
Collaborator Author

d-w-moore commented Nov 17, 2024

The lingo can be too ingrained, I guess. Maybe a discussion is due on that topic too.

But past that, w.r.t. the names of the 3 cleanup stages themselves, I was wondering whether *_CLEANUP should be the form. Whatever is true of "PRC", BEFORE_PRC isn't really clear enough. Before PRC what? Cleanup, in this case.

Although tbh, BEFORE_PYTHON_IRODSCLIENT_CLEANUP is a bit too long for my tastes.

irods/at_client_exit.py Outdated Show resolved Hide resolved
@trel
Copy link
Member

trel commented Nov 18, 2024

I think BEFORE_PRC_CLEANUP, DURING_PRC_CLEANUP, AFTER_PRC_CLEANUP are fine.

Comment on lines 7 to 11
class priority(enum.Enum):
# Integer value determines relative order of execution among the stages.
BEFORE_PRC_CLEANUP = 10
DURING_PRC_CLEANUP = 20
AFTER_PRC_CLEANUP = 30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this enum definition needs some attention.

priority is pretty generic. Is there a qualifier that can be used to make its purpose more narrow? If the name can be more specific, then that would likely remove the need to include _PRC_ in the enum value names.

If _PRC_ must be part of the enum value names, then I think they should be changed so that each of them share the same prefix (e.g. PRC_CLEANUP_*). Especially if imported into the global namespace.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could say ExecutionPriority, ExecutionOrder, or ExecutionStage ... ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm not sold on PRC being part of the name. I'd be open to suggestions as I've struggle with this for some time.

Copy link
Collaborator Author

@d-w-moore d-w-moore Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do something of this sort:

class ExecutionPriority:
   BEFORE_PYTHON_IRODSCLIENT_LIB_CLEANUP = 10
   DURING_PYTHON_IRODSCLIENT_LIB_CLEANUP = 20
   AFTER_PYTHON_IRODSCLIENT_LIB_CLEANUP = 30
#...

As you probably know, I'm not that enamored of long names because I think they can detract from clarity in many instances. Python lends itself well to a slacker philosophy like mine, too, as it lets you import and rename a symbol using import...as. Thus, from irods.at_client_exit.priority as client_exit_execution_priority becomes an option for those who prefer more fully explicit names in their own applications.... Just a thought. : )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts?

class CleanupPolicy(enum.Enum):
    BEFORE = 10,
    DURING = 20,
    AFTER  = 30,

If these define the "when", then perhaps Policy can be replaced by something that expresses that.

Can you provide an example which shows usage of this? It doesn't need to be real/complete. It just needs to show how it's handled by the user of the library.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DURING is already used internally by the PRC to schedule the cleaning of data objects, so I guess the answer is yes. Plus, DURING as part of the enum is needed for proper sequencing of the three categories relative to each other. I could try I guess to hide it from outside access, but I'm not sure doing so serves much of a purpose. Are we thinking it's a security issue?

Copy link
Collaborator Author

@d-w-moore d-w-moore Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that DURING might not be used much is, I believe, not a sufficient reason for concealing it. There are cases in which it might be useful.
I thank you for bringing it up, it spurred me to think about relative order within "DURING" for applications that might use it and therefore to add commit 95591b7 , to make sure the registration of sessions' and data objects' automated cleanup function (_cleanup_remaining_sessions) is done at the time irods.session is imported. This makes a lot of sense, because the applications in question will likely want their own DURING-phase callbacks to be done before the whole PRC is built down (per the LIFO thing), and now they can do so, while still being sure that their BEFORE-phase exit callbacks will be done even sooner.
I also think it quite harmless, and convenient in this case, to raise the symbols to module level. There are other precedents for this, like when we use inline namespaces in C++.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss this following standup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some discussion, the plan is to keep the existing implementation, but treat it as a low-level interface and wrap it in a simpler one i.e. present a high-level interface consisting of two functions (function names not final):

  • register_cleanup_callback_pre
  • register_cleanup_callback_post

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other names

  • execute_before_lib_cleanup
  • execute_after_lib_cleanup
  • cleanup_action_pre
  • cleanup_action_post

return in_memory_stream.getvalue()

if __name__ == '__main__':

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that this is used exclusively inside of the test noted at the top, but perhaps it could be helpful for future readers to leave a usage comment as well as some example output.

# Run each cleanup stage.
for priority,function_list in ordered_funclists:
# Within each cleanup stage, run all registered functions last-in, first-out. (Per atexit conventions.)
for function in reversed(function_list):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that we'll be dealing with any large lists, but we could possibly use pop to avoid reversing the list. Plus, using append and pop would make the "stack" structure a little more apparent in the code, per the documentation here: https://docs.python.org/3/tutorial/datastructures.html#using-lists-as-stacks Feel free to ignore this.

Copy link
Collaborator Author

@d-w-moore d-w-moore Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the cool things about most of the functions that transform data structures in Python3 is that they return iterators rather than doing a transformed copy of the original data structure So, we don't have to fear duplicating the data or doubling the storage, in effect, contained by a large list. reversed(L) thus returns an iterator to the sequence L, that iterates backward through the list. dict.keys is another such example.

Comment on lines 16 to 20
# So that we can access priority.ENUM_NAME as simply ENUM_NAME at module level.
for _ in priority.__members__:
exec('{}=priority.{}'.format(_,_))

_cleanup_functions = {BEFORE_PRC_CLEANUP:[], DURING_PRC_CLEANUP:[], AFTER_PRC_CLEANUP:[]}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own understanding... The for loop is here so that the priority levels in _cleanup_functions don't need to have priority. prepended to them (e.g. priority.BEFORE_PRC_CLEANUP vs BEFORE_PRC_CLEANUP)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like the tests we're adding here do not belong in data_obj_test because they don't do anything with data objects, or even iRODS, really. Should these get their own test module?

Comment on lines 7 to 11
class priority(enum.Enum):
# Integer value determines relative order of execution among the stages.
BEFORE_PRC_CLEANUP = 10
DURING_PRC_CLEANUP = 20
AFTER_PRC_CLEANUP = 30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we still need to have DURING around in order for BEFORE and AFTER to have any meaning? Or will there be some mechanism for the internal PRC cleanup to define the DURING methods? Hopefully that question makes sense...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants