Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize exception handling #103

Open
nsoblath opened this issue May 6, 2024 · 0 comments
Open

Standardize exception handling #103

nsoblath opened this issue May 6, 2024 · 0 comments

Comments

@nsoblath
Copy link
Member

nsoblath commented May 6, 2024

Exception handling in dl-py: driplineorg/dripline-python#166

In general, the ultimate decision for how an exception should affect the operation of an executable (or a thread within an executable) is by the application's/thread's execute() or other primary operation (e.g. listen()) functions. Here is how exceptions are currently handled:

  • ✅ agent
    • No error handling
    • Creates sub-agent and executes it (see sub_agent_request[reply] below)
  • ✅ concurrent_receiver
    • try block around main while loop -- will catch exceptions from all message processing that comes through here
      • receiver.cc (439), execute()
      • catch std::exception: uses signal_handler::cancel_all() to shutdown application
  • ⚠️ heartbeater
    • try block around sending a message -- each time a heartbeat message is sent
      • heartbeater.cc (53), execute()
      • catch message_ptr_t and connection_error: assume offline mode; log error and don't do anything
      • catch dripline_error: log error and don't do anything
      • Question: do we really want to do nothing for these?
    • Question: what about other code in the main while loop outside of sending?
  • ✅ monitor
    • try block around starting and joining thread -- this is the main operation of a monitor
      • monitor.cc (149), execute()
      • Same treatment as service
  • ⚠️ relayer
    • try block around sending a message -- each time a message is sent
      • relayer.cc (28), execute()
      • catch message_ptr_t: assume offline mode; log error and do nothing
      • catch connection_error: log error and do nothing
      • catch dripline_error: log error and do nothing
      • Question: do we really want to do nothing for these?
      • Question: why is treatment of connection_error slightly inconsistent with heartbeater?
    • Question: what about other code in the main while loop outside of sending?
  • ❌ scheduler
  • ✅ service
    • try block around starting and joining threads -- this is the main operation of a service
      • service.cc (195), listen()
      • catch std::system_error: assume didn't connect; return false to exit service
      • catch dripline_error: report error; return false to exit service
      • catch std::exception: report error; return false to exit service
  • ⚠️ sub_agent_request[reply]
    • try block around sending a message
      • agent.cc (153[298]), create_and_send_message()
      • Same handling as service except the return value is set in f_agent instead of returning false
    • Question: what about other code, putting together the message, etc? This is unhandled.

There are some other parts of the error-handling chain that should be considered during this exercise:

  • service::submit_message()
    • service.cc (470)
    • Call sequence
      • concurrent_receiver::execute() [as part of service]
      • service::submit_message() [link between concurrent_receiver and endpoint]
      • endpoint::sort_message() [as part of service]
    • Catches several exception types; logs as KTERROR; then rethrows
      • dripline_error
      • amqp::amqp_exception
      • amqp::amqp_lib_exception
      • std::exception
    • Question Do we even need to do the catching and rethrowing here? Is there value in the log messages at this stage?
  • endpoint::on_request_message()
    • endpoint.cc (88)
    • We catch exceptions here so that we can send a reply message
    • Call sequence
      • endpoint::sort_message()
      • endpoint::on_request_message()
      • endpoint::__do_[OP]_request()
    • The try block surrounds the calling of the different __do_[OP]_request() functions.
    • Catches and actions
      • throw_reply
        • This object is only used to send a reply message
        • The return code could be success, a warning, or an error
        • After sending, control returns to the thrower
      • pybind11::error_already_set
        • This is how all raised Python exceptions show up in C++, regardless of Python type
        • How this is handled depends on a keyword that is checked for in the what() string
          • If it's indicated that the Python exception was a ThrowReply, then it's handled as above for throw_reply
          • Otherwise an unhandled exception reply is sent and the exception is rethrown
      • std::exception
        • An unhandled exception reply is sent and the exception is rethrown
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant