From b6aa20ce22bbab8d0e459124f7c6fb5cec9d2e1c Mon Sep 17 00:00:00 2001 From: Marie Backman Date: Fri, 13 Sep 2024 09:59:23 -0400 Subject: [PATCH] reorganize communication flows by function --- .../communication_flows.rst | 261 ++++++++---------- docs/developer/architecture/index.rst | 9 + .../{design => architecture}/overview.rst | 27 +- docs/developer/index.rst | 13 +- 4 files changed, 149 insertions(+), 161 deletions(-) rename docs/developer/{design => architecture}/communication_flows.rst (62%) create mode 100644 docs/developer/architecture/index.rst rename docs/developer/{design => architecture}/overview.rst (75%) diff --git a/docs/developer/design/communication_flows.rst b/docs/developer/architecture/communication_flows.rst similarity index 62% rename from docs/developer/design/communication_flows.rst rename to docs/developer/architecture/communication_flows.rst index d29b43b8..dbacc79c 100644 --- a/docs/developer/design/communication_flows.rst +++ b/docs/developer/architecture/communication_flows.rst @@ -1,20 +1,67 @@ .. _communication_flows: -ActiveMQ Communication Flows -============================ +Communication Flows +=================== + +This section presents communication sequences organized by WebMon functionality. .. contents:: :local: -Autoreducers ------------- +Experiment monitoring +--------------------- + +Instrument status and PV updates +................................ + +DASMON, from Data Acquisition (DAQ) System Monitor, provides instrument status and process variable +(PV) updates from the beamlines to WebMon. DASMON connects to the WebMon message broker to pass +status information, for example the current run number and count rate, to Dasmon listener. Due to +the high volume of PV updates, DASMON writes PV:s directly to the PostgreSQL database. + +.. mermaid:: -Run post-processing -................... + sequenceDiagram + participant DASMON + participant Dasmon listener + participant Workflow DB + par + DASMON->>Workflow DB: PV update + and + DASMON->>Dasmon listener: Instrument status + Dasmon listener->>Workflow DB: Instrument status + end + +Run status updates +.................. + +The Stream Management Service (SMS) posts messages on the queue ``APP.SMS`` at run start, run stop +and when the Streaming Translation Client (STC) completes translation to NeXus. + +.. mermaid:: + + sequenceDiagram + participant SMS + participant Dasmon listener + participant Workflow DB + SMS->>Dasmon listener: Run started + Dasmon listener->>Workflow DB: Add new data run + Dasmon listener->>Workflow DB: Run status + SMS->>Dasmon listener: Run stopped + Dasmon listener->>Workflow DB: Run status + SMS->>Dasmon listener: Translation succeeded + Dasmon listener->>Workflow DB: Run status + + +Experiment data post-processing +------------------------------- + +Autoreduction and cataloging +............................ The sequence diagram below describes the communication flow as a run gets post-processed. -The post-processing workflow is triggered when the Translation Service has finished translating the -data stream to NeXus and sends a message on the queue ``POSTPROCESS.DATA_READY`` specifying the -instrument, IPTS, run number and location of the NeXus file. +The post-processing workflow is triggered when the Streaming Translation Client (STC) has finished +translating the data stream to NeXus and sends a message on the queue ``POSTPROCESS.DATA_READY`` +specifying the instrument, IPTS, run number and location of the NeXus file. The post-processing workflow for the instrument is configurable in the database table ``report_task``. @@ -25,54 +72,26 @@ raw data in `ONCat `_ and cataloging of reduced data in .. mermaid:: sequenceDiagram - participant Translation Service + participant STC participant Workflow Manager participant Autoreducer participant ONCat + participant HFIR/SNS File Archive - Translation Service->>Workflow Manager: POSTPROCESS.DATA_READY + STC->>Workflow Manager: POSTPROCESS.DATA_READY Workflow Manager->>Autoreducer: CATALOG.ONCAT.DATA_READY Autoreducer->>Workflow Manager: CATALOG.ONCAT.STARTED Autoreducer->>ONCat: pyoncat Autoreducer->>Workflow Manager: CATALOG.ONCAT.COMPLETE Workflow Manager->>Autoreducer: REDUCTION.DATA_READY Autoreducer->>Workflow Manager: REDUCTION.STARTED - Autoreducer->>Autoreducer: run reduction + Autoreducer->>HFIR/SNS File Archive: reduced data, reduction log Autoreducer->>Workflow Manager: REDUCTION.COMPLETE Workflow Manager->>Autoreducer: REDUCTION_CATALOG.DATA_READY Autoreducer->>Workflow Manager: REDUCTION_CATALOG.STARTED Autoreducer->>ONCat: pyoncat Autoreducer->>Workflow Manager: REDUCTION_CATALOG.COMPLETE -.. - .. mermaid:: - - sequenceDiagram - participant Translation Service - participant Workflow Manager - participant Autoreducer - - Translation Service->>Workflow Manager: POSTPROCESS.DATA_READY - opt Cataloging - Workflow Manager->>Autoreducer: CATALOG.ONCAT.DATA_READY - Autoreducer->>Workflow Manager: CATALOG.ONCAT.STARTED - Note over Autoreducer: Ingest in ONCat - Autoreducer->>Workflow Manager: CATALOG.ONCAT.COMPLETE - end - opt Autoreduction - Workflow Manager->>Autoreducer: REDUCTION.DATA_READY - Autoreducer->>Workflow Manager: REDUCTION.STARTED - Note over Autoreducer: Execute autoreduction script - Autoreducer->>Workflow Manager: REDUCTION.COMPLETE - end - opt Reduced data cataloging - Workflow Manager->>Autoreducer: REDUCTION_CATALOG.DATA_READY - Autoreducer->>Workflow Manager: REDUCTION_CATALOG.STARTED - Note over Autoreducer: Ingest in ONCat - Autoreducer->>Workflow Manager: REDUCTION_CATALOG.COMPLETE - end - - Configuring the autoreduction ............................. @@ -93,129 +112,54 @@ parameters for instruments that have implemented WebMon->>Autoreducer: REDUCTION.CREATE_SCRIPT Autoreducer->>HFIR/SNS File archive: Update instrument reduction script -DASMON ------- -DASMON, from Data Acquisition (DAQ) System Monitor, provides instrument status and process variable -(PV) updates from the beamlines to WebMon. DASMON connects to the WebMon message broker to pass -status information, for example the current run number and count rate, to Dasmon listener. Due to -the high volume of PV updates, DASMON writes PV:s directly to the PostgreSQL database. - -.. mermaid:: +Live data visualization +-------------------------- - sequenceDiagram - participant DASMON - participant Dasmon listener - participant Workflow DB - DASMON->>Workflow DB: PV update - DASMON->>Dasmon listener: Instrument status - DASMON->>Workflow DB: Instrument status +Live Data Server (https://github.com/neutrons/live_data_server) is a service that serves plots to +the WebMon frontend. It provides a REST API with endpoints to create/update to and retrieve plots +from the Live Data Server database. -Dasmon listener ---------------- +Publish to Live Data Server from live data stream +................................................. -Stream Management Service (SMS) -............................... +Livereduce (https://github.com/mantidproject/livereduce/) allows scientists to reduce +data from an ongoing experiment, i.e. before translation to NeXus, by connecting to the live data +stream from the Stream Management Service (SMS). The instrument-specific livereduce processing +script can make the results available in WebMon by publishing plots to Live Data Server. .. mermaid:: sequenceDiagram participant SMS - participant Dasmon listener - participant Workflow DB - SMS->>Dasmon listener: Run started - Dasmon listener->>Workflow DB: Create new run - SMS->>Dasmon listener: Run stopped - SMS->>Dasmon listener: Translation succeeded - -Heartbeats from services -........................ - -Dasmon listener subscribes to heartbeats from the other services. There is a mechanism for alerting -admins by email when a service has missed heartbeats (needs to be verified that this still works). - -.. - .. mermaid:: - - sequenceDiagram - participant Other services - participant Dasmon listener - participant Workflow DB - actor Subscribed users - loop Every N s - Other services->>Dasmon listener: Heartbeat - Dasmon listener->>Workflow DB: Status update - end - opt Service has 3 missed heartbeats - Dasmon listener->>Subscribed users: Email - end - - -.. mermaid:: - - flowchart LR - SMS["SMS (per beamline)"] - PVSD["PVSD (per beamline)"] - DASMON["DASMON (per beamline)"] - STC - Autoreducers - DasmonListener - WorkflowDB[(DB)] - SMS-->|heartbeat|DasmonListener - PVSD-->|heartbeat|DasmonListener - DASMON-->|heartbeat|DasmonListener - STC-->|heartbeat|DasmonListener - Autoreducers-->|heartbeat|DasmonListener - WorkflowManager-->|heartbeat|DasmonListener - DasmonListener-->|heartbeat|DasmonListener - DasmonListener-->WorkflowDB - DasmonListener-.->|if missed 3 heartbeats|InstrumentScientist - -Live Data Server ----------------------------------------- + participant Livereduce + participant Live Data Server -WebMon has two modes of interaction with Live Data Server: publish (save) plots to the Live Data -Server database and display (fetch) plots from the database. + SMS->>Livereduce: data stream + loop Every N minutes + Livereduce->>Livereduce: run processing script + Livereduce->>Live Data Server: HTTP POST + end Publish to Live Data Server from autoreduction script ..................................................... -The instrument-specific autoreduction script can optionally publish plots (in either JSON format -or HTML div) to Live Data Server. +The instrument-specific autoreduction script can include a step to publish plots (in either JSON +format or HTML div) to Live Data Server. The Post-processing Agent repository includes some +convenience functions for generating and publishing plots in `publish_plot.py +`_. .. mermaid:: sequenceDiagram - participant WebMon + participant Workflow Manager participant Autoreducer participant Live Data Server - WebMon->>Autoreducer: REDUCTION.DATA_READY + Workflow Manager->>Autoreducer: REDUCTION.DATA_READY opt Publish plot - Autoreducer->>Live Data Server: publish_plot + Autoreducer->>Live Data Server: HTTP POST end -Publish to Live Data Server from live data stream -................................................. - -Livereduce (https://github.com/mantidproject/livereduce/) allows scientists to reduce -data from an ongoing experiment, i.e. before translation to NeXus, by connecting to the live data -stream from the Stream Management Service (SMS). The instrument-specific processing -script can make the results available in WebMon by publishing plots to Live Data Server. - -.. mermaid:: - - sequenceDiagram - participant SMS - participant Livereduce - participant Live Data Server - - SMS->>Livereduce: data stream - loop Every N minutes - Livereduce->>Livereduce: run processing script - Livereduce->>Live Data Server: publish plot - end - - Display plot from Live Data Server ................................ @@ -232,3 +176,38 @@ Data Server for a plot for that instrument and run number and display it if avai loop Every 60 s WebMon->>Live Data Server: HTTP GET end + +System diagnostics +------------------ + +WebMon displays system diagnostics information on https://monitor.sns.gov/dasmon/common/diagnostics/ +and diagnostics for DASMON and PVSD at the beamline at +`https://monitor.sns.gov/dasmon//diagnostics/ +`_. +Diagnostics information is primarily collected by Dasmon listener. + +Heartbeats from services +........................ + +Dasmon listener subscribes to heartbeats from the other services. There is a mechanism for alerting +admins by email when a service has missed heartbeats (needs to be verified that this still works). + +.. mermaid:: + + flowchart LR + SMS["SMS (per beamline)"] + PVSD["PVSD (per beamline)"] + DASMON["DASMON (per beamline)"] + STC + Autoreducers + DasmonListener + WorkflowDB[(DB)] + SMS-->|heartbeat|DasmonListener + PVSD-->|heartbeat|DasmonListener + DASMON-->|heartbeat|DasmonListener + STC-->|heartbeat|DasmonListener + Autoreducers-->|heartbeat|DasmonListener + WorkflowManager-->|heartbeat|DasmonListener + DasmonListener-->|heartbeat|DasmonListener + DasmonListener-->WorkflowDB + DasmonListener-.->|if missed 3 heartbeats|InstrumentScientist diff --git a/docs/developer/architecture/index.rst b/docs/developer/architecture/index.rst new file mode 100644 index 00000000..0fedd477 --- /dev/null +++ b/docs/developer/architecture/index.rst @@ -0,0 +1,9 @@ +Architecture +============ + +.. toctree:: + :maxdepth: 1 + :caption: Index + + overview + communication_flows diff --git a/docs/developer/design/overview.rst b/docs/developer/architecture/overview.rst similarity index 75% rename from docs/developer/design/overview.rst rename to docs/developer/architecture/overview.rst index b5e1df00..750ae324 100644 --- a/docs/developer/design/overview.rst +++ b/docs/developer/architecture/overview.rst @@ -1,21 +1,17 @@ -Design Overview -=============== - -.. toctree:: - :maxdepth: 1 - - communication_flows - +Overview +======== High-level architecture ----------------------- The diagram below describes the high-level architecture of WebMon, including some external resources -that WebMon interacts with. The gray box labeled "DAS" are services owned by the Data Acquisition -System team that feed information to WebMon. `ONCat `_ is the experiment -data catalog, which the autoreducers catalog runs to and the frontend fetches run metadata from. -The autoreducers access instrument-specific reduction scripts and experiment data files on the -HFIR/SNS file archive. The autoreducers also write reduced data files and logs to the file archive. +that WebMon interacts with. The direction of the arrows shows the direction of data flow, e.g. +Dasmon listener gets data from DASMON. The gray box labeled "DAS" are services owned by the Data +Acquisition System team that feed information to WebMon. `ONCat `_ is the +experiment data catalog, which the autoreducers catalog runs to and the frontend fetches run +metadata from. The autoreducers access instrument-specific reduction scripts and experiment data +files on the HFIR/SNS file archive. The autoreducers also write reduced data files and reduction log +files to the file archive. .. mermaid:: @@ -56,7 +52,7 @@ HFIR/SNS file archive. The autoreducers also write reduced data files and logs t TranslationService-->FileArchive FileArchive<-->Autoreducers style DAS fill:#D3D3D3, stroke-dasharray: 5 5 - classDef webMonStyle fill:#FFFFE0 + classDef webMonStyle fill:#FFFFE0, stroke:#B8860B class WorkflowManager,DasmonListener,Database,Autoreducers,LiveDataServer,LiveReduce,WebMon,LiveDataDB webMonStyle subgraph Legend @@ -69,6 +65,7 @@ HFIR/SNS file archive. The autoreducers also write reduced data files and logs t style Legend fill:#FFFFFF,stroke:#000000 class Internal webMonStyle +The section :ref:`communication_flows` provides sequence diagrams to show how the services interact. Message broker -------------- @@ -77,8 +74,6 @@ WebMon uses an `ActiveMQ `_ message broker for com services. The message broker also serves as a load balancer by distributing post-processing jobs to the available autoreducers in a round-robin fashion. -Service communication flows are described in :ref:`communication_flows`. - .. mermaid:: flowchart TB diff --git a/docs/developer/index.rst b/docs/developer/index.rst index 26f81a39..76447424 100644 --- a/docs/developer/index.rst +++ b/docs/developer/index.rst @@ -1,13 +1,21 @@ Developer documentation ======================= +Architecture +------------ + +.. toctree:: + :maxdepth: 1 + + architecture/overview + architecture/communication_flows + Developer Guide --------------- .. toctree:: :maxdepth: 1 - design/overview instruction/build instruction/test_fixture instruction/docker @@ -25,15 +33,12 @@ The web-monitor contains three independent Django applications * :py:mod:`webmon `: user facing web interface, visit the production version at `monitor.sns.gov`_. * :py:mod:`workflow`: backend manager. -and a mocked catalog services. - .. toctree:: :maxdepth: 1 dasmon/modules webmon/modules workflow/modules - catalog/modules .. _monitor.sns.gov: https://monitor.sns.gov/