diff --git a/docs/developer/architecture/communication_flows.rst b/docs/developer/architecture/communication_flows.rst index 6fad9165..e5f87398 100644 --- a/docs/developer/architecture/communication_flows.rst +++ b/docs/developer/architecture/communication_flows.rst @@ -1,5 +1,9 @@ .. _communication_flows: +.. Note that the mermaid diagrams are styled using some ugly CSS since styling of sequence diagrams + is an open issue: https://github.com/mermaid-js/mermaid/issues/523 + CSS hack from: https://stackoverflow.com/questions/63587556/color-change-of-one-element-in-a-mermaid-sequence-diagram + Communication Flows =================== @@ -30,6 +34,7 @@ the high volume of PV updates, DASMON writes PV:s directly to the PostgreSQL dat DASMON->>Dasmon listener: Instrument status Dasmon listener->>Workflow DB: Instrument status end + %%{init:{'themeCSS':'g:nth-of-type(6) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(2) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };'}}%% Run status updates .................. @@ -50,6 +55,7 @@ and when the Streaming Translation Client (STC) completes translation to NeXus. Dasmon listener->>Workflow DB: Run status SMS->>Dasmon listener: Translation succeeded Dasmon listener->>Workflow DB: Run status + %%{init:{'themeCSS':'g:nth-of-type(2) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(5) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };'}}%% Experiment data post-processing @@ -76,29 +82,30 @@ configured sequence and the steps included depending on the instrument. sequenceDiagram participant STC participant Workflow Manager - participant Autoreducer + participant Post-Processing Agent participant ONCat participant HFIR/SNS File Archive STC->>Workflow Manager: POSTPROCESS.DATA_READY - Workflow Manager->>Autoreducer: CATALOG.ONCAT.DATA_READY - Autoreducer->>Workflow Manager: CATALOG.ONCAT.STARTED - Autoreducer->>ONCat: pyoncat - Autoreducer->>Workflow Manager: CATALOG.ONCAT.COMPLETE - Workflow Manager->>Autoreducer: REDUCTION.DATA_READY - Autoreducer->>Workflow Manager: REDUCTION.STARTED - Autoreducer->>HFIR/SNS File Archive: reduced data, reduction log - Autoreducer->>Workflow Manager: REDUCTION.COMPLETE - Workflow Manager->>Autoreducer: REDUCTION_CATALOG.DATA_READY - Autoreducer->>Workflow Manager: REDUCTION_CATALOG.STARTED - Autoreducer->>ONCat: pyoncat - Autoreducer->>Workflow Manager: REDUCTION_CATALOG.COMPLETE + Workflow Manager->>Post-Processing Agent: CATALOG.ONCAT.DATA_READY + Post-Processing Agent->>Workflow Manager: CATALOG.ONCAT.STARTED + Post-Processing Agent->>ONCat: pyoncat + Post-Processing Agent->>Workflow Manager: CATALOG.ONCAT.COMPLETE + Workflow Manager->>Post-Processing Agent: REDUCTION.DATA_READY + Post-Processing Agent->>Workflow Manager: REDUCTION.STARTED + Post-Processing Agent->>HFIR/SNS File Archive: reduced data, reduction log + Post-Processing Agent->>Workflow Manager: REDUCTION.COMPLETE + Workflow Manager->>Post-Processing Agent: REDUCTION_CATALOG.DATA_READY + Post-Processing Agent->>Workflow Manager: REDUCTION_CATALOG.STARTED + Post-Processing Agent->>ONCat: pyoncat + Post-Processing Agent->>Workflow Manager: REDUCTION_CATALOG.COMPLETE + %%{init:{'themeCSS':'g:nth-of-type(2) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(5) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(6) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(7) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(10) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(11) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };'}}%% Configuring the autoreduction ............................. -In addition to run post-processing, the autoreducers handle updating instrument reduction script -parameters for instruments that have implemented +In addition to run post-processing, Post-Processing Agent handles updating instrument reduction +script parameters for instruments that have implemented :doc:`autoreduction parameter configuration<../instruction/autoreduction>` at `monitor.sns.gov/reduction// `_. @@ -107,12 +114,13 @@ parameters for instruments that have implemented sequenceDiagram actor Instrument Scientist participant WebMon - participant Autoreducer + participant Post-Processing Agent participant HFIR/SNS File archive Instrument Scientist->>WebMon: Submit form with parameter values - WebMon->>Autoreducer: REDUCTION.CREATE_SCRIPT - Autoreducer->>HFIR/SNS File archive: Update instrument reduction script + WebMon->>Post-Processing Agent: REDUCTION.CREATE_SCRIPT + Post-Processing Agent->>HFIR/SNS File archive: Update instrument reduction script + %%{init:{'themeCSS':'g:nth-of-type(5) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(9) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };'}}%% Live data visualization -------------------------- @@ -141,12 +149,13 @@ script can make the results available in WebMon by publishing plots to Live Data Livereduce->>Livereduce: run processing script Livereduce->>Live Data Server: HTTP POST end + %%{init:{'themeCSS':'g:nth-of-type(2) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };g:nth-of-type(6) rect.actor { fill:#faf2e6; stroke:#f2e3cb; };'}}%% Publish to Live Data Server from autoreduction script ..................................................... The instrument-specific autoreduction script can include a step to publish plots (in either JSON -format or HTML div) to Live Data Server. The Post-processing Agent repository includes some +format or HTML div) to Live Data Server. The Post-Processing Agent repository includes some convenience functions for generating and publishing plots in `publish_plot.py `_. @@ -154,19 +163,24 @@ convenience functions for generating and publishing plots in `publish_plot.py sequenceDiagram participant Workflow Manager - participant Autoreducer + participant Post-Processing Agent participant Live Data Server - Workflow Manager->>Autoreducer: REDUCTION.DATA_READY + Workflow Manager->>Post-Processing Agent: REDUCTION.DATA_READY opt Publish plot - Autoreducer->>Live Data Server: HTTP POST + Post-Processing Agent->>Live Data Server: HTTP POST end Display plot from Live Data Server ................................ Run overview pages (``monitor.sns.gov/report///``) will query the Live -Data Server for a plot for that instrument and run number and display it if available. +Data Server for a plot for that instrument and run number and display it, if available. + +The Live Data Server database stores a single plot for each combination of instrument and run +number. Publishing a new plot automatically replaces the previous plot. When WebMon fetches a plot +it will, therefore, always display the latest plot, whether it was published by Livereduce during +the run or by autoreduction after the run has finished. .. mermaid:: @@ -191,8 +205,9 @@ Diagnostics information is primarily collected by Dasmon listener. Heartbeats from services ........................ -Dasmon listener subscribes to heartbeats from the other services. There is a mechanism for alerting -admins by email when a service has missed heartbeats (needs to be verified that this still works). +Dasmon listener subscribes to heartbeat messages from the other services and stores the last +received status for each service in the database. Post-Processing Agent and Workflow Manager +also include their process ID (PID) in the heartbeat message. .. mermaid:: @@ -200,16 +215,54 @@ admins by email when a service has missed heartbeats (needs to be verified that SMS["SMS (per beamline)"] PVSD["PVSD (per beamline)"] DASMON["DASMON (per beamline)"] - STC - Autoreducers + PostProcessingAgent["Post-Processing Agent"] DasmonListener WorkflowDB[(DB)] SMS-->|heartbeat|DasmonListener PVSD-->|heartbeat|DasmonListener DASMON-->|heartbeat|DasmonListener - STC-->|heartbeat|DasmonListener - Autoreducers-->|heartbeat|DasmonListener - WorkflowManager-->|heartbeat|DasmonListener - DasmonListener-->|heartbeat|DasmonListener + PostProcessingAgent-->|heartbeat, PID|DasmonListener + WorkflowManager-->|heartbeat, PID|DasmonListener DasmonListener-->WorkflowDB - DasmonListener-.->|if missed 3 heartbeats|InstrumentScientist + classDef externalStyle fill:#faf2e6, stroke:#f2e3cb + class SMS,PVSD,DASMON externalStyle + + subgraph Legend + direction LR + Internal["Internal resource"] + External["External resource"] + Internal ~~~ External + end + WorkflowManager ~~~ Internal + style Legend fill:#FFFFFF,stroke:#000000 + class External externalStyle + +Dasmon listener handles messages sent to a message broker topic with the string "STATUS" in the name +as heartbeat messages. For example, Workflow Manager sends a heartbeat message to +``SNS.COMMON.STATUS.WORKFLOW.0`` every 5 seconds. Dasmon listener also records heartbeats from the +beamline-specific services, e.g. the PVSD service at the HFIR beamline +CG3 sends heartbeat messages to the topic ``HFIR.CG3.STATUS.PVSD``. Table 2 lists the services that +send heartbeats to Dasmon listener, as well as their message broker topic and heartbeat frequency. + +.. list-table:: Table 2: Service heartbeat messages + :widths: 40 40 20 + :header-rows: 1 + + * - Service + - Message broker topic + - Frequency + * - Workflow Manager + - SNS.COMMON.STATUS.WORKFLOW.0 + - 5 s + * - Post-Processing Agent + - SNS.COMMON.STATUS.AUTOREDUCE.0 + - 30 s + * - DASMON + - ..STATUS.DASMON + - 5 s + * - PVSD + - ..STATUS.PVSD + - 5 s + * - SMS + - ..STATUS.SMS + - 5 s diff --git a/docs/developer/architecture/overview.rst b/docs/developer/architecture/overview.rst index 0f6787c0..0828a9e8 100644 --- a/docs/developer/architecture/overview.rst +++ b/docs/developer/architecture/overview.rst @@ -27,7 +27,7 @@ The arrows represent relationships between these services and resources. WorkflowManager[Workflow Manager] DasmonListener[Dasmon listener] Database[(Workflow DB)] - Autoreducers-->ONCat + PostProcessingAgent[Post-Processing Agent]-->ONCat LiveDataServer-->WebMon LiveDataServer<-->LiveDataDB[(LiveData DB)] LiveReduce @@ -38,16 +38,16 @@ The arrows represent relationships between these services and resources. DASMON-->DasmonListener DASMON-->Database WorkflowManager-->Database - WorkflowManager<-->Autoreducers - Autoreducers-->LiveDataServer + WorkflowManager<-->PostProcessingAgent + PostProcessingAgent-->LiveDataServer Database-->WebMon ONCat-->WebMon LiveReduce-->LiveDataServer DasmonListener-->Database TranslationService-->FileArchive - FileArchive<-->Autoreducers + FileArchive<-->PostProcessingAgent style DAS fill:#D3D3D3, stroke-dasharray: 5 5 - classDef externalStyle fill:#FAEFDE, stroke:#B08D55 + classDef externalStyle fill:#faf2e6, stroke:#f2e3cb class DASMON,TranslationService,SMS,FileArchive,ONCat externalStyle subgraph Legend @@ -61,12 +61,12 @@ The arrows represent relationships between these services and resources. class External externalStyle The gray box labeled "DAS" are services managed by the Data Acquisition System team that pass -information to WebMon. The autoreducers interact with the HFIR/SNS file archive to access -instrument-specific reduction scripts and experiment data files. The autoreducers also write reduced -data files and reduction log files back to the file archive. +information to WebMon. Post-Processing Agent interacts with the HFIR/SNS file archive to access +instrument-specific reduction scripts and experiment data files. Post-Processing Agent also writes +reduced data files and reduction log files back to the file archive. Another external component is the experiment data catalog, `ONCat `_, where -the autoreducers catalog experiment metadata. The WebMon frontend retrieves and displays this +Post-Processing Agent catalogs experiment metadata. The WebMon frontend retrieves and displays this metadata from ONCat. The section :ref:`communication_flows` includes sequence diagrams that show how the services @@ -77,8 +77,8 @@ Inter-service communication WebMon uses an `ActiveMQ `_ message broker as the main method of communication between services. The message broker also serves as a load balancer by distributing -post-processing jobs to the available autoreducers in a round-robin fashion. Communication with Live -Data Server and ONCat occurs via their respective REST API:s. +post-processing jobs to the available instances of Post-Processing Agent in a round-robin fashion. +Communication with Live Data Server and ONCat occurs via their respective REST API:s. Table 1 lists the type of communication between pairs services, which are loosely categorized as "client" and "service" in that interaction. @@ -90,13 +90,13 @@ Table 1 lists the type of communication between pairs services, which are loosel * - "Client" - "Server" - Communication type - * - Autoreducers + * - Post-Processing Agent - Dasmon Listener - Message queue - * - Autoreducers + * - Post-Processing Agent - Live Data Server - REST API - * - Autoreducers + * - Post-Processing Agent - ONCat - REST API * - DASMON @@ -130,7 +130,7 @@ Table 1 lists the type of communication between pairs services, which are loosel - Workflow Manager - Message queue * - Workflow Manager - - Autoreducers + - Post-Processing Agent - Message queue * - Workflow Manager - Dasmon Listener diff --git a/docs/glossary.rst b/docs/glossary.rst new file mode 100644 index 00000000..63316832 --- /dev/null +++ b/docs/glossary.rst @@ -0,0 +1,85 @@ +Glossary +======== + +.. glossary:: + :sorted: + + WebMon + Monitor for instrument status and autoreduction status at SNS and HFIR. WebMon can refer to + either the whole system or the landing page at https://monitor.sns.gov. + + DASMON + Data Acquisition System Monitor (DASMON) is a per-beamline service that reports PV values and + experiment/run meta-data to WebMon. Due to the high volume of PV updates, DASMON writes PV + updates straight to the database. DASMON reports its status to WebMon for system diagnostics. + + DAS + Data Acquisition System. + + SMS + Stream Management Service (SMS) is a service that aggregates data from fast neutron event data + and slow PVs into a data stream available for both live monitoring and file archiving. SMS + reports its status to WebMon for system diagnostics. + + STC + Streaming Translation Client (STC) is a service that translates the experiment data stream + from :term:`SMS` to a NeXus data file. Triggers the post-processing workflow after the NeXus + file has been created. + + PVSD + Process Variable Streaming Daemon + Per-beamline service that subscribes to EPICS Control System PV:s and forwards PV value + changes to the SMS. PVSD reports its status to WebMon for system diagnostics. + + PV + Process Variables (PVs) are variables coming from the control system and can include sample + environment variables (e.g. temperature, magnetic field), instrument geometry and experiment + metadata. Subsets of PVs are used for monitoring experiments. + + ONCat + Catalog for neutron experiment data at SNS and HFIR: https://oncat.ornl.gov. + + SNS + The Spallation Neutron Source at Oak Ridge National Laboratory. + + HFIR + The High-Flux Isotope Reactor at Oak Ridge National Laboratory. + + NeXuS + Data format for neutron, x-ray, and muon science. + + Workflow Manager + Service that orchestrates the autoreduction/post-processing workflow. + + Dasmon listener + Service that subscribes to messages from :term:`DASMON` and writes to the database. + + Autoreduction + Automated data reduction that can be triggered when the run NeXuS data file is available. The + instrument-specific autoreduction process is configured by the instrument scientist. + Autoreduction is a step in the post-processing workflow. + + Cataloging + Cataloging of experiment data in ONCat. Cataloging is a step in the post-processing workflow. + + Live Data Server + Service that serves plots to the WebMon front-end. :term:`Livereduce` and + :term:`autoreduction` can publish plots for a run to Live Data Server. + + Livereduce + Service for live monitoring of the data stream for the creation of plots which are published + to the :term:`Live Data Server`. + + IPTS + Integrated Proposal Tracking System (IPTS) number is the experiment's proposal ID that is used + for grouping runs. + + Workflow + Experiment data post-processing workflow. The available tasks are cataloging, autoreduction + and reduced data cataloging. + + Post-Processing Agent + Service that performs post-processing tasks like cataloging and autoreduction. + + Autoreducer + Server where an instance of :term:`Post-Processing Agent` is running. diff --git a/docs/index.rst b/docs/index.rst index b5ac721e..1d5b278e 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -11,6 +11,7 @@ Welcome to Web Monitor's documentation! .. toctree:: :maxdepth: 1 + glossary users/index releasenotes/index developer/index