-
Notifications
You must be signed in to change notification settings - Fork 3
/
purpose.tex
62 lines (43 loc) · 8.76 KB
/
purpose.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
% -----------------------------------------------------------------------------
\section{Purpose}
% -----------------------------------------------------------------------------
Laboratory protocols are critical to biological research and development, but can be difficult to communicate or reproduce due to the differences in context, skills, and resources between different projects, investigators, and organizations.
The Laboratory Open Protocol language (LabOP) aims to address this problem by providing a data model for description of laboratory protocols that is unambiguous enough for precise interpretation and automation, yet simultaneously abstract enough to support reuse and adaptation.
\subsection{Requirements}
In order to better understand the requirements on LabOP as an artifact, here is an enumeration unpacking some of the key computational tasks that LabOP needs to be able to support in order to further the goals above:
\begin{description}
\item[Instantiation] An \emph{abstract} protocol specifies a recipe to be followed in the laboratory. A concrete \emph{instance} of a protocol is an execution of such a recipe at a specific time and place, on a specific set of equipment, etc. This implies at least two representational requirements:
\begin{description}
\item[Execution Record] We must be able to create some persistent data structure that records an instance in which a protocol is executed.
\item[Derivation] We must be able to record the relationship between the execution record for a protocol instance and the protocol recipe from which it is derived.
\end{description}
\item[Metadata markup] Closely related to instantiation, a LabOP protocol description should support automatic metadata tagging of data that are collected in the course of protocol execution, \emph{to the extent lab automation at the executing lab supports this}.
For example, a protocol that collects flow cytometry data on samples of different strains under varying growth conditions, should support automatic association of each FCS file produced by the flow cytometer with the strain, growth conditions, time of data collection, calibration, etc. of the sample from which the FCS file was produced.
\item[Mapping a protocol to a laboratory] LabOP aims to support replication in part by providing automated support for mapping protocols developed at one lab onto another.
%We suspect that for some time to come this will be at best a semi-automated process, and like verification, this task will fall heavily on those who provide tool support for LabOP users. However, also as for verification,
Note, however, that LabOP intentionally will not attempt to provide explicit guarantees about transfer or replicability of protocol executions.
Such guarantees cannot be made for a poorly understood or overly delicate protocol, no matter how it is described.
Rather, the LabOP specification must allow a protocol to say how to {\em predict} if a mapping will product a correct execution and how to {\em check} if an execution should be considered correct, i.e., what a protocol specification truly requires, what aspects can safely be varied, which must be honored, and what reasonable tolerances are for inputs and outputs.
\item[Execution] To enable replication, LabOP must specify what it means to execute a protocol correctly. This definition must support a wide range of degrees of lab automation, from entirely manual, to manual supported by some automated workstation to entirely robotic. Specifically this involves giving a \emph{compositional} account of the execution semantics in which the meaning of the protocol is a function of its component actions and the relationships between them. This also involves supporting \emph{translation} from LabOP to alternative representations for different uses. For example, an initial LabOP proof-of-concept translates LabOP to Markdown\footnote{\url{https://github.com/SD2E/labop} for the translator, \url{https://commonmark.org/} gives the Markdown spec.} for labs operated by technicians, and an initial translation to Autoprotocol\footnote{\url{autoprotocol.org}} for wholly robotic laboratories using that control standard.
\item[Modification] We must be able to record modifications made to an original protocol. For example, a protocol may be modified in order to enable the protocol to be executed at a lab with different equipment from the lab at which it was originated, to enable the protocol to be scaled up or scaled down, etc.
A use pattern that may end up being common is to have a protocol be ``too strict" (too specific) to be instantiated in a lab, and thus need modification. Rather than the modification being a patch, as one might do in programming, however, the user would generalize the original protocol \emph{into a new protocol} that is a generalization allowing it to be instantiated in both the labs where it was originated \emph{and} in new labs. This new protocol would be released as an updated version of the existing protocol.
In this context, the term ``modification'' is intended to refer to \emph{tracking modifications}: what is required is a combination of \emph{versioning} and \emph{provenance tracking}. A method for computing understandable version differences would also be part of this high level task, which encompasses ongoing development and versioning of the protocol.
A key requirement here will be maintaining information about the \emph{specific version} of the protocol used in an execution record, and underlying a particular dataset, in order to ensure that new versions released later do not cause incorrect data interpretation.
\item[Authoring derived works] As with most complex formal artifacts (programs, spreadsheets, etc.), we confidently expect that few users will create new protocols from an entirely blank slate.
Instead, many will take an existing protocol, copy it, and edit the copy, possibly by copying and editing material from yet other protocols. LabOP must support this mode of operation.
Key issues here are determining which component structures can be incorporated by reference, and which must be copied instead. Tracking the relationship between an original protocol, and protocols that have branched from it would also be desirable, in order to support, for example, propagating fixes from a root protocol to others.
\item[Protocol maintenance] Closely related to the tasks of ``Modification,'' and ``Derived works'', LabOP also needs to be able to support the ongoing development, repair, and versioning of a protocol.
\item[Verification] Authoring a formal protocol, in LabOP or otherwise, requires substantial care and effort, and the usefulness of the protocol can be compromised if its specification is ill-formed or erroneous.
Supporting protocol authors in achieving correctness is an important goal.
Some of this will fall onto the shoulders of those building LabOP support tools, however the specification itself must provide guidance as to what it means for a protocol to be complete, consistent, etc.
\item[Planning and scheduling] Laboratory resources are valuable, and some organizations will want to be able to optimize their use.
To do so, LabOP should support (at least) extraction of resource requirements from activities in the protocol, and estimated durations. Note that \emph{what} resource requirements and duration estimates are required will likely be a function of both the protocol and the lab.
Which resources are limited, and must be considered in a planner or scheduler, versus those that can be effectively treated as unlimited, will vary by lab. Management styles will also vary between organizations.
\end{description}
\subsection{Implementation Approach and Scope}
Where possible, LabOP builds on other existing standards.
In particular, LabOP uses the Unified Modeling Language (UML) version 2.5.1~\citep{uml251} to describe the organization of actions in a protocol, the Synthetic Biology Open Language (SBOL) version 3~\citep{SBOL3} to describe biological materials in terms of combinations of strains, media, etc., and uses the Ontology of Units of Measure (OM)~\citep{om2} to describe parameters with physical units.
As a foundation, LabOP uses existing Semantic Web practices and resources, such as \emph{Uniform Resource Identifiers} (\labop{URI}s) and ontologies, to unambiguously identify and define biological system elements.
and to provide serialization formats for encoding this information in electronic data files, as well as the SBOL approach to closure in reasoning about knowledge.
This approach also allows LabOP to be extended with additional custom information for particular uses and deployments.
Note that LabOP is intentionally agnostic to any details of computer networking used to discover, share, or otherwise operate with protocols, in order to maximize flexibility in available options for implementation of such services.