-
Notifications
You must be signed in to change notification settings - Fork 3
/
labopDataModel.tex
419 lines (363 loc) · 24.4 KB
/
labopDataModel.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
%
\normalsize%
\subsection{Protocol}%
\label{sec:labop:Protocol}%
A \labop{Protocol} describes how to carry out some form of laboratory or research process.
For example, a \labop{Protocol} could describe DNA miniprep, Golden-Gate assembly, a cell culture experiment.
At present this class adds no additional information over \uml{Activity}, but may in the future.%
\newline%
\linebreak%
\begin{figure}[h!]%
\centering%
\includegraphics[width=0.21595744680851064\textwidth]{labop_classes/Protocol_abstraction_hierarchy.pdf}%
\caption{Protocol}%
\label{fig:Protocol}%
\end{figure}
%
The \labop{Protocol} class is shown in \ref{fig:Protocol}. It is derived from \uml{Activity}.%
%
\subsection{Primitive}%
\label{sec:labop:Primitive}%
A \labop{Primitive} describes a library function that acts as a basic ``building block'' for a \labop{Protocol}.
For example, a \labop{Primitive} could describe pipetting, measuring absorbance in a plate reader, or centrifuging.
At present this class adds no additional information over \uml{Behavior}, but may in the future.%
\newline%
\linebreak%
\begin{figure}[h!]%
\centering%
\includegraphics[width=0.21893617021276596\textwidth]{labop_classes/Primitive_abstraction_hierarchy.pdf}%
\caption{Primitive}%
\label{fig:Primitive}%
\end{figure}
%
The \labop{Primitive} class is shown in \ref{fig:Primitive}. It is derived from \uml{Behavior}.%
%
\subsection{BehaviorExecution}%
\label{sec:labop:BehaviorExecution}%
A \labop{BehaviorExecution} is a record of how a \labop{Protocol}, \labop{Primitive}, or other \uml{Behavior} was carried out.
The execution of the behavior could be either real or simulated.
In specifying a \labop{BehaviorExecution}, the prov:type field inherited from \prov{Activity} is used to indicate the
\uml{Behavior} whose execution is being recorded. Precisely \sbol{one} value of prov:type MUST be a URI for a \uml{Behavior}.
The prov:startedAtTime and prov:endedAtTime fields SHOULD be used to record timing information as this becomes
available.
Finally, the entity carrying out the execution SHOULD be recorded as a \prov{Agent} indicated using a
\prov{Association}.
Note that a \labop{BehaviorExecution} can be used to record both the state of an in-progress execution as well as an
execution that has completed. As a \labop{BehaviorExecution} proceeds, all values of its properties are monotonic,
i.e., they are only added to and never changed.
TODO: need to changing completedNormally to allow indication of an in-progress BehaviorExecution
TODO: Is there a good ontology for agent roles in association?%
\newline%
\linebreak%
\begin{figure}[h!]%
\centering%
\includegraphics[width=0.7327659574468085\textwidth]{labop_classes/BehaviorExecution_abstraction_hierarchy.pdf}%
\caption{BehaviorExecution}%
\label{fig:BehaviorExecution}%
\end{figure}
%
The \labop{BehaviorExecution} class is shown in \ref{fig:BehaviorExecution}. It is derived from \prov{Activity} and includes the following specializations: \labop{ProtocolExecution}. %
This class includes the following properties: \labop{completedNormally}, \labop{parameterValuePair}, \labop{consumedMaterial}. %
\begin{itemize}%
\item%
The \labop{parameterValuePair} property is OPTIONAL and contains URI references to associated objects of type ParameterValueThe parameterValuePair property is used to record the value that was associated with each
uml:Parameter for the uml:Behavior when it was executed, by means of a ParameterValue object.
Any uml:Parameter that is not listed is assumed to have had no value assigned. Conversely, every non-optional
uml:Parameter for the uml:Behavior MUST have an associated parameter value.
Finally, note that this applies both to input uml:Parameter objects, whose value is set before execution begins,
and to output uml:Parameter objects, whose value is set by the time execution ends.
TODO: are multiple values allowed, or do those need to be passed as list/set types?
%
\item%
The \labop{consumedMaterial} property is OPTIONAL and contains URI references to associated objects of type MaterialThis property is used to record the noteworthy consumables used during the execution of the
Behavior. For example, a cell culture protocols will consume various reagents and samples of cells. Materials
with the same specification SHOULD be consolidated, such that the list of materials SHOULD NOT contain two
materials with the same specification.
For example, consuming 5.0 mL of PBS and 2.0 mL of PBS should be recorded as consuming 7.0 mL of PBS.
Complex materials, however, MAY contain the same material more than once in their substructure.
For example, M9 media contains glucose, but it would not be necessary to consolidate the glucose in M9 media
with additional glucose that was added as a supplement, since that would change the definition of the media.%
\item%
The \labop{completedNormally} property is REQUIRED and has a singleton value of type booleanThis boolean should be set to true if the Behavior completed normally and false if there
was some exception condition. At present, no further information is being encoded about exceptions, but this
is an extension that is anticipated for the future.%
\end{itemize}%
\subsubsection{ProtocolExecution}%
\label{sec:labop:ProtocolExecution}%
A \labop{ProtocolExecution} expands on the information in a \labop{BehaviorExecution} by including records for
the nodes and edges defining the Protocol's behavior as a \uml{Activity}. Specifically, the execution property
is used to record each firing of a \uml{ActivityNode} and the flow property is used to record each time a token
moves along a \uml{ActivityEdge}.
Otherwise, a \labop{ProtocolExecution} is used exactly the same way as its parent class \labop{BehaviorExecution}.
TODO: consider dropping the protocol field as redundant with use prov:type field in its parent%
\newline%
\linebreak%
The \labop{ProtocolExecution} class is shown in \ref{fig:BehaviorExecution}. It is derived from \labop{BehaviorExecution}.%
This class includes the following properties: \labop{execution}, \labop{protocol}, \labop{flow}. %
\begin{itemize}%
\item%
The \labop{execution} property is OPTIONAL and contains URI references to associated objects of type ActivityNodeExecutionEach instance of this property links to an ActivityNodeExecution that records one
firing of a uml:ActivityNode during the execution of its containing Protocol%
\item%
The \labop{protocol} property is REQUIRED and contains a URI reference to an associated object of type ProtocolThis property appears to be redundant with the use of prov:type specified by BehaviorExecution, and is likely to be deleted%
\item%
The \labop{flow} property is OPTIONAL and contains URI references to associated objects of type ActivityEdgeFlowEach instance of this property links to an ActivityEdgeFlow that records one movement of a UML
token along a uml:ActivityEdge during the execution of its containing Protocol%
\end{itemize}%
\subsection{ParameterValue}%
\label{sec:labop:ParameterValue}%
This class is used to represent the assignment of a value to a parameter in a BehaviorExecution
that records the execution of a \uml{Behavior}. This class is similar to \prov{Usage}, but instead of always
pointing to an object it uses an arbitrary literal (which might or might not be an object). An example would
be recording that a plate reader absorbance measurement was taken with its absorbance wavelength parameter set
to 600 nm%
\newline%
\linebreak%
\begin{figure}[h!]%
\centering%
\includegraphics[width=0.42148936170212764\textwidth]{labop_classes/ParameterValue_abstraction_hierarchy.pdf}%
\caption{ParameterValue}%
\label{fig:ParameterValue}%
\end{figure}
%
The \labop{ParameterValue} class is shown in \ref{fig:ParameterValue}. It is derived from \sbol{Identified}.%
This class includes the following properties: \labop{parameter}, \labop{parameterValue}. %
\begin{itemize}%
\item%
The \labop{parameter} property is REQUIRED and contains a URI reference to an associated object of type ParameterThis property points to the uml:Parameter associated with the value (e.g., wavelength for a
plate reader absorbance measurement behavior).%
\item%
The \labop{parameterValue} property is REQUIRED and contains a URI reference to an associated object of type LiteralSpecificationThis property points to the literal value used for the parameter during execution (e.g., a
uml:LiteralIdentified for an om:Measure representing a 600 nm wavelength).%
\end{itemize}%
\subsection{Material}%
\label{sec:labop:Material}%
An amount of material allocated for use during the execution of a behavior.
For example a \labop{Material} might be used to specify 1 96-well flat-bottom microplate or 2.5 mL of 10 millimolar glucose.
TODO: consider changing type of specification to allow non-TopLevel descriptions, such as a \labop{ContainerSpec} or sbol:ExternallyDefined
TODO: consider adding a field to distinguish between expended vs. reusable materials.%
\newline%
\linebreak%
\begin{figure}[h!]%
\centering%
\includegraphics[width=0.35893617021276597\textwidth]{labop_classes/Material_abstraction_hierarchy.pdf}%
\caption{Material}%
\label{fig:Material}%
\end{figure}
%
The \labop{Material} class is shown in \ref{fig:Material}. It is derived from \sbol{Identified}.%
This class includes the following properties: \labop{amount}, \labop{specification}. %
\begin{itemize}%
\item%
The \labop{amount} property is REQUIRED and contains a URI reference to an associated object of type MeasureThe amount property of a Material is used to indicate the quantity of material used.
For example, 2.5 mL (referring to a fluid) or 3 (with unit "number", referring to a group of microplates)%
\item%
The \labop{specification} property is REQUIRED and contains a URI reference to an associated object of type TopLevelThe specification property is used to indicate the type of material used.
For example a DNA sample would be described by an sbol:Component.
TODO: add example for glucose and for 96-well plate%
\end{itemize}%
\subsection{ActivityEdgeFlow}%
\label{sec:labop:ActivityEdgeFlow}%
An \labop{ActivityEdgeFlow} records \sbol{one} movement of a UML token along a \uml{ActivityEdge} during the
execution of its containing \labop{Protocol}. If the edge is a \uml{ObjectFlow}, then the value MUST be set.
If the edge is a \uml{ControlFlow}, then the value MUST NOT be set.
For instance, the \labop{ActivityEdgeFlow} for a \uml{ObjectFlow} might record a measurement being sent to an output
\uml{Parameter}, while the \labop{ActivityEdgeFlow} for a \uml{ControlFlow} might record a decision to proceed down a
particular branch from a \uml{DecisionNode}.
Note that a \uml{ActivityEdge} might appear in multiple \labop{ActivityEdgeFlow} records associated with a single
\labop{ProtocolExecution}, e.g., due to a loop in the \labop{Protocol}. It also might not appear in any, if the
\uml{ActivityEdge} is on a path not taken due to branching control flow.
TODO: correct the cardinality: edgeValue is supposed to be optional, not edge
%
\newline%
\linebreak%
\begin{figure}[h!]%
\centering%
\includegraphics[width=0.4319148936170213\textwidth]{labop_classes/ActivityEdgeFlow_abstraction_hierarchy.pdf}%
\caption{ActivityEdgeFlow}%
\label{fig:ActivityEdgeFlow}%
\end{figure}
%
The \labop{ActivityEdgeFlow} class is shown in \ref{fig:ActivityEdgeFlow}. It is derived from \sbol{Identified}.%
This class includes the following properties: \labop{tokenSource}, \labop{edge}, \labop{edgeValue}. %
\begin{itemize}%
\item%
The \labop{tokenSource} property is REQUIRED and contains a URI reference to an associated object of type ActivityNodeExecutionThis property is used to indicate the ActivityNodeExecution that produced the token.%
\item%
The \labop{edge} property is OPTIONAL and contains a URI reference to an associated object of type ActivityEdgeThis property is used to indicate the uml:ActivityEdge down which the token moved.%
\item%
The \labop{edgeValue} property is REQUIRED and contains a URI reference to an associated object of type IdentifiedThis property is used to indicate the value of a token that moved on a uml:ObjectFlow edge.%
\end{itemize}%
\subsection{ActivityNodeExecution}%
\label{sec:labop:ActivityNodeExecution}%
An \labop{ActivityNodeExecution} records \sbol{one} instance in which a \uml{ActivityNode} is executed during the
execution of its containing \labop{Protocol}.
For instance, the \labop{ActivityNodeExecution} for a \uml{CallBehaviorAction} to measure absorbance on a plate reader
would set its node property to point to the \uml{CallBehaviorAction} and might have incomingFlow properties
indicating arrival of information about the samples to measure via a \uml{ObjectFlow} and the arrival a
of permission to begin via a \uml{ControlFlow}.
Note that a \uml{ActivityNode} might appear in multiple \labop{ActivityNodeExecution} records associated with a single
\labop{ProtocolExecution}, e.g., due to a loop in the \labop{Protocol}. It also might not appear in any, if the
\uml{ActivityNode} is on a path not taken due to branching control flow.%
\newline%
\linebreak%
\begin{figure}[h!]%
\centering%
\includegraphics[width=0.6061702127659574\textwidth]{labop_classes/ActivityNodeExecution_abstraction_hierarchy.pdf}%
\caption{ActivityNodeExecution}%
\label{fig:ActivityNodeExecution}%
\end{figure}
%
The \labop{ActivityNodeExecution} class is shown in \ref{fig:ActivityNodeExecution}. It is derived from \sbol{Identified} and includes the following specializations: \labop{CallBehaviorExecution}. %
This class includes the following properties: \labop{node}, \labop{incomingFlow}. %
\begin{itemize}%
\item%
The \labop{node} property is REQUIRED and contains a URI reference to an associated object of type ActivityNodeThis property is used to indicate the uml:ActivityNode that has been execcuted.%
\item%
The \labop{incomingFlow} property is OPTIONAL and contains URI references to associated objects of type ActivityEdgeFlowThis property is used to indicate an ActivityEdgeFlow that delivered a token consumed during
the execution of the uml:ActivityNode.%
\end{itemize}%
\subsubsection{CallBehaviorExecution}%
\label{sec:labop:CallBehaviorExecution}%
A \labop{CallBehaviorExecution} extends \labop{ActivityNodeExecution} by adding a pointer to a BehaviorExecution
record for the \uml{Behavior} that is being executed.
For a primitive action (e.g., measuring absorbance on a plate reader), this is a plain \labop{BehaviorExecution},
while for calling a \labop{Protocol} as a sub-routine (e.g., to run a stage of an Type IIS assembly), this would be a
\labop{ProtocolExecution}.%
\newline%
\linebreak%
The \labop{CallBehaviorExecution} class is shown in \ref{fig:ActivityNodeExecution}. It is derived from \labop{ActivityNodeExecution}.%
This class includes the following properties: \labop{call}. %
\begin{itemize}%
\item%
The \labop{call} property is REQUIRED and contains a URI reference to an associated object of type BehaviorExecutionThis property indicates the BehaviorExecution record for the uml:Behavior that was called.%
\end{itemize}%
\subsection{SampleCollection}%
\label{sec:labop:SampleCollection}%
SampleCollection is the base class for describing the collections of physical materials that are
acted upon by a \labop{Protocol}. For example, a \labop{SampleCollection} might describe a set of 10 cell cultures growing in
96-well plate cells, or a set of 6 streaked agar plates, or a single 500 mL flask filled with media.
There are two types of \labop{SampleCollection}. A \labop{SampleArray} specifies an n-dimensional rectangular array of samples,
all stored in the same type of container. A \labop{SampleMask} specifies a subset of a \labop{SampleCollection} by means of an
array of Boolean values indicating whether each element is included or excluded from the subset.
Note, however, that a \labop{SampleCollection} is a logical object and not a physical object. Thus, while a
\labop{SampleCollection} might describe a set of samples in 96-well plate wells, it does not necessarily identify
a particular 96-well plate or the location of those wells. In practice, these will be determined as a
result of the specific library calls made to generate \labop{SampleCollection} objects, and may not be determined
until the protocol is actually run in a particular execution environment.
This is important for increasing the flexibility with which a \labop{Protocol} can be specified and applied.
Consider, for example, a cell culturing protocol that includes a step to measure \sbol{sample} absorbance on a plate
reader. Describing this step does not require knowing how the samples are laid out on the plate, and in many
cases is even acceptable to run on samples across multiple plates. This flexibility will allow the cell
culturing protocol to be applied for experiments with different numbers and arrangements of samples.%
\newline%
\linebreak%
\begin{figure}[h!]%
\centering%
\includegraphics[width=0.40212765957446805\textwidth]{labop_classes/SampleCollection_abstraction_hierarchy.pdf}%
\caption{SampleCollection}%
\label{fig:SampleCollection}%
\end{figure}
%
The \labop{SampleCollection} class is shown in \ref{fig:SampleCollection}. It is derived from \sbol{Identified} and includes the following specializations: \labop{SampleArray}, \labop{SampleMask}. %
%
\subsubsection{SampleArray}%
\label{sec:labop:SampleArray}%
A \labop{SampleArray} specifies an n-dimensional rectangular array of samples, all stored in the same
type of container. For example, a \labop{SampleCollection} might describe a set of 10 cell cultures growing in
96-well plate cells, or a set of 6 streaked agar plates, or a single 500 mL flask filled with media.
Wells may be full, in which case the contents property should contain a URI to a description of the \sbol{sample},
or empty, in which case the contents should be null.
Note that this is a logical array, and does not necessarily indicate the actual layout of the samples in space.
For example, a 2x4 array of samples in 96-well plate wells might end up being laid out as a 2x4 array in wells
A1 to B4 or as a 2x4 array in wells G5 to H8 or as an 8x1 column in wells A1 to H1, or even as eight wells
scattered arbitrarily around the plate according to an anti-bias quality control schema.
This also allows for higher-dimensional arrays where each dimension represents an experimental factor.
For example, an experiment testing four factors with 3, 3, 4, and 5 values per factor, for a total of 180
combinations, could be represented as a 4-dimensional \sbol{sample} array of 96-well plate wells, and then end up
laid out over two plates.
TODO: need to decide on the format of the contents description.%
\newline%
\linebreak%
The \labop{SampleArray} class is shown in \ref{fig:SampleCollection}. It is derived from \labop{SampleCollection}.%
This class includes the following properties: \labop{contents}, \labop{containerType}. %
\begin{itemize}%
\item%
The \labop{contents} property is REQUIRED and has a singleton value of type stringDescription of the contents.
TODO: need to decide whether this is a multi-valued property with associated array coordinates or a
single-valued property with an array value.
Currently set to string as a "dummy" value that can serialize anything.%
\item%
The \labop{containerType} property is REQUIRED and has a singleton value of type URI%
\end{itemize}%
\subsubsection{SampleMask}%
\label{sec:labop:SampleMask}%
A \labop{SampleMask} is a subset of a \labop{SampleCollection}. The subset of samples to be included is defined
by an array of Boolean values, where true values indicate that a \sbol{sample} is included and false values indicate
that it is excluded.
The dimensions of the mask MUST be identical to the dimensions of the source \labop{SampleCollection}. For this purpose,
the dimensions of a masked subset are not reduced, but remain the same as the original \labop{SampleArray}. This allows
masks to be composed, such that SampleMask(source=SampleMask(source=X,mask=mask1),mask=mask2) is equivalent to
SampleMask(source=X,mask=mask1 AND mask2). Note that this implies masks are commutative and idempotent.%
\newline%
\linebreak%
The \labop{SampleMask} class is shown in \ref{fig:SampleCollection}. It is derived from \labop{SampleCollection}.%
This class includes the following properties: \labop{mask}, \labop{source}. %
\begin{itemize}%
\item%
The \labop{source} property is REQUIRED and contains a URI reference to an associated object of type SampleCollectionThe source indicates the SampleCollection that is being subsetted via the mask%
\item%
The \labop{mask} property is REQUIRED and has a singleton value of type stringThe mask is an N-dimensional array of Booleans values, where each Boolean indicates whether the
sample at the corresponding location in the source is included in the subset.
TODO: format of mask array needs to match the array format chosen for the SampleArray contents property%
\end{itemize}%
\subsection{SampleData}%
\label{sec:labop:SampleData}%
The \labop{SampleData} class is used to associate a set of data with a collection of samples.
This is typically used to capture measurements, e.g., an array of absorbance measurements collected by
a plate reader. Using this data structure allows the values in a dataframe to be automatically linked to
the descriptions of the samples that the data describes, which is critical for data analysis.
The dimensions of the sampleDataValues MUST equal the dimensions of the \labop{SampleCollection} linked with fromSamples.
TODO: the format of the data values needs to be compatible with the array format chosen for the
\labop{SampleArray} contents property. In this case, however, we also need to consider how we want to support
multiple values for each \sbol{sample} (e.g., measurement of both fluorescence and absorbance in a plate reader),
as well as links to more complex data (e.g., results of flow cytometry or omics for each sample)%
\newline%
\linebreak%
\begin{figure}[h!]%
\centering%
\includegraphics[width=0.3946808510638298\textwidth]{labop_classes/SampleData_abstraction_hierarchy.pdf}%
\caption{SampleData}%
\label{fig:SampleData}%
\end{figure}
%
The \labop{SampleData} class is shown in \ref{fig:SampleData}. It is derived from \sbol{Identified}.%
This class includes the following properties: \labop{fromSamples}, \labop{sampleDataValues}. %
\begin{itemize}%
\item%
The \labop{fromSamples} property is REQUIRED and contains a URI reference to an associated object of type SampleCollectionThe fromSamples property indicates the SampleCollection from which the data were collected.%
\item%
The \labop{sampleDataValues} property is REQUIRED and contains a URI reference to an associated object of type stringThe sampleDataValues are an array of data values, one for each sample, format to be determined.%
\end{itemize}%
\subsection{ContainerSpec}%
\label{sec:labop:ContainerSpec}%
A \labop{ContainerSpec} is used to indicate the type of container to be used for a \labop{SampleArray}, e.g.,
a standard 96-well flat-bottom transparent plate.
TODO: determine if we want to use this format or modify it in some way.%
\newline%
\linebreak%
\begin{figure}[h!]%
\centering%
\includegraphics[width=0.29936170212765956\textwidth]{labop_classes/ContainerSpec_abstraction_hierarchy.pdf}%
\caption{ContainerSpec}%
\label{fig:ContainerSpec}%
\end{figure}
%
The \labop{ContainerSpec} class is shown in \ref{fig:ContainerSpec}. It is derived from \sbol{Identified}.%
This class includes the following properties: \labop{prefixMap}, \labop{queryString}. %
\begin{itemize}%
\item%
The \labop{prefixMap} property is REQUIRED and has a singleton value of type stringA prefix map in JSON-LD format, to be applied to a queryString.%
\item%
The \labop{queryString} property is REQUIRED and has a singleton value of type stringA query string, in OWL Manchester syntax, to be used to find matching containers in the ContainerSpec.%
\end{itemize}%