forked from oasis-tcs/virtio-spec
-
Notifications
You must be signed in to change notification settings - Fork 0
/
virtio-mem.tex
483 lines (351 loc) · 18.7 KB
/
virtio-mem.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
\section{Memory Device}\label{sec:Device Types / Memory Device}
The virtio memory device provides and manages a memory region in guest
physical address space. This memory region is partitioned into memory
blocks of fixed size that can either be in the state plugged or unplugged.
Once plugged, a memory block can be used like ordinary RAM. The driver
selects memory blocks to (un)plug and requests the device to perform the
(un)plug.
The device requests the driver to plug a certain amount of memory, by
setting the \field{requested_size} in the device configuration, which can
change at runtime. It is up to the device driver to fulfill this request
by (un)plugging memory blocks. Once the \field{plugged_size} is greater or
equal to the \field{requested_size}, requests to plug memory blocks will be
rejected by the device.
The device-managed memory region is split into two parts, the usable region
and the unusable region. All memory blocks in the unusable region are
unplugged and requests to plug them will be rejected. The device will grow
the usable region to fit the \field{requested_size}. Usually, the usable
region is bigger than the \field{requested_size} of the device, to give the
driver some flexibility when selecting memory blocks to plug.
On initial start, and after a system reset, all memory blocks are
unplugged. In corner cases, memory blocks might still be plugged after a
system reset, and the driver usually requests to unplug all memory while
initializing, before starting to select memory blocks to plug.
The device-managed memory region is not exposed as RAM via other firmware
/ hw interfaces (e.g., e820 on x86). The driver is responsible for
deciding how plugged memory blocks will be used. A common use case is to
expose plugged memory blocks to the operating system as system RAM,
available for the page allocator.
Some platforms provide memory properties for system RAM that are usually
queried and modified using special CPU instructions. Memory properties might
be implicitly queried or modified on memory access. Memory properties can
include advanced memory protection, access and change indication, or memory
usage indication relevant in virtualized environments. \footnote{For example,
s390x provides storage keys for each 4 KiB page and may, depending on the
configuration, provide storage attributes for each 4 KiB page.} The device
provides the exact same properties with the exact same semantics for
plugged device memory as available for comparable RAM in the same configuration.
\subsection{Device ID}\label{sec:Device Types / Memory Device / Device ID}
24
\subsection{Virtqueues}\label{sec:Device Types / Memory Device / Virtqueues}
\begin{description}
\item[0] guest-request
\end{description}
\subsection{Feature bits}\label{sec:Device Types / Memory Device / Feature bits}
\begin{description}
\item[VIRTIO_MEM_F_ACPI_PXM (0)] The field \field{node_id} in the device
configuration is valid and corresponds to an ACPI PXM.
\item[VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE (1)] The driver is not allowed to
access unplugged memory. \footnote{On platforms with memory properties that
might get modified implicitly on memory access, this feature is expected to
be offered by the device.}
\end{description}
\subsection{Device configuration layout}\label{sec:Device Types / Memory Device / Device configuration layout}
All fields of this configuration are always available and read-only for the
driver.
\begin{lstlisting}
struct virtio_mem_config {
le64 block_size;
le16 node_id;
le8 padding[6];
le64 addr;
le64 region_size;
le64 usable_region_size;
le64 plugged_size;
le64 requested_size;
};
\end{lstlisting}
\begin{description}
\item[\field{block_size}] is the size and the alignment in bytes of a
memory block. Cannot change.
\item[\field{node_id}] has no meaning without VIRTIO_MEM_F_ACPI_PXM. With
VIRTIO_MEM_F_ACPI_PXM, this field is valid and corresponds to an ACPI PXM.
Cannot change.
\item[\field{padding}] has no meaning and is reserved for future use.
\item[\field{addr}] is the guest physical address of the start of the
device-managed memory region in bytes. Cannot change.
\item[\field{region_size}] is the size of device-managed memory region in
bytes. Cannot change.
\item[\field{usable_region_size}] is the size of the usable device-managed
memory region. Can grow up to \field{region_size}. Can only shrink due to
VIRTIO_MEM_REQ_UNPLUG_ALL requests.
\item[\field{plugged_size}] is the amount of plugged memory in bytes within
the usable device-managed memory region.
\item[\field{requested_size}] is the requested amount of plugged memory
within the usable device-managed memory region.
\end{description}
\drivernormative{\subsubsection}{Device configuration layout}{Device Types / Memory Device / Device configuration layout}
The driver MUST NOT write to device configuration fields.
The driver MUST ignore the value of \field{padding}.
The driver MUST ignore the value of \field{node_id} without
VIRTIO_MEM_F_ACPI_PXM.
\devicenormative{\subsubsection}{Device configuration layout}{Device Types / Memory Device / Device configuration layout}
The device MAY change \field{usable_region_size} and
\field{requested_size}.
The device MUST NOT change \field{block_size}, \field{node_id},
\field{addr}, and \field{region_size}, except during a system reset.
The device MUST change \field{plugged_size} to reflect the size of plugged
memory blocks.
The device MUST set \field{usable_region_size} to \field{requested_size} or
greater.
The device MUST set \field{block_size} to a power of two.
The device MUST set \field{addr}, \field{region_size},
\field{usable_region_size}, \field{plugged_size}, \field{requested_size} to
multiples of \field{block_size}.
The device MUST set \field{region_size} to 0 or greater.
The device MUST NOT shrink \field{usable_region_size}, except when
processing an UNPLUG ALL request, or during a system reset.
The device MUST send a configuration update notification when changing
\field{usable_region_size} or \field{requested_size}, except when
processing an UNPLUG ALL request.
The device SHOULD NOT send a configuration update notification when
changing \field{plugged_size}.
The device MAY send a configuration update notification even if nothing
changed.
\subsection{Device Initialization}\label{Device Types / Memory Device / Device Initialization}
On initialization, the driver first discovers the device's virtqueues. It
then reads the device configuration.
In case the driver detects that the device still has memory plugged
(\field{plugged_size} in the device configuration is greater than 0), the
driver will either try to re-initialize by issuing STATE requests, or try
to unplug all memory before continuing. Special handling might be
necessary in case some plugged memory might still be relevant (e.g., system
dump, memory still in use after unloading the driver).
\drivernormative{\subsubsection}{Device Initialization}{Device Types / Memory Device / Device Initialization}
The driver SHOULD accept VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE if it is
offered and the driver supports it.
The driver SHOULD issue UNPLUG ALL requests until successful if the device
still has memory plugged and the plugged memory is not in use.
\devicenormative{\subsubsection}{Device Initialization}{Device Types / Memory Device / Device Initialization}
A device MAY fail to operate further if VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
is not accepted.
The device MUST NOT change the state of memory blocks during device reset.
The device MUST NOT modify memory or memory properties of plugged memory
blocks during device reset.
\subsection{Device Operation}\label{sec:Device Types / Memory Device / Device Operation}
The device notifies the driver about the amount of memory the device wants
the driver to consume via the device. These resize requests from the
device are communciated via the \field{requested_size} in the device
configuration. The driver will react by requesting to (un)plug specific
memory blocks, to make the \field{plugged_size} match the
\field{requested_size} as close as possible.
The driver sends requests to the device on the guest-request virtqueue,
notifies the device, and waits for the device to respond. Requests have a
common header, defining the request type, followed by request-specific
data. All requests are 24 bytes long and have the layout:
\begin{lstlisting}
struct virtio_mem_req {
le16 type;
le16 padding[3];
union {
struct virtio_mem_req_plug plug;
struct virtio_mem_req_unplug unplug;
struct virtio_mem_req_state state;
} u;
}
\end{lstlisting}
Possible request types are:
\begin{lstlisting}
#define VIRTIO_MEM_REQ_PLUG 0
#define VIRTIO_MEM_REQ_UNPLUG 1
#define VIRTIO_MEM_REQ_UNPLUG_ALL 2
#define VIRTIO_MEM_REQ_STATE 3
\end{lstlisting}
Responses have a common header, defining the response type, followed by
request-specific data. All responses are 10 bytes long and have the layout:
\begin{lstlisting}
struct virtio_mem_resp {
le16 type;
le16 padding[3];
union {
struct virtio_mem_resp_state state;
} u;
}
\end{lstlisting}
Possible response types, in general, are:
\begin{lstlisting}
#define VIRTIO_MEM_RESP_ACK 0
#define VIRTIO_MEM_RESP_NACK 1
#define VIRTIO_MEM_RESP_BUSY 2
#define VIRTIO_MEM_RESP_ERROR 3
\end{lstlisting}
\drivernormative{\subsubsection}{Device Operation}{Device Types / Memory Device / Device Operation}
The driver MUST NOT write memory or modify memory properties of
unplugged memory blocks.
The driver MUST NOT read memory or query memory properties of unplugged memory
blocks outside \field{usable_region_size}.
The driver MUST NOT read memory or query memory properties of unplugged memory
blocks inside \field{usable_region_size} via DMA.
If VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE has not been negotiated, the driver
SHOULD NOT read memory or query memory properties of unplugged memory blocks
inside \field{usable_region_size} via the CPU.
If VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE has been negotiated, the driver
MUST NOT read memory or query memory properties of unplugged memory blocks.
The driver MUST NOT request unplug of memory blocks while corresponding memory
or memory properties are still in use.
The driver SHOULD initialize memory blocks after plugging them, the content
is undefined.
The driver SHOULD react to resize requests from the device
(\field{requested_size} in the device configuration changed) by
(un)plugging memory blocks.
The driver SHOULD only plug memory blocks it can actually use.
The driver MAY not reach the requested size (\field{requested_size} in the
device configuration), for example, because it cannot free up any plugged
memory blocks to unplug them, or it would not be able to make use of
unplugged memory blocks after plugging them (e.g., alignment).
\devicenormative{\subsubsection}{Device Operation}{Device Types / Memory Device / Device Operation}
The device MUST provide the exact same memory properties with the exact same
semantics for device memory the platform provides in the same configuration for
comparable RAM.
The device MAY modify memory of unplugged memory blocks or reset memory
properties of such memory blocks to platform defaults at any time.
The device MUST NOT modify memory or memory properties of plugged memory
blocks.
The device MUST allow the driver to read and write memory and to query
and modify memory attributes of plugged memory blocks.
If VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE has not been negotiated, the device
MUST allow the driver to read memory and to query memory properties of
unplugged memory blocks inside \field{usable_region_size} via the CPU.
\footnote{To allow for simplified dumping of memory. The CPU is expected to
copy such memory to another location before starting DMA.}
The device MAY change the state of memory blocks during system resets.
The device SHOULD unplug all memory blocks during system resets.
\subsubsection{PLUG request}\label{sec:Device Types / Memory Device / Device Operation / PLUG request}
Request to plug consecutive memory blocks that are currently unplugged.
The request-specific data in a PLUG request has the format:
\begin{lstlisting}
struct virtio_mem_req_plug {
le64 addr;
le16 nb_blocks;
le16 padding[3];
}
\end{lstlisting}
\field{addr} is the guest physical address of the first memory block.
\field{nb_blocks} is the number of consecutive memory blocks
Responses don't have request-specific data defined.
\drivernormative{\paragraph}{PLUG request}{Device Types / Memory Device / Device Operation / PLUG request}
The driver MUST ignore anything except the response type in a response.
\devicenormative{\paragraph}{PLUG request}{Device Types / Memory Device / Device Operation / PLUG request}
The device MUST ignore anything except the request type and the
request-specific data in a request.
The device MUST ignore the \field{padding} in the request-specific data in
a request.
The device MUST reject requests with VIRTIO_MEM_RESP_ERROR if \field{addr}
is not aligned to the \field{block_size} in the device configuration, if
\field{nb_blocks} is not greater than 0, or if any memory block outside of
the usable device-managed memory region is covered by the request.
The device MUST reject requests with VIRTIO_MEM_RESP_ERROR if any memory
block covered by the request is already plugged.
The device MAY reject requests with VIRTIO_MEM_RESP_BUSY if the request can
currently not be processed.
The device MUST acknowledge requests with VIRTIO_MEM_RESP_ACK in case all
memory blocks were successfully plugged. The device MUST reflect the
change in the device configuration \field{plugged_size}.
\subsubsection{UNPLUG request}\label{sec:Device Types / Memory Device / Device Operation / UNPLUG request}
Request to unplug consecutive memory blocks that are currently plugged.
The request-specific data in an UNPLUG request has the format:
\begin{lstlisting}
struct virtio_mem_req_unplug {
le64 addr;
le16 nb_blocks;
le16 padding[3];
}
\end{lstlisting}
\field{addr} is the guest physical address of the first memory block.
\field{nb_blocks} is the number of consecutive memory blocks
Responses don't have request-specific data defined.
\drivernormative{\paragraph}{UNPLUG request}{Device Types / Memory Device / Device Operation / UNPLUG request}
The driver MUST ignore anything except the response type in a response.
\devicenormative{\paragraph}{UNPLUG request}{Device Types / Memory Device / Device Operation / UNPLUG request}
The device MUST ignore anything except the request type and the
request-specific data in a request.
The device MUST ignore the \field{padding} in the request-specific data in
a request.
The device MUST reject requests with VIRTIO_MEM_RESP_ERROR if \field{addr}
is not aligned to the \field{block_size} in the device configuration, if
\field{nb_blocks} is not greater than 0, or if any memory block outside of
the usable device-managed memory region is covered by the request.
The device MUST reject requests with VIRTIO_MEM_RESP_ERROR if any memory
block covered by the request is already unplugged.
The device MAY reject requests with VIRTIO_MEM_RESP_BUSY if the request can
currently not be processed.
The device MUST acknowledge requests with VIRTIO_MEM_RESP_ACK in case all
memory blocks were successfully unplugged. The device MUST reflect the
change in the device configuration \field{plugged_size}.
\subsubsection{UNPLUG ALL request}\label{sec:Device Types / Memory Device / Device Operation / UNPLUG ALL request}
Request to unplug all memory blocks the device has currently plugged. If
successful, the \field{plugged_size} in the device configuration will be 0
and \field{usable_region_size} might have changed.
Requests don't have request-specific data defined, only the request type is
relevant. Responses don't have request-specific data defined, only the
response type is relevant.
\drivernormative{\paragraph}{UNPLUG request}{Device Types / Memory Device / Device Operation / UNPLUG ALL request}
The driver MUST ignore any data in a response except the response type.
\devicenormative{\paragraph}{UNPLUG request}{Device Types / Memory Device / Device Operation / UNPLUG ALL request}
The device MUST ignore any data in a request except the request type.
The device MUST ignore the \field{padding} in the request-specific data in
a request.
The device MAY reject requests with VIRTIO_MEM_RESP_BUSY if the request can
currently not be processed.
The device MUST acknowledge requests with VIRTIO_MEM_RESP_ACK in case all
memory blocks were successfully unplugged.
The device MUST set \field{plugged_size} to 0 in case the request is
acknowledged with VIRTIO_MEM_RESP_ACK.
The device MAY modify \field{usable_region_size} before responding with
VIRTIO_MEM_RESP_ACK.
\subsubsection{STATE request}\label{sec:Device Types / Memory Device / Device Operation / STATE request}
Request the state (plugged, unplugged, mixed) of consecutive memory blocks.
The request-specific data in a STATE request has the format:
\begin{lstlisting}
struct virtio_mem_req_state {
le64 addr;
le16 nb_blocks;
le16 padding[3];
};
\end{lstlisting}
\field{addr} is the guest physical address of the first memory block.
\field{nb_blocks} is the number of consecutive memory blocks.
The request-specific data in a STATE response has the format:
\begin{lstlisting}
struct virtio_mem_resp_state {
le16 type;
};
\end{lstlisting}
Whereby \field{type} defines one of three different state types:
\begin{lstlisting}
#define VIRTIO_MEM_STATE_PLUGGED 0
#define VIRTIO_MEM_STATE_UNPLUGGED 1
#define VIRTIO_MEM_STATE_MIXED 2
\end{lstlisting}
\drivernormative{\paragraph}{STATE request}{Device Types / Memory Device / Device Operation / STATE request}
The driver MUST ignore anything except the response type and the
request-specific data in a response.
The driver MUST ignore the request-specific data in a response in case the
response type is not VIRTIO_MEM_RESP_ACK.
\devicenormative{\paragraph}{STATE request}{Device Types / Memory Device / Device Operation / STATE request}
The device MUST ignore anything except the request type and the
request-specific data in a request.
The device MUST ignore the \field{padding} in the request-specific data in
a request.
The device MUST reject requests with VIRTIO_MEM_RESP_ERROR if \field{addr}
is not aligned to the \field{block_size} in the device configuration, if
\field{nb_blocks} is not greater than 0, or if any memory block outside of
the usable device-managed memory region is covered by the request.
The device MUST acknowledge requests with VIRTIO_MEM_RESP_ACK, supplying
the state of the memory blocks.
The device MUST set the state type in the response to
VIRTIO_MEM_STATE_PLUGGED if all requested memory blocks are plugged. The
device MUST set the state type in the response to
VIRTIO_MEM_STATE_UNPLUGGED if all requested memory blocks are unplugged.
Otherwise, the device MUST set state type in the response to
VIRTIO_MEM_STATE_MIXED.