Skip to content

Commit

Permalink
Merge pull request #80 from pathakraul/rpathak_arc_review_79
Browse files Browse the repository at this point in the history
ARC Feedback Updates and other changes
  • Loading branch information
lftan authored Dec 9, 2024
2 parents 0d09215 + 1387d5d commit 2674515
Show file tree
Hide file tree
Showing 6 changed files with 139 additions and 40 deletions.
Binary file modified images/highlevel-arch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
30 changes: 29 additions & 1 deletion src/message-protocol.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Multiple related RPMI services are grouped logically into an *RPMI service
group* such as Clock, Voltage, Performance, etc. Depending on the RPMI service,
a RPMI request message may carry data required to perform the control and
management task. An RPMI request message may have an associated response which
is send back as an *RPMI acknowledgement message* on the same RPMI transport
is sent back as an *RPMI acknowledgement message* on the same RPMI transport
channel. The RPMI acknowledgement message carries the status and optional
response data from an RPMI request after it has been processed.

Expand Down Expand Up @@ -166,6 +166,34 @@ values as corresponding fields in the RPMI request message. The `DATALEN`
field of the RPMI acknowledgement message must be set according to the data
carried by this acknowledgement.

NOTE: The message token will help the application processors to keep track of
the origin of the request when it receives a response. This is useful when the
multiple application processors are sharing the same queues. For example, two
different application processors may send the same type of request message with
the same SERVICEGROUP_ID and SERVICE_ID. When the response messages for both
requests are received from the platform microcontroller, the token helps
distinguish which response belongs to which request.

NOTE: The RPMI specification recommends monotonically increasing token numbers
and the token number can be initialized from any value without any constraints.

When the doorbell interrupts are supported and enabled, the application processor
can set the `flags[3]` bit to `1` in the request message header to inform the
platform microcontroller to ring the doorbell after sending the response back.
If the `flags[3]` bit is `0` in the request message
header, it means that the application processor is going poll for the
response message in the queue and the platform microcontroller should not
ring the doorbell.

NOTE: The flags[3] bit can be used for a particular message or for the entire
lifecycle of RPMI message communication. For example if the application
processor and the platform microcontroller are capable for MSIs and the application
processor has configured MSI details via defined service in <<srvgrp_base_set_msi>>,
then flags[3] bit can always be enabled so that the platform microcontroller will
always send the MSI for each response. Also, the application processor can
selectively disable it so that the platform microcontroller in that case does
not trigger a doorbell.

For an RPMI notification message, the platform microcontroller will set
appropriate values for the `TOKEN`, `SERVICEGROUP_ID`, and `DATALEN` fields
whereas the `SERVICE_ID` field must be always set to `0x0`.
Expand Down
6 changes: 6 additions & 0 deletions src/rpmi.bib
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,9 @@ @electronic{libRPMI
title = {libRPMI},
url = {https://github.com/riscv-software-src/librpmi}
}

@electronic{priv_v1_12,
title = {The RISC-V Instruction Set Manual, Volume II: Privileged Architecture},
url = {https://github.com/riscv/riscv-isa-manual/releases/tag/Priv-v1.12},
year = {2021}
}
19 changes: 13 additions & 6 deletions src/service-groups.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@ All implemented RPMI service groups must satisfy the following requirements:
level associated with the RPMI context which includes it.
. All RPMI services of the RPMI service groups must be supported except
the dedicated notification service (`SERVICE_ID = 0x00`) which is reserved
for RPMI notification messages. A RPMI service group may be partially
implement its RPMI services only if defines mechanism to discover supported
for RPMI notification messages. A RPMI service group may implement its RPMI
services partially only if it also defines a mechanism to discover supported
RPMI services.
. The RPMI service group must implement a dedicated RPMI service with
`SERVICE_ID = 0x01` to subscribe for event notifications.
Expand All @@ -42,9 +42,11 @@ should be invoked.

This specification defines standard RPMI service groups and RPMI services
with the provision to add more service groups as required in the future.
The platform vendors can provide implementation specific RPMI service groups.
The <<table_service_groups>> below list all standard RPMI service groups
defined by this specification.
The RPMI specification also provides experimental service group IDs space
for development of service group until a standard service group ID is
allocated. The platform vendors can provide implementation specific RPMI
service groups. The <<table_service_groups>> table below lists all standard
RPMI service groups defined by this specification.

[#table_service_groups]
.RPMI Service Groups
Expand Down Expand Up @@ -115,11 +117,16 @@ defined by this specification.
| REQUEST_FORWARD
| M-mode, S-mode

| 0x000D - 0x7FFF
| 0x000D - 0x7BFF
|
| _Reserved for Future Use_
|

| 0x7C00 - 0x7FFF
|
| _Experimental Service Groups_
|

| 0x8000 - 0xFFFF
|
| _Implementation Specific Service Groups_
Expand Down
21 changes: 16 additions & 5 deletions src/srvgrp-base.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -65,15 +65,28 @@ The following table lists the services in the BASE service group:
|===

==== RPMI Implementation IDs
The RPMI specification defines space for standard implementation IDs and for
experimental implementation IDs. The experimental implementation IDs can be used
by the implementations until a standard implementation ID is assigned to it.

The RPMI implementations that have been assigned a standard implementation ID
are listed in the table below.

[#table_base_rpmi_impl_id]
.RPMI Implementation IDs
[cols="2, 3a", width=100%, align="center", options="header"]
|===
| Implementation ID
| Name

| 0x0
| 0x00000000
| libRPMI cite:[libRPMI]

| 0x00000001 - 0x7FFFFFFF
| _Reserved for Future Use_

| 0x80000000 - 0xFFFFFFFF
| _Experimental Implementation IDs_
|===

[#base-notifications]
Expand Down Expand Up @@ -525,6 +538,7 @@ service group.
| _Reserved_, must be initialized to `0`.
|===

[#srvgrp_base_set_msi]
==== Service: BASE_SET_MSI (SERVICE_ID: 0x08)
This service is used to configure the MSI address and data which the platform
microcontroller can use as a doorbell to the application processor. The
Expand All @@ -537,10 +551,7 @@ appropriate `STATUS` returned.
The platform microcontroller will enable MSI only if support is present and
this service configures MSI address and data successfully.

NOTE: If the platform supports PLIC, the platform need to provide a MMIO
register to inject an edge-triggered interrupt.

NOTE: The platform microcontroller can use MSI for both sending the MSI
NOTE: The platform microcontroller can use MSI for sending the MSI
directly or injecting wired interrupt in the application processor. If the MSI
target address is IMSIC, then the application processor will take MSI whereas
if the MSI target address is `setipnum` of the APLIC then the application
Expand Down
103 changes: 75 additions & 28 deletions src/transport.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,11 @@ the platform microcontroller can avoid implementing the P2A channel.
The current RPMI specification only defines a shared memory based transport but
other transport types can be added in the future.

NOTE: The shared memory for RPMI transport and fast-channels allocated
in DRAM or in on-chip RAM will require memory attributes configuration. These
memory attributes also called PMA (Physical Memory Attributes) are defined in
RISC-V Privileged Specification cite:[priv_v1_12].

[#transport_bidir_comm]
.Bi-directional Communication
image::transport-bidirectional.png[400,400, align="center"]
Expand Down Expand Up @@ -69,26 +74,11 @@ processors must discover it using standard hardware description mechanisms
such as device tree or ACPI.

If the P2A doorbell is a MSI then the application processors must configure
the MSI on the platform microcontroller side using RPMI messages defined by
the MSI on the platform microcontroller side using the RPMI service defined by
the `BASE` service group.

=== Fast-channels
Fast-channels are special shared memory-based channels used in scenarios
requiring lower latency and faster processing of requests from application
processors to the platform microcontroller.

The layout and request format of fast-channels are service group specific
and only a few service groups may support fast-channels. A service group
that supports fast-channels:

* May only enable some services to be used over fast-channels
* Must provide physical address and other attributes (such as optional
fast-channel doorbell) of the fast-channels via a services defined by
the service group

NOTE: To avoid the caching side-effects, the platform can configure the
fast-channel shared memory as non-cacheable or IO memory for both the
application processors and the platform microcontroller.
NOTE: If the platform supports PLIC, the platform need to provide a MMIO
register to inject an edge-triggered interrupt.

=== Shared Memory Transport
The RPMI shared memory transport defines a mechanism to exchange messages via
Expand All @@ -97,9 +87,17 @@ device memory. The RPMI shared memory transport does not specify where the
shared memory resides in a platform, but it must be accessible from both the
application processors and the platform microcontroller.

NOTE: To avoid the caching side-effects, the platform can configure the shared
The platform must setup the PMA for the shared memory used for RPMI transport.

NOTE: Its possible that the application processor and the platform
microcontroller are not cache-coherent and using the shared memory may lead to
caching side effects such as data inconsistency between the platform
microcontroller and the application processor, write propagation delays and
others issues which may lead to race conditions. To avoid the caching
side-effects, the platform can configure the memory attribute of the shared
memory as non-cacheable or IO memory for both the application processor and the
platform microcontroller.
platform microcontroller. In addition, the implementation can perform manual
cache maintenance using cache flush and invalidate operations.

All data sent or received through the RPMI shared memory transport must follow
little-endian byte-order.
Expand Down Expand Up @@ -166,26 +164,37 @@ must be a `power-of-2` and must be at least `64 bytes`. The slot size is same
across all RPMI shared memory queues and the physical address of each slot
must be aligned at slot size boundary.

NOTE: The slot size should match with the maximum cache line size used in a
NOTE: The slot size should match with the maximum cache block size used in a
platform. The requirement of `power-of-2` slot size with minimum value of
`64 bytes` is because usual CPU cache line size is `64 bytes` or some
`64 bytes` is because usual CPU cache block size is `64 bytes` or some
`power-of-2` value.

The slots of the RPMI shared memory queue are assigned sequentially increasing
indices starting with `0`. The slot at index `0` is referred to as the
`head slot` and the slot at index `1` is referred to as the `tail slot`. The
remaining `(M - 2)` slots of the RPMI shared memory queue are message slots.
The first `4 bytes` of the head slot is used as the `head` of the circular
queue which contains a `slot index - 2` value pointing to the message slot from
The first `4 bytes` of the Head slot is used as the `head` of the circular
queue which contains a `(slot index - 2)` value pointing to the message slot from
where the next message can be dequeued. The first `4 bytes` of the tail slot is
used as the `tail` of the circular queue which contains a `slot index - 2` value
used as the `tail` of the circular queue which contains a `(slot index - 2)` value
pointing to the message slot from where the next message can be enqueued. The
pictorial view of the RPMI shared memory queue internals is shown in the
<<transport_shared_memory_qint>> below.

NOTE: In the total `M` slots only the `(M - 2)` slots are used as an queue
having RPMI messages stored as data. The `(slot index - 2)` index value
represents that from all slots perspective in a queue shared memory which also
includes the `head` and `tail` slots, the `head` and `tail` stores the indices
of the message slots which effectively starts from `slot index - 2`.

NOTE: The requirement of keeping `head` and `tail` in separate slots is
to prevent both `head` and `tail` using the same cache line so that cache
maintenance can be done separately for both `head` and `tail`.
to prevent both `head` and `tail` using the same cache block so that cache
maintenance such as using cache flush and invalidate operations can be done
separately for both `head` and `tail`.

NOTE: There are no explicit indicators present to highlight the queue
wrapping condition. The implementations can use `head` == `tail` as queue
empty condition and `\((tail + 1) % (M - 2)) == head` as full condition.

[#transport_shared_memory_qint]
.Shared Memory Queue Internals
Expand All @@ -211,7 +220,7 @@ into two parts where one part belongs to the A2P channel and other belongs
to the P2A channel. The shared memory region sizes of the A2P and P2A channel
can be different. For each channel (A2P or P2A), the corresponding REQ and ACK
queues must be of the same size hence equal number of slots (or queue capacity).
The size of each RPMI shared shared queue must be a multiple of the slot size.
The size of each RPMI shared queue must be a multiple of the slot size.

NOTE: A platform should provide sufficient shared memory for all RPMI shared
memory queues so that the number of slots (queue capacity) does not become
Expand Down Expand Up @@ -255,3 +264,41 @@ M = (X / slot-size) : Total slot count in a queue
(M-2) : Message slot count (2 slots less for `HEAD` and `TAIL`)
```
====

=== Shared Memory based Fast-channels
A fast-channel is a unidirectional shared memory channel with a dedicated RPMI
service type. The data transmitted over a fast-channel is without any message
header and its layout is defined by the service which is dedicated to that
fast-channel. Unlike normal RPMI transport, which can be shared by multiple
service groups and services, a fast-channel is exclusive to a service in a
service group which allows faster exchange of the data. A fast-channel can be
used in scenarios that require lower latency and faster processing of requests
between the application processors and the platform microcontroller.

NOTE: Because of fixed data format and type associated with a fast-channel, the
requests made over a fast-channel can be processed quickly, but the time required
by the platform microcontroller to complete the requests may not be less than
the time required for completion of requests made over the normal RPMI transport
The request completion time depends on the platform implementation.

A service group that supports fast-channels for services:

* May only enable some services to be used over fast-channels.
* Must provide physical address and other attributes (such as optional
fast-channel doorbell) of the fast-channels via a services defined by
the service group.

The layout and data format of a fast-channel are RPMI service specific in a
service group and defined in the respective service group sections.

The platform must setup the PMA for the shared memory used for the fast-channels.

NOTE: It is possible that the application processor and the platform
microcontroller are not cache-coherent and using the shared memory may lead to
caching side effects such as data inconsistency between the platform
microcontroller and the application processor, write propagation delays and
others issues which may lead to race conditions. To avoid the caching
side-effects, the platform can configure the memory attribute of the shared
memory as non-cacheable or IO memory for both the application processor and the
platform microcontroller. In addition, the implementation can perform manual
cache maintenance using cache flush and invalidate operations.

0 comments on commit 2674515

Please sign in to comment.