eXtension Interface
The eXtension interface enables extending CPU with (custom or standardized) instructions without the need to change the RTL of CPU itself. Extensions can be provided in separate modules external to CPU and are integrated at system level by connecting them to the eXtension interface.
The eXtension interface provides low latency (tightly integrated) read and write access to the CPU register file. All opcodes which are not used (i.e. considered to be invalid) by CPU can be used for extensions. It is recommended however that custom instructions do not use opcodes that are reserved/used by RISC-V International.
The eXtension interface enables extension of CPU with:
Custom ALU type instructions.
Custom CSRs and related instructions.
Control-Transfer type instructions (e.g. branches and jumps) are not supported via the eXtension interface.
CV-X-IF
The terminology eXtension interface
and CV-X-IF
are used interchangeably.
Parameters
The CV-X-IF specification contains the following parameters:
Name |
Type/Range |
Default |
Description |
---|---|---|---|
|
int unsigned (2..3) |
2 |
Number of register file read ports that can be used by the eXtension interface. |
|
int unsigned (3..32) |
4 |
Identification ( |
|
int unsigned (32, 64) |
32 |
Register file read access width for the eXtension interface. Must be at least XLEN. If XLEN = 32, then the legal values are 32 and 64 (e.g. for RV32P). If XLEN = 64, then the legal value is (only) 64. |
|
int unsigned (32, 64) |
32 |
Register file write access width for the eXtension interface. Must be at least XLEN. If XLEN = 32, then the legal values are 32 and 64 (e.g. for RV32D). If XLEN = 64, then the legal value is (only) 64. |
|
int unsigned (1..2^MXLEN) |
1 |
Number of harts (hardware threads) associated with the interface. The CPU determines the legal values for this parameter. |
|
int unsigned (1..MXLEN) |
1 |
Width of |
|
logic [31:0] |
32’b0 |
MISA extensions implemented on the eXtension interface. The CPU determines the legal values for this parameter. |
|
logic [1:0] |
2’b0 |
Initial value for |
|
int unsigned (0..3) |
0 |
Is dual read supported? 0: No, 1: Yes, for |
|
int unsigned (0..1) |
0 |
Is dual write supported? 0: No, 1: Yes. Legal values are determined by the CPU. |
|
int unsigned (0..1) |
0 |
Does the interface pipeline register interface? 0: No, 1: Yes. Legal values are determined by the CPU. If 1, registers are provided after the issue of the instruction. If 0, registers are provided at the same time as issue. |
Note
A CPU shall clearly document which X_MISA
values it can support and there is no requirement that a CPU can support
all possible X_MISA
values. For example, if a CPU only supports machine mode, then it is not reasonable to expect that the
CPU will additionally support user mode by just setting the X_MISA[20]
(U
bit) to 1.
Additionally, the following type definitions are defined to improve readability of the specification and ensure consistency between the interfaces:
Name |
Definition |
Description |
---|---|---|
|
logic [X_NUM_RS+X_DUALREAD-1:0] |
Vector with a flag per possible source register. This depends upon the number of read ports and their ability to read register pairs. The bit positions map to registers as follows: Low indices correspond to low operand numbers, and the even part of the pair has the lower index than the odd one. |
|
logic [X_DUALWRITE:0] |
Bit vector indicating destination registers for write back.
The width depends on the ability to perform dual write.
If |
|
logic [X_NUM_RS-1:0][X_RFR_WIDTH-1:0] |
Privilege level (2’b00 = User, 2’b01 = Supervisor, 2’b10 = Reserved, 2’b11 = Machine). |
|
logic [X_ID_WIDTH-1:0] |
Identification of the offloaded instruction. See Identification for details on the identifiers |
|
logic [X_HARTID_WIDTH-1:0] |
Identification of the hart offloading the instruction.
Only relevant in multi-hart systems. Hart IDs are not required to
to be numbered continuously.
The hart ID would usually correspond to |
Major features
The major features of CV-X-IF are:
Minimal requirements on extension instruction encoding.
If an extension instruction relies on reading from or writing to the core’s general purpose register file, then the standard RISC-V bitfield locations for rs1, rs2, rs3, rd as used for non-compressed instructions ([RISC-V-UNPRIV]) must be used. Bitfields for unused read or write operands can be fully repurposed. Extension instructions can either use the compressed or uncompressed instruction format. For offloading compressed instructions the coprocessor must provide the core with the related non-compressed instructions.
Support for dual writeback instructions (optional, based on
X_DUALWRITE
).CV-X-IF optionally supports implementation of (custom or standardized) ISA extensions mandating dual register file writebacks. Dual writeback is supported for even-odd register pairs (
Xn
andXn+1
withn
being an even number extracted from instruction bits[11:7]
).Dual register file writeback is only supported for
XLEN
= 32.Support for dual read instructions (per source operand) (optional, based on
X_DUALREAD
).CV-X-IF optionally supports implementation of (custom or standardized) ISA extensions mandating dual register file reads. Dual read is supported for even-odd register pairs (
Xn
andXn+1
, withn
being an even number extracted from instruction bits[19:15]
),[24:20]
and[31:27]
(i.e.rs1
,rs2
andrs3
). Dual read can therefore provide up to six 32-bit operands per instruction.When a dual read is performed with
n
= 0, the entire operand is 0, i.e.x1
shall not need to be accessed by the CPU.Dual register file read is only supported for XLEN = 32.
Support for ternary operations.
CV-X-IF optionally supports ISA extensions implementing instructions which use three source operands. Ternary instructions must be encoded in the R4-type instruction format defined by [RISC-V-UNPRIV].
Support for instruction speculation.
CV-X-IF indicates whether offloaded instructions are allowed to be committed (or should be killed).
CV-X-IF consists of the following interfaces:
Compressed interface. Signaling of compressed instruction to be offloaded.
Issue (request/response) interface. Signaling of the uncompressed instruction to be offloaded.
Register interface. Signaling of General Purpose Registers (GPRs) and CSRs.
Commit interface. Signaling of control signals related to whether instructions can be committed or should be killed.
Result interface. Signaling of the instruction result(s).
Operating principle
CPU will attempt to offload every (compressed or non-compressed) instruction that it does not recognize as a legal instruction itself. In case of a compressed instruction the coprocessor must first provide the core with a matching uncompressed (i.e. 32-bit) instruction using the compressed interface. This non-compressed instruction is then attempted for offload via the issue interface.
Offloading of the (non-compressed, 32-bit) instructions happens via the issue interface.
The external coprocessor can decide to accept or reject the instruction offload. In case of acceptation the coprocessor
will further handle the instruction. In case of rejection the core will raise an illegal instruction exception.
The core provides the required register file operand(s) to the coprocessor via the register interface.
If an offloaded instruction uses any of the register file sources rs1
, rs2
or rs3
, then these are always encoded in instruction bits [19:15]
,
[24:20]
and [31:27]
respectively. The coprocessor only needs to wait for the register file operands that a specific instruction actually uses.
The coprocessor informs the core to which register(s) in the register file it will writeback.
The CPU uses this information to track data dependencies between instructions.
Offloaded instructions are speculative; CPU has not necessarily committed to them yet and might decide to kill them (e.g. because they are in the shadow of a taken branch or because they are flushed due to an exception in an earlier instruction). Via the commit interface the core will inform the coprocessor about whether an offloaded instruction will either need to be killed or whether the core will guarantee that the instruction is no longer speculative and is allowed to be committed.
The final result of an accepted offloaded instruction can be written back into the coprocessor itself or into the CPU’s register file. Either way, the
result interface is used to signal to the CPU that the instruction has completed. Apart from a possible writeback into the register file, the result
interface transaction is for example used in the core to increment the minstret
CSR, to implement the fence instructions and to judge if instructions
before a WFI
instruction have fully completed (so that sleep mode can be entered if needed).
In short: From a functional perspective it should not matter whether an instruction is handled inside the CPU or inside a coprocessor. In both cases the instructions need to obey the same instruction dependency rules, memory consistency rules, load/store address checks, fences, etc.
Interfaces
This section describes the interfaces of CV-X-IF. Port directions are described as seen from the perspective of the CPU.
The coprocessor will have opposite pin directions.
Stated signals names are not mandatory, but it is highly recommended to at least include the stated names as part of actual signal names. It is for example allowed to add prefixes and/or postfixes (e.g. x_
prefix or _i
, _o
postfixes) or to use different capitalization. A name mapping should be provided if non obvious renaming is applied.
Identification
Most interfaces of CV-X-IF all use a signal called id
, which serves as a unique identification number for offloaded instructions.
The same id
value shall be used for all transaction packets on all interfaces that logically relate to the same instruction.
An id
value can be reused after an earlier instruction related to the same id
value is no longer consider in-flight.
The id
values for in-flight offloaded instructions are required to be unique.
The id
values are required to be incremental from one issue transaction to the next.
The increment may be greater than one.
If the next id
would be greater than the maximum value (2**X_ID_WIDTH - 1
), the value of id
wraps.
id
values can only be introduced by the issue interface.
An id
becomes in-flight in the first cycle that issue_valid
is 1 for that id
.
An id
ends being in-flight when one of the following scenarios apply:
the corresponding issue request transaction is retracted.
the corresponding issue request transaction is not accepted and the corresponding commit handshake has been performed.
the corresponding result transaction has been performed.
For the purpose of relative identification, an instruction is considered to be preceding another instruction, if it was accepted in an issue transaction at an earlier time. The other instruction is thus succeeding the earlier one.
Multiple Harts
The interface can be used in systems with multiple harts (hardware threads).
This includes scenarios with multiple CPUs and multi-threaded implementations of CPUs.
RISC-V distinguishes between harts using hartid
, which we also introduce to the interface.
It is required to identify the source of the offloaded instruction, as multiple harts might be able to offload via a shared interface.
No duplicates of the combination of hartid
and id
may be in flight at any time within one instance of the interface.
Any state within the coprocessor (e.g. custom CSRs) must be duplicated according to the number of harts (indicated by the X_NUM_HARTS
parameter).
Execution units may be shared among threads of the coprocessor, and conflicts around such resources must be managed by the coprocessor.
Note
The interface can be used in scenarios where the CPU is superscalar, i.e. it can issue more than one instruction per cycle. In such scenarios, the coprocessor is usually required to also be able to accept more than one instruction per cycle. Our expectation is that implementers will duplicate the interface according to the issue width.
Compressed interface
Table 4 describes the compressed interface signals.
Signal |
Type |
Direction (CPU) |
Description |
---|---|---|---|
|
logic |
output |
Compressed request valid. Request to uncompress a compressed instruction. |
|
logic |
input |
Compressed request ready. The transactions signaled via |
|
x_compressed_req_t |
output |
Compressed request packet. |
|
x_compressed_resp_t |
input |
Compressed response packet. |
Table 5 describes the x_compressed_req_t
type.
Signal |
Type |
Description |
---|---|---|
|
logic [15:0] |
Offloaded compressed instruction. |
|
Identification of the hart offloading the instruction. |
The instr[15:0]
signal is used to signal compressed instructions that are considered illegal by CPU itself. A coprocessor can provide an uncompressed instruction
in response to receiving this.
A compressed request transaction is defined as the combination of all compressed_req
signals during which compressed_valid
is 1 and the hartid
remains unchanged.
A CPU is allowed to retract its compressed request transaction before it is accepted with compressed_ready
= 1 and it can do so in the following ways:
Set
compressed_valid
= 0.Keep
compressed_valid
= 1, but change thehartid
signal (and if desired change the other signals incompressed_req
).
The signals in compressed_req
are valid when compressed_valid
is 1. These signals remain stable during a compressed request transaction (if hartid
changes while compressed_valid
remains 1,
then a new compressed request transaction started).
Table 6 describes the x_compressed_resp_t
type.
Signal |
Type |
Description |
---|---|---|
|
logic [31:0] |
Uncompressed instruction. |
|
logic |
Is the offloaded compressed instruction ( |
The signals in compressed_resp
are valid when compressed_valid
and compressed_ready
are both 1. There are no stability requirements.
The CPU will attempt to offload every compressed instruction that it does not recognize as a legal instruction itself. CPU might also attempt to offload compressed instructions that it does recognize as legal instructions itself.
The CPU shall cause an illegal instruction fault when attempting to execute (commit) an instruction that:
is considered to be valid by the CPU and accepted by the coprocessor (
accept
= 1).is considered neither to be valid by the CPU nor accepted by the coprocessor (
accept
= 0).
The accept
signal of the compressed interface merely indicates that the coprocessor accepts the compressed instruction as an instruction that it implements and translates into
its uncompressed counterpart.
Typically an accepted transaction over the compressed interface will be followed by a corresponding transaction over the issue interface, but there is no requirement
on the CPU to do so (as the instructions offloaded over the compressed interface and issue interface are allowed to be speculative). Only when an accept
is signaled over the issue interface, then an instruction is considered accepted for offload.
The coprocessor shall not take the mstatus
based extension context status (see ([RISC-V-PRIV])) into account when generating the accept
signal on its compressed interface (but it shall take
it into account when generating the accept
signal on its issue interface).
Issue interface
Table 7 describes the issue interface signals.
Signal |
Type |
Direction (CPU) |
Description |
---|---|---|---|
|
logic |
output |
Issue request valid. Indicates that CPU wants to offload an instruction. |
|
logic |
input |
Issue request ready. The transaction signaled via |
|
x_issue_req_t |
output |
Issue request packet. |
|
x_issue_resp_t |
input |
Issue response packet. |
Table 8 describes the x_issue_req_t
type.
Signal |
Type |
Description |
---|---|---|
|
logic [31:0] |
Offloaded instruction. |
|
Identification of the hart offloading the instruction. |
|
|
Identification of the offloaded instruction. |
An issue request transaction is defined as the combination of all issue_req
signals during which issue_valid
is 1 and the hartid
remains unchanged.
A CPU is allowed to retract its issue request transaction before it is accepted with issue_ready
= 1 and it can do so in the following ways:
Set
issue_valid
= 0.Keep
issue_valid
= 1, but change thehartid
signal (and if desired change the other signals inissue_req
).
The instr
, hartid
, and id
signals are valid when issue_valid
is 1.
The instr
signal remains stable during an issue request transaction.
Table 10 describes the x_issue_resp_t
type.
Signal |
Type |
Description |
---|---|---|
|
logic |
Is the offloaded instruction ( |
|
Will the coprocessor perform a writeback in the core to |
|
|
Will the coprocessor perform require specific registers to be read?
A coprocessor may only request an odd register of a pair, if it also requests the even register of a pair.
A coprocessor must signal |
|
|
logic |
Will the coprocessor perform a writeback in the core to |
The core shall attempt to offload instructions via the issue interface for the following two main scenarios:
The instruction is originally non-compressed and it is not recognized as a valid instruction by the CPU’s non-compressed instruction decoder.
The instruction is originally compressed and the coprocessor accepted the compressed instruction and provided a 32-bit uncompressed instruction. In this case the 32-bit uncompressed instruction will be attempted for offload even if it matches in the CPU’s non-compressed instruction decoder.
Apart from the above two main scenarios a CPU may also attempt to offload (compressed/uncompressed) instructions that it does recognize as legal instructions itself. In case that both the CPU and the coprocessor accept the same instruction as being valid, the instruction will cause an illegal instruction fault upon execution.
The CPU shall cause an illegal instruction fault when attempting to execute (commit) an instruction that:
is considered to be valid by the CPU and accepted by the coprocessor (
accept
= 1).is considered neither to be valid by the CPU nor accepted by the coprocessor (
accept
= 0).
A coprocessor can (only) accept an offloaded instruction when:
It can handle the instruction (based on decoding
instr
).There are no structural hazards that would prevent execution.
A transaction is considered offloaded/accepted on the positive edge of clk
when issue_valid
, issue_ready
are asserted and accept
is 1.
A transaction is considered not offloaded/rejected on the positive edge of clk
when issue_valid
and issue_ready
are asserted while accept
is 0.
The signals in issue_resp
are valid when issue_valid
and issue_ready
are both 1. There are no stability requirements.
Register interface
Table 12 describes the register interface signals.
Signal |
Type |
Direction (CPU) |
Description |
---|---|---|---|
|
logic |
output |
Register request valid. Indicates that CPU provides register contents related to an instruction. |
|
logic |
input |
Register request ready. The transaction signaled via |
|
x_register_t |
output |
Register packet. |
Table 13 describes the x_register_t
type.
Signal |
Type |
Description |
---|---|---|
|
Identification of the hart offloading the instruction. |
|
|
Identification of the offloaded instruction. |
|
|
logic [X_RFR_WIDTH-1:0] |
Register file source operands for the offloaded instruction. |
|
Validity of the register file source operand(s). If register pairs are supported, the validity is signaled for each register within the pair individually. |
|
|
logic [5:0] |
Extension Context Status ({ |
|
logic |
Validity of the Extension Context Status. |
There are two main scenarios, in how the register interface will be used. They are selected by X_ISSUE_REGISTER_SPLIT
:
X_ISSUE_REGISTER_SPLIT
= 0: A register transaction can be started in the same clock cycle as the issue transaction (issue_valid = register_valid
,issue_ready = register_ready
,issue_req.hartid = register.hartid
andissue_req.id = register.id
). In this case, the CPU will speculatively provide all possible source registers viaregister.rs
when they become available (signalled via the respectivers_valid
signals). The coprocessor will delay accepting the instruction until all necessary registers are provided, and only then assertissue_ready
andregister_ready
. Thers_valid
bits are not required to be stable during the transaction. Each bit can transition from 0 to 1, but is not allowed to transition back to 0 during a transaction. A coprocessor is not expected to wait for allrs_valid
bits to be 1, but only for those registers it intends to read. Thers
signals are only required to be stable during the part of a transaction in which these signals are considered to be valid. Theecs_valid
bit is not required to be stable during the transaction. It can transition from 0 to 1, but is not allowed to transition back to 0 during a transaction. Theecs
signal is only required to be stable during the part of a transaction in which this signals is considered to be valid.X_ISSUE_REGISTER_SPLIT
= 1: For a CPU which splits the issue and register interface into subsequent pipeline stages (e.g. because it has a dedicated read registers (RR) stage), the registers will be provided after the issue transaction completed. The CPU initiates the register transaction once all registers are available. If the coprocessor is able to accept multiple issue transactions before receiving the registers, the register transaction can occur in a different order. This allows the CPU to reorder instructions based on the availability of operands. The coprocessor is always expected to be ready to retrieve its operands via the register interface after accepting the issue of an instruction. Therefore,register_ready
is tied to 1. Theregister_valid
signal will be 1 for one cycle, andrs_valid
is guaranteed to be equal to the correspondingissue_resp.register_read
. Thus, a coprocessor can ignorers_valid
in this case and a CPU may chose to not implement the signal. The same applies to theecs
andecs_valid
signals.
In both scenarios, the following applies:
The hartid
, id
, ecs_valid
and rs_valid
signals are valid when register_valid
is 1.
The rs
signal is only considered valid when register_valid
is 1 and the corresponding bit in rs_valid
is 1 as well.
The ecs
signal is only considered valid when register_valid
is 1 and ecs_valid
is 1 as well.
The rs[X_NUM_RS-1:0]
signals provide the register file operand(s) to the coprocessor. In case that XLEN
= X_RFR_WIDTH
, then the regular register file
operands corresponding to rs1
, rs2
or rs3
are provided. In case XLEN
!= X_RFR_WIDTH
(i.e. XLEN
= 32 and X_RFR_WIDTH
= 64), then the
rs[X_NUM_RS-1:0]
signals provide two 32-bit register file operands per index (corresponding to even/odd register pairs) with the even register specified
in rs1
, rs2
or rs3
. The register file operand for the even register file index is provided in the lower 32 bits; the register file operand for the
odd register file index is provided in the upper 32 bits. When reading from the x0
, x1
pair, then a value of 0 is returned for the entire operand.
The X_DUALREAD
parameter defines whether dual read is supported and for which register file sources it is supported.
The ecs
signal provides the Extension Context Status from the mstatus
CSR to the coprocessor.
Commit interface
Table 14 describes the commit interface signals.
Signal |
Type |
Direction (CPU) |
Description |
---|---|---|---|
|
logic |
output |
Commit request valid. Indicates that CPU has valid commit or kill information for an offloaded instruction.
There is no corresponding ready signal (it is implicit and assumed 1). The coprocessor shall be ready
to observe the |
|
x_commit_t |
output |
Commit packet. |
Table 15 describes the x_commit_t
type.
Signal |
Type |
Description |
---|---|---|
|
Identification of the hart offloading the instruction. |
|
|
Identification of the offloaded instruction. Valid when |
|
|
logic |
If |
The commit_valid
signal will be 1 exactly one clk
cycle.
It is not required that a commit transaction is performed for each offloaded instruction individually.
Instructions can be signalled to be non-speculative or to be killed in batch.
E.g. signalling the oldest instruction to be killed is equivalent to requesting a flush of the coprocessor.
The first instruction to be considered not-to-be-killed after a commit transaction with commit_kill
as 1,
is at earliest an instruction with successful issue transaction starting at least one clock cycle later.
Note
If an instruction is marked in the coprocessor as killed or committed, the coprocessor shall ignore any subsequent commit transaction related to that instruction.
Note
A coprocessor must be tolerant to any possible commit.id
, whether this represents and in-flight instruction or not.
In this case, the coprocessor may still need to process the request by considering the relevant instructions (either preceding or succeeding) as no longer speculative or to be killed.
This behavior supports scenarios in which more than one coprocessor is connected to an issue interface.
A CPU is required to mark every instruction that has completed the issue transaction as either killed or non-speculative. This includes accepted (issue_resp.accept = 1) and rejected instructions (issue_resp.accept = 0).
A coprocessor does not have to wait for commit_valid
to
become asserted. It can speculate that an offloaded accepted instruction will not get killed, but in case this speculation turns out to be wrong because the instruction actually did get killed,
then the coprocessor must undo any of its internal architectural state changes that are due to the killed instruction.
A coprocessor is not allowed to perform speculative result transactions and shall therefore never initiate a result transaction for instructions that have not yet received a commit transaction
with commit_kill
= 0. The earliest point at which a coprocessor can initiate a result handshake for an instruction is therefore the cycle in which commit_valid
= 1 and commit_kill
= 0
for that instruction.
The signals in commit
are valid when commit_valid
is 1.
Memory (request/response) interface
The memory (request/response) interface is not included in this version of the specification
Memory result interface
The memory (request/response) interface is not included in this version of the specification
Result interface
Table 21 describes the result interface signals.
Signal |
Type |
Direction (CPU) |
Description |
---|---|---|---|
|
logic |
input |
Result request valid. Indicates that the coprocessor has a valid result (write data or exception) for an offloaded instruction. |
|
logic |
output |
Result request ready. The result signaled via |
|
x_result_t |
input |
Result packet. |
The coprocessor shall provide results to the core via the result interface. A coprocessor is allowed to provide results to the core in an out of order fashion. A coprocessor is only allowed to provide a result for an instruction once the core has indicated (via the commit interface) that this instruction is allowed to be committed. Each accepted offloaded (committed and not killed) instruction shall have exactly one result transaction (even if no data needs to be written back to the CPU’s register file). No result transaction shall be performed for instructions which have not been accepted for offload or for instructions that have been killed.
Table 22 describes the x_result_t
type.
Signal |
Type |
Description |
---|---|---|
|
Identification of the hart offloading the instruction. |
|
|
Identification of the offloaded instruction. |
|
|
logic [X_RFW_WIDTH-1:0] |
Register file write data value(s). |
|
logic [4:0] |
Register file destination address(es). |
|
Register file write enable(s). |
|
|
logic [2:0] |
Write enables for |
|
logic [5:0] |
Write data value for { |
|
logic |
Did the instruction cause a synchronous exception? |
|
logic [5:0] |
Exception code. |
|
logic |
Did the instruction cause a debug trigger match with |
|
logic |
Did the instruction cause a bus error? |
A result transaction starts in the cycle that result_valid
= 1 and ends in the cycle that both result_valid
= 1 and result_ready
= 1. The signals in result
are
valid when result_valid
is 1. The signals in result
shall remain stable during a result transaction.
we
is 2 bits wide when XLEN
= 32 and X_RFW_WIDTH
= 64, and 1 bit wide otherwise. The CPU shall ignore writeback to x0
.
When a dual writeback is performed to the x0
, x1
pair, the entire write shall be ignored, i.e. neither x0
nor x1
shall be written by the CPU.
For an instruction instance, the we
signal must be the same as issue_resp.writeback
.
The CPU is not required to check that these signals match.
Note
issue_resp.writeback
and result.we
carry the same information.
Nevertheless, result.we
is provided to simplify the CPU logic.
Without this signal, the CPU would have to look this information up based on the instruction id
.
If ecswe[2]
is 1, then the value in ecsdata[5:4]
is written to mstatus.xs
.
If ecswe[1]
is 1, then the value in ecsdata[3:2]
is written to mstatus.fs
.
If ecswe[0]
is 1, then the value in ecsdata[1:0]
is written to mstatus.vs
.
The writes to the stated mstatus
bitfields will take into account any WARL rules that might exist for these bitfields in the CPU.
Interface dependencies
The following rules apply to the relative ordering of the interface handshakes:
The compressed interface transactions are in program order (possibly a subset) and the CPU will at least attempt to offload instructions that it does not consider to be valid itself.
The issue interface transactions are in program order (possibly a subset) and the CPU will at least attempt to offload instructions that it does not consider to be valid itself.
Every issue interface transaction has an associated register interface transaction. It is not required for register transactions to be in the same order as the issue transactions.
Every issue interface transaction (whether accepted or not) has an associated commit interface transaction and both interfaces use a matching transaction ordering.
If an offloaded instruction is accepted and allowed to commit, then for each such instruction one result transaction must occur via the result interface (even if no writeback needs to happen to the core’s register file). The transaction ordering on the result interface does not have to correspond to the transaction ordering on the issue interface.
A commit interface handshake cannot be initiated before the corresponding issue interface handshake is initiated. It is allowed to be initiated at the same time or later.
A result interface handshake cannot be initiated before the corresponding issue interface handshake is initiated. It is allowed to be initiated at the same time or later.
A result interface handshake cannot be initiated before the corresponding commit interface handshake is initiated (and the instruction is allowed to commit). It is allowed to be initiated at the same time or later.
A result interface handshake cannot be (or have been) initiated for killed instructions.
Handshake rules
The following handshake pairs exist on the eXtension interface:
compressed_valid
withcompressed_ready
.issue_valid
withissue_ready
.register_valid
withregister_ready
.commit_valid
with implicit always ready signal.
result_valid
withresult_ready
.
The only rule related to valid and ready signals is that:
A transaction is considered accepted on the positive
clk
edge when both valid and (implicit or explicit) ready are 1.
Specifically note the following:
The valid signals are allowed to be retracted by a CPU (e.g. in case that the related instruction is killed in the CPU’s pipeline before the corresponding ready is signaled).
A new transaction can be started by a CPU by changing the
id
signal and keeping the valid signal asserted (thereby possibly terminating a previous transaction before it completed).The valid signals are not allowed to be retracted by a coprocessor (e.g. once
result_valid
is asserted it must remain asserted until the handshake withresult_ready
has been performed). A new transaction can therefore not be started by a coprocessor by just changing theid
signal and keeping the valid signal asserted if no ready has been received yet for the original transaction. The cycle after receiving the ready signal, a next (back-to-back) transaction is allowed to be started by just keeping the valid signal high and changing theid
to that of the next transaction.The ready signals is allowed to be 1 when the corresponding valid signal is not asserted.
Signal dependencies
A CPU shall not have combinatorial paths from its eXtension interface input signals to its eXtension interface output signals, except for the following allowed paths:
paths from
result_valid
,result
tors
,rs_valid
.
Note
The above implies that the non-compressed instruction instr[31:0]
received via the compressed interface is not allowed
to combinatorially feed into the issue interface’s instr[31:0]
instruction.
A coprocessor is allowed (and expected) to have combinatorial paths from its eXtension interface input signals to its eXtension interface output signals. In order to prevent combinatorial loops the following combinatorial paths are not allowed in a coprocessor:
paths from
rs
,rs_valid
toresult_valid
,result
.
Note
The above implies that a coprocessor has a pipeline stage separating the register file operands from its result generating circuit (similar to the separation between decode stage and execute stage found in many CPUs).
Note
As a CPU is allowed to retract transactions on its compressed and issue interfaces, the compressed_ready
and issue_ready
signals will have to
depend on signals received from the CPU in a combinatorial manner (otherwise these ready signals might be signaled for the wrong id
).
Handshake dependencies
In order to avoid system level deadlock both the CPU and the coprocessor shall obey the following rules:
The
valid
signal of a transaction shall not be dependent on the correspondingready
signal.Transactions related to an earlier part of the instruction flow shall not depend on transactions with the same
id
related to a later part of the instruction flow. The instruction flow is defined from earlier to later as follows:compressed transaction
issue transaction
register transaction
commit transaction
result transaction.
Transactions with an earlier issued
id
shall not depend on transactions with a later issuedid
(e.g. a coprocessor is not allowed to delay generatingresult_valid
= 1 because it first wants to seecommit_valid
= 1 for a newer instruction).
Note
The use of the words depend and dependent relate to logical relationships, which is broader than combinatorial relationships.
Appendix
This appendix contains several useful, non-normative pieces of information that help implementing the eXtension Interface.
SystemVerilog example
In the src
folder of this project, the file https://github.com/openhwgroup/core-v-xif/blob/main/src/core_v_xif.sv contains a non-normative realization of this specification based on SystemVerilog interfaces.
Of course the use of SystemVerilog (interfaces) is not mandatory.
Coprocessor recommendations
A coprocessor is recommended (but not required) to follow the following suggestions to maximize its re-use potential:
Avoid using opcodes that are reserved or already used by RISC-V International unless for supporting a standard RISC-V extension.
Make it easy to change opcode assignments such that a coprocessor can easily be updated if it conflicts with another coprocessor.
Clearly document the supported and required parameter values.
Timing recommendations
The integration of the eXtension interface will vary from CPU to CPU, and thus require its own set of timing constraints.
CV32E40X eXtension timing budget shows the recommended timing budgets
for the coprocessor and (optional) interconnect for the case in which a coprocessor is paired with the CV32E40X (https://github.com/openhwgroup/cv32e40x) processor.
As is shown in that timing budget, the coprocessor only receives a small part of the timing budget on the paths through xif_issue_if.issue_req.rs*
.
This enables the coprocessor to source its operands directly from the CV32E40X register file bypass network, thereby preventing stall cycles in case an
offloaded instruction depends on the result of a preceding non-offloaded instruction. This implies that, if a coprocessor is intended for pairing with the CV32E40X,
it will be beneficial timing wise if the coprocessor does not directly operate on the rs*
source inputs, but registers them instead. To maximize utilization of a coprocessor with various CPUs, such registers could be made optional via a parameter.
Verification
A UVM agent for the interface was developed for the verification of CVA6. It can be accessed under https://github.com/openhwgroup/core-v-verif/tree/master/lib/uvm_agents/uvma_cvxif.