Performance Counters
CV32E40P implements performance counters according to the RISC-V Privileged Specification, version 1.11 (see Hardware Performance Monitor, Section 3.1.11).
The performance counters are placed inside the Control and Status Registers (CSRs) and can be accessed with the CSRRW(I)
and CSRRS/C(I)
instructions.
CV32E40P implements the clock cycle counter mcycle(h)
, the retired instruction counter minstret(h)
, as well as the parameterizable number of event counters
mhpmcounter3(h)
- mhpmcounter31(h)
and the corresponding event selector CSRs mhpmevent3
- mhpmevent31
, and the mcountinhibit
CSR to individually enable/disable the counters.
mcycle(h)
and minstret(h)
are always available.
All counters are 64 bit wide.
The number of event counters is determined by the parameter NUM_MHPMCOUNTERS
with a range from 0 to 29 (default value of 1).
Unimplemented counters always read 0.
Note
All performance counters are using the gated version of clk_i
. The wfi instruction, the
cv.elw instruction, and pulp_clock_en_i
impact the gating of clk_i
as explained
in Sleep Unit and can therefore affect the counters.
Event Selector
The following events can be monitored using the performance counters of CV32E40P.
Bit # |
Event Name |
Description |
---|---|---|
0 |
CYCLES |
Number of cycles |
1 |
INSTR |
Number of instructions retired |
2 |
LD_STALL |
Number of load use hazards |
3 |
JMP_STALL |
Number of jump register hazards |
4 |
IMISS |
Cycles waiting for instruction fethces, excluding jumps and branches |
5 |
LD |
Number of load instructions |
6 |
ST |
Number of store instructions |
7 |
JUMP |
Number of jumps (unconditional) |
8 |
BRANCH |
Number of branches (conditional) |
9 |
BRANCH_TAKEN |
Number of branches taken (conditional) |
10 |
COMP_INSTR |
Number of compressed instructions retired |
11 |
PIPE_STALL |
Cycles from stalled pipeline |
12 |
APU_TYPE |
Numbe of type conflicts on APU/FP |
13 |
APU_CONT |
Number of contentions on APU/FP |
14 |
APU_DEP |
Number of dependency stall on APU/FP |
15 |
APU_WB |
Number of write backs on APUB/FP |
The event selector CSRs mhpmevent3
- mhpmevent31
define which of these events are counted by the event counters mhpmcounter3(h)
- mhpmcounter31(h)
.
If a specific bit in an event selector CSR is set to 1, this means that events with this ID are being counted by the counter associated with that selector CSR.
If an event selector CSR is 0, this means that the corresponding counter is not counting any event.
Note
At most 1 bit should be set in an event selector. If multiple bits are set in an event selector, then the operation of the associated counter is undefined.
Controlling the counters from software
By default, all available counters are disabled after reset in order to provide the lowest power consumption.
They can be individually enabled/disabled by overwriting the corresponding bit in the mcountinhibit
CSR at address 0x320
as described in the RISC-V Privileged Specification,
version 1.11 (see Machine Counter-Inhibit CSR, Section 3.1.13).
In particular, to enable/disable mcycle(h)
, bit 0 must be written. For minstret(h)
, it is bit 2. For event counter mhpmcounterX(h)
, it is bit X.
The lower 32 bits of all counters can be accessed through the base register, whereas the upper 32 bits are accessed through the h
-register.
Reads of all these registers are non-destructive.
Parametrization at synthesis time
The mcycle(h)
and minstret(h)
counters are always available and 64 bit wide.
The number of available event counters mhpmcounterX(h)
can be controlled via the NUM_MHPMCOUNTERS
parameter.
By default NUM_MHPCOUNTERS
set to 1.
An increment of 1 to the NUM_MHPCOUNTERS results in the addition of the following:
64 flops for
mhpmcounterX
15 flops for mhpmeventX
1 flop for mcountinhibit[X]
Adder and event enablement logic
Time Registers (time(h)
)
The user mode time(h)
registers are not implemented. Any access to these
registers will cause an illegal instruction trap. It is recommended that a software trap handler is
implemented to detect access of these CSRs and convert that into access of the
platform-defined mtime
register (if implemented in the platform).