Floating Point Unit (FPU)
The RV32F ISA extension for floating-point support in the form of IEEE-754 single
precision can be enabled by setting the parameter FPU of the cv32e40p_top
top level module
to 1. This will extend the CV32E40P decoder accordingly and will instantiate the FPU.
The FPU repository used by the CV32E40P is available at https://github.com/openhwgroup/cvfpu and
its documentation can be found here.
CVFPU v0.8.1 release has been copied in CV32E40P repository inside rtl/vendor (used for verification and implementation) so all core and FPU RTL files should be taken from CV32E40P repository.
cv32e40p_fpu_manifest file is listing all necessary files for both the Core and CVFPU.
CVFPU parameters
As CVFPU is an highly configurable IP, here is the list of its parameters and their actual value used when CVFPU is intantiated through a wrapper in cv32e40p_top
module.
Name |
Type/Range |
Value |
Description |
---|---|---|---|
|
int |
32 |
Datapath Width Specifies the width of the input and output data ports and of the datapath. |
|
logic |
0 |
Vectorial Hardware Generation Controls the generation of packed-SIMD computation units. |
|
logic |
0 |
NaN-Boxing Check Control Controls whether input value NaN-boxing is enforced. |
|
fmt_logic_t |
{1, 0, 0, 0, 0} |
Enabled Floating-Point Formats Enables respectively: IEEE Single-Precision format IEEE Double-Precision format IEEE Half-Precision format Custom Byte-Precision format Custom Alternate Half-Precision format |
|
ifmt_logic_t |
{0, 0, 1, 0} |
Enabled Integer Formats Enables respectively: Byte format Half-Word format Word format Double-Word format |
Name |
Type/Range |
Value |
Description |
---|---|---|---|
|
opgrp_fmt_unsigned_t |
{ { {default: 1}, {default: {default: } |
Number of Pipelining Stages This parameter sets a number of pipeline stages to be inserted into the computational units per operation group, per FP format. As such, latencies for different operations and different formats can be freely configured. Respectively: ADDition/MULtiplication operation group DIVision/SQuare RooT operation group NON COMPuting operation group CONVersion operation group
|
|
opgrp_fmt_unit_types_t |
{ {default: MERGED}, {default: MERGED}, {default: PARALLEL}, {default: MERGED} } |
HW Unit Implementation This parameter allows to control resources by either removing operation units for certain formats and operations, or merging multiple formats into one. Respectively: ADDition/MULtiplication operation group DIVision/SQuare RooT operation group NON COMPuting operation group CONVersion operation group |
|
pipe_config_t |
AFTER |
Pipeline Register Placement This parameter controls where pipeling registers (number defined by
AFTER means they are all placed at the output of each operational unit. See Synthesizing with the FPU advices to get best synthesis results. |
Name |
Type/Range |
Value |
Description |
---|---|---|---|
|
logic |
The SystemVerilog data type of the operation tag input and output ports. |
|
|
int |
0 |
Vectorial mode classify operation RISC-V compliancy. |
|
int |
0 |
Inactive vectorial lanes floating-point status flags masking. |
FP Register File
By default a dedicated register file consisting of 32
floating-point registers, f0
-f31
, is instantiated. This default behavior
can be overruled by setting the parameter ZFINX of the cv32e40p_top
top level
module to 1, in which case the dedicated register file is
not included and the general purpose register file is used instead to
host the floating-point operands.
The latency of the individual instructions are explained in Cycle counts per instruction type table.
FP CSR
When using floating-point extensions the standard specifies a floating-point status and control register (Floating-point control and status register (fcsr)) which contains the exceptions that occurred since it was last reset and the rounding mode. Floating-point accrued exceptions (fflags) and Floating-point dynamic rounding mode (frm) can be accessed directly or via Floating-point control and status register (fcsr) which is mapped to those two registers.
FPU Sleeping mode
To reduce power consumption, FPU clock is stopped when no FP instruction is being executed. To do so a dedicated clock gating cell is instantiated in cv32e40p_top
top level module with its enable signal depending of both apu_req_o
and apu_busy_o
core outputs.
Reminder for programmers
As mentioned in RISC-V Privileged Architecture specification, mstatus
.FS should be set to Initial to be able to use FP instructions.
If mstatus
.FS = Off (reset value), any instruction that attempts to read or write the Floating-Point state (F registers or F CSRs) will cause an illegal instruction exception.
Upon interrupt or context switch events, mstatus
.SD should be read to see if Floating-Point state has been altered.
If following executed program (interrupt routine or whatsover) is going to use FP instructions and only if mstatus
.SD = 1 (means FS = Dirty),
then the whole FP state (F registers and F CSRs) should be saved in memory and program should set mstatus
.FS to Clean.
When returning to interrupted or main program, if mstatus
.FS = Clean then the whole FP state should be restored from memory.