You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add FP8alt, low and mixed-precision SDOTP with stochastic rounding support, and compressed vector cmp (#3)
Added support for:
- FP8alt (1, 4, 3)
- low and mixed-precision SDOTP with stochastic rounding support
- compressed vector compare results (one bit per comparison in the LSBs)
---------
Co-authored-by: Gianna Paulin <[email protected]>
All notable changes to this project will be documented in this file.
4
+
5
+
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
6
+
7
+
In this sense, we interpret the "Public API" of a hardware module as its port/parameter list.
8
+
Versions of the IP in the same major relase are "pin-compatible" with each other. Minor relases are permitted to add new parameters as long as their default bindings ensure backwards compatibility.
9
+
10
+
## [0.1.0] - 2023-05-04
11
+
12
+
### Added
13
+
- Add low and mixed-precision SDOTP with support for stochastic rounding
14
+
- Add `FP8alt (1,4,3)` format
15
+
- Add support for compressed vector compare results (one bit per comparison in the LSBs)
Copy file name to clipboardExpand all lines: docs/README.md
+53-18Lines changed: 53 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -40,6 +40,8 @@ For more in-depth explanations on how to configure the unit and the layout of th
40
40
|`TagType`| The SystemVerilog data type of the operation tag |
41
41
|`TrueSIMDClass`| If enabled, the result of a classify operation in vectorial mode will be RISC-V compliant if each output has at least 10 bits|
42
42
|`EnableSIMDMask`| Enable the RISC-V floating-point status flags masking of inactive vectorial lanes. When disabled, `simd_mask_i` is inactive |
43
+
|`StochasticRndImplementation`| Enable stochastic rounding support for SDOTP, define LFSR bitwidth and number of trailing bits considered for the SR decision |
44
+
|`CompressedVecCmpResult`| Compress the result of a vector compare in the LSBs, conceived for RV32FD cores |
43
45
44
46
### Ports
45
47
@@ -50,6 +52,7 @@ As the width of some input/output signals is defined by the configuration, it is
|`RNE`|`3'b000`| To nearest, tie to even (default) |
88
+
|`RTZ`|`3'b001`| Toward zero |
89
+
|`RDN`|`3'b010`| Toward negative infinity |
90
+
|`RUP`|`3'b011`| Toward positive infinity |
91
+
|`RMM`|`3'b100`| To nearest, tie away from zero |
92
+
|`ROD`|`3'b101`| To odd |
93
+
|`RSR`|`3'b110`| Stochastic Rounding (available only on SDOTP operations) |
94
+
|`DYN`|`3'b111`|*RISC-V Dynamic RM, invalid if passed to operations*|
91
95
92
96
##### `operation_e` - FP Operation
93
97
@@ -104,6 +108,8 @@ Unless noted otherwise, the first operand `op[0]` is used for the operation.
104
108
|`ADD`|`0`| Addition (`op[1] + op[2]`) *note the operand indices*|
105
109
|`ADD`|`1`| Subtraction (`op[1] - op[2]`) *note the operand indices*|
106
110
|`MUL`|`0`| Multiplication (`op[0] * op[1]`) |
111
+
|`SDOTP`|`0`| Sum of dot product ) |
112
+
|`VSUM`|`0`| Vector Inner Sum ) |
107
113
|`DIV`|`0`| Division (`op[0] / op[1]`) |
108
114
|`SQRT`|`0`| Square root |
109
115
|`SGNJ`|`0`| Sign injection, operation encoded in rounding mode<br>`RNE`: `op[0]` with `sign(op[1])`<br>`RTZ`: `op[0]` with `~sign(op[1])`<br>`RDN`: `op[0]` with `sign(op[0]) ^ sign(op[1])`<br>`RUP`: `op[0]` (passthrough) |
@@ -132,10 +138,11 @@ Enumeration of type `logic [2:0]` holding the supported FP formats.
132
138
|`FP16`| IEEE binary16 | 16 bit | 5 | 10 |
133
139
|`FP8`| binary8 | 8 bit | 5 | 2 |
134
140
|`FP16ALT`| binary16alt | 16 bit | 8 | 7 |
141
+
|`FP8ALT`| binary8alt | 8 bit | 4 | 3 |
135
142
136
143
The following global parameters associated with FP formats are set in `fpnew_pkg`:
137
144
```SystemVerilog
138
-
localparam int unsigned NUM_FP_FORMATS = 5;
145
+
localparam int unsigned NUM_FP_FORMATS = 6;
139
146
localparam int unsigned FP_FORMAT_BITS = $clog2(NUM_FP_FORMATS);
140
147
```
141
148
@@ -230,7 +237,7 @@ typedef struct packed {
230
237
```
231
238
The fields of this struct behave as follows:
232
239
233
-
##### `Width` - Datapath Wdith
240
+
##### `Width` - Datapath Width
234
241
235
242
Specifies the width of the FPU datapath and of the input and output data ports (`operands_i`/`result_o`).
236
243
It must be larger or equal to the width of the widest enabled FP and integer format.
@@ -278,7 +285,7 @@ Otherwise, synthesis tools can optimize away any logic associated with this form
278
285
279
286
#### `Implementation` - Implementation Options
280
287
281
-
The FPU is divided into four operation groups, `ADDMUL`, `DIVSQRT`, `NONDOMP`, and `CONV` (see [Architecture: Top-Level](#top-level)).
288
+
The FPU is divided into five operation groups, `ADDMUL`, `DIVSQRT`, `NONDOMP`, `CONV`, and `DOTP` (see [Architecture: Top-Level](#top-level)).
282
289
The `Implementation` parameter controls the implementation of these operation groups.
283
290
It is of type `fpu_implementation_t` which is defined as:
284
291
```SystemVerilog
@@ -320,17 +327,18 @@ The unit type `unit_type_t` is an enumeration of type `logic [1:0]` holding the
320
327
The `UnitTypes` parameter allows to control resources used for the FPU by either removing operation units for certain formats and operations, or merging multiple formats into one.
321
328
Currently, the follwoing unit types are available for the FPU operation groups:
0 commit comments