[ImportVerilog] add stream concat operation #7784

chenbo-again · 2024-11-08T03:09:31Z

@fabianschuiki, @hailongSun2000
Add moore.stream_concat operation for streaming like concat, it's currently a draft version and call for comments. for lvalue, {1 << {i32, i32, i32}} will be treated like

(!moore.ref<i32>, !moore.ref<i32>, !moore.ref<i32> 0) -> <stream_concat<{!moore.i32, !moore.i32, !moore.i32}, 0>>

and it will be process in the core dialect.

Because the stream concat operation is so flexible, various datatype is acceptable as input, so I just Postponed the actual analysis. And stream concat oepration just be treated like a conversion from input to a packed type.

fabianschuiki · 2024-11-08T22:59:19Z

Hey @chenbo-again, thanks a lot for working on this! 🥳 I'm wondering if we need a dedicated type for the streams. Do you know whether SystemVerilog has a specific type that it returns for streaming operations such as {<<{...}}? If you can store a stream in a variable somewhere (as a stream type), we probably have to be able to represent streams as separate types and values. But if that's not possible, we might be able to fully handle streams inside of ImportVerilog without creating special streaming operations in the IR. It would be interesting to find out what all the different streaming operations are, and if they are just syntactic sugar for bit slices and concats, or if there's more to it.

chenbo-again · 2024-11-09T05:48:28Z

I thought that before but have no answer, because streaming concat operation can be operated on bit-stream type(described in IEEE 1800-2017), dynamically sized array can be included.

Both source_t and dest_t can include one or more dynamically sized data in any position (for example, a structure containing a dynamic array followed by a queue of bytes). If the source type, source_t, includes dynamically sized variables, they are all included in the bit stream. If the destination type, dest_t, includes unbounded dynamically sized types, the conversion process is greedy: compute the size of the source_t, subtract the size of the fixed-size data items in the destination, and then adjust the size of the first dynamically sized item in the destination to the remaining size; any remaining dynamically sized items are left empty.
IEEE 1800-2017 6.24.3

And @hailongSun2000 proposed that we can distinguish sized one and unsized one, and maybe we don't have any choice to deal all the things in ImportVerilog stage.

hailongSun2000 · 2024-11-11T01:59:16Z

Hey, @chenbo-again! Thanks for your work on the streaming operator. Presently, we have a dedicated op(moore.dyn_extract) to slice a value with a dynamic low bit. So maybe we can use moore.(dyn_)extract(_ref) and moore.concat(_ref) to represent moore.streaming_concat. WDYT?

chenbo-again · 2024-11-11T17:55:33Z

Hey, @chenbo-again! Thanks for your work on the streaming operator. Presently, we have a dedicated op(moore.dyn_extract) to slice a value with a dynamic low bit. So maybe we can use moore.(dyn_)extract(_ref) and moore.concat(_ref) to represent moore.streaming_concat. WDYT?

Hi, @hailongSun2000 .
I tried it, but I'm afraid it is not suitable here, we need a dynamic sliceWidth extract here, it seems like a unsythesisable expression and not present in moore for now.

hailongSun2000 · 2024-11-12T03:19:05Z

Here are some cases:

module Foo();
  int a = 1;
  int b = 2;
  int c = 3;
  bit [127:0] d;
  int size1 = 32;
  const int size2 = 32;
  parameter size3 = 32;
  initial begin
    d = {<<int{a, b, c}};           // Ok, slice size = 32
    d = {<<logic[3:0]{a, b, c}};    // Ok, slice size = 4
    d = {<<size3{a, b, c}};         // Ok, slice size = 32   parameter is a constant

    d = {<<size1{a, b, c}};         // Not, error: reference to non-constant variable 'size1' is not allowed in a constant expression
    d = {<<size2{a, b, c}};         // Not, as above
  end
endmodule

Slang will throw an error if slice size is not a constant value. So maybe we only need to use moore.extract and moore.extract_ref. The key point is how to determine the size of the dynamic array and queue.

chenbo-again · 2024-11-12T17:32:19Z

Recently I pushed a commit try to implement streaming concat operation with concat & extract operation. But currently it's limited that the operand must be sized IntType. It's not easy to deal the cases which the operands is unsized, and I believe in the future we will find an answer and implement it.

for now, streaming concat assignment expression {<< byte {a, b, c}} = {<< byte {a, b, c}}; will emit moore ir like below:

      %0 = moore.extract_ref %a from 0 : <i32> -> <i8>
      %1 = moore.extract_ref %a from 8 : <i32> -> <i24>
      %2 = moore.extract_ref %1 from 0 : <i24> -> <i8>
      %3 = moore.extract_ref %1 from 8 : <i24> -> <i16>
      %4 = moore.extract_ref %3 from 0 : <i16> -> <i8>
      %5 = moore.extract_ref %3 from 8 : <i16> -> <i8>
      %6 = moore.extract_ref %b from 0 : <i32> -> <i8>
      %7 = moore.extract_ref %b from 8 : <i32> -> <i24>
      %8 = moore.extract_ref %7 from 0 : <i24> -> <i8>
      %9 = moore.extract_ref %7 from 8 : <i24> -> <i16>
      %10 = moore.extract_ref %9 from 0 : <i16> -> <i8>
      %11 = moore.extract_ref %9 from 8 : <i16> -> <i8>
      %12 = moore.extract_ref %c from 0 : <i32> -> <i8>
      %13 = moore.extract_ref %c from 8 : <i32> -> <i24>
      %14 = moore.extract_ref %13 from 0 : <i24> -> <i8>
      %15 = moore.extract_ref %13 from 8 : <i24> -> <i16>
      %16 = moore.extract_ref %15 from 0 : <i16> -> <i8>
      %17 = moore.extract_ref %15 from 8 : <i16> -> <i8>
      %18 = moore.concat_ref %17, %16, %14, %12, %11, %10, %8, %6, %5, %4, %2, %0 : (!moore.ref<i8>, !moore.ref<i8>, !moore.ref<i8>, !moore.ref<i8>, !moore.ref<i8>, !moore.ref<i8>, !moore.ref<i8>, !moore.ref<i8>, !moore.ref<i8>, !moore.ref<i8>, !moore.ref<i8>, !moore.ref<i8>) -> <i96>
      %19 = moore.read %a : <i32>
      %20 = moore.read %b : <i32>
      %21 = moore.read %c : <i32>
      %22 = moore.extract %19 from 0 : i32 -> i8
      %23 = moore.extract %19 from 8 : i32 -> i24
      %24 = moore.extract %23 from 0 : i24 -> i8
      %25 = moore.extract %23 from 8 : i24 -> i16
      %26 = moore.extract %25 from 0 : i16 -> i8
      %27 = moore.extract %25 from 8 : i16 -> i8
      %28 = moore.extract %20 from 0 : i32 -> i8
      %29 = moore.extract %20 from 8 : i32 -> i24
      %30 = moore.extract %29 from 0 : i24 -> i8
      %31 = moore.extract %29 from 8 : i24 -> i16
      %32 = moore.extract %31 from 0 : i16 -> i8
      %33 = moore.extract %31 from 8 : i16 -> i8
      %34 = moore.extract %21 from 0 : i32 -> i8
      %35 = moore.extract %21 from 8 : i32 -> i24
      %36 = moore.extract %35 from 0 : i24 -> i8
      %37 = moore.extract %35 from 8 : i24 -> i16
      %38 = moore.extract %37 from 0 : i16 -> i8
      %39 = moore.extract %37 from 8 : i16 -> i8
      %40 = moore.concat %39, %38, %36, %34, %33, %32, %30, %28, %27, %26, %24, %22 : (!moore.i8, !moore.i8, !moore.i8, !moore.i8, !moore.i8, !moore.i8, !moore.i8, !moore.i8, !moore.i8, !moore.i8, !moore.i8, !moore.i8) -> i96
      moore.blocking_assign %18, %40 : i96

maybe it's usable for many case for now.

fabianschuiki · 2024-11-12T19:58:02Z

This looks nice! I like your idea of splitting this into a dynamically-sized and statically-sized problem. Only supporting streaming operators with statically-known sizes sounds like a great starting point. Especially since Moore doesn't support dynamic arrays yet. It might be easier to add streaming operator support for dynamic arrays after we add support for dynamic arrays themselves, because we've established all the available ops at that point. Very cool work!

hailongSun2000 · 2024-11-13T10:24:31Z

Your example is LGTM! However, could we reduce the moore.extract_ref amount? My idea is we can moore.concat(_ref) a, b, c, and we get a 96-bit result( assume it's the %0). Then, we can create

%1 = moore.extract(_ref) %0, 0    // [7:0]
%2 = moore.extract(_ref) %0, 8    // [15:8]
... 
%11 = moore.extrac(_ref) %0, 80  // [87:80]
%12 = moore.extrac(_ref) %0, 88  // [95:88]

finally, we can create moore.concat(_ref) %1, %2,..., %11, %12 to together them. WDYT @fabianschuiki , @chenbo-again 🤔?

lib/Conversion/ImportVerilog/Expressions.cpp

lib/Conversion/ImportVerilog/FormatStrings.cpp

test/Conversion/ImportVerilog/basic.sv

chenbo-again · 2024-11-14T21:01:11Z

thank you for your review! :)

lib/Conversion/ImportVerilog/Expressions.cpp

fabianschuiki · 2024-11-15T19:06:38Z

lib/Conversion/ImportVerilog/Expressions.cpp

+    for (auto stream : expr.streams()) {
+      auto value = context.convertLvalueExpression(*stream.operand);
+      if (!value)
+        return {};
+      operands.push_back(value);
+    }


I noticed that each stream can also have a withExpr and constantWithWidth. Could you add error messages if either of these are set, and tell the user that we don't support those cases? It's important that we only succeed if we were able to process everything the user has type, and not silently ignore parts of the input.

fabianschuiki · 2024-11-15T19:06:44Z

lib/Conversion/ImportVerilog/Expressions.cpp

+    for (auto stream : expr.streams()) {
+      auto value = context.convertRvalueExpression(*stream.operand);
+      if (!value)


I noticed that each stream can also have a withExpr and constantWithWidth. Could you add error messages if either of these are set, and tell the user that we don't support those cases? It's important that we only succeed if we were able to process everything the user has type, and not silently ignore parts of the input.

fabianschuiki · 2024-11-15T19:07:41Z

test/Conversion/ImportVerilog/basic.sv

+    // CHECK: [[BYTE4:%.+]] = moore.extract [[CONCAT]] from 32 : l48 -> l8
+    // CHECK: [[BYTE5:%.+]] = moore.extract [[CONCAT]] from 40 : l48 -> l8
+    // CHECK: [[RESULT:%.+]] = moore.concat [[BYTE5]], [[BYTE4]], [[BYTE3]], [[BYTE2]], [[BYTE1]], [[BYTE0]] : (!moore.l8, !moore.l8, !moore.l8, !moore.l8, !moore.l8, !moore.l8) -> l48
+    vec_5 = {<<byte{vec_3, vec_4}};


Nice, thanks for checking different-sized variables here!

fabianschuiki · 2024-11-15T19:20:03Z

test/Conversion/ImportVerilog/basic.sv

+    // CHECK: [[BYTE4_REF:%.+]] = moore.extract_ref [[CONCAT_REF]] from 32 : <l48> -> <l8>
+    // CHECK: [[BYTE5_REF:%.+]] = moore.extract_ref [[CONCAT_REF]] from 40 : <l48> -> <l8>
+    // CHECK: [[RESULT_REF:%.+]] = moore.concat_ref [[BYTE5_REF]], [[BYTE4_REF]], [[BYTE3_REF]], [[BYTE2_REF]], [[BYTE1_REF]], [[BYTE0_REF]] : (!moore.ref<l8>, !moore.ref<l8>, !moore.ref<l8>, !moore.ref<l8>, !moore.ref<l8>, !moore.ref<l8>) -> <l48>
+    {<<byte{vec_3, vec_4}} = vec_5;


Can you add a few more test cases that check other ways of how streams can be described? For example, I think your implementation already supports nested streams. And there's also the possibility of having a stream go in the other direction >>. I also noticed that the spec allows for streams to specify a slice size instead of a type, so something like {<<16{...}}.

The SystemVerilog specification contains a few examples for streams which might be great to include here, or use for inspiration for the different kinds of streams that we should test. For example, in IEEE 1800-2017 section 11.4.14.2 "Re-ordering of the generic stream" I see the following test:

int j = { "A", "B", "C", "D" }; { >> {j}} // generates stream "A" "B" "C" "D" { << byte {j}} // generates stream "D" "C" "B" "A" (little endian) { << 16 {j}} // generates stream "C" "D" "A" "B" { << { 8'b0011_0101 }} // generates stream 'b1010_1100 (bit reverse) { << 4 { 6'b11_0101 }} // generates stream 'b0101_11 { >> 4 { 6'b11_0101 }} // generates stream 'b1101_01 (same) { << 2 { { << { 4'b1101 }} }} // generates stream 'b1110

Pretty sure you could remove the string constants "A", "B", "C", "D" and then just check whether you get the correct slices of the variable j with your implementation. Or replace the string with a easily-recognizable constant, like 32'h87654321.

In IEEE 1800-2017 section 11.4.14.3 "Streaming concatenation as an assignment target (unpack)" I see the following test:

int a, b, c; logic [10:0] up [3:0]; logic [11:1] p1, p2, p3, p4; bit [96:1] y = {>>{ a, b, c }}; // OK: pack a, b, c int j = {>>{ a, b, c }}; // error: j is 32 bits < 96 bits bit [99:0] d = {>>{ a, b, c }}; // OK: d is padded with 4 bits {>>{ a, b, c }} = 23'b1; // error: too few bits in stream {>>{ a, b, c }} = 96'b1; // OK: unpack a = 0, b = 0, c = 1 {>>{ a, b, c }} = 100'b11111; // OK: unpack a = 0, b = 0, c = 1; 96 MSBs unpacked, 4 LSBs truncated { >> {p1, p2, p3, p4}} = up; // OK: unpack p1 = up[3], p2 = up[2], p3 = up[1], p4 = up[0]

Variations of these tests might be good to have. You don't need to check for error reporting if the error is reported by Slang and not your code; Slang already contains tests for this and we don't need to re-check Slang. But for any error message you produce in your code, and every branch/early exit in your code, you'd want to have a test that checks whether things work as expected there 😃

lib/Conversion/ImportVerilog/Expressions.cpp

hailongSun2000 · 2024-11-19T06:03:24Z

lib/Conversion/ImportVerilog/Expressions.cpp

+      if (stream.constantWithWidth.has_value() || stream.withExpr) {
+        mlir::emitError(operandLoc) << "Moore only support streaming "
+                                       "concatenation without 'with operation'";
+        return {};
+      }


I think we can invoke context.convertRvalueExpression() to handle withExpr if it's existing. And then ignore the same level variable.
For your case: (Only handle withExpr)

vec_1 = {<<byte{vec_0, arr with [7:0]}};

It's AST looks like

{ "operand": { "kind": "NamedValue", "type": "logic$[63:0]", "symbol": "2199025903840 arr" }, "withExpr": { "kind": "RangeSelect", "type": "logic$[7:0]", "selectionKind": "Simple", "value": { "kind": "NamedValue", "type": "logic$[63:0]", "symbol": "2199025903840 arr" }, ......

However, when the stream contains multiple variable-size data packets, or each data packet contains more than one variable-size data item, or the size of the data to be unpacked is stored in the middle of the stream, this mechanism can become cumbersome and error-prone. To overcome these problems, the unpack operation allows a with expression to explicitly specify the extent of a variable-size field within the unpack operation.

with expression is dedicated for dynamic stream operands, and it is only used for unpacked operands. however concat operation only supports packed sized operands in systemverilog, so currently we have no method to implement with expression.

By the way, I think to support full stream concat operation, the key is to create a conversion from any type to packed type, and then we can use packed value to do concat, extract and finally implement stream concat.

I noticed that we have a powerful op moore.conversion can convert any type to any type, so I make a demo in 0e7f6bd, but currently I don't know how it works

// CHECK: %[[TMP1:.*]] = moore.read %vec_3 : <l16> // CHECK: %[[TMP2:.*]] = moore.read %arr_1 : <uarray<64 x l1>> // CHECK: %[[TMP3:.*]] = moore.extract %[[TMP2]] from 0 : uarray<64 x l1> -> uarray<16 x l1> // CHECK: %[[TMP4:.*]] = moore.conversion %[[TMP3]] : !moore.uarray<16 x l1> -> !moore.l16 // CHECK: %[[TMP5:.*]] = moore.concat %[[TMP1]], %[[TMP4]] : (!moore.l16, !moore.l16) -> l32 // CHECK: %[[TMP6:.*]] = moore.extract %[[TMP5]] from 0 : l32 -> l8 // CHECK: %[[TMP7:.*]] = moore.extract %[[TMP5]] from 8 : l32 -> l8 // CHECK: %[[TMP8:.*]] = moore.extract %[[TMP5]] from 16 : l32 -> l8 // CHECK: %[[TMP9:.*]] = moore.extract %[[TMP5]] from 24 : l32 -> l8 // CHECK: %[[TMP10:.*]] = moore.concat %[[TMP6]], %[[TMP7]], %[[TMP8]], %[[TMP9]] : (!moore.l8, !moore.l8, !moore.l8, !moore.l8) -> l32 // CHECK: moore.blocking_assign %vec_1, %[[TMP10]] : l32 vec_1 = {<<byte{vec_3, arr_1 with [15:0]}}; // CHECK: %[[TMP1:.*]] = moore.extract_ref %arr_1 from 0 : <uarray<64 x l1>> -> <uarray<16 x l1>> // CHECK: %[[TMP2:.*]] = moore.conversion %[[TMP1]] : !moore.ref<uarray<16 x l1>> -> !moore.ref<l16> // CHECK: %[[TMP3:.*]] = moore.concat_ref %vec_3, %[[TMP2]] : (!moore.ref<l16>, !moore.ref<l16>) -> <l32> // CHECK: %[[TMP4:.*]] = moore.extract_ref %[[TMP3]] from 0 : <l32> -> <l8> // CHECK: %[[TMP5:.*]] = moore.extract_ref %[[TMP3]] from 8 : <l32> -> <l8> // CHECK: %[[TMP6:.*]] = moore.extract_ref %[[TMP3]] from 16 : <l32> -> <l8> // CHECK: %[[TMP7:.*]] = moore.extract_ref %[[TMP3]] from 24 : <l32> -> <l8> // CHECK: %[[TMP8:.*]] = moore.concat_ref %[[TMP4]], %[[TMP5]], %[[TMP6]], %[[TMP7]] : (!moore.ref<l8>, !moore.ref<l8>, !moore.ref<l8>, !moore.ref<l8>) -> <l32> // CHECK: %[[TMP9:.*]] = moore.read %vec_1 : <l32> // CHECK: moore.blocking_assign %[[TMP8]], %[[TMP9]] : l32 {<<byte{vec_3, arr_1 with [15:0]}} = vec_1;

At the moment, the moore.conversion operation is more or less a placeholder for more detailed conversion operations in the future. Ideally we would have more precise operations, such as moore.pack_sbv and moore.unpack_sbv, moore.int_to_logic and moore.logic_to_int, and other operations to convert between compatible unpacked types. I have started doing this for the sign/zero extension that is needed for resizing, but more work is needed. For now I think it's fine to just rely on moore.conversion to express whatever type conversion you need, and we'll fill in the details of how those conversions work later.

lib/Conversion/ImportVerilog/Expressions.cpp

test/Conversion/ImportVerilog/errors.sv

lib/Conversion/ImportVerilog/Expressions.cpp

hailongSun2000 · 2024-11-19T07:10:38Z

By the way, please don't forget to update the commit history. @fabianschuiki had tweaked the moore.conversion in this PR(#7783).

hailongSun2000 · 2024-11-19T07:11:34Z

Thanks for your work again 😃!

fabianschuiki · 2024-11-20T19:37:28Z

@chenbo-again Feel free to request commit access to LLVM such that your CI builds run automatically without anyone approving them. That way you can also land the PR yourself once CI passes and you're happy with the reviews and code 🙂

fabianschuiki

Very nice! Thanks a lot for making this all work 🙌. Let me know if you want to merge this yourself once you're happy with it, or if you would like me to merge this for you.

hailongSun2000 requested review from fabianschuiki, maerhart and hailongSun2000 November 9, 2024 06:11

hailongSun2000 added the Moore label Nov 9, 2024

[ImportVerilog] cleanup and add testcase

ce040b7

chenbo-again force-pushed the stream_concat branch from 6f6d3ae to ce040b7 Compare November 14, 2024 18:04

fabianschuiki reviewed Nov 14, 2024

View reviewed changes

lib/Conversion/ImportVerilog/Expressions.cpp Outdated Show resolved Hide resolved

fabianschuiki reviewed Nov 14, 2024

View reviewed changes

lib/Conversion/ImportVerilog/Expressions.cpp Outdated Show resolved Hide resolved

fabianschuiki reviewed Nov 14, 2024

View reviewed changes

lib/Conversion/ImportVerilog/FormatStrings.cpp Outdated Show resolved Hide resolved

fabianschuiki reviewed Nov 14, 2024

View reviewed changes

test/Conversion/ImportVerilog/basic.sv Outdated Show resolved Hide resolved

chenbo-again added 2 commits November 15, 2024 04:57

[ImportVerilog] fix bugs, enhance testcase and remove irrelevant changes

5aff8ac

[ImportVerilog] run clang-format

3641f24

fabianschuiki reviewed Nov 15, 2024

View reviewed changes

[ImportVerilog] polish test case and diagnostics infomation

4cca01a

hailongSun2000 reviewed Nov 19, 2024

View reviewed changes

chenbo-again and others added 3 commits November 21, 2024 02:35

Merge branch 'llvm:main' into stream_concat

e8cca6c

[ImportVerilog] remove wrong 'reverse', update testcase and comment

0601a17

[ImportVerilog] improve file check comment readability

e21121c

[ImportVerilog] fix testcase

b11311b

fabianschuiki approved these changes Nov 20, 2024

View reviewed changes

[ImportVerilog] add fixed size 'with expression' support

0e7f6bd

hailongSun2000 approved these changes Nov 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ImportVerilog] add stream concat operation #7784

[ImportVerilog] add stream concat operation #7784

chenbo-again commented Nov 8, 2024 •

edited

Loading

fabianschuiki commented Nov 8, 2024

chenbo-again commented Nov 9, 2024

hailongSun2000 commented Nov 11, 2024 •

edited

Loading

chenbo-again commented Nov 11, 2024 •

edited

Loading

hailongSun2000 commented Nov 12, 2024

chenbo-again commented Nov 12, 2024

fabianschuiki commented Nov 12, 2024

hailongSun2000 commented Nov 13, 2024 •

edited

Loading

chenbo-again commented Nov 14, 2024

fabianschuiki Nov 15, 2024

fabianschuiki Nov 15, 2024

fabianschuiki Nov 15, 2024

fabianschuiki Nov 15, 2024

hailongSun2000 Nov 19, 2024

chenbo-again Nov 20, 2024

chenbo-again Nov 20, 2024

fabianschuiki Nov 20, 2024

hailongSun2000 commented Nov 19, 2024

hailongSun2000 commented Nov 19, 2024

fabianschuiki commented Nov 20, 2024

fabianschuiki left a comment

[ImportVerilog] add stream concat operation #7784

Are you sure you want to change the base?

[ImportVerilog] add stream concat operation #7784

Conversation

chenbo-again commented Nov 8, 2024 • edited Loading

fabianschuiki commented Nov 8, 2024

chenbo-again commented Nov 9, 2024

hailongSun2000 commented Nov 11, 2024 • edited Loading

chenbo-again commented Nov 11, 2024 • edited Loading

hailongSun2000 commented Nov 12, 2024

chenbo-again commented Nov 12, 2024

fabianschuiki commented Nov 12, 2024

hailongSun2000 commented Nov 13, 2024 • edited Loading

chenbo-again commented Nov 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hailongSun2000 commented Nov 19, 2024

hailongSun2000 commented Nov 19, 2024

fabianschuiki commented Nov 20, 2024

fabianschuiki left a comment

Choose a reason for hiding this comment

chenbo-again commented Nov 8, 2024 •

edited

Loading

hailongSun2000 commented Nov 11, 2024 •

edited

Loading

chenbo-again commented Nov 11, 2024 •

edited

Loading

hailongSun2000 commented Nov 13, 2024 •

edited

Loading