Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Verible standalone preprocessor #1360

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

karimtera
Copy link
Contributor

@karimtera karimtera commented Jul 10, 2022

Description:
This PR introduces takes another step to enhance the standalone preprocessor tool by implementing multiple-cu mode.
Please refer to issue #1358 for more information about the proposed design.

Goal:

  • In case multiple files are passed to the tool, each one of them will be compiled in a separate compilation unit, thus declarations scopes will end by the end of each file (this is the default behavior).
  • Compiler directives should be correctly interpreted (files inclusion, macros expansion and conditionals are a good start).

Flags description

Using the Abseil flags library allows an easy access to flags and their arguments.

Flags Type Default Description State
--strip_comments bool false This replaces comments with equal number of whitespaces, to preserve the same byte offsets. DONE
--generate_variants bool false This generates all the possible variants for conditionals. DONE
--multiple_cu bool true In case of true multiple files are passed to the tool, each one of them will be compiled in a separate compilation unit, thus declarations scopes will end by the end of each file (this is the default behavior), otherwise all files will be compiled in the same compilation unit. DONE

Positional arguments

Those mainly are the SV files to preprocess, defines and search-directory paths for includes.
Any of the following positional argument can be passed one or more times.

Argument Description State
<file.sv> Specifies a SV file to be preprocessed. DONE
+define+<foo>[=<text>] Defines one or more macros same as `define foo text DONE
+incdir+<dir> Specifies the directory to search for files included with `include "file_name" DONE

TODO:

  • Fixing testcases reported in mglb's review.
  • Better comment placing.
  • const qualifier in cmd_positional_arguments.h.
  • Fixing +define+macro+="a=b" issue.
  • Removing the full usage message, relying on --helpfull from absl/flags.
  • Better error messages.

@codecov-commenter
Copy link

codecov-commenter commented Jul 10, 2022

Codecov Report

Merging #1360 (18e990c) into master (80b07f9) will decrease coverage by 0.99%.
The diff coverage is 19.06%.

@@            Coverage Diff             @@
##           master    #1360      +/-   ##
==========================================
- Coverage   92.90%   91.91%   -1.00%     
==========================================
  Files         340      344       +4     
  Lines       23832    24125     +293     
==========================================
+ Hits        22141    22174      +33     
- Misses       1691     1951     +260     
Impacted Files Coverage Δ
verilog/analysis/flow_tree.cc 0.00% <0.00%> (ø)
verilog/analysis/flow_tree.h 0.00% <0.00%> (ø)
verilog/preprocessor/verilog_preprocess.h 100.00% <ø> (ø)
verilog/preprocessor/verilog_preprocess.cc 74.68% <17.58%> (-16.90%) ⬇️
verilog/tools/preprocessor/verilog_preprocessor.cc 28.07% <23.30%> (-65.04%) ⬇️
common/util/cmd_positional_arguments.cc 35.18% <35.18%> (ø)
common/util/cmd_positional_arguments.h 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@karimtera karimtera force-pushed the standalone_preprocessor branch from fa4aa99 to 392c605 Compare July 21, 2022 12:01
@karimtera
Copy link
Contributor Author

Hello @hzeller ,

Wish you had a good vacation!

Here is a quick overview about what's going on here:

  1. The preprocessor tool was getting arguments through a class defined in common/util/subcommand.h, It defines a subcommands such that each one has its own flags.
    I changed that to the normal abseil flags to be similar to other Verible tools.
    As @tgorochowik's request to standardize the way of passing arguments, What do you think about my approach?
  2. About the generate_variants feature, I added comments so the code is understandable, but let me give you a quick overview:
    Now, this feature doesn't know if a macro is defined or not, It just assumes all the possibilities and outputs the SV code for each one by constructing a control flow tree, which defines the edges between each token and the possible next tokens, then traveses with a depth first search to generate all possible vairants.

@karimtera karimtera changed the title [WIP] Verible standalone preprocessor: multiple-cu mode [WIP] Verible standalone preprocessor Jul 23, 2022
"verible::CmdPositionalArguments" class only supports these types so far: SV files, +define+<name>[=<value>], and +incdir+<dir>.
@karimtera
Copy link
Contributor Author

karimtera commented Jul 25, 2022

verible::CmdPositionalArguments aims to parse the preprocessor tool's positional arguments which includes:

  1. <file.sv>
  2. +define+<name>[=value]
  3. +incdir+<dir>

Note:

  1. For each type multiple arguments are supported in a single tool run.
  2. For +define: multiple macros are supported in the same argument.
  3. For +incdir: multiple directories are supported in the same argument.

TODO:

  • The arguments are accessible now from the tool, but only SV files are used, need to use the rest.
  • Need to implement a HandleIncludes function in verilog/analysis/preprocessor/.
  • Need to implement a AddDefineFromCmdLine interface function in 'verilog/analysis/preprocessor`.

- Adding an interface function "AddDefineFromCmdLine" to use the macro added by +define+<foo> argument to "VerilogPreprocess".
- Added a feature to VerilogPreprocess which allows to store paths,
  That can be used later to find the SV file to include.
- The preprocessor tool takes these paths from the user,
  as a +incdir+<path>[+<another_path>] and set then in VerilogPreprocess.
- The included files macros and conditionals can be expanded and evaluated.
- Some limitations exists and were written as TODOs in place,
  need to open issues for these.
@karimtera
Copy link
Contributor Author

Include directives supported usage:

  • "file_path" can be an absolute path, or a relative one.
  • In case of relative one, search_paths_ of verilog::VerilogPreprocess are searched.
  • search_paths_ are passed to the tool through +incdir+<dir>[+<dir>].

Limitations:

  • Number of nested includes should be limited (min is 15 as per LRM).
  • Come up with a more efficient way to extend the included file TokenInfos life (more info is documented verilog/preprocess/verilog_preprocess.cc).
  • Share defined macros with the included files.
  • Minimal testing was done, need to add more tests.

@tgorochowik tgorochowik requested review from mglb and rkapuscik July 27, 2022 13:29
- The preprocessor tool preserves the white-spaces in the SV files.
- Changed the output of the -generate_variants and -multipecu from tokens (enum,text) into normal text.
@karimtera
Copy link
Contributor Author

karimtera commented Jul 29, 2022

To solve the white-spaces problem I have added VerilogPreprocess::GenerateBypassWhiteSpaces function which skips white-spaces tokens, and returns an iterator to a non-whitespace token.

TokenStreamView::const_iterator VerilogPreprocess::GenerateBypassWhiteSpaces(
    const StreamIteratorGenerator& generator) {
  auto iterator =
      generator();  // iterator should be pointing to a non-whitespace token;
  while (verilog::VerilogLexer::KeepSyntaxTreeTokens(**iterator) == 0) {
    iterator = generator();
  }
  return iterator;
}

Then replaced each generator() into GenerateBypassWhiteSpaces(generator)) in VerilogPreprocess methods.

Example:
verible-verilog-preprocessor m5.sv

Example input file:

`include "file.sv"
module m();
  wire x = 1;
`ifdef KARIM
  real is_defined = 1;
`else
  real is_defined = 0;
`endif
// pre-blank line

// post-blank line
endmodule

Example output:

module included_module();
  wire x=1;
endmodule

module m();
  wire x = 1;

  real is_defined = 0;

// pre-blank line

// post-blank line
endmodule

Note: if added +define+KARIM then the other branch will be selected.

TODO:

  • Add tests for all the implemented features, didn't want to do it earlier as I will have to revisit them once I fix the white-spaces problem.
    • File inclusion.
    • Multiple compilation unit mode.
    • Generate variants mode.
  • Open issues on the known limitations.
  • Maybe add a flag to disable blank lines.

Copy link
Contributor

@mglb mglb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! I've played a bit with the tool and checked part of the code - more comments to come.
So far:

Maybe add a flag to disable blank lines.

Suggestion: this could include a comment following `endif/`else in the same line.

Question mostly out of curiosity: do SystemVerilog allows any tokens other than comments in the same line as `else and `endif? Because this works:

module m();
  wire x = `ifdef KARIM
      1`else/* no KARIM */0`endif; // endif KARIM
endmodule

Failing use cases:

input:

`ifndef KARIM
    // not defined: define
    `define KARIM
`endif

module m();
`ifdef KARIM
    // defined
`else
    // NOT defined
`endif
endmodule

output:

    // not defined: define



module m();

    // NOT defined

endmodule

Expected // defined.


file: file.sv:

module included_module();
  wire x=1;
endmodule

file: test.m01.inc.sv:

// begin: test.m01.inc.sv
`include "file.sv"
// end: test.m01.inc.sv

file: test.m01.sv (input):

// begin: test.m01.sv
`include "test.m01.inc.sv"
module m();
  wire x = 1;
endmodule
// end: test.m01.sv

output:

// begin: test.m01.sv
// begin: test.m01.inc.sv

module m();
  wire x = 1;
endmodule
// end: test.m01.sv

Expected inclusion of nested `include or error.

Cosmetic "issues"

More informative error reporting would be nice. At this moment it looks like:

$ ./bazel-bin/verilog/tools/preprocessor/verible-verilog-preprocessor - <<< $'// before\n`include "no-such-file.sv"\n// after\n'

// before
no such file found.

It is even more confusing when there's no comment before the include.

Example error from verible-verilog-syntax:

$ ./bazel-bin/verilog/tools/syntax/verible-verilog-syntax --printtokens - <<< "module 4"

-:1:8: syntax error at token "4"

I would rename --strip_comments to --erase_comments or something. I think "strip" suggests that they will be removed (replaced with empty strings).


Instead of Variant number 1: it would be nice to print something like //////// variant: 1 //////// for easier parsing (also for human eyes).

common/util/cmd_positional_arguments.cc Show resolved Hide resolved
common/util/cmd_positional_arguments.h Outdated Show resolved Hide resolved
verilog/tools/preprocessor/verilog_preprocessor.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@mglb mglb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To solve the white-spaces problem I have added VerilogPreprocess::GenerateBypassWhiteSpaces function which skips white-spaces tokens, and returns an iterator to a non-whitespace token.

It looks good generally, and seems to work here.

Comment on lines 35 to 41
std::vector<absl::string_view> GetIncludeDirs();

// Gets macro defines arguments.
std::vector<std::pair<absl::string_view, absl::string_view>> GetDefines();

// Gets SV files arguments.
std::vector<absl::string_view> GetFiles();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const std::vector<...>& GetX() const; and auto& in assignments in verilog_preprocessor.cc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first const is giving me a warning with clang-tidy:
Clang-Tidy: Return type 'const TYPE' (aka 'const TYPE') is 'const'-qualified at the top level, which may reduce code readability without improving const correctness

I think it means that it leads the reader to think the caller can't change the returned value, which is not true in general.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you forgot & after const std::vector<...>. This warnings appears only when returning const values/objects (which makes sense, because such const does nothing), not references to const value (which is used a lot to avoid forced copy).

Comment on lines +34 to +51
absl::Status DepthFirstSearch(
int index); // travese the tree in a depth first manner.
std::vector<verible::TokenSequence>
variants_; // a memory for all variants generated.

private:
std::vector<int>
ifs_; // vector of all `ifdef/`ifndef indexes in source_sequence_.
std::map<int, std::vector<int>> elses_; // indexes of `elsif/`else indexes in
// source_sequence_ of each if block.
std::map<int, std::vector<int>>
edges_; // the tree edges which defines the possible next childs of each
// token in source_sequence_.
verible::TokenSequence
source_sequence_; // the original source code lexed token seqouence.
verible::TokenSequence
current_sequence_; // the variant's token sequence currrently being built
// by DepthFirstSearch.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put comments above code to avoid excessive wrapping. Also in other places.

Comment on lines +573 to +574
// karimtera(TODO): how to differentiate between <file> and "file"?? both are
// the same token, need to edit the lexer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK `include <file> is not yet supported in Verible at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be enabled from the lexer's side to begin with, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right.

@hzeller hzeller requested review from kbieganski and removed request for rkapuscik May 18, 2023 17:38
@hzeller
Copy link
Collaborator

hzeller commented May 18, 2023

@kbieganski : here are some bits and pieces for the preprocessor, that have been addressed in various sub-pull request.

Can you have a look through these to make sure everything actually made it to the final preprocessor or if there are pieces missing that we should reformulate in a new PR ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants