Allow "lite" Chipyard builds with a minimal subset of submodules #2212

jerryz123 · 2025-03-19T18:49:52Z

There are a lot of submodules in generators/. Currently they are all cloned, initialized, and compiled, even though most people only use a small subset of these at a time.

This PR demonstrates a path towards a "modular generator submodule" approach, here some submodules can be left uninitialized, with the consequence being that configs which depend on those submodules will not appear on the classpath, and will not be available to the user.

This PR demonstrates this with some submodules, but the approach should scale to any submodule that does not have a crazy dependency structure. Currently modularized submods are:

ara
compress-acc
caliptra-aes

The changes are:

init-submodules now initializes only minimal submodules, optional submodules can be initialized by adding specific flags or --full
The submodule's custom Configs.scala is now in the submodule, but still retains the chipyard classpath. This is symlinked into the old place in generators/chipyard/config
build.sbt only injects the scala dependency if it finds the .git in the submodule directory - indicating the submodule has been initialized

I rate the hackiness of this approach as a 4/10, mostly due to the symlinking. I don't want to mess with the directory structure, package structure, or the way the generator searches the classpath for the Config to build. Changing those could make this less hacky.

Related PRs / Issues:

Type of change:

Bug fix
New feature
Other enhancement

Impact:

RTL change
Software change (RISC-V software)
Build system change
Other

Contributor Checklist:

Did you set main as the base branch?
Is this PR's title suitable for inclusion in the changelog and have you added a changelog:<topic> label?
Did you state the type-of-change/impact?
Did you delete any extraneous prints/debugging code?
Did you mark the PR with a changelog: label?
(If applicable) Did you update the conda .conda-lock.yml file if you updated the conda requirements file?
(If applicable) Did you add documentation for the feature?
(If applicable) Did you add a test demonstrating the PR?

(If applicable) Did you mark the PR as Please Backport?

CI Help:
Add the following labels to modify the CI for a set of features.
Generally, a label added only affect subsequent changes to the PR (i.e. new commits, force pushing, closing/reopening).
See ci:* for full list of labels:

ci:fpga-deploy - Run FPGA-based E2E testing
ci:local-fpga-buildbitstream-deploy - Build local FPGA bitstreams for platforms that are released
ci:disable - Disable CI

…t'd submodules

Fi50 · 2025-03-19T20:19:21Z

This is a really cool idea! Some thoughts that come to mind:

Is it better to have the default be full Chipyard or lean Chipyard? The default for conda build right now is full, with lean being a flag, so it feels like a potential confusion to have Chipyard be the opposite.
Connected to the above, potentially many Chipyard users (I'm thinking of class labs, tapeout/bringup etc) will need to amend their documentation to specify which submodules are necessary, which is likely to not get done (so people will run into bugs with modules not existing and complain)? (Maybe asking people to read some blurb in the README would cover this though?)
So might be worth exploring which submodules should be part of a minimal build. Are there enough of them unused to have a speedup/size reduction?
Maybe there could be a separate --lean flag which allows a more stripped down baseline, the --full flag which adds everything everything in, and the default is some in between which hopefully covers most existing use cases.

The other thought is, how to support people integrating new generators according to the rules, now that the rule also includes editing the shell script and a slightly different sbt process? Since all the effort to refactor things just collapses if people add their new submodules to default build anyway. Or maybe that doesn't matter as long proper integration is enforced when upstreaming?

(+mandatory "idrk" disclaimer xD)

jerryz123 · 2025-03-19T20:35:57Z

Is it better to have the default be full Chipyard or lean Chipyard? The default for conda build right now is full, with lean being a flag, so it feels like a potential confusion to have Chipyard be the opposite.

Default should probably be full.

So might be worth exploring which submodules should be part of a minimal build. Are there enough of them unused to have a speedup/size reduction?

Some submodules have internal submodules, which might need to be initialized as well. I think the speed improvement will be noticeable once all the optional blocks are not initialized.

Another motivation for this is to lower the barrier-to-entry for adding new generators to chipyard. We can add new projects without having to rationalize that the usefulness outweighs the additional bloat.

Maybe there could be a separate --lean flag which allows a more stripped down baseline, the --full flag which adds everything everything in, and the default is some in between which hopefully covers most existing use cases.

Yes, I haven't gone through the process of adding the flags to build-setup. I've only added them to init-submodules for now. Feedback here on the right user interface would be appreciated.

The other thought is, how to support people integrating new generators according to the rules, now that the rule also includes editing the shell script and a slightly different sbt process? Since all the effort to refactor things just collapses if people add their new submodules to default build anyway. Or maybe that doesn't matter as long proper integration is enforced when upstreaming?

Yes, the documentation for adding new generators, and the process for doing so, should be simplified.

joonho3020 · 2025-03-19T20:55:51Z

I like CY-lite. It would be ideal if we can flatten stuff like testchipip, rocketchip, inclusive l2 during this process, but I know it is a lot of work...

jerryz123 · 2025-03-19T20:58:39Z

Flattening RC and RC-LLC are non-starters (and I don't think this buys anything).

abejgonzalez

Is it better to have the default be full Chipyard or lean Chipyard? The default for conda build right now is full, with lean being a flag, so it feels like a potential confusion to have Chipyard be the opposite.

Default should probably be full.

So might be worth exploring which submodules should be part of a minimal build. Are there enough of them unused to have a speedup/size reduction?

Some submodules have internal submodules, which might need to be initialized as well. I think the speed improvement will be noticeable once all the optional blocks are not initialized.

Another motivation for this is to lower the barrier-to-entry for adding new generators to chipyard. We can add new projects without having to rationalize that the usefulness outweighs the additional bloat.

Maybe there could be a separate --lean flag which allows a more stripped down baseline, the --full flag which adds everything everything in, and the default is some in between which hopefully covers most existing use cases.

Assuming some people clone "minimal" Chipyard, then the default flow would be to run the init-submod...* script to add those submodules back in? IMO a minimal Chipyard makes a lot of sense as the default, then we tell users to use the said script to re-build it up to what they want (most initial users probably only care about the core submodules - Rocket + L2 + testchipip + etc - for the most part)

Yes, I haven't gone through the process of adding the flags to build-setup. I've only added them to init-submodules for now. Feedback here on the right user interface would be appreciated.

Spitballing the minimal setup, build-setup is defaulting to minimal Chipyard. Then we move the init-submodules to the top-level of the repo and clean up it's API/name to make it clear this is how to add/remove packages (i.e. submodules). Then users after build-setup can modify the packages using that script.

The other thought is, how to support people integrating new generators according to the rules, now that the rule also includes editing the shell script and a slightly different sbt process? Since all the effort to refactor things just collapses if people add their new submodules to default build anyway. Or maybe that doesn't matter as long proper integration is enforced when upstreaming?

Yes, the documentation for adding new generators, and the process for doing so, should be simplified.

I personally am not a fan of the symlinking but I don't know of a better way to do this right now.

scripts/init-submodules-no-riscv-tools-nolog.sh

abejgonzalez · 2025-03-19T23:16:27Z

common.mk

 include $(base_dir)/generators/tracegen/tracegen.mk
-include $(base_dir)/generators/nvdla/nvdla.mk
-include $(base_dir)/generators/radiance/radiance.mk
 include $(base_dir)/tools/torture.mk


Nit: Do we want to also ignore unfound files for the other repositories?

This PR will eventually make those repos also optional

abejgonzalez · 2025-03-19T23:20:45Z

common.mk


 #########################################################################################
 # Prerequisite lists
 #########################################################################################
 # Returns a list of files in directories $1 with single file extension $2.
 # If available, use 'fd' to find the list of files, which is faster than 'find'.
 ifeq ($(shell which fd 2> /dev/null),)
-	lookup_srcs = $(shell find -L $(1)/ -name target -prune -o \( -iname "*.$(2)" ! -iname ".*" \) -print 2> /dev/null)
+	lookup_srcs = $(shell find -L $(1)/ -name target -prune -o \( ! -xtype l -a -iname "*.$(2)" ! -iname ".*" \) -print 2> /dev/null)


Can we make this find expression easier to read, i.e. not a symlink and (name is $(2) and not name .*). Currently, this is quite hard to parse.

abejgonzalez · 2025-03-19T23:23:29Z

common.mk

 else
 	lookup_srcs = $(shell fd -L -t f -e $(2) . $(1))
 endif

 # Returns a list of files in directories $1 with *any* of the file extensions in $2
 lookup_srcs_by_multiple_type = $(foreach type,$(2),$(call lookup_srcs,$(1),$(type)))

-CHECK_SUBMODULES_COMMAND = echo "Checking all submodules in generators/ are initialized. Uninitialized submodules will be displayed" ; ! git submodule status $(base_dir)/generators | grep ^-
+CHECK_SUBMODULES_COMMAND = echo "Checking required submodules in generators/ are initialized. Uninitialized submodules will be displayed" ; ! git submodule status $(base_dir)/generators | grep '^-.*' | grep -vE "(ara|caliptra|compress)"


This to me seems brittle, I expect users to forget to add things to this list. Can we auto-generate this list based on something else (maybe we just specify the core set of submodules instead excluding the extension submodules)

This would need a "Single-source-of-truth" for listing what submodules are required/optional. I don't believe build.sbt sould be that single-source-of-truth... it would be another consumer

We could define a yaml/json somewhere in chipyard that all the scripts reference.

Is the intent that this YAML/JSON feeds into the build.sbt, which then parses this file? If so, then we need to encode dependencies, which, in my opinion, is too heavy-handed. If it's separate then you need to update the YAML/JSON and the build.sbt which also leads to the same issue (de-sync from both sets of inputs).

Brainstorming: We have a map[string -> SBTProjects] in the build.sbt. Users must add their project to this map, then there is a required list[string] that says the default required submodules. This is then passed to the main Chipyard project? Then you just need to parse that build.sbt line (this would still be janky).

So there are three options I see:

Heavy-handed approach, with new YAML/JSON that serves as single-source-of-truth that all the other scripts and build.sbt parse

Use build.sbt as single-source-of-truth, require parsing build.sbt in various scripts... (very ugly)

No single-source-of-truth

I'm of the opinion that approach 1 makes the most sense, but makes the most engineering. Approach 3 is the easiest for now. I'm not a fan of the intermediate approach 2.

FWIW I don't think (2) is that bad (bad but significantly less effort than (1). For example:

val projectNames: Seq[String] = Seq( "projectA", "projectB", "projectC", "projectD" ) val projects = projectNames.map(name => project(name).settings( // Add common settings here, if needed. version := "1.0.0", scalaVersion := "2.13.8" // or your desired Scala version ))

With that being said, I vote for (3) for now, if people want to fix this later (when they have time - or get annoyed enough) then they can.

abejgonzalez · 2025-03-19T23:26:44Z

build.sbt

-    compressacc, saturn, ara, firrtl2_bridge, vexiiriscv, tacit)
-  .settings(libraryDependencies ++= rocketLibDeps.value)
-  .settings(
+lazy val chipyard = {


Surprisingly not horrible. I wonder if we can encode this a bit better though... maybe we can write a quick SBT plugin to do this for us for any Chipyard submodule). Dropping https://github.com/sbt/sbt-sriracha/blob/master/src/main/scala/SrirachaPlugin.scala for a reference on how to write a plugin. Or maybe we can write a quick SBT function to wrap this (like freshProject).

SBT function seems fine.

abejgonzalez · 2025-03-19T23:28:06Z

build.sbt

    libraryDependencies ++= Seq(
      "org.reflections" % "reflections" % "0.10.2"
    )
  )
  .settings(commonSettings)
  .settings(Compile / unmanagedSourceDirectories += file("tools/stage/src/main/scala"))

+  val includeAra = file("generators/ara/.git").exists()


For now, why not add this for every SBT project Chipyard depends on (for most this would automatically add the dependency).

This scheme currently supports only leaf projects, so we can't blanket apply this.

Co-authored-by: Abraham Gonzalez <[email protected]>

jerryz123 added 8 commits March 19, 2025 10:38

modular ara

9b76fa2

Chance AraConfigs to symlink to submodule'd file

76cba97

Ignore errors on makefile includes of submodule'd makefiles

dbf7fa9

Do not depend on broken scala symlinks, as these may come from un ini…

c7022ba

…t'd submodules

Add ara to list of allowably unitialized submods

23bcc54

Change init-submodules script to do a lite-clone by default

e45e0c8

Perform full submod init for CI

afe6662

Remove radiance optionality for now

53e33a5

jerryz123 added the changelog:added label Mar 19, 2025

Bump caliptra submod

443ef94

Modularize caliptra

b73fc75

jerryz123 force-pushed the modular branch 2 times, most recently from 06340bd to 740c9ea Compare March 19, 2025 21:54

Modularize compressacc

df2e679

jerryz123 force-pushed the modular branch from 740c9ea to df2e679 Compare March 19, 2025 23:09

abejgonzalez requested changes Mar 19, 2025

View reviewed changes

Update scripts/init-submodules-no-riscv-tools-nolog.sh

8f1ae68

Co-authored-by: Abraham Gonzalez <[email protected]>

jerryz123 mentioned this pull request Mar 20, 2025

Remove cake-pattern requirement for no-IO devices #2214

Draft

16 tasks

jerryz123 force-pushed the modular branch from cf945dd to 62ebb39 Compare March 23, 2025 19:35

modular mempress

d96d34c

jerryz123 force-pushed the modular branch from 62ebb39 to d96d34c Compare March 23, 2025 19:50

Modularize saturn

1674c1d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow "lite" Chipyard builds with a minimal subset of submodules #2212

Allow "lite" Chipyard builds with a minimal subset of submodules #2212

jerryz123 commented Mar 19, 2025 •

edited

Loading

Fi50 commented Mar 19, 2025 •

edited

Loading

jerryz123 commented Mar 19, 2025

joonho3020 commented Mar 19, 2025

jerryz123 commented Mar 19, 2025

abejgonzalez left a comment

abejgonzalez Mar 19, 2025

jerryz123 Mar 19, 2025

abejgonzalez Mar 19, 2025

abejgonzalez Mar 19, 2025 •

edited

Loading

jerryz123 Mar 20, 2025

abejgonzalez Mar 20, 2025

jerryz123 Mar 21, 2025

abejgonzalez Mar 21, 2025 •

edited

Loading

abejgonzalez Mar 19, 2025

jerryz123 Mar 21, 2025

abejgonzalez Mar 19, 2025

jerryz123 Mar 20, 2025

Allow "lite" Chipyard builds with a minimal subset of submodules #2212

Are you sure you want to change the base?

Allow "lite" Chipyard builds with a minimal subset of submodules #2212

Conversation

jerryz123 commented Mar 19, 2025 • edited Loading

Fi50 commented Mar 19, 2025 • edited Loading

jerryz123 commented Mar 19, 2025

joonho3020 commented Mar 19, 2025

jerryz123 commented Mar 19, 2025

abejgonzalez left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abejgonzalez Mar 19, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abejgonzalez Mar 21, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jerryz123 commented Mar 19, 2025 •

edited

Loading

Fi50 commented Mar 19, 2025 •

edited

Loading

abejgonzalez Mar 19, 2025 •

edited

Loading

abejgonzalez Mar 21, 2025 •

edited

Loading