diff --git a/docs/rad-sim-code-structure.rst b/docs/rad-sim-code-structure.rst index 1af111a..c2018fc 100644 --- a/docs/rad-sim-code-structure.rst +++ b/docs/rad-sim-code-structure.rst @@ -25,8 +25,12 @@ The code structure of RAD-Sim is summarized as follows: | | |- sc_flit.{cpp/hpp} | | |- radsim_noc.{cpp/hpp} | |- design_context.{cpp/hpp} + | |- design_system.hpp + | |- design_top.hpp + | |- radsim_cluster.{cpp/hpp} | |- radsim_config.{cpp/hpp} | |- radsim_defines.hpp + | |- radsim_inter_rad.{cpp/hpp} | |- radsim_module.{cpp/hpp} | |- radsim_telemetry.{cpp/hpp} | |- radsim_utils.{cpp/hpp} @@ -52,7 +56,7 @@ Simulator Infrastructure (``sim``) ---------------------------------- This directory includes all the RAD-Sim simulation infrastructure and utilities: -* The ``noc`` directory which contains everything related to the NoC modeling: +* The ``noc`` directory contains everything related to the NoC modeling: * `Booksim 2.0 `_ NoC simulator source code. * Definitions of the AXI memory mapped (AXI-MM) and streaming (AXI-S) interfaces (``{aximm/axis}_interface.hpp``). @@ -66,23 +70,31 @@ This directory includes all the RAD-Sim simulation infrastructure and utilities: * `DRAMsim3 `_ memory simulator source code. * SystemC wrapper for DRAMsim that presents an AXI-MM interface and implements functionality book-keeping to be instantiated in application designs (``mem_controller.{cpp/hpp}``). -* The ``RADSimDesignContext`` class in ``design_context.{cpp/hpp}`` which stores all the details of a RAD-Sim design such as NoCs and modules of the design, their clocks, module NoC placement, and connections between modules and NoC adapters. For each RAD-Sim simulation, there is a single global variable of this class type (``radsim_design``) that stores these information to be used from any part of the simulator. +* The ``RADSimDesignContext`` class in ``design_context.{cpp/hpp}`` stores all the details of a RAD-Sim design such as NoCs and modules of the design, their clocks, module NoC placement, and connections between modules and NoC adapters. For each device in the RAD-Sim simulation, there is a variable of this class type (``radsim_design``) that stores these information to be used from any part of the simulator. -* The ``RADSimConfig`` class in ``radsim_config.{cpp/hpp}`` which stores all the RAD-Sim configuration parameters. +* The ``RADSimCluster`` class in ``radsim_cluster.{cpp/hpp}`` stores details for the cluster of RADs for the RADSim simulation. This is the top-level of the hierarchy for simulation. Single-RAD simulation is implemented as a cluster of one RAD. -* RAD-Sim constant definitions in ``radsim_defines.hpp``. This header file is automatically generated by the RAD-Sim configuration script (``config.py``). +* The ``RADSimDesignSystem`` class in ``design_system.hpp`` is a generalized parent class used per design. The RADSimDesignSystem wraps around the device-under-test (DUT) and testbench. Each design in the ``example-designs`` directory has its own system class that should inherit from this class. This class has ``sc_module`` as its virtual parent class. + +* The ``RADSimDesignTop`` class in ``design_top.hpp`` is a parent class for the DUT (top) class used within any design. It contains the creation of a portal module which is used to interface with the inter-RAD network. This class has ``sc_module`` as its virtual parent class. -* The ``RADSimModule`` class in ``radsim_module.{cpp/hpp}`` which implements an abstract class from which all RAD-Sim application modules are derived. This class stores information about each module in the design such as its name, its clock, pointers to its AXI-MM/AXI-S ports and their data widths. Each module in the application design must implement the pure virtual funtion ``RegisterModuleInfo()`` with adds the module AXI-MM and AXI-S master/slave ports to the ``RADSimDesignContext`` class. +* The ``RADSimConfig`` class in ``radsim_config.{cpp/hpp}`` stores all the RAD-Sim configuration parameters. -* Logging and trace recording functions and classes in ``radsim_telemetry.{cpp/hpp}``. +* RAD-Sim constant definitions are in ``radsim_defines.hpp``. This header file is automatically generated by the RAD-Sim configuration script (``config.py``). - * The ``NoCTransactionTrace`` and ``NoCTransactionTelemetry`` for collecting NoC statistics. - * The ``SimLog`` class for logging simulator messages. - * The ``SimTraceRecording`` class for recording timestamps at any time during the simulation and dumping them as simulation traces at the end of the simulation. +* The ``RADSimInterRad class`` in ``radsim_inter_rad.{cpp/hpp}`` implements a latency- and bandwidth-constrained network for communication between RADs. -* Utility functions functions and struct definitions in ``radsim_utils.{cpp/hpp}``. +* The ``RADSimModule`` class in ``radsim_module.{cpp/hpp}`` implements an abstract class from which all RAD-Sim application modules are derived. This class stores information about each module in the design such as its name, its clock, pointers to its AXI-MM/AXI-S ports and their data widths. Each module in the application design must implement the pure virtual funtion ``RegisterModuleInfo()`` with adds the module AXI-MM and AXI-S master/slave ports to the ``RADSimDesignContext`` class. -* The ``main.cpp`` file which declares all the global variables, instantiates the system to be simulated and starts the SystemC simulation. +* Logging and trace recording functions and classes are in ``radsim_telemetry.{cpp/hpp}``. + + * The ``NoCTransactionTrace`` and ``NoCTransactionTelemetry`` are used for collecting NoC statistics. + * The ``SimLog`` class is for logging simulator messages. + * The ``SimTraceRecording`` class is for recording timestamps at any time during the simulation and dumping them as simulation traces at the end of the simulation. + +* Utility functions and struct definitions are in ``radsim_utils.{cpp/hpp}``. + +* The ``main.cpp`` file declares all the global variables, instantiates the system to be simulated, and starts the SystemC simulation. Application Designs (``example-designs``) ----------------------------------------- @@ -92,13 +104,13 @@ own sub-directory (``/``) which must contain the following files/di Modules Directory (``modules/``) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -This directory includes the SystemC definitions of all the modules in the design. All these modules have to be derived +This directory includes the SystemC definitions of all the modules in the design. All of these modules have to be derived from the ``RADSimModule`` abstract class. If a module is to be attached to the NoC, it must have AXI-MM and/or AXI-S ports which are defined in the ``sim/{aximm|axi_s}_interface.hpp`` files. Design Top-level (``_top.{cpp/hpp}``) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -These files define a SystemC module (``sc_module``) that instantiates all the modules in the design and connects any +These files define a RADSimDesignTop class which in turn defines a SystemC module (``sc_module``) that instantiates all the modules in the design and connects any non-NoC signals between the modules in its constructor using conventional SystemC syntax. At the end of its constructor, it must include the following lines of code to build the design context, create the system NoCs, and automatically connect the ports of NoC-attached modules to the NoC based on the NoC placement file: @@ -106,15 +118,15 @@ connect the ports of NoC-attached modules to the NoC based on the NoC placement .. code-block:: c++ // mydesign_top Constructor - mydesign_top::mydesign_top(const sc_module_name &name): sc_module(name) { - + mydesign_top::mydesign_top(const sc_module_name &name, RADSimDesignContext* radsim_design) : RADSimDesignTop(radsim_design) { + this->radsim_design = radsim_design; //to use within design // Module Instantiations and Connections Start Here // ... // Module Instantiations and Connections End Here - radsim_design.BuildDesignContext("mydesign.place", "mydesign.clks"); - radsim_design.CreateSystemNoCs(rst); - radsim_design.ConnectModulesToNoC(); + radsim_design->BuildDesignContext("mydesign.place", "mydesign.clks"); + radsim_design->CreateSystemNoCs(rst); + radsim_design->ConnectModulesToNoC(); } The design top-level SystemC module will typically have input/output ports (``sc_in/sc_out``) which will be used to @@ -127,15 +139,15 @@ It has two SystemC threads (``SC_CTHREAD``): a ``source`` thread that sends inpu and a ``sink`` thread that listens on the design top-level output ports to receive outputs. A common scenario is that this driver module performs the following steps: -1. Parse test inputs and golden outputs from files -2. Use the ``source`` thread to send inputs to design top-level when ready -3. Use ``sink`` thread to listen for outputs from the design top-level when available -4. Compare received outputs to golden outputs to verify functionality -5. Stop simulation when all outputs are received +1. Parse test inputs and golden outputs from files. +2. Use the ``source`` thread to send inputs to design top-level when ready. +3. Use ``sink`` thread to listen for outputs from the design top-level when available. +4. Compare received outputs to golden outputs to verify functionality. +5. Raise per-RAD done flag when all testbench outputs are received. When all testbenches (for all RADs in the simulation raise their done flags, simulation stops. Design System (``_system.{cpp/hpp}``) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -This is a simple SystemC module (``sc_module``) that instantiates and connects the design top-level and simulation +This inherits from the RADSimDesignSystem class and is a simple SystemC module (``sc_module``) that instantiates and connects the design top-level and simulation driver modules. This is the single module that will be instantiated inside the ``sc_main()`` function in the ``main.cpp`` file. @@ -155,6 +167,10 @@ design's ``config.yml`` file. For example, if the ``config.yml`` file, had the f adapters of both modules are operating at 1.25 ns clock period (800 MHz), while ``module_a`` has a clock period of 2.5 ns (400 MHz) and ``module_b`` has a clock period of 5.0 ns (200 MHz). +.. note:: +For designs containing multiple RADs, RAD-Sim adds a portal module to the design, which allows for communication between +RADs. The clock configuration for the portal module should be added to the clock configuration file. + .. code-block:: yaml noc_adapters: @@ -188,9 +204,15 @@ and the bottom-right router has ID :math:`N^2-1` for an :math:`N \times N` mesh. interfaces, it is possible to only write the module name and this will result in all its ports to be connected to the same NoC router with arbitration logic between them. +.. note:: +For designs containing multiple RADs, RAD-Sim adds a portal module to the design, which allows for communication between +RADs. The NoC configuration for the portal module should be added to the configuration file. AXI-S is the correct +interface type. Verify that the design configuration yaml file has a large enough NoC size to include the portal module. +Any unused NoC ID can be selected. + CMakeLists File (``CMakeLists.txt``) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -This is a convntional CMakeLists file that lists all your modules, top, driver, and system header and source files +This is a conventional CMakeLists file that lists all your modules, top, driver, and system header and source files for CMake to compile correctly when you build RAD-Sim for the application design. For a new application design, it is recommended that you copy the ``CMakeLists.txt`` file from one of the provided example design directories and edit the ``hdrfiles`` and ``srcfiles`` variables to include all your design ``.hpp`` and ``.cpp`` files. @@ -198,8 +220,15 @@ recommended that you copy the ``CMakeLists.txt`` file from one of the provided e RAD-Sim Configuration File (``config.yml``) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This YAML file configures all the RAD-Sim parameters for the simulation of the application design under 4 main tags: -``noc``, ``noc_adapters``, ``design``, and ``telemetry``. An example configuration file is shown below, followed by -an explanation for each configuration parameter. +``noc``, ``noc_adapters``, ``config ``, and ``cluster``. The ``noc`` and ``noc_adapters`` parameters are shared across all RADs. +There may be multiple ``config `` sections, each describing a RAD configuration that can be applied to a single or multiple devices in the cluster. +The ``cluster`` tag describes the cluster of RADs, including the number of RADs and their configurations. + +This file should be located in the same directory as the ``config.py`` script. For a new design, you should copy +the ``config.yml`` file from one of the provided example design directories and make modifications for your use case. + +Note that the parameters within a ``config `` subsection can be applied to a single RAD or shared among multiple RADs. +An example configuration file is shown below, followed by an explanation for each configuration parameter. .. code-block:: yaml @@ -232,14 +261,40 @@ an explanation for each configuration parameter. out_arbiter: ['priority_rr'] vc_mapping: ['direct'] - design: - name: 'aximm_hello_world' - noc_placement: ['aximm_hello_world.place'] - clk_periods: [5.0] - - telemetry: - log_verbosity: 2 - traces: [] + config rad1: + dram: + num_controllers: 4 + clk_periods: [3.32, 3.32, 2.0, 2.0] + queue_sizes: [64, 64, 64, 64] + config_files: ['DDR4_8Gb_x16_2400', 'DDR4_8Gb_x16_2400', 'HBM2_8Gb_x128', 'HBM2_8Gb_x128'] + + design: + name: 'dlrm' + noc_placement: ['dlrm.place'] + clk_periods: [5.0, 2.0, 3.32, 1.5] + + config anotherconfig: + dram: + num_controllers: 4 + clk_periods: [3.32, 3.32, 2.0, 2.0] + queue_sizes: [64, 64, 64, 64] + config_files: ['DDR4_8Gb_x16_2400', 'DDR4_8Gb_x16_2400', 'HBM2_8Gb_x128', 'HBM2_8Gb_x128'] + + design: + name: 'dlrm' + noc_placement: ['dlrm.place'] + clk_periods: [5.0, 2.0, 3.32, 1.5] + + cluster: + sim_driver_period: 5.0 + telemetry_log_verbosity: 2 + telemetry_traces: ['Embedding LU', 'Mem0', 'Mem1', 'Mem2', 'Mem3', 'Feature Inter.', 'MVM first', 'MVM last'] + num_rads: 2 + cluster_configs: ['rad1', 'anotherconfig'] #use config 'rad1' for the first RAD and config 'anotherconfig' for the second RAD under simulation + cluster_topology: 'all-to-all' #this parameter is not currently used + inter_rad_latency: 2100 #in nanoseconds + inter_rad_bw: 102.4 #in bits per nanosecond + inter_rad_fifo_num_slots: 1000 **NoC Configuration Parameters** @@ -293,19 +348,47 @@ an explanation for each configuration parameter. :menuselection:`vc_mapping` -**Design Configuration Parameters** +**Configuration Parameters** + +**Config subsection: DRAM Configuration Parameters** + +:menuselection:`num_controllers` is the number of DRAM controllers + +:menuselection:`clk_periods` are the clock periods per DRAM + +:menuselection:`queue_sizes` are the names of the ``DRAMSim3`` configuration file for each DRAM. For a complete list of configuration options, check the ``rad-flow/rad-sim/sim/dram/DRAMsim3/configs/`` directory. + +:menuselection:`config_files` are the filenames of the files specifying the memory configuration per DRAM + +**Config subsection: Design Configuration Parameters** + +:menuselection:`name` of the design being run in this configuration + +:menuselection:`noc_placement` is the NoC placement file to use + +:menuselection:`clk_periods` is a list of all clock periods used in this design + +**Cluster Configuration Parameters** + +:menuselection:`sim_driver_period` is the max clock period in nanoseconds for the entire simulation. Simulation cycle counts are reported based upon this. + +:menuselection:`telemetry_log_verbosity` specifies how much detail to use for the telemetry logging + +:menuselection:`telemetry_traces` specifies which simulation traces to use for telemetry + +:menuselection:`num_rads` is the number of RADs being simulated -:menuselection:`name` +:menuselection:`cluster_configs` is a list of which configuration to use per-RAD. These names must match those in the config tagged sections. -:menuselection:`noc_placement` +:menuselection:`cluster_topology` is not currently used but is meant to specify the connection of RADs within the cluster. +Currently only all-to-all is supported wherein each RAD can send to and receive data from any other RAD over the inter-RAD network directly. -:menuselection:`clk_periods` +:menuselection:`inter_rad_latency` is the latency in nanoseconds for data transfer between RADs over the inter-RAD network -**Telemetry Configuration Parameters** +:menuselection:`inter_rad_bw` is the bandwidth in bits per nanosecond for data transfer between RADs over the inter-RAD network -:menuselection:`log_verbosity` +:menuselection:`inter_rad_fifo_num_slots` is the number of FIFO slots available for the buffering within the inter-RAD network -:menuselection:`traces` -Testing Scripts (``test``) --------------------------- +.. Testing Scripts (``test``) +.. -------------------------- diff --git a/docs/rad-sim-developer.rst b/docs/rad-sim-developer.rst index 3e9ad03..7e31f3a 100644 --- a/docs/rad-sim-developer.rst +++ b/docs/rad-sim-developer.rst @@ -6,6 +6,9 @@ RAD-Sim Testing Infrastructure Python Scripts Tests ^^^^^^^^^^^^^^^^^^^^^ +.. note:: + This script does not currently work in multi-RAD RAD-Sim. + To run python tests, ensure the current working directory is in the ``rad-sim`` folder and run the following steps: #. ``python -m unittest discover .`` diff --git a/docs/rad-sim-quick-start.rst b/docs/rad-sim-quick-start.rst index 3ac3afc..6332828 100644 --- a/docs/rad-sim-quick-start.rst +++ b/docs/rad-sim-quick-start.rst @@ -74,7 +74,6 @@ Building RAD-Sim ---------------- You can configure RAD-Sim for your example design simulation using the following commands executed at the ``rad-sim`` root directory (the commands use the ``mlp`` example design which can be replaced by your own design under the ``rad-flow/rad-sim/example-designs`` directory): - .. code-block:: bash $ cd /rad-sim diff --git a/docs/rad-sim-rtl-code.rst b/docs/rad-sim-rtl-code.rst index 7e477cb..fd22815 100644 --- a/docs/rad-sim-rtl-code.rst +++ b/docs/rad-sim-rtl-code.rst @@ -58,6 +58,17 @@ RAD-Sim has a pre-defined file structure for supporting RTL code. All RTL code m An example design that utilizes RTL modules can be found in the ``rad-sim/example-designs/rtl_add`` folder. +.. note:: + For designs containing multiple RADs, RAD-Sim adds a portal module to each RAD for communication between devices. + Bitwidths for AXI-S signals carrying destination address (``tdest``) should match the ``DESTW`` set in + ``sim/radsim_defines.hpp``, which is generated by running: + .. code-block:: bash + $ python config.py + +.. note:: + RAD-Sim adds a portal module for designs containing multiple RADs. The NoC, clock, and general configuration files + should be modified according to the code structure guide. + RTL CMakeLists --------------- The RTL source folder additionally contains a CMakeLists script, and an optional port mapping file used for :ref:`automatic wrapper generation `. diff --git a/docs/rad-sim-two-rad-dlrm-example.rst b/docs/rad-sim-two-rad-dlrm-example.rst new file mode 100644 index 0000000..0995786 --- /dev/null +++ b/docs/rad-sim-two-rad-dlrm-example.rst @@ -0,0 +1,48 @@ +Two-RAD DLRM Example Design +================= + +This guide explains how to use the two-RAD DLRM example design. RAD 1 is responsible for the DLRM up to and including the embedding table lookups. +These are then transmitted to RAD 2 over the inter-RAD network, which then completes the remaining model stages. + +Building RAD-Sim +---------------- + +You can configure RAD-Sim for the two-RAD DLRM design simulation using the following commands executed at the ``rad-sim`` root directory. + +.. code-block:: bash + + $ cd /rad-sim + $ python config.py dlrm_two_rad #dlrm_two_rad is name of design directory within example-designs parent directory + +Running RAD-Sim +---------------- + +You can then simulate this two-RAD DLRM example design following these steps: + + +1. Generate a DLRM test case using the provided compiler: + + .. code-block:: bash + + $ cd /rad-sim/example-designs/dlrm_two_rad/compiler + $ python dlrm.py + +2. Run RAD-Sim simulation: + + .. code-block:: bash + + $ cd /rad-sim/build + $ make run + # Info: /OSCI/SystemC: Simulation stopped by user. + # Simulation Cycles from main.cpp = 20390 + # [100%] Built target run + # dlrm_system.driver: Finished sending all inputs to embedding lookup module! + # dlrm_system.dut.feature_interaction_inst: Got all memory responses at cycle 6113! + # [==================================================] 100 % + # Got 2048 output(s)! + # Simulation PASSED! All outputs matching! + # Simulated 19958 cycle(s) + + # Info: /OSCI/SystemC: Simulation stopped by user. + # Simulation Cycles from main.cpp = 19971 + # [100%] Built target run \ No newline at end of file diff --git a/rad-sim/.gitignore b/rad-sim/.gitignore index c567747..b501b2f 100644 --- a/rad-sim/.gitignore +++ b/rad-sim/.gitignore @@ -3,4 +3,8 @@ radsim_knobs __pycache__ -test/*.xml \ No newline at end of file +test/*.xml +*.cmake +*.a +CMakeFiles +MakeFile \ No newline at end of file diff --git a/rad-sim/CMakeLists.txt b/rad-sim/CMakeLists.txt index 1bd5738..c5a86b2 100644 --- a/rad-sim/CMakeLists.txt +++ b/rad-sim/CMakeLists.txt @@ -5,8 +5,11 @@ project(RADSim) set(CMAKE_BINARY_DIR "./build/") set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}) -SET(DESIGN "dlrm" CACHE STRING "Design directory to be compiled. Must be under rad-flow/rad-sim/example-designs") -message(STATUS "Compiling the ${DESIGN} design") +#SET(DESIGN "dlrm" CACHE STRING "Design directory to be compiled. Must be under rad-flow/rad-sim/example-designs") +FOREACH(DESIGN_NAME ${DESIGN_NAMES}) + #MESSAGE("<<${DESIGN_NAME}>>") + message(STATUS "Compiling the ${DESIGN_NAME} design") +ENDFOREACH() add_subdirectory(sim) add_subdirectory(example-designs) diff --git a/rad-sim/config.py b/rad-sim/config.py index dfd230f..5c0a507 100644 --- a/rad-sim/config.py +++ b/rad-sim/config.py @@ -3,148 +3,197 @@ import yaml import sys import shutil +from itertools import repeat +from copy import deepcopy +from math import ceil -def parse_config_file(config_filename, booksim_params, radsim_header_params, radsim_knobs): +def parse_config_file(config_filename, booksim_params, radsim_header_params, radsim_knobs, cluster_knobs): with open(config_filename, 'r') as yaml_config: config = yaml.safe_load(yaml_config) - - for param_category in config: - for param, param_value in config[param_category].items(): - param_name = param_category + '_' + param - invalid_param = True - if param_name in booksim_params: - booksim_params[param_name] = param_value - invalid_param = False - if param_name in radsim_header_params: - radsim_header_params[param_name] = param_value - invalid_param = False - if param_name in radsim_knobs: - radsim_knobs[param_name] = param_value - invalid_param = False - - if invalid_param: - print("Config Error: Parameter " + param_name + " is invalid!") - exit(1) - - '''noc_num_nodes = [] - for n in range(radsim_knobs["noc_num_nocs"]): - noc_num_nodes.append(0) - radsim_knobs["noc_num_nodes"] = noc_num_nodes - radsim_header_params["noc_num_nodes"] = noc_num_nodes''' - radsim_knobs["radsim_user_design_root_dir"] = radsim_knobs["radsim_root_dir"] + "/example-designs/" + radsim_knobs["design_name"] - - longest_clk_period = radsim_knobs["design_clk_periods"][0] - for p in radsim_knobs["design_clk_periods"]: - if p > longest_clk_period: - longest_clk_period = p - radsim_knobs["sim_driver_period"] = longest_clk_period + config_counter = 0 + for config_section in config: + print(config_section + ':') + if 'config' in config_section: + print('NAME OF CONFIG: ' + str(config_section.split()[1])) + config_names.append(str(config_section.split()[1])) + for param_category, param in config[config_section].items(): + if 'config' in config_section and (isinstance(param, dict)): + print(' ' + param_category + ':') + for param, param_value in param.items(): + print(' ' + param, param_value) + param_name = param_category + '_' + param + invalid_param = True + if param_name in booksim_params[config_counter]: + booksim_params[config_counter][param_name] = param_value + invalid_param = False + if config_counter == 0 and param_name in radsim_header_params: #all header params are shared across RADs, so only need to store one time + radsim_header_params[param_name] = param_value + invalid_param = False + if param_name in radsim_knobs[config_counter]: + radsim_knobs[config_counter][param_name] = param_value + invalid_param = False + + if invalid_param: + print("Config Error: Parameter " + param_name + " is invalid!") + exit(1) + + elif config_section == "noc" or config_section == "noc_adapters" or config_section == "interfaces": + param_value = param #bc no subsection, so correction + param = param_category #bc no subsection, so correction + print(' ' + param, param_value) + param_name = config_section + '_' + param + invalid_param = True + for i in range(0, num_configs): #use num_configs from command line in case NoC sections are earlier in yaml file + if param_name in booksim_params[i]: + booksim_params[i][param_name] = param_value + invalid_param = False + if param_name in radsim_header_params: #all header params are shared across RADs, so only need to store one time + radsim_header_params[param_name] = param_value + invalid_param = False + if param_name in radsim_knobs[i]: + radsim_knobs[i][param_name] = param_value + invalid_param = False + if invalid_param: + print("Config Error: Parameter " + param_name + " is invalid!") + exit(1) + + elif config_section == "cluster": + param_value = param #bc no subsection, so correction + param_name = param_category #bc no subsection, so correction + print(' ' + param_name, param_value) + if param_name in cluster_knobs: + cluster_knobs[param_name] = param_value + else: + print("Config Error: Parameter " + param_name + " is invalid!") + exit(1) + if 'config' in config_section: + config_counter += 1 + + for i in range(0, num_configs): + radsim_knobs[i]["radsim_user_design_root_dir"] = cluster_knobs["radsim_root_dir"] + "/example-designs/" + radsim_knobs[i]["design_name"] + + if config_counter != num_configs: + print('number of unique config sections in config YAML file does not match commandline argument') + exit(-1) def print_config(booksim_params, radsim_header_params, radsim_knobs): print("*****************************") print("** RAD-FLOW CONFIGURATION **") print("*****************************") - for param in booksim_params: - print(param + " : " + str(booksim_params[param])) + for config_count in range(num_configs): + for param in booksim_params[config_count].keys(): + print("config " + str(config_count) + " : " + param + " : " + str(booksim_params[config_count][param])) + for param in radsim_knobs[config_count].keys(): + print("config " + str(config_count) + " : " + param + " : " + str(radsim_knobs[config_count][param])) for param in radsim_header_params: print(param + " : " + str(radsim_header_params[param])) - for param in radsim_knobs: - print(param + " : " + str(radsim_knobs[param])) - - -def generate_booksim_config_files(booksim_params, radsim_header_params, radsim_knobs): - for i in range(booksim_params["noc_num_nocs"]): - booksim_config_file = open(booksim_params["radsim_root_dir"] + "/sim/noc/noc" + str(i) + "_config", "w") - - # Booksim topology configuration - booksim_config_file.write("// Topology\n") - noc_topology = booksim_params["noc_topology"][i] - noc_type = booksim_params["noc_type"][i] - if noc_topology == "mesh" or noc_topology == "torus": - # A 3D RAD instance is modeled as a concenterated mesh NoC - if noc_type == "2d": - booksim_config_file.write("topology = " + noc_topology + ";\n") - elif noc_type == "3d" and noc_topology == "mesh": - booksim_config_file.write("topology = cmesh;\n") - else: - print("Config Error: noc_type parameter value has to be 2d or 3d") - exit(1) - # Booksim does not support assymetric meshes so it is simplified as a square mesh assuming that a simple dim - # order routing will never use the links/routers outside the specified grid - noc_dim_x = booksim_params["noc_dim_x"][i] - noc_dim_y = booksim_params["noc_dim_y"][i] - larger_noc_dim = noc_dim_x - if noc_dim_y > noc_dim_x: - larger_noc_dim = noc_dim_y - booksim_config_file.write("k = " + str(larger_noc_dim) + ";\n") - booksim_config_file.write("n = 2;\n") - - # Booksim supports concentrated meshes of 4 nodes per router only -- RAD-Sim works around that by modeling - # 3D RAD instances as a concentrated mesh of FPGA node, base die node, and two "empty" nodes by adjusting - # their IDs - if noc_type == "3d": - radsim_header_params["noc_num_nodes"][i] = (larger_noc_dim * larger_noc_dim * 4) - radsim_knobs["noc_num_nodes"][i] = larger_noc_dim * larger_noc_dim * 4 - booksim_config_file.write("c = 4;\n") - booksim_config_file.write("xr = 2;\n") - booksim_config_file.write("yr = 2;\n") + +def generate_booksim_config_files(booksim_params, radsim_header_params, radsim_knobs, cluster_knobs): + num_configs = len(cluster_knobs["cluster_configs"]) + for config_idx in range(num_configs): #curr_config_num in range(num_configs): + curr_config_name = cluster_knobs["cluster_configs"][config_idx] #retrieve the config num by rad ID + curr_config_num = config_names.index(curr_config_name) + print('generate_booksim_config_files fn ' + curr_config_name + ' ' + str(curr_config_num)) + for noc_idx in range(booksim_params[curr_config_num]["noc_num_nocs"]): + booksim_config_file = open(booksim_params[curr_config_num]["radsim_root_dir"] + "/sim/noc/noc" + str(noc_idx) + "_rad" + str(curr_config_num) + "_config", "w") + + # Booksim topology configuration + booksim_config_file.write("// Topology\n") + noc_topology = booksim_params[curr_config_num]["noc_topology"][noc_idx] + noc_type = booksim_params[curr_config_num]["noc_type"][noc_idx] + if noc_topology == "mesh" or noc_topology == "torus": + # A 3D RAD instance is modeled as a concenterated mesh NoC + if noc_type == "2d": + booksim_config_file.write("topology = " + noc_topology + ";\n") + elif noc_type == "3d" and noc_topology == "mesh": + booksim_config_file.write("topology = cmesh;\n") + else: + print("Config Error: noc_type parameter value has to be 2d or 3d") + exit(1) + + # Booksim does not support assymetric meshes so it is simplified as a square mesh assuming that a simple dim + # order routing will never use the links/routers outside the specified grid + noc_dim_x = booksim_params[curr_config_num]["noc_dim_x"][noc_idx] + noc_dim_y = booksim_params[curr_config_num]["noc_dim_y"][noc_idx] + larger_noc_dim = noc_dim_x + if noc_dim_y > noc_dim_x: + larger_noc_dim = noc_dim_y + booksim_config_file.write("k = " + str(larger_noc_dim) + ";\n") + booksim_config_file.write("n = 2;\n") + + # Booksim supports concentrated meshes of 4 nodes per router only -- RAD-Sim works around that by modeling + # 3D RAD instances as a concentrated mesh of FPGA node, base die node, and two "empty" nodes by adjusting + # their IDs + if noc_type == "3d": + radsim_header_params["noc_num_nodes"][noc_idx] = (larger_noc_dim * larger_noc_dim * 4) + #TODO: make changes to have per-RAD booksim config too. for now, just hacky workaround to get radsim_knobs set + #for j in range(cluster_knobs["num_rads"]): + #radsim_knobs[j]["noc_num_nodes"][noc_idx] = larger_noc_dim * larger_noc_dim * 4 + radsim_knobs[curr_config_num]["noc_num_nodes"][noc_idx] = larger_noc_dim * larger_noc_dim * 4 + booksim_config_file.write("c = 4;\n") + booksim_config_file.write("xr = 2;\n") + booksim_config_file.write("yr = 2;\n") + else: + radsim_header_params["noc_num_nodes"][noc_idx] = (larger_noc_dim * larger_noc_dim) + #TODO: make changes to have per-RAD booksim config AND radsim_header_params too. for now, just hacky workaround to get radsim_knobs set + # for j in range(cluster_knobs["num_rads"]): + # radsim_knobs[j]["noc_num_nodes"][noc_idx] = larger_noc_dim * larger_noc_dim + radsim_knobs[curr_config_num]["noc_num_nodes"][noc_idx] = larger_noc_dim * larger_noc_dim + + elif noc_topology == "anynet": + booksim_config_file.write("topology = anynet;\n") + booksim_config_file.write("network_file = " + booksim_params[curr_config_num]["noc_anynet_file"][noc_idx] + ";\n") + if radsim_header_params[curr_config_num]["noc_num_nodes"][noc_idx] == 0: + print("Config Error: Number of nodes parameter missing for anynet NoC topologies!") + exit(1) + else: - radsim_header_params["noc_num_nodes"][i] = (larger_noc_dim * larger_noc_dim) - radsim_knobs["noc_num_nodes"][i] = larger_noc_dim * larger_noc_dim - - elif noc_topology == "anynet": - booksim_config_file.write("topology = anynet;\n") - booksim_config_file.write("network_file = " + booksim_params["noc_anynet_file"][i] + ";\n") - if radsim_header_params["noc_num_nodes"][i] == 0: - print("Config Error: Number of nodes parameter missing for anynet NoC topologies!") + print("Config Error: This NoC topology is not supported by RAD-Sim!") exit(1) - - else: - print("Config Error: This NoC topology is not supported by RAD-Sim!") - exit(1) - booksim_config_file.write("\n") - - # Booksim routing function configuration - booksim_config_file.write("// Routing\n") - booksim_config_file.write("routing_function = " + booksim_params["noc_routing_func"][i] + ";\n") - booksim_config_file.write("\n") - - # Booksim flow control configuration - booksim_config_file.write("// Flow control\n") - noc_vcs = booksim_params["noc_vcs"][i] - noc_num_packet_types = booksim_params["noc_num_packet_types"][i] - if noc_vcs % noc_num_packet_types != 0: - print("Config Error: Number of virtual channels has to be a multiple of the number of packet types!") - exit(1) - if noc_num_packet_types > 5: - print("Config Error: RAD-Sim supports up to 5 packet types") - exit(1) - noc_num_vcs_per_packet_type = int(noc_vcs / noc_num_packet_types) - booksim_config_file.write("num_vcs = " + str(noc_vcs) + ";\n") - booksim_config_file.write("vc_buf_size = " + str(booksim_params["noc_vc_buffer_size"][i]) + ";\n") - booksim_config_file.write("output_buffer_size = "+ str(booksim_params["noc_output_buffer_size"][i])+ ";\n") - booksim_flit_types = ["read_request", "write_request", "write_data", "read_reply", "write_reply"] - vc_count = 0 - for t in range(noc_num_packet_types): - booksim_config_file.write(booksim_flit_types[t] + "_begin_vc = " + str(vc_count) + ";\n") - vc_count = vc_count + noc_num_vcs_per_packet_type - booksim_config_file.write(booksim_flit_types[t] + "_end_vc = " + str(vc_count - 1) + ";\n") - booksim_config_file.write("\n") - - # Booksim router architecture and delays configuration - booksim_config_file.write("// Router architecture & delays\n") - booksim_config_file.write("router = " + booksim_params["noc_router_uarch"][i] + ";\n") - booksim_config_file.write("vc_allocator = " + booksim_params["noc_vc_allocator"][i] + ";\n") - booksim_config_file.write("sw_allocator = " + booksim_params["noc_sw_allocator"][i] + ";\n") - booksim_config_file.write("alloc_iters = 1;\n") - booksim_config_file.write("wait_for_tail_credit = 0;\n") - booksim_config_file.write("credit_delay = " + str(booksim_params["noc_credit_delay"][i]) + ";\n") - booksim_config_file.write("routing_delay = " + str(booksim_params["noc_routing_delay"][i]) + ";\n") - booksim_config_file.write("vc_alloc_delay = " + str(booksim_params["noc_vc_alloc_delay"][i]) + ";\n") - booksim_config_file.write("sw_alloc_delay = " + str(booksim_params["noc_sw_alloc_delay"][i]) + ";\n") - booksim_config_file.close() + booksim_config_file.write("\n") + + # Booksim routing function configuration + booksim_config_file.write("// Routing\n") + booksim_config_file.write("routing_function = " + booksim_params[curr_config_num]["noc_routing_func"][noc_idx] + ";\n") + booksim_config_file.write("\n") + + # Booksim flow control configuration + booksim_config_file.write("// Flow control\n") + noc_vcs = booksim_params[curr_config_num]["noc_vcs"][noc_idx] + noc_num_packet_types = booksim_params[curr_config_num]["noc_num_packet_types"][noc_idx] + if noc_vcs % noc_num_packet_types != 0: + print("Config Error: Number of virtual channels has to be a multiple of the number of packet types!") + exit(1) + if noc_num_packet_types > 5: + print("Config Error: RAD-Sim supports up to 5 packet types") + exit(1) + noc_num_vcs_per_packet_type = int(noc_vcs / noc_num_packet_types) + booksim_config_file.write("num_vcs = " + str(noc_vcs) + ";\n") + booksim_config_file.write("vc_buf_size = " + str(booksim_params[curr_config_num]["noc_vc_buffer_size"][noc_idx]) + ";\n") + booksim_config_file.write("output_buffer_size = "+ str(booksim_params[curr_config_num]["noc_output_buffer_size"][noc_idx])+ ";\n") + booksim_flit_types = ["read_request", "write_request", "write_data", "read_reply", "write_reply"] + vc_count = 0 + for t in range(noc_num_packet_types): + booksim_config_file.write(booksim_flit_types[t] + "_begin_vc = " + str(vc_count) + ";\n") + vc_count = vc_count + noc_num_vcs_per_packet_type + booksim_config_file.write(booksim_flit_types[t] + "_end_vc = " + str(vc_count - 1) + ";\n") + booksim_config_file.write("\n") + + # Booksim router architecture and delays configuration + booksim_config_file.write("// Router architecture & delays\n") + booksim_config_file.write("router = " + booksim_params[curr_config_num]["noc_router_uarch"][noc_idx] + ";\n") + booksim_config_file.write("vc_allocator = " + booksim_params[curr_config_num]["noc_vc_allocator"][noc_idx] + ";\n") + booksim_config_file.write("sw_allocator = " + booksim_params[curr_config_num]["noc_sw_allocator"][noc_idx] + ";\n") + booksim_config_file.write("alloc_iters = 1;\n") + booksim_config_file.write("wait_for_tail_credit = 0;\n") + booksim_config_file.write("credit_delay = " + str(booksim_params[curr_config_num]["noc_credit_delay"][noc_idx]) + ";\n") + booksim_config_file.write("routing_delay = " + str(booksim_params[curr_config_num]["noc_routing_delay"][noc_idx]) + ";\n") + booksim_config_file.write("vc_alloc_delay = " + str(booksim_params[curr_config_num]["noc_vc_alloc_delay"][noc_idx]) + ";\n") + booksim_config_file.write("sw_alloc_delay = " + str(booksim_params[curr_config_num]["noc_sw_alloc_delay"][noc_idx]) + ";\n") + booksim_config_file.close() def generate_radsim_params_header(radsim_header_params): @@ -153,6 +202,9 @@ def generate_radsim_params_header(radsim_header_params): radsim_params_header_file.write("// clang-format off\n") radsim_params_header_file.write('#define RADSIM_ROOT_DIR "' + radsim_header_params["radsim_root_dir"] + '"\n\n') + if (cluster_knobs["num_rads"] <= 1): + radsim_params_header_file.write('#define SINGLE_RAD 1\n\n') + radsim_params_header_file.write("// NoC-related Parameters\n") # Finding maximum NoC payload width and setting its definition max_noc_payload_width = 0 @@ -189,8 +241,8 @@ def generate_radsim_params_header(radsim_header_params): for n in radsim_header_params["noc_num_nodes"]: if n > max_num_nodes: max_num_nodes = n - max_destination_bitwidth = int(math.ceil(math.log(max_num_nodes, 2))) * 3 # TO-DO-MR: Multiply by 3 for (rad_id, node_id1, node_id0). If single RAD, node_id0 and node_id1 is the same. If not, node_id0 is portal and node_id1 is destination node on other RAD. - max_destination_field_bitwidth = int(math.ceil(math.log(max_num_nodes, 2))) # TO-DO-MR: Bitwidth of a single field of the destination described above. + max_destination_bitwidth = int(math.ceil(math.log(max_num_nodes, 2))) * 3 #Multiply by 3 for (rad_id, node_id1, node_id0). If single RAD, node_id0 and node_id1 is the same. If not, node_id0 is portal and node_id1 is destination node on other RAD. + max_destination_field_bitwidth = int(math.ceil(math.log(max_num_nodes, 2))) #Bitwidth of a single field of the destination described above. radsim_params_header_file.write("#define NOC_LINKS_DEST_WIDTH " + str(max_destination_bitwidth) + "\n") dest_interface_bitwidth = int(math.ceil(math.log(radsim_header_params["noc_max_num_router_dest_interfaces"], 2))) @@ -213,7 +265,8 @@ def generate_radsim_params_header(radsim_header_params): radsim_params_header_file.write("#define AXIS_KEEPW " + str(radsim_header_params["interfaces_axis_tkeep_width"]) + "\n") radsim_params_header_file.write("#define AXIS_IDW NOC_LINKS_PACKETID_WIDTH\n") radsim_params_header_file.write("#define AXIS_DESTW NOC_LINKS_DEST_WIDTH\n") - radsim_params_header_file.write("#define AXIS_DEST_FIELDW " + str(max_destination_field_bitwidth) + "\n") # TO-DO-MR: Define parameter for destination field width (to separate 3 fields) + #NOTE: AXIS_DEST_FIELDW is NOC_LINKS_DEST_WIDTH/3 to fit RAD_DEST_ID, REMOTE_NODE_ID, and LOCAL_NODE_ID + radsim_params_header_file.write("#define AXIS_DEST_FIELDW " + str(max_destination_field_bitwidth) + "\n") radsim_params_header_file.write("#define AXI4_IDW " + str(radsim_header_params["interfaces_axi_id_width"]) + "\n") radsim_params_header_file.write("#define AXI4_ADDRW 64\n") radsim_params_header_file.write("#define AXI4_LENW 8\n") @@ -258,45 +311,119 @@ def generate_radsim_params_header(radsim_header_params): radsim_params_header_file.close() +def get_fraction(input_val): + '''Returns tuple, where first value is numerator and second is denom''' + print(input_val) + remainder = float('{:,.3f}'.format(input_val % 1)) #choosing to keep only 3 dec places. need to do this bc python stores floats as binary fraction + if remainder == 0: #whole number + return (int(input_val), 1) + else: + count = 0 + while (remainder % 1 != 0): + remainder *= 10 + count += 1 + b = count * 10 + a = (( input_val-(input_val%1) ) * b) + remainder + return (int(a), b) + -def generate_radsim_config_file(radsim_knobs): +def generate_radsim_config_file(radsim_knobs, cluster_knobs): radsim_config_file = open(radsim_header_params["radsim_root_dir"] + "/sim/radsim_knobs", "w") - for param in radsim_knobs: - radsim_config_file.write(param + " ") - if isinstance(radsim_knobs[param], list): - for value in radsim_knobs[param]: + for config_id in range(len(cluster_knobs["cluster_configs"])): + curr_config_name = cluster_knobs["cluster_configs"][config_id] #retrieve the config num by rad ID + curr_config_num = config_names.index(curr_config_name) + for param in radsim_knobs[config_id]: + radsim_config_file.write(param + " " + str(config_id) + " ") # second element is RAD ID + if isinstance(radsim_knobs[curr_config_num][param], list): + for value in radsim_knobs[curr_config_num][param]: + radsim_config_file.write(str(value) + " ") + radsim_config_file.write("\n") + else: + radsim_config_file.write(str(radsim_knobs[curr_config_num][param]) + "\n") + for param in cluster_knobs: #for params shared across cluster + if param == 'configs': + continue + elif param == 'inter_rad_latency': + radsim_config_file.write("inter_rad_latency_cycles " + str(ceil(cluster_knobs[param]/cluster_knobs["sim_driver_period"])) + "\n") + continue + elif param == 'inter_rad_bw': + (inter_rad_bw_accept_cycles, inter_rad_bw_total_cycles) = get_fraction(cluster_knobs[param] * cluster_knobs["sim_driver_period"] / radsim_header_params["interfaces_max_axis_tdata_width"]) + if (inter_rad_bw_accept_cycles <= inter_rad_bw_total_cycles): + radsim_config_file.write("inter_rad_bw_accept_cycles " + str(inter_rad_bw_accept_cycles) + "\n") + radsim_config_file.write("inter_rad_bw_total_cycles " + str(inter_rad_bw_total_cycles) + "\n") + else: + print('generate_radsim_config_file error: Invalid or missing inter_rad_bw. Proceeding with default bandwidth of AXIS_DATAW per cycle.') + radsim_config_file.write("inter_rad_bw_accept_cycles " + str(1) + "\n") + radsim_config_file.write("inter_rad_bw_total_cycles " + str(1) + "\n") + continue + else: + radsim_config_file.write(param + " " ) + + if isinstance(cluster_knobs[param], list): + for value in cluster_knobs[param]: radsim_config_file.write(str(value) + " ") radsim_config_file.write("\n") else: - radsim_config_file.write(str(radsim_knobs[param]) + "\n") + radsim_config_file.write(str(cluster_knobs[param]) + "\n") radsim_config_file.close() -def generate_radsim_main(design_name): +def generate_radsim_main(design_names, radsim_knobs): main_cpp_file = open(radsim_header_params["radsim_root_dir"] + "/sim/main.cpp", "w") main_cpp_file.write("#include \n") main_cpp_file.write("#include \n") main_cpp_file.write("#include \n") main_cpp_file.write("#include \n") main_cpp_file.write("#include \n") - main_cpp_file.write("#include \n\n") - main_cpp_file.write("#include <" + design_name + "_system.hpp>\n\n") - main_cpp_file.write("RADSimConfig radsim_config;\n") - main_cpp_file.write("RADSimDesignContext radsim_design;\n") + main_cpp_file.write("#include \n") + main_cpp_file.write("#include \n") + main_cpp_file.write("#include \n\n") + for design_name in design_names: #iterate thru set of design names + main_cpp_file.write("#include <" + design_name + "_system.hpp>\n") + main_cpp_file.write("#define NUM_RADS " + str(cluster_knobs["num_rads"]) + " \n") + main_cpp_file.write("\nRADSimConfig radsim_config;\n") + #main_cpp_file.write("RADSimDesignContext radsim_design;\n") main_cpp_file.write("std::ostream *gWatchOut;\n") main_cpp_file.write("SimLog sim_log;\n") main_cpp_file.write("SimTraceRecording sim_trace_probe;\n\n") main_cpp_file.write("int sc_main(int argc, char *argv[]) {\n") + main_cpp_file.write("\tstd::string radsim_knobs_filename = \"/sim/radsim_knobs\";\n") + main_cpp_file.write("\tstd::string radsim_knobs_filepath = RADSIM_ROOT_DIR + radsim_knobs_filename;\n") + main_cpp_file.write("\tradsim_config.ResizeAll(NUM_RADS);\n") + main_cpp_file.write("\tParseRADSimKnobs(radsim_knobs_filepath);\n\n") + main_cpp_file.write("\tRADSimCluster* cluster = new RADSimCluster(NUM_RADS);\n\n") main_cpp_file.write("\tgWatchOut = &cout;\n") - main_cpp_file.write("\tint log_verbosity = radsim_config.GetIntKnob(\"telemetry_log_verbosity\");\n") + main_cpp_file.write("\tint log_verbosity = radsim_config.GetIntKnobShared(\"telemetry_log_verbosity\");\n") main_cpp_file.write("\tsim_log.SetLogSettings(log_verbosity, \"sim.log\");\n\n") - main_cpp_file.write("\tint num_traces = radsim_config.GetIntKnob(\"telemetry_num_traces\");\n") + main_cpp_file.write("\tint num_traces = radsim_config.GetIntKnobShared(\"telemetry_num_traces\");\n") main_cpp_file.write("\tsim_trace_probe.SetTraceRecordingSettings(\"sim.trace\", num_traces);\n\n") - main_cpp_file.write("\tsc_clock *driver_clk_sig = new sc_clock(\n") - main_cpp_file.write("\t\t\"node_clk0\", radsim_config.GetDoubleKnob(\"sim_driver_period\"), SC_NS);\n\n") - main_cpp_file.write("\t" + design_name + "_system *system = new " + design_name + "_system(\"" + design_name + "_system\", driver_clk_sig);\n") - main_cpp_file.write("\tsc_start();\n\n") - main_cpp_file.write("\tdelete system;\n") - main_cpp_file.write("\tdelete driver_clk_sig;\n") + for i in range(cluster_knobs["num_rads"]): + design_name = radsim_knobs[i]["design_name"] + main_cpp_file.write("\tsc_clock *driver_clk_sig" + str(i) + " = new sc_clock(\n") + main_cpp_file.write("\t\t\"node_clk0\", radsim_config.GetDoubleKnobShared(\"sim_driver_period\"), SC_NS);\n") + main_cpp_file.write("\t" + design_name + "_system *system" + str(i) + " = new " + design_name + "_system(\"" + + design_name + "_system\", driver_clk_sig" + str(i) + + ", cluster->all_rads[" + str(i) + "]);\n") + main_cpp_file.write("\tcluster->StoreSystem(system" + str(i) + ");\n") + if (cluster_knobs["num_rads"] > 1): + main_cpp_file.write("\n\tsc_clock *inter_rad_clk_sig = new sc_clock(\n") + main_cpp_file.write("\t\t\"node_clk0\", radsim_config.GetDoubleKnobShared(\"sim_driver_period\"), SC_NS);\n") + main_cpp_file.write("\tRADSimInterRad* blackbox = new RADSimInterRad(\"inter_rad_box\", inter_rad_clk_sig, cluster);\n\n") + for i in range(cluster_knobs["num_rads"]): + main_cpp_file.write("\tblackbox->ConnectClusterInterfaces(" + str(i) +");\n") + #main_cpp_file.write("\tsc_start();\n\n") + main_cpp_file.write("\n\tint start_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared(\"sim_driver_period\"));\n") + main_cpp_file.write("\twhile (cluster->AllRADsNotDone()) {\n") + main_cpp_file.write("\t\tsc_start(1, SC_NS);\n") + main_cpp_file.write("\t}\n") + main_cpp_file.write("\tint end_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared(\"sim_driver_period\"));\n") + main_cpp_file.write("\tsc_stop();\n") + main_cpp_file.write("\tstd::cout << \"Simulation Cycles from main.cpp = \" << end_cycle - start_cycle << std::endl;\n\n") + for i in range(cluster_knobs["num_rads"]): + main_cpp_file.write("\tdelete system" + str(i) + ";\n") + main_cpp_file.write("\tdelete driver_clk_sig" + str(i) + ";\n") + if (cluster_knobs["num_rads"] > 1): + main_cpp_file.write("\tdelete blackbox;\n") + main_cpp_file.write("\tdelete inter_rad_clk_sig;\n\n") main_cpp_file.write("\tsc_flit scf;\n") main_cpp_file.write("\tscf.FreeAllFlits();\n") main_cpp_file.write("\tFlit *f = Flit::New();\n") @@ -306,101 +433,148 @@ def generate_radsim_main(design_name): main_cpp_file.write("\tsim_trace_probe.dump_traces();\n") main_cpp_file.write("\t(void)argc;\n") main_cpp_file.write("\t(void)argv;\n") - main_cpp_file.write("\treturn radsim_design.GetSimExitCode();\n") + #device with RAD ID 0 is special as it is used to generate simulation exit codes + main_cpp_file.write("\treturn cluster->all_rads[0]->GetSimExitCode();\n") main_cpp_file.write("}\n") -def prepare_build_dir(design_name): +def prepare_build_dir(design_names): if os.path.isdir("build"): shutil.rmtree("build", ignore_errors=True) os.makedirs("build") - os.system("cd build; cmake -DDESIGN:STRING=" + design_name + " ..; cd ..;") - -# Get design name from command line argument -if len(sys.argv) < 2: - print("Invalid arguments: python config.py ") - exit(1) -design_name = sys.argv[1] - -# Check if design directory exists -if not(os.path.isdir(os.getcwd() + "/example-designs/" + design_name)): - print("Cannot find design directory under rad-sim/example-designs/") - exit(1) - -# Point to YAML configuration file -config_filename = "example-designs/" + design_name + "/config.yml" - -# List default parameter values -booksim_params = { - "radsim_root_dir": os.getcwd(), - "noc_type": "2d", - "noc_num_nocs": 1, - "noc_topology": ["mesh"], - "noc_anynet_file": [os.getcwd() + "/sim/noc/anynet_file"], - "noc_dim_x": [8], - "noc_dim_y": [8], - "noc_routing_func": ["dim_order"], - "noc_vcs": [5], - "noc_vc_buffer_size": [8], - "noc_output_buffer_size": [8], - "noc_num_packet_types": [3], - "noc_router_uarch": ["iq"], - "noc_vc_allocator": ["islip"], - "noc_sw_allocator": ["islip"], - "noc_credit_delay": [1], - "noc_routing_delay": [1], - "noc_vc_alloc_delay": [1], - "noc_sw_alloc_delay": [1], -} -radsim_header_params = { - "radsim_root_dir": os.getcwd(), - "noc_payload_width": [166], - "noc_packet_id_width": 32, - "noc_vcs": [3], - "noc_num_packet_types": [3], - "noc_num_nodes": [0], - "noc_max_num_router_dest_interfaces": 32, - "interfaces_max_axis_tdata_width": 512, - "interfaces_axis_tkeep_width": 8, - "interfaces_axis_tstrb_width": 8, - "interfaces_axis_tuser_width": 75, - "interfaces_axi_id_width": 8, - "interfaces_axi_user_width": 64, - "interfaces_max_axi_data_width": 512, -} -radsim_knobs = { - "radsim_root_dir": os.getcwd(), - "design_name": design_name, - "noc_num_nocs": 1, - "noc_clk_period": [0.571], - "noc_vcs": [3], - "noc_payload_width": [146], - "noc_num_nodes": [0], - "design_noc_placement": ["noc.place"], - "noc_adapters_clk_period": [1.25], - "noc_adapters_fifo_size": [16], - "noc_adapters_obuff_size": [2], - "noc_adapters_in_arbiter": ["fixed_rr"], - "noc_adapters_out_arbiter": ["priority_rr"], - "noc_adapters_vc_mapping": ["direct"], - "design_clk_periods": [5.0], - "sim_driver_period": 5.0, - "telemetry_log_verbosity": 0, - "telemetry_traces": ["trace0", "trace1"], - "dram_num_controllers": 0, - "dram_clk_periods": [2.0], - "dram_queue_sizes": [64], - "dram_config_files": ["HBM2_8Gb_x128"], -} - -# Parse configuration file -parse_config_file(config_filename, booksim_params, radsim_header_params, radsim_knobs) -#print_config(booksim_params, radsim_header_params, radsim_knobs) - -# Generate RAD-Sim input files -generate_booksim_config_files(booksim_params, radsim_header_params, radsim_knobs) -generate_radsim_params_header(radsim_header_params) -generate_radsim_config_file(radsim_knobs) -generate_radsim_main(design_name) -prepare_build_dir(design_name) - -print("RAD-Sim was configured successfully!") + semicol_sep_design_names = '' + count = 0 + count_max = len(design_names) + for design_name in design_names: + semicol_sep_design_names += design_name + if count < count_max-1: + semicol_sep_design_names += ';' + count = count+1 + os.system("cd build; cmake -DCMAKE_BUILD_TYPE=Debug -DDESIGN_NAMES=\"" + semicol_sep_design_names + "\" ..; cd .;") + +def find_num_configs(config_filename): + with open(config_filename, 'r') as yaml_config: + config = yaml.safe_load(yaml_config) + config_counter = 0 + for config_section in config: + if 'config' in config_section: + config_counter += 1 + return config_counter + +if __name__ == "__main__": + # Get design name from command line argument + if len(sys.argv) < 2: + print("Invalid arguments: python config.py <[optional] other_unique_design_names>") + exit(1) + + design_names = set() #No duplicating design include statements and cmake commands + for i in range(1, len(sys.argv)): #skip 0th argument (that is current program name) + design_names.add(sys.argv[i]) + print(sys.argv[i]) + + # Check if design directory exists + for design_name in design_names: + if not(os.path.isdir(os.getcwd() + "/example-designs/" + design_name)): + print("Cannot find design directory under rad-sim/example-designs/") + exit(1) + + # Point to YAML configuration file + config_filename = "config.yml" + config_names = [] + + # List default parameter values + booksim_params = { + "radsim_root_dir": os.getcwd(), + "noc_type": "2d", + "noc_num_nocs": 1, + "noc_topology": ["mesh"], + "noc_anynet_file": [os.getcwd() + "/sim/noc/anynet_file"], + "noc_dim_x": [8], + "noc_dim_y": [8], + "noc_routing_func": ["dim_order"], + "noc_vcs": [5], + "noc_vc_buffer_size": [8], + "noc_output_buffer_size": [8], + "noc_num_packet_types": [3], + "noc_router_uarch": ["iq"], + "noc_vc_allocator": ["islip"], + "noc_sw_allocator": ["islip"], + "noc_credit_delay": [1], + "noc_routing_delay": [1], + "noc_vc_alloc_delay": [1], + "noc_sw_alloc_delay": [1], + } + radsim_header_params = { #shared across all RADs + "radsim_root_dir": os.getcwd(), + "noc_payload_width": [166], + "noc_packet_id_width": 32, + "noc_vcs": [3], + "noc_num_packet_types": [3], + "noc_num_nodes": [0], + "noc_max_num_router_dest_interfaces": 32, + "interfaces_max_axis_tdata_width": 512, + "interfaces_axis_tkeep_width": 8, + "interfaces_axis_tstrb_width": 8, + "interfaces_axis_tuser_width": 75, + "interfaces_axi_id_width": 8, + "interfaces_axi_user_width": 64, + "interfaces_max_axi_data_width": 512, + } + radsim_knobs = { #includes cluster config + "design_name": design_name, + "noc_num_nocs": 1, + "noc_clk_period": [0.571], + "noc_vcs": [3], + "noc_payload_width": [146], + "noc_num_nodes": [0], + "design_noc_placement": ["noc.place"], + "noc_adapters_clk_period": [1.25], + "noc_adapters_fifo_size": [16], + "noc_adapters_obuff_size": [2], + "noc_adapters_in_arbiter": ["fixed_rr"], + "noc_adapters_out_arbiter": ["priority_rr"], + "noc_adapters_vc_mapping": ["direct"], + "design_clk_periods": [5.0], + "dram_num_controllers": 0, + "dram_clk_periods": [2.0], + "dram_queue_sizes": [64], + "dram_config_files": ["HBM2_8Gb_x128"] + } + + cluster_knobs = { #shared among all RADs + "radsim_root_dir": os.getcwd(), + "sim_driver_period": 5.0, + "telemetry_log_verbosity": 0, + "telemetry_traces": ["trace0", "trace1"], + "num_rads": 1, + "cluster_configs": [], + "cluster_topology": 'all-to-all', + "inter_rad_latency": 5.0, #ns + "inter_rad_bw": 25.6, #bits per ns + "inter_rad_fifo_num_slots": 1000 + + } + + num_configs = find_num_configs(config_filename) + #print('num_configs: ' + str(num_configs)) + config_indices = [] + for n in range(0, num_configs): + config_indices.append(n) + print(config_indices) + + #deep copy (to allow changes to each dict) + radsim_knobs_per_rad = list(deepcopy(radsim_knobs) for i in range(num_configs)) + booksim_params_per_rad = list(deepcopy(booksim_params) for i in range(num_configs)) + + # Parse configuration file + parse_config_file(config_filename, booksim_params_per_rad, radsim_header_params, radsim_knobs_per_rad, cluster_knobs) + #print_config(booksim_params_per_rad, radsim_header_params, radsim_knobs_per_rad) + + # Generate RAD-Sim input files + generate_booksim_config_files(booksim_params_per_rad, radsim_header_params, radsim_knobs_per_rad, cluster_knobs) + generate_radsim_params_header(radsim_header_params) + generate_radsim_config_file(radsim_knobs_per_rad, cluster_knobs) + generate_radsim_main(design_names, radsim_knobs_per_rad) + + prepare_build_dir(design_names) + + print("RAD-Sim was configured successfully!") diff --git a/rad-sim/config.yml b/rad-sim/config.yml new file mode 100644 index 0000000..0816b92 --- /dev/null +++ b/rad-sim/config.yml @@ -0,0 +1,63 @@ +config rad1: + dram: + num_controllers: 4 + clk_periods: [3.32, 3.32, 2.0, 2.0] + queue_sizes: [64, 64, 64, 64] + config_files: ['DDR4_8Gb_x16_2400', 'DDR4_8Gb_x16_2400', 'HBM2_8Gb_x128', 'HBM2_8Gb_x128'] + + design: + name: 'dlrm_two_rad' + noc_placement: ['dlrm_two_rad.place'] + clk_periods: [5.0, 2.0, 3.32, 1.5] + +config anotherconfig: + dram: + num_controllers: 4 + clk_periods: [3.32, 3.32, 2.0, 2.0] + queue_sizes: [64, 64, 64, 64] + config_files: ['DDR4_8Gb_x16_2400', 'DDR4_8Gb_x16_2400', 'HBM2_8Gb_x128', 'HBM2_8Gb_x128'] + + design: + name: 'dlrm_two_rad' + noc_placement: ['dlrm_two_rad.place'] + clk_periods: [5.0, 2.0, 3.32, 1.5] + +noc: + type: ['2d'] + num_nocs: 1 + clk_period: [1.0] + payload_width: [82] + topology: ['mesh'] + dim_x: [10] + dim_y: [10] + routing_func: ['dim_order'] + vcs: [5] + vc_buffer_size: [16] + output_buffer_size: [8] + num_packet_types: [5] + router_uarch: ['iq'] + vc_allocator: ['islip'] + sw_allocator: ['islip'] + credit_delay: [1] + routing_delay: [1] + vc_alloc_delay: [1] + sw_alloc_delay: [1] + +noc_adapters: + clk_period: [1.25] + fifo_size: [16] + obuff_size: [2] + in_arbiter: ['fixed_rr'] + out_arbiter: ['priority_rr'] + vc_mapping: ['direct'] + +cluster: + sim_driver_period: 5.0 + telemetry_log_verbosity: 2 + telemetry_traces: ['Embedding LU', 'Mem0', 'Mem1', 'Mem2', 'Mem3', 'Feature Inter.', 'MVM first', 'MVM last'] + num_rads: 2 + cluster_configs: ['rad1', 'anotherconfig'] + cluster_topology: 'all-to-all' + inter_rad_latency: 2100 + inter_rad_bw: 102.4 + inter_rad_fifo_num_slots: 1000 \ No newline at end of file diff --git a/rad-sim/example-designs/CMakeLists.txt b/rad-sim/example-designs/CMakeLists.txt index 192bfb9..8a57faa 100644 --- a/rad-sim/example-designs/CMakeLists.txt +++ b/rad-sim/example-designs/CMakeLists.txt @@ -1,4 +1,7 @@ cmake_minimum_required(VERSION 3.16) find_package(SystemCLanguage CONFIG REQUIRED) -add_subdirectory(${DESIGN}) +FOREACH(DESIGN_NAME ${DESIGN_NAMES}) + MESSAGE("<<${DESIGN_NAME}>>") + add_subdirectory(${DESIGN_NAME}) +ENDFOREACH() \ No newline at end of file diff --git a/rad-sim/example-designs/add/.gitignore b/rad-sim/example-designs/add/.gitignore new file mode 100644 index 0000000..e2d03be --- /dev/null +++ b/rad-sim/example-designs/add/.gitignore @@ -0,0 +1,3 @@ +CMakeFiles/ +Makefile +CMakeCache.txt \ No newline at end of file diff --git a/rad-sim/example-designs/add/CMakeLists.txt b/rad-sim/example-designs/add/CMakeLists.txt index df3d545..4110dd8 100644 --- a/rad-sim/example-designs/add/CMakeLists.txt +++ b/rad-sim/example-designs/add/CMakeLists.txt @@ -31,5 +31,5 @@ set(hdrfiles add_compile_options(-Wall -Wextra -pedantic) -add_library(design STATIC ${srcfiles} ${hdrfiles}) -target_link_libraries(design PUBLIC SystemC::systemc booksim noc) \ No newline at end of file +add_library(add STATIC ${srcfiles} ${hdrfiles}) +target_link_libraries(add PUBLIC SystemC::systemc booksim noc) \ No newline at end of file diff --git a/rad-sim/example-designs/add/add_driver.cpp b/rad-sim/example-designs/add/add_driver.cpp index 4388d2d..d28e44b 100644 --- a/rad-sim/example-designs/add/add_driver.cpp +++ b/rad-sim/example-designs/add/add_driver.cpp @@ -2,8 +2,14 @@ #define NUM_ADDENDS 3 -add_driver::add_driver(const sc_module_name &name) +add_driver::add_driver(const sc_module_name &name, RADSimDesignContext* radsim_design_) : sc_module(name) { + + this->radsim_design = radsim_design_; + + //for simulation cycle count + start_cycle = 0; + end_cycle = 0; // Random Seed srand (time(NULL)); @@ -31,7 +37,7 @@ void add_driver::source() { client_valid.write(false); wait(); rst.write(false); - start_cycle = GetSimulationCycle(radsim_config.GetDoubleKnob("sim_driver_period")); + start_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); start_time = std::chrono::steady_clock::now(); wait(); @@ -52,24 +58,29 @@ void add_driver::source() { } void add_driver::sink() { - while (!response_valid.read()) { + while (!(response_valid.read())) { wait(); } //std::cout << "Received " << response.read().to_uint64() << " sum from the adder!" << std::endl; //std::cout << "The actual sum is " << actual_sum << std::endl; - if (response.read() != actual_sum) { + if (response.read() != actual_sum) { std::cout << "FAILURE - Output is not matching!" << std::endl; - radsim_design.ReportDesignFailure(); + radsim_design->ReportDesignFailure(); } else { std::cout << "SUCCESS - Output is matching!" << std::endl; } - end_cycle = GetSimulationCycle(radsim_config.GetDoubleKnob("sim_driver_period")); + end_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); end_time = std::chrono::steady_clock::now(); std::cout << "Simulation Cycles = " << end_cycle - start_cycle << std::endl; std::cout << "Simulation Time = " << std::chrono::duration_cast (end_time - start_time).count() << " us" << std::endl; NoCTransactionTelemetry::DumpStatsToFile("stats.csv"); - sc_stop(); + end_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + std::cout << "Simulation Cycles for Just Adder Portion = " << end_cycle - start_cycle << std::endl; + + this->radsim_design->set_rad_done(); + return; + } \ No newline at end of file diff --git a/rad-sim/example-designs/add/add_driver.hpp b/rad-sim/example-designs/add/add_driver.hpp index 5bbef5a..311672e 100644 --- a/rad-sim/example-designs/add/add_driver.hpp +++ b/rad-sim/example-designs/add/add_driver.hpp @@ -10,10 +10,11 @@ class add_driver : public sc_module { private: + int start_cycle, end_cycle; std::queue numbers_to_send; int actual_sum; - int start_cycle, end_cycle; std::chrono::steady_clock::time_point start_time, end_time; + RADSimDesignContext* radsim_design; public: sc_in clk; @@ -25,7 +26,7 @@ class add_driver : public sc_module { sc_in> response; sc_in response_valid; - add_driver(const sc_module_name &name); + add_driver(const sc_module_name &name, RADSimDesignContext* radsim_design_); ~add_driver(); void source(); diff --git a/rad-sim/example-designs/add/add_system.cpp b/rad-sim/example-designs/add/add_system.cpp index faec250..5a32d68 100644 --- a/rad-sim/example-designs/add/add_system.cpp +++ b/rad-sim/example-designs/add/add_system.cpp @@ -1,10 +1,10 @@ #include -add_system::add_system(const sc_module_name &name, sc_clock *driver_clk_sig) +add_system::add_system(const sc_module_name &name, sc_clock *driver_clk_sig, RADSimDesignContext* radsim_design) : sc_module(name) { // Instantiate driver - driver_inst = new add_driver("driver"); + driver_inst = new add_driver("driver", radsim_design); driver_inst->clk(*driver_clk_sig); driver_inst->rst(rst_sig); driver_inst->client_tdata(client_tdata_sig); @@ -15,7 +15,7 @@ add_system::add_system(const sc_module_name &name, sc_clock *driver_clk_sig) driver_inst->response_valid(response_valid_sig); // Instantiate design top-level - dut_inst = new add_top("dut"); + dut_inst = new add_top("dut", radsim_design); dut_inst->rst(rst_sig); dut_inst->client_tdata(client_tdata_sig); dut_inst->client_tlast(client_tlast_sig); @@ -23,6 +23,8 @@ add_system::add_system(const sc_module_name &name, sc_clock *driver_clk_sig) dut_inst->client_ready(client_ready_sig); dut_inst->response(response_sig); dut_inst->response_valid(response_valid_sig); + //add add_top as dut instance for parent class RADSimDesignSystem + this->design_dut_inst = dut_inst; } add_system::~add_system() { diff --git a/rad-sim/example-designs/add/add_system.hpp b/rad-sim/example-designs/add/add_system.hpp index 6911498..ace1ecb 100644 --- a/rad-sim/example-designs/add/add_system.hpp +++ b/rad-sim/example-designs/add/add_system.hpp @@ -4,8 +4,9 @@ #include #include #include +#include -class add_system : public sc_module { +class add_system : public RADSimDesignSystem { private: sc_signal> client_tdata_sig; sc_signal client_tlast_sig; @@ -21,6 +22,6 @@ class add_system : public sc_module { add_top *dut_inst; add_system(const sc_module_name &name, - sc_clock *driver_clk_sig); + sc_clock *driver_clk_sig, RADSimDesignContext* radsim_design); ~add_system(); }; \ No newline at end of file diff --git a/rad-sim/example-designs/add/add_top.cpp b/rad-sim/example-designs/add/add_top.cpp index 0b286c0..c9dd366 100644 --- a/rad-sim/example-designs/add/add_top.cpp +++ b/rad-sim/example-designs/add/add_top.cpp @@ -1,7 +1,9 @@ #include -add_top::add_top(const sc_module_name &name) - : sc_module(name) { +add_top::add_top(const sc_module_name &name, RADSimDesignContext* radsim_design) + : RADSimDesignTop(radsim_design) { + + this->radsim_design = radsim_design; std::string module_name_str; char module_name[25]; @@ -9,7 +11,7 @@ add_top::add_top(const sc_module_name &name) module_name_str = "client_inst"; std::strcpy(module_name, module_name_str.c_str()); - client_inst = new client(module_name); + client_inst = new client(module_name, radsim_design); client_inst->rst(rst); client_inst->client_tdata(client_tdata); client_inst->client_tlast(client_tlast); @@ -18,15 +20,15 @@ add_top::add_top(const sc_module_name &name) module_name_str = "adder_inst"; std::strcpy(module_name, module_name_str.c_str()); - adder_inst = new adder(module_name); + adder_inst = new adder(module_name, radsim_design); adder_inst->rst(rst); adder_inst->response(response); adder_inst->response_valid(response_valid); - radsim_design.BuildDesignContext("add.place", - "add.clks"); - radsim_design.CreateSystemNoCs(rst); - radsim_design.ConnectModulesToNoC(); + this->connectPortalReset(&rst); + radsim_design->BuildDesignContext("add.place", "add.clks"); + radsim_design->CreateSystemNoCs(rst); + radsim_design->ConnectModulesToNoC(); } add_top::~add_top() { diff --git a/rad-sim/example-designs/add/add_top.hpp b/rad-sim/example-designs/add/add_top.hpp index 23a51fb..6ca9d2c 100644 --- a/rad-sim/example-designs/add/add_top.hpp +++ b/rad-sim/example-designs/add/add_top.hpp @@ -5,11 +5,14 @@ #include #include #include +#include +#include -class add_top : public sc_module { +class add_top : public RADSimDesignTop { private: adder *adder_inst; client *client_inst; + RADSimDesignContext* radsim_design; public: sc_in rst; @@ -21,6 +24,6 @@ class add_top : public sc_module { sc_out> response; sc_out response_valid; - add_top(const sc_module_name &name); + add_top(const sc_module_name &name, RADSimDesignContext* radsim_design); ~add_top(); }; \ No newline at end of file diff --git a/rad-sim/example-designs/add/config.yml b/rad-sim/example-designs/add/config.yml index 24749d8..b65f928 100644 --- a/rad-sim/example-designs/add/config.yml +++ b/rad-sim/example-designs/add/config.yml @@ -1,3 +1,9 @@ +config rad1: + design: + name: 'add' + noc_placement: ['add.place'] + clk_periods: [5.0] + noc: type: ['2d'] num_nocs: 1 @@ -27,11 +33,10 @@ noc_adapters: out_arbiter: ['priority_rr'] vc_mapping: ['direct'] -design: - name: 'add' - noc_placement: ['add.place'] - clk_periods: [5.0] -telemetry: - log_verbosity: 2 - traces: [] \ No newline at end of file +cluster: + sim_driver_period: 5.0 + telemetry_log_verbosity: 2 + telemetry_traces: [] + num_rads: 1 + cluster_configs: ['rad1'] \ No newline at end of file diff --git a/rad-sim/example-designs/add/modules/adder.cpp b/rad-sim/example-designs/add/modules/adder.cpp index a1acbdb..5d8f1e0 100644 --- a/rad-sim/example-designs/add/modules/adder.cpp +++ b/rad-sim/example-designs/add/modules/adder.cpp @@ -1,7 +1,9 @@ #include -adder::adder(const sc_module_name &name) - : RADSimModule(name) { +adder::adder(const sc_module_name &name, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design) { + + this->radsim_design = radsim_design; // Combinational logic and its sensitivity list SC_METHOD(Assign); @@ -15,7 +17,8 @@ adder::adder(const sc_module_name &name) this->RegisterModuleInfo(); } -adder::~adder() {} +adder::~adder() { +} void adder::Assign() { if (rst) { @@ -30,20 +33,28 @@ void adder::Assign() { void adder::Tick() { response_valid.write(0); response.write(0); + int count_in_addends = 0; wait(); + int curr_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + //std::cout << "adder.cpp is before while loop at cycle " << curr_cycle << std::endl; + // Always @ positive edge of the clock while (true) { + curr_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + // Receiving transaction from AXI-S interface if (axis_adder_interface.tvalid.read() && - axis_adder_interface.tready.read()) { - uint64_t current_sum = adder_rolling_sum.to_uint64(); - adder_rolling_sum = current_sum + axis_adder_interface.tdata.read().to_uint64(); - t_finished.write(axis_adder_interface.tlast.read()); - //std::cout << module_name << ": Got Transaction (user = " - // << axis_adder_interface.tuser.read().to_uint64() << ") (addend = " - // << axis_adder_interface.tdata.read().to_uint64() << ")!" - // << std::endl; + axis_adder_interface.tready.read() + ){ + count_in_addends++; + uint64_t current_sum = adder_rolling_sum.to_uint64(); + adder_rolling_sum = current_sum + axis_adder_interface.tdata.read().to_uint64(); + t_finished.write(axis_adder_interface.tlast.read()); + std::cout << module_name << ": Got Transaction " << count_in_addends << " on cycle " << curr_cycle << " (user = " + << axis_adder_interface.tuser.read().to_uint64() << ") (addend = " + << axis_adder_interface.tdata.read().to_uint64() << ")!" + << std::endl; } // Print Sum and Exit @@ -64,4 +75,5 @@ void adder::RegisterModuleInfo() { port_name = module_name + ".axis_adder_interface"; RegisterAxisSlavePort(port_name, &axis_adder_interface, DATAW, 0); + } \ No newline at end of file diff --git a/rad-sim/example-designs/add/modules/adder.hpp b/rad-sim/example-designs/add/modules/adder.hpp index 364ad05..9abffa2 100644 --- a/rad-sim/example-designs/add/modules/adder.hpp +++ b/rad-sim/example-designs/add/modules/adder.hpp @@ -9,6 +9,8 @@ #include #include #include +#include +#include class adder : public RADSimModule { private: @@ -16,13 +18,14 @@ class adder : public RADSimModule { sc_signal t_finished; // Signal flagging that the transaction has terminated public: + RADSimDesignContext* radsim_design; sc_in rst; sc_out response_valid; sc_out> response; // Interface to the NoC axis_slave_port axis_adder_interface; - adder(const sc_module_name &name); + adder(const sc_module_name &name, RADSimDesignContext* radsim_design); ~adder(); void Assign(); // Combinational logic process diff --git a/rad-sim/example-designs/add/modules/client.cpp b/rad-sim/example-designs/add/modules/client.cpp index 2891c66..3e395f5 100644 --- a/rad-sim/example-designs/add/modules/client.cpp +++ b/rad-sim/example-designs/add/modules/client.cpp @@ -1,7 +1,9 @@ #include -client::client(const sc_module_name &name) - : RADSimModule(name) { +client::client(const sc_module_name &name, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design) { + + this->radsim_design = radsim_design; char fifo_name[25]; std::string fifo_name_str; @@ -58,13 +60,13 @@ void client::Tick() { while (true) { if (client_ready.read() && client_valid.read()) { std::cout << this->name() << " @ cycle " - << GetSimulationCycle(radsim_config.GetDoubleKnob("sim_driver_period")) << ": " + << GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")) << ": " << " Pushed request to FIFO!" << std::endl; } if (axis_client_interface.tvalid.read() && axis_client_interface.tready.read()) { std::cout << this->name() << " @ cycle " - << GetSimulationCycle(radsim_config.GetDoubleKnob("sim_driver_period")) << ": " + << GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")) << ": " << " Sent Transaction!" << std::endl; } wait(); @@ -83,10 +85,13 @@ void client::Assign() { bool tlast = client_tlast_fifo_rdata_signal.read(); std::string src_port_name = module_name + ".axis_client_interface"; std::string dst_port_name = "adder_inst.axis_adder_interface"; - uint64_t dst_addr = radsim_design.GetPortDestinationID(dst_port_name); - uint64_t src_addr = radsim_design.GetPortDestinationID(src_port_name); - - axis_client_interface.tdest.write(dst_addr); + uint64_t dst_addr = radsim_design->GetPortDestinationID(dst_port_name); + uint64_t src_addr = radsim_design->GetPortDestinationID(src_port_name); + sc_bv dest_id_concat; + DEST_REMOTE_NODE(dest_id_concat) = 0; //bc staying on same RAD + DEST_LOCAL_NODE(dest_id_concat) = dst_addr; + DEST_RAD(dest_id_concat) = radsim_design->rad_id; + axis_client_interface.tdest.write(dest_id_concat); axis_client_interface.tid.write(0); axis_client_interface.tstrb.write(0); axis_client_interface.tkeep.write(0); diff --git a/rad-sim/example-designs/add/modules/client.hpp b/rad-sim/example-designs/add/modules/client.hpp index 2dbf267..2bc9865 100644 --- a/rad-sim/example-designs/add/modules/client.hpp +++ b/rad-sim/example-designs/add/modules/client.hpp @@ -8,6 +8,7 @@ #include #include #include +#include #include #define FIFO_DEPTH 16 @@ -47,8 +48,9 @@ class client : public RADSimModule { sc_out client_ready; // Interface to the NoC axis_master_port axis_client_interface; + RADSimDesignContext* radsim_design; - client(const sc_module_name &name); + client(const sc_module_name &name, RADSimDesignContext* radsim_design); ~client(); void Assign(); // Combinational logic process diff --git a/rad-sim/example-designs/dlrm/.gitignore b/rad-sim/example-designs/dlrm/.gitignore new file mode 100644 index 0000000..5379a53 --- /dev/null +++ b/rad-sim/example-designs/dlrm/.gitignore @@ -0,0 +1,3 @@ +CMakeFiles/ +Makefile +*.log \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm/CMakeLists.txt b/rad-sim/example-designs/dlrm/CMakeLists.txt index da62070..0416e1e 100644 --- a/rad-sim/example-designs/dlrm/CMakeLists.txt +++ b/rad-sim/example-designs/dlrm/CMakeLists.txt @@ -50,5 +50,5 @@ set(hdrfiles add_compile_options(-Wall -Wextra -pedantic) -add_library(design STATIC ${srcfiles} ${hdrfiles}) -target_link_libraries(design PUBLIC SystemC::systemc booksim noc dram) \ No newline at end of file +add_library(dlrm STATIC ${srcfiles} ${hdrfiles}) +target_link_libraries(dlrm PUBLIC SystemC::systemc booksim noc dram) diff --git a/rad-sim/example-designs/dlrm/compiler/dlrm.py b/rad-sim/example-designs/dlrm/compiler/dlrm.py index dfeeb12..634b95a 100644 --- a/rad-sim/example-designs/dlrm/compiler/dlrm.py +++ b/rad-sim/example-designs/dlrm/compiler/dlrm.py @@ -782,6 +782,8 @@ def generate_dlrm_defines_hpp(): dlrm_defines.write("#define INST_MEM_DEPTH 2048\n") dlrm_defines.write("#define DOT_PRODUCTS LANES\n") dlrm_defines.write("#define DATAW (BITWIDTH * LANES)\n") + dlrm_defines.write("#define TDATA_ELEMS 32\n") + dlrm_defines.write("#define TDATA_WIDTH 16\n") dlrm_defines.close() diff --git a/rad-sim/example-designs/dlrm/compiler/make.log b/rad-sim/example-designs/dlrm/compiler/make.log deleted file mode 100644 index e69de29..0000000 diff --git a/rad-sim/example-designs/dlrm/config.yml b/rad-sim/example-designs/dlrm/config.yml index 9095cc5..0ef30da 100644 --- a/rad-sim/example-designs/dlrm/config.yml +++ b/rad-sim/example-designs/dlrm/config.yml @@ -1,3 +1,15 @@ +config rad1: + dram: + num_controllers: 4 + clk_periods: [3.32, 3.32, 2.0, 2.0] + queue_sizes: [64, 64, 64, 64] + config_files: ['DDR4_8Gb_x16_2400', 'DDR4_8Gb_x16_2400', 'HBM2_8Gb_x128', 'HBM2_8Gb_x128'] + + design: + name: 'dlrm' + noc_placement: ['dlrm.place'] + clk_periods: [5.0, 2.0, 3.32, 1.5] + noc: type: ['2d'] num_nocs: 1 @@ -27,17 +39,13 @@ noc_adapters: out_arbiter: ['priority_rr'] vc_mapping: ['direct'] -dram: - num_controllers: 4 - clk_periods: [3.32, 3.32, 2.0, 2.0] - queue_sizes: [64, 64, 64, 64] - config_files: ['DDR4_8Gb_x16_2400', 'DDR4_8Gb_x16_2400', 'HBM2_8Gb_x128', 'HBM2_8Gb_x128'] - -design: - name: 'dlrm' - noc_placement: ['dlrm.place'] - clk_periods: [5.0, 2.0, 3.32, 1.5] - -telemetry: - log_verbosity: 2 - traces: ['Embedding LU', 'Mem0', 'Mem1', 'Mem2', 'Mem3', 'Feature Inter.', 'MVM first', 'MVM last'] \ No newline at end of file +cluster: + sim_driver_period: 5.0 + telemetry_log_verbosity: 2 + telemetry_traces: ['Embedding LU', 'Mem0', 'Mem1', 'Mem2', 'Mem3', 'Feature Inter.', 'MVM first', 'MVM last'] + num_rads: 1 + cluster_configs: ['rad1'] + cluster_topology: 'all-to-all' + inter_rad_latency: 2100 + inter_rad_bw: 102.4 + inter_rad_fifo_num_slots: 1000 \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm/dlrm.place b/rad-sim/example-designs/dlrm/dlrm.place index 1745288..0b13524 100644 --- a/rad-sim/example-designs/dlrm/dlrm.place +++ b/rad-sim/example-designs/dlrm/dlrm.place @@ -64,4 +64,4 @@ layer1_mvm0 0 30 axis layer1_mvm1 0 20 axis layer2_mvm0 0 10 axis layer2_mvm1 0 0 axis -output_collector 0 31 axis +output_collector 0 31 axis \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm/dlrm_driver.cpp b/rad-sim/example-designs/dlrm/dlrm_driver.cpp index a84e65e..1943dee 100644 --- a/rad-sim/example-designs/dlrm/dlrm_driver.cpp +++ b/rad-sim/example-designs/dlrm/dlrm_driver.cpp @@ -71,11 +71,12 @@ bool ParseOutputs(std::vector> &fi_outputs, return true; } -dlrm_driver::dlrm_driver(const sc_module_name &name) : sc_module(name) { +dlrm_driver::dlrm_driver(const sc_module_name &name, RADSimDesignContext* radsim_design_) : sc_module(name) { + this->radsim_design = radsim_design_; // Parse design configuration (number of layers & number of MVM per layer) std::string design_root_dir = - radsim_config.GetStringKnob("radsim_user_design_root_dir"); + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); std::string inputs_filename = design_root_dir + "/compiler/embedding_indecies.in"; @@ -112,7 +113,7 @@ void dlrm_driver::source() { unsigned int idx = 0; _start_cycle = - GetSimulationCycle(radsim_config.GetDoubleKnob("sim_driver_period")); + GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); while (idx < _lookup_indecies.size()) { lookup_indecies_data.write(_lookup_indecies[idx]); lookup_indecies_target_channels.write(_target_channels[idx]); @@ -167,7 +168,7 @@ void dlrm_driver::sink() { matching = (dut_output[e] == _mlp_outputs[outputs_count][e]); } if (!matching) { - std::cout << "Output " << outputs_count << " does not match!\n"; + std::cout << "Output " << outputs_count << " on rad " << radsim_design->rad_id << " does not match!\n"; std::cout << "TRUE: [ "; for (unsigned int e = 0; e < _mlp_outputs[outputs_count].size(); e++) { std::cout << _mlp_outputs[outputs_count][e] << " "; @@ -180,10 +181,25 @@ void dlrm_driver::sink() { std::cout << "]\n"; std::cout << "-------------------------------\n"; } + // else { + // std::cout << "Output " << outputs_count << " on rad " << radsim_design->rad_id << " does match :)\n"; + // std::cout << "TRUE: [ "; + // for (unsigned int e = 0; e < _mlp_outputs[outputs_count].size(); e++) { + // std::cout << _mlp_outputs[outputs_count][e] << " "; + // } + // std::cout << "]\n"; + // std::cout << "DUT : [ "; + // for (unsigned int e = 0; e < dut_output.size(); e++) { + // std::cout << dut_output[e] << " "; + // } + // std::cout << "]\n"; + // std::cout << "-------------------------------\n"; + // } outputs_count++; all_outputs_matching &= matching; print_progress_bar(outputs_count, _num_mlp_outputs); + //std::cout << "outputs_count " << outputs_count << " and _num_mlp_outputs " << _num_mlp_outputs << std::endl; } wait(); } @@ -195,15 +211,17 @@ void dlrm_driver::sink() { std::cout << "Simulation PASSED! All outputs matching!" << std::endl; } else { std::cout << "Simulation FAILED! Some outputs are NOT matching!" << std::endl; - radsim_design.ReportDesignFailure(); + radsim_design->ReportDesignFailure(); } _end_cycle = - GetSimulationCycle(radsim_config.GetDoubleKnob("sim_driver_period")); + GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); std::cout << "Simulated " << (_end_cycle - _start_cycle) << " cycle(s)" << std::endl; for (unsigned int i = 0; i < 10; i++) { wait(); } - sc_stop(); + //sc_stop(); + this->radsim_design->set_rad_done(); //flag to replace sc_stop calls + return; } \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm/dlrm_driver.hpp b/rad-sim/example-designs/dlrm/dlrm_driver.hpp index cb6ab4c..409152d 100644 --- a/rad-sim/example-designs/dlrm/dlrm_driver.hpp +++ b/rad-sim/example-designs/dlrm/dlrm_driver.hpp @@ -19,6 +19,7 @@ class dlrm_driver : public sc_module { unsigned int _num_feature_interaction_outputs; unsigned int _num_mlp_outputs; unsigned int _start_cycle, _end_cycle; + RADSimDesignContext* radsim_design; public: sc_in clk; @@ -35,7 +36,7 @@ class dlrm_driver : public sc_module { sc_out collector_fifo_ren; sc_in> collector_fifo_rdata; - dlrm_driver(const sc_module_name &name); + dlrm_driver(const sc_module_name &name, RADSimDesignContext* radsim_design_); ~dlrm_driver(); void assign(); diff --git a/rad-sim/example-designs/dlrm/dlrm_system.cpp b/rad-sim/example-designs/dlrm/dlrm_system.cpp index f8a293d..f278b91 100644 --- a/rad-sim/example-designs/dlrm/dlrm_system.cpp +++ b/rad-sim/example-designs/dlrm/dlrm_system.cpp @@ -1,10 +1,10 @@ #include -dlrm_system::dlrm_system(const sc_module_name &name, sc_clock *driver_clk_sig) +dlrm_system::dlrm_system(const sc_module_name &name, sc_clock *driver_clk_sig, RADSimDesignContext* radsim_design) : sc_module(name) { // Instantiate driver - driver_inst = new dlrm_driver("driver"); + driver_inst = new dlrm_driver("driver", radsim_design); driver_inst->clk(*driver_clk_sig); driver_inst->rst(rst_sig); driver_inst->lookup_indecies_data(lookup_indecies_data_sig); @@ -20,7 +20,7 @@ dlrm_system::dlrm_system(const sc_module_name &name, sc_clock *driver_clk_sig) driver_inst->collector_fifo_rdata(collector_fifo_rdata_sig); // Instantiate design top-level - dut_inst = new dlrm_top("dut"); + dut_inst = new dlrm_top("dut", radsim_design); dut_inst->rst(rst_sig); dut_inst->lookup_indecies_data(lookup_indecies_data_sig); dut_inst->lookup_indecies_target_channels( @@ -32,6 +32,8 @@ dlrm_system::dlrm_system(const sc_module_name &name, sc_clock *driver_clk_sig) dut_inst->collector_fifo_rdy(collector_fifo_rdy_sig); dut_inst->collector_fifo_ren(collector_fifo_ren_sig); dut_inst->collector_fifo_rdata(collector_fifo_rdata_sig); + //add _top as dut instance for parent class RADSimDesignSystem + this->design_dut_inst = dut_inst; } dlrm_system::~dlrm_system() { diff --git a/rad-sim/example-designs/dlrm/dlrm_system.hpp b/rad-sim/example-designs/dlrm/dlrm_system.hpp index e6b962a..264c0d6 100644 --- a/rad-sim/example-designs/dlrm/dlrm_system.hpp +++ b/rad-sim/example-designs/dlrm/dlrm_system.hpp @@ -4,8 +4,9 @@ #include #include #include +#include -class dlrm_system : public sc_module { +class dlrm_system : public RADSimDesignSystem { //sc_module { private: sc_signal> lookup_indecies_data_sig; sc_signal> lookup_indecies_target_channels_sig; @@ -24,6 +25,6 @@ class dlrm_system : public sc_module { dlrm_driver *driver_inst; dlrm_top *dut_inst; - dlrm_system(const sc_module_name &name, sc_clock *driver_clk_sig); + dlrm_system(const sc_module_name &name, sc_clock *driver_clk_sig, RADSimDesignContext* radsim_design); ~dlrm_system(); }; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm/dlrm_top.cpp b/rad-sim/example-designs/dlrm/dlrm_top.cpp index 6ac458a..0667275 100644 --- a/rad-sim/example-designs/dlrm/dlrm_top.cpp +++ b/rad-sim/example-designs/dlrm/dlrm_top.cpp @@ -1,14 +1,14 @@ #include -dlrm_top::dlrm_top(const sc_module_name &name) : sc_module(name) { - +dlrm_top::dlrm_top(const sc_module_name &name, RADSimDesignContext* radsim_design) : RADSimDesignTop(radsim_design) { + this->radsim_design = radsim_design; unsigned int line_bitwidth = 512; unsigned int element_bitwidth = 16; std::vector mem_channels = {1, 1, 8, 8}; unsigned int embedding_lookup_fifos_depth = 16; unsigned int feature_interaction_fifos_depth = 64; unsigned int num_mem_controllers = - radsim_config.GetIntKnob("dram_num_controllers"); + radsim_config.GetIntKnobPerRad("dram_num_controllers", radsim_design->rad_id); assert(num_mem_controllers == mem_channels.size()); unsigned int total_mem_channels = 0; for (auto &num_channels : mem_channels) { @@ -20,7 +20,7 @@ dlrm_top::dlrm_top(const sc_module_name &name) : sc_module(name) { // Parse MVM configuration std::string design_root_dir = - radsim_config.GetStringKnob("radsim_user_design_root_dir"); + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); std::string design_config_filename = design_root_dir + "/compiler/mvms.config"; @@ -45,7 +45,7 @@ dlrm_top::dlrm_top(const sc_module_name &name) : sc_module(name) { module_name_str = "embedding_lookup_inst"; std::strcpy(module_name, module_name_str.c_str()); embedding_lookup_inst = new embedding_lookup( - module_name, line_bitwidth, mem_channels, embedding_lookup_fifos_depth); + module_name, line_bitwidth, mem_channels, embedding_lookup_fifos_depth, radsim_design); embedding_lookup_inst->rst(rst); embedding_lookup_inst->lookup_indecies_data(lookup_indecies_data); embedding_lookup_inst->lookup_indecies_target_channels( @@ -59,12 +59,12 @@ dlrm_top::dlrm_top(const sc_module_name &name) : sc_module(name) { module_name_str = "feature_interaction_inst"; std::strcpy(module_name, module_name_str.c_str()); std::string feature_interaction_inst_file = - radsim_config.GetStringKnob("radsim_user_design_root_dir") + + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id) + "/compiler/instructions/feature_interaction.inst"; feature_interaction_inst = new custom_feature_interaction( module_name, line_bitwidth, element_bitwidth, total_mem_channels, feature_interaction_fifos_depth, num_mvms[0], - feature_interaction_inst_file); + feature_interaction_inst_file, radsim_design); feature_interaction_inst->rst(rst); feature_interaction_inst->received_responses(received_responses); @@ -79,7 +79,7 @@ dlrm_top::dlrm_top(const sc_module_name &name) : sc_module(name) { std::strcpy(module_name, module_name_str.c_str()); std::string inst_filename = design_root_dir + "/compiler/instructions/" + module_name_str + ".inst"; - mvms[l][m] = new mvm(module_name, m, l, inst_filename); + mvms[l][m] = new mvm(module_name, m, l, inst_filename, radsim_design); mvms[l][m]->rst(rst); axis_signal_count++; } @@ -106,7 +106,7 @@ dlrm_top::dlrm_top(const sc_module_name &name) : sc_module(name) { // Instantiate Output Collector module_name_str = "output_collector"; std::strcpy(module_name, module_name_str.c_str()); - output_collector = new collector(module_name); + output_collector = new collector(module_name, radsim_design); output_collector->rst(rst); output_collector->data_fifo_rdy(collector_fifo_rdy); output_collector->data_fifo_ren(collector_fifo_ren); @@ -116,11 +116,11 @@ dlrm_top::dlrm_top(const sc_module_name &name) : sc_module(name) { mem_clks.resize(num_mem_controllers); unsigned int ch_id = 0; std::string mem_content_init_prefix = - radsim_config.GetStringKnob("radsim_user_design_root_dir") + + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id) + "/compiler/embedding_tables/channel_"; for (unsigned int ctrl_id = 0; ctrl_id < num_mem_controllers; ctrl_id++) { double mem_clk_period = - radsim_config.GetDoubleVectorKnob("dram_clk_periods", ctrl_id); + radsim_config.GetDoubleVectorKnobPerRad("dram_clk_periods", ctrl_id, radsim_design->rad_id); module_name_str = "ext_mem_" + to_string(ctrl_id) + "_clk"; std::strcpy(module_name, module_name_str.c_str()); mem_clks[ctrl_id] = new sc_clock(module_name, mem_clk_period, SC_NS); @@ -128,15 +128,17 @@ dlrm_top::dlrm_top(const sc_module_name &name) : sc_module(name) { std::strcpy(module_name, module_name_str.c_str()); std::string mem_content_init = mem_content_init_prefix + to_string(ch_id); ext_mem[ctrl_id] = - new mem_controller(module_name, ctrl_id, mem_content_init); + new mem_controller(module_name, ctrl_id, radsim_design, mem_content_init); ext_mem[ctrl_id]->mem_clk(*mem_clks[ctrl_id]); ext_mem[ctrl_id]->rst(rst); ch_id += mem_channels[ctrl_id]; } - radsim_design.BuildDesignContext("dlrm.place", "dlrm.clks"); - radsim_design.CreateSystemNoCs(rst); - radsim_design.ConnectModulesToNoC(); + this->connectPortalReset(&rst); + + radsim_design->BuildDesignContext("dlrm.place", "dlrm.clks"); + radsim_design->CreateSystemNoCs(rst); + radsim_design->ConnectModulesToNoC(); } dlrm_top::~dlrm_top() { @@ -150,4 +152,5 @@ dlrm_top::~dlrm_top() { delete mvm; } } + } \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm/dlrm_top.hpp b/rad-sim/example-designs/dlrm/dlrm_top.hpp index d59a868..71d699e 100644 --- a/rad-sim/example-designs/dlrm/dlrm_top.hpp +++ b/rad-sim/example-designs/dlrm/dlrm_top.hpp @@ -9,8 +9,10 @@ #include #include #include +#include +#include -class dlrm_top : public sc_module { +class dlrm_top : public RADSimDesignTop { private: embedding_lookup *embedding_lookup_inst; custom_feature_interaction *feature_interaction_inst; @@ -20,6 +22,7 @@ class dlrm_top : public sc_module { std::vector axis_sig; std::vector mem_clks; + RADSimDesignContext* radsim_design; public: sc_in rst; @@ -36,6 +39,6 @@ class dlrm_top : public sc_module { sc_in collector_fifo_ren; sc_out> collector_fifo_rdata; - dlrm_top(const sc_module_name &name); + dlrm_top(const sc_module_name &name, RADSimDesignContext* radsim_design); ~dlrm_top(); }; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm/modules/collector.cpp b/rad-sim/example-designs/dlrm/modules/collector.cpp index 0742ddd..448b47e 100644 --- a/rad-sim/example-designs/dlrm/modules/collector.cpp +++ b/rad-sim/example-designs/dlrm/modules/collector.cpp @@ -1,10 +1,11 @@ #include -collector::collector(const sc_module_name &name) - : RADSimModule(name), rst("rst"), data_fifo_rdy("data_fifo_rdy"), +collector::collector(const sc_module_name &name, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design), rst("rst"), data_fifo_rdy("data_fifo_rdy"), data_fifo_ren("data_fifo_ren"), data_fifo_rdata("data_fifo_rdata") { module_name = name; + this->radsim_design = radsim_design; char fifo_name[25]; std::string fifo_name_str; diff --git a/rad-sim/example-designs/dlrm/modules/collector.hpp b/rad-sim/example-designs/dlrm/modules/collector.hpp index 4e9f9ad..73f6f09 100644 --- a/rad-sim/example-designs/dlrm/modules/collector.hpp +++ b/rad-sim/example-designs/dlrm/modules/collector.hpp @@ -20,13 +20,14 @@ class collector : public RADSimModule { data_fifo_almost_empty_signal; public: + RADSimDesignContext* radsim_design; sc_in rst; sc_out data_fifo_rdy; sc_in data_fifo_ren; sc_out> data_fifo_rdata; axis_slave_port rx_interface; - collector(const sc_module_name &name); + collector(const sc_module_name &name, RADSimDesignContext* radsim_design); ~collector(); void Assign(); diff --git a/rad-sim/example-designs/dlrm/modules/custom_feature_interaction.cpp b/rad-sim/example-designs/dlrm/modules/custom_feature_interaction.cpp index 42c9555..79522bc 100644 --- a/rad-sim/example-designs/dlrm/modules/custom_feature_interaction.cpp +++ b/rad-sim/example-designs/dlrm/modules/custom_feature_interaction.cpp @@ -42,15 +42,17 @@ custom_feature_interaction::custom_feature_interaction( const sc_module_name &name, unsigned int dataw, unsigned int element_bitwidth, unsigned int num_mem_channels, unsigned int fifos_depth, unsigned int num_output_channels, - std::string &instructions_file) - : RADSimModule(name) { + std::string &instructions_file, + RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design) { + this->radsim_design = radsim_design; _fifos_depth = fifos_depth; _num_received_responses = 0; _num_mem_channels = num_mem_channels; _dataw = dataw; _bitwidth = element_bitwidth; - _num_input_elements = dataw / element_bitwidth; + _num_input_elements = dataw / element_bitwidth; //512/16=32 _num_output_elements = DATAW / element_bitwidth; _num_output_channels = num_output_channels; @@ -66,7 +68,7 @@ custom_feature_interaction::custom_feature_interaction( _ofifo_empty.init(_num_output_channels); std::string resp_filename = - radsim_config.GetStringKnob("radsim_user_design_root_dir") + + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id) + "/compiler/embedding_indecies.in"; ParseFeatureInteractionInstructions(instructions_file, _instructions, resp_filename, _num_expected_responses); @@ -165,8 +167,11 @@ void custom_feature_interaction::Tick() { _pc.write(0); wait(); + int no_val_counter = 0; + bool got_all_mem_responses = false; + // Always @ positive edge of the clock - while (true) { + while (true ) { // Accept R responses from the NoC for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { if (_input_fifos[ch_id].size() < _fifos_depth && @@ -179,6 +184,7 @@ void custom_feature_interaction::Tick() { if (_num_received_responses == _num_expected_responses) { std::cout << this->name() << ": Got all memory responses at cycle " << GetSimulationCycle(5.0) << "!" << std::endl; + got_all_mem_responses = true; } } } @@ -231,14 +237,20 @@ void custom_feature_interaction::Tick() { } // Interface with AXI-S NoC + bool non_empty_output_fifo = false; for (unsigned int ch_id = 0; ch_id < _num_output_channels; ch_id++) { if (axis_interface[ch_id].tready.read() && axis_interface[ch_id].tvalid.read()) { + //int curr_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + data_vector tx_tdata = _output_fifos[ch_id].front(); + //std::cout << "custom_feature_interaction @ cycle " << curr_cycle << ": tx_tdata sent " << tx_tdata << " from RAD " << radsim_design->rad_id << " with tdest field " << axis_interface[ch_id].tdest.read() << std::endl; _output_fifos[ch_id].pop(); } - if (!_output_fifos[ch_id].empty()) { + if ( (!_output_fifos[ch_id].empty()) ) { + non_empty_output_fifo = true; data_vector tx_tdata = _output_fifos[ch_id].front(); + //std::cout << "custom_feature_interaction: tx_tdata sent " << tx_tdata << " from RAD " << radsim_design->rad_id << std::endl; sc_bv tx_tdata_bv; data_vector_to_bv(tx_tdata, tx_tdata_bv, _num_output_elements); axis_interface[ch_id].tvalid.write(true); @@ -247,10 +259,17 @@ void custom_feature_interaction::Tick() { axis_interface[ch_id].tid.write(0); std::string dest_name = "layer0_mvm" + std::to_string(ch_id) + ".rx_interface"; + //std::cout << "radsim_design->GetPortDestinationID(dest_name) on RAD " << radsim_design->rad_id << ": " << radsim_design->GetPortDestinationID(dest_name) << std::endl; + sc_bv dest_id_concat; + DEST_RAD(dest_id_concat) = radsim_design->rad_id; //keep data on current RAD + DEST_LOCAL_NODE(dest_id_concat) = radsim_design->GetPortDestinationID(dest_name); + DEST_REMOTE_NODE(dest_id_concat) = radsim_design->GetPortDestinationID(dest_name); axis_interface[ch_id].tdest.write( - radsim_design.GetPortDestinationID(dest_name)); + dest_id_concat); + no_val_counter = 0; } else { axis_interface[ch_id].tvalid.write(false); + no_val_counter++; } } diff --git a/rad-sim/example-designs/dlrm/modules/custom_feature_interaction.hpp b/rad-sim/example-designs/dlrm/modules/custom_feature_interaction.hpp index 85fdb72..bf481e0 100644 --- a/rad-sim/example-designs/dlrm/modules/custom_feature_interaction.hpp +++ b/rad-sim/example-designs/dlrm/modules/custom_feature_interaction.hpp @@ -47,6 +47,7 @@ class custom_feature_interaction : public RADSimModule { ofstream *_debug_feature_interaction_out; public: + RADSimDesignContext* radsim_design; sc_in rst; // Interface to driver logic sc_out received_responses; @@ -59,7 +60,8 @@ class custom_feature_interaction : public RADSimModule { unsigned int num_mem_channels, unsigned int fifos_depth, unsigned int num_output_channels, - std::string &instructions_file); + std::string &instructions_file, + RADSimDesignContext* radsim_design); ~custom_feature_interaction(); void Assign(); // Combinational logic process diff --git a/rad-sim/example-designs/dlrm/modules/dlrm_defines.hpp b/rad-sim/example-designs/dlrm/modules/dlrm_defines.hpp index aa0cc7a..139ca30 100644 --- a/rad-sim/example-designs/dlrm/modules/dlrm_defines.hpp +++ b/rad-sim/example-designs/dlrm/modules/dlrm_defines.hpp @@ -7,3 +7,5 @@ #define INST_MEM_DEPTH 2048 #define DOT_PRODUCTS LANES #define DATAW (BITWIDTH * LANES) +#define TDATA_ELEMS 32 +#define TDATA_WIDTH 16 diff --git a/rad-sim/example-designs/dlrm/modules/embedding_lookup.cpp b/rad-sim/example-designs/dlrm/modules/embedding_lookup.cpp index 63a0c8a..a57070e 100644 --- a/rad-sim/example-designs/dlrm/modules/embedding_lookup.cpp +++ b/rad-sim/example-designs/dlrm/modules/embedding_lookup.cpp @@ -3,8 +3,10 @@ embedding_lookup::embedding_lookup( const sc_module_name &name, unsigned int dataw, std::vector &num_mem_channels_per_controller, - unsigned int fifo_depth) - : RADSimModule(name) { + unsigned int fifo_depth, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design) { + + this->radsim_design = radsim_design; _total_num_channels = 0; unsigned int ctrl_id = 0; @@ -89,7 +91,6 @@ void embedding_lookup::Tick() { // Always @ positive edge of the clock while (true) { - // Interface with testbench driver if (lookup_indecies_ready.read() && lookup_indecies_valid.read()) { data_vector lookup_indecies = lookup_indecies_data.read(); data_vector target_channels = @@ -119,12 +120,12 @@ void embedding_lookup::Tick() { uint64_t table_base_addr = _base_addresses_fifo[ch_id].front(); std::string dst_port_name = _dst_port_names[ch_id]; - uint64_t dst_addr = radsim_design.GetPortBaseAddress(dst_port_name) + + uint64_t dst_addr = radsim_design->GetPortBaseAddress(dst_port_name) + table_base_addr + lookup_index; std::string src_port_name = "feature_interaction_inst.aximm_interface_" + std::to_string(ch_id); - uint64_t src_addr = radsim_design.GetPortBaseAddress(src_port_name); + uint64_t src_addr = radsim_design->GetPortBaseAddress(src_port_name); /*if (ctrl_id == 0) { std::cout << "Base address: " << table_base_addr << std::endl; diff --git a/rad-sim/example-designs/dlrm/modules/embedding_lookup.hpp b/rad-sim/example-designs/dlrm/modules/embedding_lookup.hpp index c7b01aa..19e3313 100644 --- a/rad-sim/example-designs/dlrm/modules/embedding_lookup.hpp +++ b/rad-sim/example-designs/dlrm/modules/embedding_lookup.hpp @@ -28,6 +28,7 @@ class embedding_lookup : public RADSimModule { unsigned int _debug_sent_request_counter; public: + RADSimDesignContext* radsim_design; sc_in rst; // Interface to driver logic sc_in> lookup_indecies_data; @@ -40,7 +41,7 @@ class embedding_lookup : public RADSimModule { embedding_lookup(const sc_module_name &name, unsigned int dataw, std::vector &num_mem_channels_per_controller, - unsigned int fifo_depth); + unsigned int fifo_depth, RADSimDesignContext* radsim_design); ~embedding_lookup(); void Assign(); // Combinational logic process diff --git a/rad-sim/example-designs/dlrm/modules/feature_interaction.cpp b/rad-sim/example-designs/dlrm/modules/feature_interaction.cpp index fde6e27..1479095 100644 --- a/rad-sim/example-designs/dlrm/modules/feature_interaction.cpp +++ b/rad-sim/example-designs/dlrm/modules/feature_interaction.cpp @@ -50,9 +50,10 @@ feature_interaction::feature_interaction(const sc_module_name &name, unsigned int num_mem_channels, unsigned int fifos_depth, unsigned int num_output_channels, - std::string &instructions_file) - : RADSimModule(name) { - + std::string &instructions_file, + RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design) { + this->radsim_design = radsim_design; _fifos_depth = fifos_depth; _afifo_width_ratio_in = 32 / 4; _afifo_width_ratio_out = LANES / 4; @@ -79,7 +80,7 @@ feature_interaction::feature_interaction(const sc_module_name &name, _ofifo_empty.init(_num_output_channels); std::string resp_filename = - radsim_config.GetStringKnob("radsim_user_design_root_dir") + + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id) + "/compiler/embedding_indecies.in"; ParseFeatureInteractionInstructions(instructions_file, _instructions, resp_filename, _num_expected_responses); @@ -285,8 +286,11 @@ void feature_interaction::Tick() { axis_interface[ch_id].tid.write(0); std::string dest_name = "layer0_mvm" + std::to_string(ch_id) + ".rx_interface"; + sc_bv dest_id_concat = radsim_design->GetPortDestinationID(dest_name); + DEST_RAD(dest_id_concat) = radsim_design->rad_id; axis_interface[ch_id].tdest.write( - radsim_design.GetPortDestinationID(dest_name)); + dest_id_concat); + //radsim_design->GetPortDestinationID(dest_name)); } else { axis_interface[ch_id].tvalid.write(false); } diff --git a/rad-sim/example-designs/dlrm/modules/feature_interaction.hpp b/rad-sim/example-designs/dlrm/modules/feature_interaction.hpp index 67fc2c7..88e1528 100644 --- a/rad-sim/example-designs/dlrm/modules/feature_interaction.hpp +++ b/rad-sim/example-designs/dlrm/modules/feature_interaction.hpp @@ -50,6 +50,7 @@ class feature_interaction : public RADSimModule { ofstream *_debug_feature_interaction_out; public: + RADSimDesignContext* radsim_design; sc_in rst; // Interface to driver logic sc_out received_responses; @@ -61,7 +62,8 @@ class feature_interaction : public RADSimModule { unsigned int element_bitwidth, unsigned int num_mem_channels, unsigned int fifos_depth, unsigned int num_output_channels, - std::string &instructions_file); + std::string &instructions_file, + RADSimDesignContext* radsim_design); ~feature_interaction(); void Assign(); // Combinational logic process diff --git a/rad-sim/example-designs/dlrm/modules/mvm.cpp b/rad-sim/example-designs/dlrm/modules/mvm.cpp index cf25b9f..a21c839 100644 --- a/rad-sim/example-designs/dlrm/modules/mvm.cpp +++ b/rad-sim/example-designs/dlrm/modules/mvm.cpp @@ -39,8 +39,8 @@ bool ParseInstructions(std::vector &inst_mem, } mvm::mvm(const sc_module_name &name, unsigned int id_mvm, unsigned int id_layer, - const std::string &inst_filename) - : RADSimModule(name), matrix_mem_rdata("matrix_mem_rdata", DOT_PRODUCTS), + const std::string &inst_filename, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design), matrix_mem_rdata("matrix_mem_rdata", DOT_PRODUCTS), matrix_mem_wen("matrix_mem_wen", DOT_PRODUCTS), ififo_pipeline("ififo_pipeline", RF_RD_LATENCY), reduce_pipeline("reduce_pipeline", RF_RD_LATENCY), @@ -54,6 +54,7 @@ mvm::mvm(const sc_module_name &name, unsigned int id_mvm, unsigned int id_layer, dest_mvm_pipeline("mvm_layer_pipeline", COMPUTE_LATENCY + RF_RD_LATENCY), tdata_vec(LANES), result(DOT_PRODUCTS), rst("rst") { + this->radsim_design = radsim_design; module_name = name; mvm_id = id_mvm; layer_id = id_layer; @@ -71,7 +72,7 @@ mvm::mvm(const sc_module_name &name, unsigned int id_mvm, unsigned int id_layer, std::string mem_name_str; matrix_memory.resize(DOT_PRODUCTS); std::string mvm_dir = - radsim_config.GetStringKnob("radsim_user_design_root_dir"); + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); std::string mem_init_file; for (unsigned int dot_id = 0; dot_id < DOT_PRODUCTS; dot_id++) { mem_init_file = mvm_dir + "/compiler/mvm_weights/layer" + @@ -314,6 +315,13 @@ void mvm::Tick() { if (rx_input_interface.tvalid.read() && rx_input_interface.tready.read()) { sc_bv tdata = rx_input_interface.tdata.read(); + data_vector tdatavector(TDATA_ELEMS); + unsigned int start_idx, end_idx; + for (unsigned int e = 0; e < TDATA_ELEMS; e++) { + start_idx = e * TDATA_WIDTH; + end_idx = (e + 1) * TDATA_WIDTH; + tdatavector[e] = tdata.range(end_idx - 1, start_idx).to_int(); + } if (rx_input_interface.tuser.read().range(15, 13).to_uint() == 1) { unsigned int waddr = @@ -507,8 +515,11 @@ void mvm::Assign() { dest_name = "layer" + std::to_string(dest_layer_int - 1) + "_mvm" + std::to_string(dest_mvm_int) + ".rx_interface"; } - dest_id = radsim_design.GetPortDestinationID(dest_name); - + dest_id = radsim_design->GetPortDestinationID(dest_name); + sc_bv dest_id_concat; + DEST_LOCAL_NODE(dest_id_concat) = dest_id; + DEST_REMOTE_NODE(dest_id_concat) = dest_id; + DEST_RAD(dest_id_concat) = radsim_design->rad_id; //stay on current RAD unsigned int dest_interface; // which FIFO unsigned int dest_interface_id; // added for separate ports // If destination is the same layer, send to reduce FIFO @@ -537,7 +548,7 @@ void mvm::Assign() { tx_input_interface.tdata.write(tx_tdata_bv); tx_input_interface.tvalid.write(true); tx_input_interface.tuser.write(dest_interface); - tx_input_interface.tdest.write(dest_id); + tx_input_interface.tdest.write(dest_id_concat); //dest_id); tx_input_interface.tid.write(dest_interface_id); tx_reduce_interface.tvalid.write(false); // if (mvm_id == 1 && layer_id == 2 && !ofifo_empty_signal) { @@ -555,7 +566,7 @@ void mvm::Assign() { tx_reduce_interface.tdata.write(tx_tdata_bv); tx_reduce_interface.tvalid.write(true); tx_reduce_interface.tuser.write(dest_interface); - tx_reduce_interface.tdest.write(dest_id); + tx_reduce_interface.tdest.write(dest_id_concat); //dest_id); tx_reduce_interface.tid.write(dest_interface_id); tx_input_interface.tvalid.write(false); } else { diff --git a/rad-sim/example-designs/dlrm/modules/mvm.hpp b/rad-sim/example-designs/dlrm/modules/mvm.hpp index 3dc6d1d..8c61837 100644 --- a/rad-sim/example-designs/dlrm/modules/mvm.hpp +++ b/rad-sim/example-designs/dlrm/modules/mvm.hpp @@ -73,6 +73,7 @@ class mvm : public RADSimModule { sc_signal dot_op, dot_reduce_op; public: + RADSimDesignContext* radsim_design; sc_in rst; axis_slave_port rx_input_interface; axis_slave_port rx_reduce_interface; @@ -80,7 +81,7 @@ class mvm : public RADSimModule { axis_master_port tx_reduce_interface; mvm(const sc_module_name &name, unsigned int id_mvm, unsigned int id_layer, - const std::string &inst_filename); + const std::string &inst_filename, RADSimDesignContext* radsim_design); ~mvm(); void Assign(); diff --git a/rad-sim/example-designs/dlrm_two_rad/.gitignore b/rad-sim/example-designs/dlrm_two_rad/.gitignore new file mode 100644 index 0000000..5dfc5be --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/.gitignore @@ -0,0 +1,9 @@ +CMakeFiles/ +Makefile +*.log +compiler/embedding_tables/ +compiler/instructions/ +compiler/mvm_weights/ +compiler/*.out +compiler/*.in +compiler/mvms.config \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/CMakeLists.txt b/rad-sim/example-designs/dlrm_two_rad/CMakeLists.txt new file mode 100644 index 0000000..37906ec --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/CMakeLists.txt @@ -0,0 +1,54 @@ +cmake_minimum_required(VERSION 3.16) +find_package(SystemCLanguage CONFIG REQUIRED) + +include_directories( + ./ + modules + ../../sim + ../../sim/noc + ../../sim/noc/booksim + ../../sim/noc/booksim/networks + ../../sim/noc/booksim/routers + ../../sim/dram + ../../sim/dram/DRAMsim3 + ../../sim/dram/DRAMsim3/src + ../../sim/dram/DRAMsim3/ext/headers +) + +set(srcfiles + modules/embedding_lookup.cpp + modules/feature_interaction.cpp + modules/custom_feature_interaction.cpp + modules/sim_utils.cpp + modules/afifo.cpp + modules/register_file.cpp + modules/mvm.cpp + modules/fifo.cpp + modules/instructions.cpp + modules/collector.cpp + dlrm_top.cpp + dlrm_driver.cpp + dlrm_two_rad_system.cpp +) + +set(hdrfiles + modules/embedding_lookup.hpp + modules/feature_interaction.hpp + modules/custom_feature_interaction.hpp + modules/sim_utils.hpp + modules/afifo.hpp + modules/register_file.hpp + modules/mvm.hpp + modules/fifo.hpp + modules/instructions.hpp + modules/collector.hpp + modules/dlrm_defines.hpp + dlrm_top.hpp + dlrm_driver.hpp + dlrm_two_rad_system.hpp +) + +add_compile_options(-Wall -Wextra -pedantic) + +add_library(dlrm_two_rad STATIC ${srcfiles} ${hdrfiles}) +target_link_libraries(dlrm_two_rad PUBLIC SystemC::systemc booksim noc dram) diff --git a/rad-sim/example-designs/dlrm_two_rad/compiler/ab_large.csv b/rad-sim/example-designs/dlrm_two_rad/compiler/ab_large.csv new file mode 100644 index 0000000..59a47d5 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/compiler/ab_large.csv @@ -0,0 +1,99 @@ +# Vector Elements,Table Entries +4,500 +4,500 +4,1000 +4,1000 +4,1000 +4,1000 +8,1000 +8,1000 +8,1000 +8,1000 +8,1000 +8,5000 +8,5000 +8,5000 +8,5000 +8,5000 +8,5000 +8,5000 +8,5000 +8,5000 +8,5000 +8,5000 +8,5000 +8,5000 +8,5000 +8,15000 +8,15000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,50000 +8,100000 +8,100000 +8,150000 +8,150000 +16,500000 +16,500000 +16,500000 +16,500000 +16,500000 +16,500000 +16,500000 +16,500000 +16,500000 +16,500000 +16,500000 +16,500000 +16,500000 +16,500000 +16,500000 +16,500000 +16,1000000 +16,1000000 +16,5000000 +32,10000000 +32,100000000 +4,100 +4,100 +4,100 +4,100 +4,100 +4,100 +4,100 +4,100 +4,100 +4,100 +4,100 +4,100 +4,500 +4,500 +4,500 +4,500 \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/compiler/ab_small.csv b/rad-sim/example-designs/dlrm_two_rad/compiler/ab_small.csv new file mode 100644 index 0000000..c995b61 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/compiler/ab_small.csv @@ -0,0 +1,48 @@ +# Vector Elements, Table Entries +4,100 +4,100 +4,100 +4,100 +4,500 +4,500 +4,1000 +4,1000 +4,1000 +4,1000 +4,1000 +4,1000 +4,1000 +4,1000 +4,1000 +4,1000 +8,3000 +8,10000 +8,10000 +8,10000 +8,10000 +8,10000 +8,10000 +8,10000 +8,10000 +8,10000 +8,10000 +8,10000 +8,10000 +8,10000 +8,10000 +8,10000 +8,20000 +8,30000 +8,100000 +8,100000 +8,100000 +8,100000 +8,100000 +8,100000 +8,100000 +8,100000 +8,100000 +8,100000 +16,500000 +16,1000000 +32,10000000 \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/compiler/dlrm.py b/rad-sim/example-designs/dlrm_two_rad/compiler/dlrm.py new file mode 100644 index 0000000..634b95a --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/compiler/dlrm.py @@ -0,0 +1,846 @@ +import math +import random +import os +import glob +import numpy as np +import sys + +# Input parameters +model_csv = "ab_small.csv" +read_bytewidth = 64 +element_bytewidth = 2 +hbm_channels = 16 +hbm_channel_words = 1 * 1024 * 1024 * 1024 / read_bytewidth +ddr_channels = 2 +ddr_channel_words = 8 * 1024 * 1024 * 1024 / read_bytewidth +num_test_inputs = 256 + +# MLP parameters +native_dim = 32 # int(read_bytewidth / element_bytewidth) +num_layers = 3 +hidden_dims = [1024, 512, 256] +num_mvms = [4, 2, 2] +hard_mvms = False + +# Model parsing +table_info = [] +smallest_table_bytewidth = 8 +input_dim = 0 + +# Memory allocation +hbm_channels_used_words = np.zeros(hbm_channels, dtype=int) +hbm_channels_rounds = np.zeros(hbm_channels, dtype=int) +ddr_channels_used_words = np.zeros(ddr_channels, dtype=int) +ddr_channels_rounds = np.zeros(ddr_channels, dtype=int) +tables_per_ddr_channel = {} +tables_per_hbm_channel = {} +base_addr_per_ddr_channel = {} +base_addr_per_hbm_channel = {} + +# Testing +test_input_data = [] +test_input_base_addr = [] +test_input_target_ch = [] +test_feature_interaction_outputs = [] +test_golden_outputs = [] +mem_contents_per_channel = [{} for channel in range(ddr_channels + hbm_channels)] + +tobin = lambda x, count=8: "".join( + map(lambda y: str((x >> y) & 1), range(count - 1, -1, -1)) +) + + +def get_table_id(table): + return table[0] + + +def get_table_vector_length(table): + return table[1] + + +def get_table_vector_length_by_id(all_tables, id): + for table in all_tables: + if get_table_id(table) == id: + return get_table_vector_length(table) + return -1 + + +def get_table_entries(table): + return table[2] + + +def get_table_entries_by_id(all_tables, id): + for table in all_tables: + if get_table_id(table) == id: + return get_table_entries(table) + return -1 + + +def get_table_words(table): + return table[3] + + +def get_table_channel_type(table): + if len(table) < 5: + return -1 + return table[4] + + +def get_table_channel_id(table): + if len(table) < 5: + return -1 + return table[5] + + +def parse_dlrm_description(filename): + global smallest_table_bytewidth + global input_dim + f = open(filename, "r") + lines = f.readlines() + id = 0 + for line in lines: + if line[0] == "#": + continue + line_split = line.split(",") + line_split = [eval(i) for i in line_split] + input_dim += line_split[0] + if line_split[0] * element_bytewidth < smallest_table_bytewidth: + smallest_table_bytewidth = line_split[0] * element_bytewidth + words_per_entry = int(math.ceil(1.0 * line_split[0] / read_bytewidth)) + line_split.append(words_per_entry * line_split[1]) + line_split.insert(0, id) + id = id + 1 + table_info.append(line_split) + f.close() + + +def sort_tables(): + table_info.sort(key=lambda x: x[3], reverse=True) + + +def greedy_allocation(): + round_id = 1 + for table in table_info: + allocated = False + while not (allocated): + for ch in range(ddr_channels): + rem_words = ddr_channel_words - ddr_channels_used_words[ch] + if (ddr_channels_rounds[ch] < round_id) and ( + rem_words >= get_table_words(table) + ): + if ch in tables_per_ddr_channel: + tables_per_ddr_channel[ch].append(get_table_id(table)) + base_addr_per_ddr_channel[ch].append( + ddr_channels_used_words[ch] + ) + else: + tables_per_ddr_channel[ch] = [get_table_id(table)] + base_addr_per_ddr_channel[ch] = [ddr_channels_used_words[ch]] + ddr_channels_used_words[ch] += get_table_words(table) + ddr_channels_rounds[ch] += 1 + allocated = True + table.append(1) + table.append(ch) + + break + + if not (allocated): + for ch in range(hbm_channels): + rem_words = hbm_channel_words - hbm_channels_used_words[ch] + if (hbm_channels_rounds[ch] < round_id) and ( + rem_words >= get_table_words(table) + ): + if ch in tables_per_hbm_channel: + tables_per_hbm_channel[ch].append(get_table_id(table)) + base_addr_per_hbm_channel[ch].append( + hbm_channels_used_words[ch] + ) + else: + tables_per_hbm_channel[ch] = [get_table_id(table)] + base_addr_per_hbm_channel[ch] = [ + hbm_channels_used_words[ch] + ] + hbm_channels_used_words[ch] += get_table_words(table) + hbm_channels_rounds[ch] += 1 + allocated = True + table.append(0) + table.append(ch) + break + + if not (allocated): + round_id += 1 + + +def print_dlrm_description(): + print("Embedding Tables (sorted):") + print("+----+----+-----------+") + print("| # | V | Entries |") + print("|----|----|-----------|") + for row in table_info: + print("| {:>2} | {:>2} | {:>9} |".format(*row)) + print("+----+----+-----------+") + + +def print_allocation(): + ddr_total_mb = 0 + hbm_total_mb = 0 + print("\nDDR Channel Allocations:") + ddr_table = [ddr_channels_rounds, ddr_channels_used_words] + ddr_table = np.transpose(ddr_table).tolist() + print("+----+-----------+--------+----------+") + print("| R | Mem Words | % | Size(MB) |") + print("|----|-----------|--------|----------|") + for row in ddr_table: + row.append(1.0 * row[1] / ddr_channel_words * 100) + size_mb = 1.0 * row[1] * read_bytewidth / 1024 / 1024 + ddr_total_mb += size_mb + row.append(size_mb) + print("| {:>2} | {:>9} | {:5.2f}% | {:8.2f} |".format(*row)) + print("+----+-----------+--------+----------+") + print("\nHBM Channel Allocations:") + hbm_table = [hbm_channels_rounds, hbm_channels_used_words] + hbm_table = np.transpose(hbm_table).tolist() + print("+----+-----------+--------+----------+") + print("| R | Mem Words | % | Size(MB) |") + print("|----|-----------|--------|----------|") + for row in hbm_table: + row.append(1.0 * row[1] / hbm_channel_words * 100) + size_mb = 1.0 * row[1] * read_bytewidth / 1024 / 1024 + hbm_total_mb += size_mb + row.append(size_mb) + print("| {:>2} | {:>9} | {:5.2f}% | {:8.2f} |".format(*row)) + print("+----+-----------+--------+----------+") + print("Total DDR memory footprint = {:.2f} MB".format(ddr_total_mb)) + print("Total HBM memory footprint = {:.2f} MB".format(hbm_total_mb)) + print("Total memory footprint = {:.2f} MB".format(ddr_total_mb + hbm_total_mb)) + print("\n") + print("DDR Tables per channel:") + for ch in tables_per_ddr_channel: + print("{:>2} : ".format(ch), end="") + for i in range(len(tables_per_ddr_channel[ch])): + print( + "{:>3} ({:>9}) ({:>2})".format( + tables_per_ddr_channel[ch][i], + base_addr_per_ddr_channel[ch][i], + int( + get_table_vector_length_by_id( + table_info, tables_per_ddr_channel[ch][i] + ) + * element_bytewidth + / smallest_table_bytewidth + ), + ), + end="", + ) + print("") + print("\n") + print("HBM Tables per channel:") + for ch in tables_per_hbm_channel: + print("{:>2} : ".format(ch), end="") + for i in range(len(tables_per_hbm_channel[ch])): + print( + "{:>3} ({:>9}) ({:>2})".format( + tables_per_hbm_channel[ch][i], + base_addr_per_hbm_channel[ch][i], + int( + get_table_vector_length_by_id( + table_info, tables_per_hbm_channel[ch][i] + ) + * element_bytewidth + / smallest_table_bytewidth + ), + ), + end="", + ) + print("") + print("\n") + + +def generate_embedding_lookup_inputs(num_inputs): + f = open("embedding_indecies.in", "w") + f.write(str(len(table_info)) + " " + str(num_inputs) + "\n") + for i in range(num_inputs): + input_vec = [] + target_ch = [] + base_addr = [] + round_id = 0 + done = False + table_count = 0 + while not (done): + for ch in tables_per_ddr_channel: + if round_id < len(tables_per_ddr_channel[ch]): + table_id = tables_per_ddr_channel[ch][round_id] + limit = int(get_table_entries_by_id(table_info, table_id) / 2) + input_vec.append(random.randint(0, limit) * read_bytewidth) + target_ch.append(ch) + base_addr.append( + base_addr_per_ddr_channel[ch][round_id] * read_bytewidth + ) + vector_length = get_table_vector_length_by_id(table_info, table_id) + mem_addr = base_addr[-1] + input_vec[-1] + mem_contents_per_channel[ch][mem_addr] = [ + random.randint(-2, 2) for i in range(vector_length) + ] + table_count += 1 + for ch in tables_per_hbm_channel: + if round_id < len(tables_per_hbm_channel[ch]): + table_id = tables_per_hbm_channel[ch][round_id] + limit = int(get_table_entries_by_id(table_info, table_id) / 2) + input_vec.append(random.randint(0, limit) * read_bytewidth) + target_ch.append(ddr_channels + ch) + base_addr.append( + base_addr_per_hbm_channel[ch][round_id] * read_bytewidth + ) + vector_length = get_table_vector_length_by_id(table_info, table_id) + mem_addr = base_addr[-1] + input_vec[-1] + mem_contents_per_channel[ddr_channels + ch][mem_addr] = [ + random.randint(-2, 2) for i in range(vector_length) + ] + table_count += 1 + round_id += 1 + done = table_count == len(table_info) + test_input_data.append(input_vec) + test_input_base_addr.append(base_addr) + test_input_target_ch.append(target_ch) + for j in input_vec: + f.write(str(j) + " ") + f.write("\n") + for j in target_ch: + f.write(str(j) + " ") + f.write("\n") + for j in base_addr: + f.write(str(j) + " ") + f.write("\n") + f.close() + + +def generate_mem_channel_contents(): + # Prepare instruction MIFs directory + if not (os.path.exists("./embedding_tables")): + os.mkdir("embedding_tables") + else: + files = glob.glob("embedding_tables/*.dat") + for file in files: + os.remove(file) + + for c in range(ddr_channels + hbm_channels): + f = open("embedding_tables/channel_" + str(c) + ".dat", "w") + for addr in mem_contents_per_channel[c]: + content = mem_contents_per_channel[c][addr] + string_content = "" + byte_count = 0 + for e in content: + string_content = tobin(e, element_bytewidth * 8) + string_content + byte_count += element_bytewidth + for i in range(read_bytewidth - byte_count): + string_content = tobin(0, element_bytewidth * 8) + string_content + f.write(str(addr) + " " + string_content + "\n") + f.close() + + +def pop_one_hot(fifo_ids): + one_hot = "" + for i in range(ddr_channels + hbm_channels): + one_hot += "0" + for id in fifo_ids: + one_hot = one_hot[:id] + "1" + one_hot[id + 1 :] + return one_hot + + +def generate_feature_interaction_instructions(): + if not (os.path.exists("./instructions")): + os.mkdir("instructions") + else: + files = glob.glob("instructions/*.inst") + for file in files: + os.remove(file) + + global smallest_table_bytewidth + f = open("instructions/feature_interaction.inst", "w") + round_id = 0 + table_count = 0 + total_flush_count = 0 + flush_counters = np.zeros(ddr_channels + hbm_channels, dtype=int) + total_pushed_bytes = 0 + while table_count < len(table_info): + for c in tables_per_ddr_channel: + if round_id < len(tables_per_ddr_channel[c]): + vector_length = get_table_vector_length_by_id( + table_info, tables_per_ddr_channel[c][round_id] + ) + num_pops = int( + vector_length * element_bytewidth / smallest_table_bytewidth + ) + total_pushed_bytes += vector_length * element_bytewidth + for p in range(num_pops): + fifo_ids = [c] + for fc in range(len(flush_counters)): + if flush_counters[fc] != 0: + fifo_ids.append(fc) + flush_counters[fc] -= 1 + total_flush_count -= 1 + f.write(str(c + 1) + " " + pop_one_hot(fifo_ids) + "\n") + num_flushes = (read_bytewidth / smallest_table_bytewidth) - num_pops + total_flush_count += num_flushes + flush_counters[c] += num_flushes + table_count += 1 + + for ch in tables_per_hbm_channel: + c = ch + ddr_channels + if round_id < len(tables_per_hbm_channel[ch]): + vector_length = get_table_vector_length_by_id( + table_info, tables_per_hbm_channel[ch][round_id] + ) + num_pops = int( + vector_length * element_bytewidth / smallest_table_bytewidth + ) + total_pushed_bytes += vector_length * element_bytewidth + for p in range(num_pops): + fifo_ids = [c] + for fc in range(len(flush_counters)): + if flush_counters[fc] != 0: + fifo_ids.append(fc) + flush_counters[fc] -= 1 + total_flush_count -= 1 + f.write(str(c + 1) + " " + pop_one_hot(fifo_ids) + "\n") + num_flushes = (read_bytewidth / smallest_table_bytewidth) - num_pops + total_flush_count += num_flushes + flush_counters[c] += num_flushes + table_count += 1 + + round_id += 1 + + padded_input_dim = math.ceil(input_dim / native_dim / num_mvms[0]) + padded_input_dim = int(padded_input_dim * native_dim * num_mvms[0]) + total_vector_bytewidth = padded_input_dim * element_bytewidth + remaining_bytes = total_vector_bytewidth - total_pushed_bytes + assert remaining_bytes % smallest_table_bytewidth == 0 + padding_words = int(remaining_bytes / smallest_table_bytewidth) + while total_flush_count > 0: + fifo_ids = [] + for fc in range(len(flush_counters)): + if flush_counters[fc] != 0: + fifo_ids.append(fc) + flush_counters[fc] -= 1 + total_flush_count -= 1 + if padding_words > 0: + f.write( + str(ddr_channels + hbm_channels + 1) + + " " + + pop_one_hot(fifo_ids) + + "\n" + ) + padding_words -= 1 + else: + f.write("0 " + pop_one_hot(fifo_ids) + "\n") + + while padding_words > 0: + f.write(str(ddr_channels + hbm_channels + 1) + " " + pop_one_hot([]) + "\n") + padding_words -= 1 + + f.close() + + +def generate_custom_feature_interaction_instructions(): + global smallest_table_bytewidth + round_id = 0 + table_count = 0 + running_byte_count = 0 + total_pushed_bytes = 0 + schedule = [] + schedule_step = [] + while table_count < len(table_info): + for ch in tables_per_ddr_channel: + if round_id < len(tables_per_ddr_channel[ch]): + vector_length = get_table_vector_length_by_id(table_info, tables_per_ddr_channel[ch][round_id]) + running_byte_count += vector_length * element_bytewidth + if (vector_length > native_dim): + for i in range(int(vector_length/native_dim)): + schedule_step.append(ch + 1) + schedule_step.append(i * native_dim) + schedule_step.append((i+1) * native_dim - 1) + if (i == int(vector_length/native_dim) - 1): + schedule_step.append(1) + schedule.append(schedule_step) + schedule_step = [] + running_byte_count = 0 + else: + schedule_step.append(0) + schedule.append(schedule_step) + schedule_step = [] + else: + schedule_step.append(ch + 1) + schedule_step.append(0) + schedule_step.append(vector_length-1) + schedule_step.append(1) + if (running_byte_count == native_dim * element_bytewidth): + schedule.append(schedule_step) + running_byte_count = 0 + schedule_step = [] + table_count += 1 + total_pushed_bytes += (vector_length * element_bytewidth) + + for c in tables_per_hbm_channel: + ch = ddr_channels + c + if round_id < len(tables_per_hbm_channel[c]): + vector_length = get_table_vector_length_by_id(table_info, tables_per_hbm_channel[c][round_id]) + running_byte_count += vector_length * element_bytewidth + if (vector_length > native_dim): + for i in range(int(vector_length/native_dim)): + schedule_step.append(ch + 1) + schedule_step.append(i * native_dim) + schedule_step.append((i+1) * native_dim - 1) + schedule_step.append(int(i == int(vector_length/native_dim))) + if (i == int(vector_length/native_dim) - 1): + schedule_step.append(1) + schedule.append(schedule_step) + schedule_step = [] + running_byte_count = 0 + else: + schedule_step.append(0) + schedule.append(schedule_step) + schedule_step = [] + else: + schedule_step.append(ch + 1) + schedule_step.append(0) + schedule_step.append(vector_length-1) + schedule_step.append(1) + if (running_byte_count == native_dim * element_bytewidth): + schedule.append(schedule_step) + running_byte_count = 0 + schedule_step = [] + table_count += 1 + total_pushed_bytes += (vector_length * element_bytewidth) + + round_id += 1 + + if running_byte_count > 0 and running_byte_count < native_dim * element_bytewidth: + remaining_bytes = (native_dim * element_bytewidth) - running_byte_count + schedule_step.append(0) + schedule_step.append(0) + schedule_step.append(int(remaining_bytes / element_bytewidth)-1) + schedule_step.append(0) + running_byte_count = 0 + schedule.append(schedule_step) + schedule_step = [] + total_pushed_bytes += remaining_bytes + + padded_input_dim = math.ceil(input_dim / native_dim / num_mvms[0]) + padded_input_dim = int(padded_input_dim * native_dim * num_mvms[0]) + total_vector_bytewidth = padded_input_dim * element_bytewidth + remaining_bytes = total_vector_bytewidth - total_pushed_bytes + assert remaining_bytes % read_bytewidth == 0 + padding_words = int(remaining_bytes / native_dim / element_bytewidth) + for i in range(padding_words): + schedule_step.append(0) + schedule_step.append(0) + schedule_step.append(native_dim-1) + schedule_step.append(0) + schedule.append(schedule_step) + schedule_step = [] + + if not (os.path.exists("./instructions")): + os.mkdir("instructions") + else: + files = glob.glob("instructions/*.inst") + for file in files: + os.remove(file) + f = open("instructions/feature_interaction.inst", "w") + for step in schedule: + for s in step: + f.write(str(s) + " ") + f.write("\n") + f.close() + #idx = 0 + #for s in schedule: + # print(str(idx) + ": " + str(s)) + # idx += 1 + +def generate_feature_interaction_outputs(): + f = open("feature_interaction.out", "w") + feature_interaction_vector_length = 0 + for table in table_info: + feature_interaction_vector_length += get_table_vector_length(table) + total_num_outputs = len(test_input_data) * int( + feature_interaction_vector_length * element_bytewidth / read_bytewidth + ) + f.write(str(total_num_outputs) + "\n") + for input_id in range(len(test_input_data)): + output_vec = [] + for idx_id in range(len(test_input_data[input_id])): + idx = test_input_data[input_id][idx_id] + base = test_input_base_addr[input_id][idx_id] + ch = test_input_target_ch[input_id][idx_id] + mem_content = mem_contents_per_channel[ch][base + idx] + for e in mem_content: + output_vec.append(e) + reshaped_output_vec = np.reshape( + output_vec, + ( + int(len(output_vec) / int((read_bytewidth / element_bytewidth))), + int(read_bytewidth / element_bytewidth), + ), + ) + for o in reshaped_output_vec: + for e in o: + f.write(str(e) + " ") + f.write("\n") + test_feature_interaction_outputs.append(output_vec) + f.close() + + +def generate_mlp_weights(): + # Generate random padded weight matrices + padded_weights = [] + for l in range(num_layers): + num_mvms_in = num_mvms[l] + if l == num_layers - 1: + num_mvms_out = num_mvms[0] + else: + num_mvms_out = num_mvms[l + 1] + if l == 0: + layer_input_dim = input_dim + else: + layer_input_dim = hidden_dims[l - 1] + padded_dimx = int( + math.ceil(layer_input_dim * 1.0 / native_dim / num_mvms_in) + * native_dim + * num_mvms_in + ) + padded_dimy = int( + math.ceil(hidden_dims[l] * 1.0 / native_dim / num_mvms_out) + * native_dim + * num_mvms_out + ) + padded_weights.append(np.zeros(shape=(padded_dimy, padded_dimx), dtype=int)) + for i in range(hidden_dims[l]): + sample_indecies = random.sample( + range(layer_input_dim), int(0.1 * layer_input_dim) + ) + for idx in sample_indecies: + padded_weights[l][i, idx] = np.random.randint(-2, 2) + # padded_weights[l][: hidden_dims[l], :layer_input_dim] = np.random.randint( + # -2, 2, size=(hidden_dims[l], layer_input_dim) + # ) + + # Prepare weight MIFs directory + if not (os.path.exists("./mvm_weights")): + os.mkdir("mvm_weights") + else: + files = glob.glob("mvm_weights/*.dat") + for file in files: + os.remove(file) + + # Write weight MIFs + for l in range(num_layers): + layer_mvms = num_mvms[l] + mvm_idx = 0 + limx = int(padded_weights[l].shape[1] / native_dim) + limy = int(padded_weights[l].shape[0] / native_dim) + mifs = [] + for m in range(layer_mvms): + mifs.append([]) + for d in range(native_dim): + mifs[m].append( + open( + "mvm_weights/layer" + + str(l) + + "_mvm" + + str(m) + + "_dot" + + str(d) + + ".dat", + "w", + ) + ) + + for i in range(limx): + for j in range(limy): + for d in range(native_dim): + for e in range(native_dim): + mifs[mvm_idx][d].write( + str( + padded_weights[l][(j * native_dim) + d][ + (i * native_dim) + e + ] + ) + + " " + ) + mifs[mvm_idx][d].write("\n") + if mvm_idx == layer_mvms - 1: + mvm_idx = 0 + else: + mvm_idx = mvm_idx + 1 + + for mvm_mifs in mifs: + for mif in mvm_mifs: + mif.close() + return padded_weights + + +def generate_mvm_instructions(padded_weights): + # Generate instruction MIFs + # en, jump, reduce, accum, accum_en, release, raddr, last, dest_layer, dest_mvm + for l in range(num_layers): + layer_mvms = num_mvms[l] + limx = int(padded_weights[l].shape[1] / native_dim / layer_mvms) + limy = int(padded_weights[l].shape[0] / native_dim) + for m in range(layer_mvms): + inst_mif = open( + "instructions/layer" + str(l) + "_mvm" + str(m) + ".inst", "w" + ) + for i in range(limx): + for j in range(limy): + if (l == num_layers - 1) and (m == layer_mvms - 1): + dest_layer = 0 + dest_mvm = 0 + elif m == layer_mvms - 1: + dest_layer = l + 2 + dest_mvm = j % num_mvms[l + 1] + else: + dest_layer = l + 1 + dest_mvm = m + 1 + inst_mif.write("1 0 ") # en, jump + if m == 0 or i < limx - 1: + inst_mif.write("0 ") # reduce + else: + inst_mif.write("1 ") # reduce + if i == 0: + inst_mif.write(str(j) + " 0 ") # accum, accum_en + else: + inst_mif.write(str(j) + " 1 ") # accum, accum_en + if i == limx - 1: + inst_mif.write("1 ") # release + else: + inst_mif.write("0 ") # release + inst_mif.write(str(i * limy + j) + " ") # raddr + if j == limy - 1: + inst_mif.write("1 ") # last + else: + inst_mif.write("0 ") # last + inst_mif.write( + str(dest_layer) + " " + str(dest_mvm) + "\n" + ) # dest_layer, dest_mvm + inst_mif.write("1 1 0 0 0 0 0 0 0 0\n") + inst_mif.close() + + +def generate_mlp_outputs(padded_weights): + # Compute test outputs + padded_input_dim = int( + math.ceil(input_dim * 1.0 / native_dim / num_mvms[0]) * native_dim * num_mvms[0] + ) + padded_test_feature_interaction_outputs = np.zeros( + shape=(num_test_inputs, padded_input_dim), dtype=int + ) + padded_test_feature_interaction_outputs[ + :, :input_dim + ] = test_feature_interaction_outputs + test_inputs = np.transpose(padded_test_feature_interaction_outputs) + test_outputs = np.dot(padded_weights[0], test_inputs) + # test_outputs = np.maximum(test_outputs, np.zeros(shape=test_outputs.shape, dtype=int)) + for l in range(1, num_layers): + test_outputs = np.dot(padded_weights[l], test_outputs) + # test_outputs = np.maximum(test_outputs, np.zeros(shape=test_outputs.shape, dtype=int)) + test_outputs = np.transpose(test_outputs) + + # Generate test output MIFs + output_file = open("./mlp.out", "w") + output_file.write( + str(test_outputs.shape[0] * int(test_outputs.shape[1] / native_dim)) + "\n" + ) + for o in range(test_outputs.shape[0]): + for c in range(int(test_outputs.shape[1] / native_dim)): + for e in range(native_dim): + output_file.write(str(test_outputs[o][(c * native_dim) + e]) + " ") + output_file.write("\n") + output_file.close() + + +def generate_mvms_config(): + # Generate layer/MVM configuration + config_file = open("./mvms.config", "w") + config_file.write(str(num_layers) + " ") + for mvm_count in num_mvms: + config_file.write(str(mvm_count) + " ") + config_file.close() + + +def generate_dlrm_defines_hpp(): + dlrm_defines = open("../modules/dlrm_defines.hpp", "w") + dlrm_defines.write("#define BITWIDTH 16\n") + dlrm_defines.write("#define LANES " + str(native_dim) + "\n") + dlrm_defines.write("#define FIFO_SIZE 512\n") + dlrm_defines.write( + "#define COMPUTE_LATENCY " + str(int(math.log2(native_dim)) + 5) + "\n" + ) + if (native_dim == 16): + dlrm_defines.write("#define RF_MEM_DEPTH 1024\n") + else: + dlrm_defines.write("#define RF_MEM_DEPTH 512\n") + dlrm_defines.write("#define ACCUM_MEM_DEPTH 64\n") + dlrm_defines.write("#define INST_MEM_DEPTH 2048\n") + dlrm_defines.write("#define DOT_PRODUCTS LANES\n") + dlrm_defines.write("#define DATAW (BITWIDTH * LANES)\n") + dlrm_defines.write("#define TDATA_ELEMS 32\n") + dlrm_defines.write("#define TDATA_WIDTH 16\n") + dlrm_defines.close() + + +def generate_radsim_clocks_file(): + dlrm_clks = open("../dlrm.clks", "w") + dlrm_clks.write("embedding_lookup_inst 0 0\n") + dlrm_clks.write("feature_interaction_inst 0 0\n") + dlrm_clks.write("ext_mem_0 2 2\n") + dlrm_clks.write("ext_mem_1 2 2\n") + dlrm_clks.write("ext_mem_2 1 1\n") + dlrm_clks.write("ext_mem_3 1 1\n") + for l in range(len(num_mvms)): + for m in range(num_mvms[l]): + if hard_mvms: + dlrm_clks.write("layer" + str(l) + "_mvm" + str(m) + " 0 3\n") + else: + dlrm_clks.write("layer" + str(l) + "_mvm" + str(m) + " 0 0\n") + dlrm_clks.write("output_collector 0 0") + dlrm_clks.close() + + +if "-h" in sys.argv or "--help" in sys.argv: + print("python dlrm.py -l -n -m ") + exit(1) + +# Parse command line arguments +if "-n" in sys.argv: + if sys.argv.index("-n") + 1 >= len(sys.argv): + sys.exit(1) + num_test_inputs = int(sys.argv[sys.argv.index("-n") + 1]) + +if "-l" in sys.argv: + if sys.argv.index("-l") + 1 >= len(sys.argv): + sys.exit(1) + native_dim = int(sys.argv[sys.argv.index("-l") + 1]) + +if "-m" in sys.argv: + if sys.argv.index("-m") + 1 >= len(sys.argv): + sys.exit(1) + model_csv = sys.argv[sys.argv.index("-m") + 1] + +if "-a" in sys.argv: + hard_mvms = True + +parse_dlrm_description(model_csv) +sort_tables() +# print_dlrm_description() +greedy_allocation() +#print_allocation() +generate_embedding_lookup_inputs(num_test_inputs) +generate_mem_channel_contents() +#generate_feature_interaction_instructions() +generate_custom_feature_interaction_instructions() +generate_feature_interaction_outputs() +padded_weights = generate_mlp_weights() +generate_mvm_instructions(padded_weights) +generate_mlp_outputs(padded_weights) +generate_mvms_config() +generate_dlrm_defines_hpp() +#generate_radsim_clocks_file() \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/compiler/plot.py b/rad-sim/example-designs/dlrm_two_rad/compiler/plot.py new file mode 100644 index 0000000..333ea3c --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/compiler/plot.py @@ -0,0 +1,131 @@ +import plotly.express as px +import math +from plotly.subplots import make_subplots +import plotly.graph_objects as go +import plotly.io as pio +import numpy as np +from numpy import loadtxt +import sys + +pio.orca.config.use_xvfb = False + +num_traces = 14 + +root_dir = sys.argv[1] + +vspacing = 0.25 +vspacing_step = vspacing / 5.0 + +traces_names = [ + "Embed. LU Req", + "Mem0 Resp", + "Mem1 Resp", + "Mem2 Resp", + "Mem3 Resp", + "Feat. Inter. Out", + "MVM0", + "MVM1", + "MVM2", + "MVM3", + "MVM4", + "MVM5", + "MVM6", + "MVM7", +] +traces_color = [ + "#003f5c", + "#2f4b7c", + "#665191", + "#a05195", + "#d45087", + "#f95d6a", + "#ffa600", + "#ffa600", + "#ffa600", + "#ffa600", + "#ffa600", + "#ffa600", + "#ffa600", + "#ffa600", +] + +traces_x = [] +traces_y = [] +for i in range(num_traces): + traces_x.append([]) + traces_y.append([]) + +trace_count = 0 +max_val = 0 +with open(root_dir+"sim/sim.trace") as traces_file: + for line in traces_file: + # Extract integer values of a trace + trace = line.strip().split(" ") + if "" in trace: + trace.remove("") + trace = [int(i) for i in trace] + trace_height = vspacing * ((num_traces - trace_count + 1)) + + for i in trace: + # Append values to corresponding trace list + traces_x[trace_count].append(i) + traces_y[trace_count].append(trace_height) + if i > max_val: + max_val = i + + # Update trace counter + if trace_count == num_traces - 1: + trace_count = 0 + else: + trace_count = trace_count + 1 + +fig = go.Figure() + +for i in range(len(traces_x)): + fig.add_trace( + go.Scatter( + x=traces_x[i], + y=traces_y[i], + mode="markers", + marker_symbol="circle", + marker_color=traces_color[i], + marker_size=10, + ), + ) + +tick_vals = [] +tick_text = traces_names +for i in range(num_traces + 1): + tick_vals.append(vspacing * (i + 1)) +tick_vals.reverse() + +fig.update_xaxes(showline=True, linewidth=2, linecolor="black", mirror=True) +fig.update_xaxes(showgrid=True, gridcolor="rgb(211,211,211)") +fig.update_xaxes( + tickfont=dict(family="Arial", color="black", size=25), + title="Simulation Cycles", + titlefont=dict(family="Arial", color="black", size=25), +) +fig.update_xaxes(tick0=0, ticks="inside") +fig.update_yaxes(showline=True, linewidth=2, linecolor="black", mirror=True) +fig.update_yaxes( + tickmode="array", + tickvals=tick_vals, + ticktext=tick_text, + tickfont=dict(family="Arial", color="black", size=25), +) +fig.update_yaxes(showgrid=False) + +fig.update_layout(xaxis_range=[0, max_val + 10]) +#fig.update_layout(xaxis_range=[0, 5000 + 10]) +fig.update_layout(plot_bgcolor="white") +fig.update_layout(showlegend=False) + +fig.update_layout(height=num_traces * 55) + +filename = sys.argv[2] + +#fig.write_image(filename+".pdf") +fig.write_html(filename+".html") + +#fig.show() diff --git a/rad-sim/example-designs/dlrm_two_rad/compiler/report.csv b/rad-sim/example-designs/dlrm_two_rad/compiler/report.csv new file mode 100644 index 0000000..2ed52a7 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/compiler/report.csv @@ -0,0 +1,297 @@ +flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +131, 8, 16, small, 1, 537 +131, 8, 16, small, 256, 54470 +131, 8, 16, large, 1, +131, 8, 16, large, 256, +131, 8, 32, small, 1, 279 +131, 8, 32, small, 256, 17573 +131, 8, 32, large, 1, 358 +131, 8, 32, large, 256, 34754 +131, 12, 16, small, 1, 537 +131, 12, 16, small, 256, 54470 +131, 12, 16, large, 1, +131, 12, 16, large, 256, +131, 12, 32, small, 1, 276 +131, 12, 32, small, 256, 17451 +131, 12, 32, large, 1, 351 +131, 12, 32, large, 256, 34828 +131, 16, 16, small, 1, 545 +131, 16, 16, small, 256, 54470 +131, 16, 16, large, 1, +131, 16, 16, large, 256, +131, 16, 32, small, 1, 275 +131, 16, 32, small, 256, 17499 +131, 16, 32, large, 1, 367 +131, 16, 32, large, 256, 34828 +195, 8, 16, small, 1, 531 +195, 8, 16, small, 256, 54463 +195, 8, 16, large, 1, +195, 8, 16, large, 256, +195, 8, 32, small, 1, 226 +195, 8, 32, small, 256, 15046 +195, 8, 32, large, 1, 293 +195, 8, 32, large, 256, 34647 +195, 12, 16, small, 1, 531 +195, 12, 16, small, 256, 54471 +195, 12, 16, large, 1, +195, 12, 16, large, 256, +195, 12, 32, small, 1, 216 +195, 12, 32, small, 256, 14997 +195, 12, 32, large, 1, 296 +195, 12, 32, large, 256, 34678 +195, 16, 16, small, 1, 539 +195, 16, 16, small, 256, 54474 +195, 16, 16, large, 1, +195, 16, 16, large, 256, +195, 16, 32, small, 1, 223 +195, 16, 32, small, 256, 14977 +195, 16, 32, large, 1, 299 +195, 16, 32, large, 256, 34650 +323, 8, 16, small, 1, 529 +323, 8, 16, small, 256, 54461 +323, 8, 16, large, 1, +323, 8, 16, large, 256, +323, 8, 32, small, 1, 239 +323, 8, 32, small, 256, 15010 +323, 8, 32, large, 1, 290 +323, 8, 32, large, 256, 34645 +323, 12, 16, small, 1, 537 +323, 12, 16, small, 256, 54469 +323, 12, 16, large, 1, +323, 12, 16, large, 256, +323, 12, 32, small, 1, 239 +323, 12, 32, small, 256, 15007 +323, 12, 32, large, 1, 290 +323, 12, 32, large, 256, 34645 +323, 16, 16, small, 1, 529 +323, 16, 16, small, 256, 54474 +323, 16, 16, large, 1, +323, 16, 16, large, 256, +323, 16, 32, small, 1, 218 +323, 16, 32, small, 256, 15023 +323, 16, 32, large, 1, 290 +323, 16, 32, large, 256, 34673 +flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +131, 4, 16, small, 1, flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +131, 8, 16, large, 1, +131, 8, 16, large, 256, +131, 12, 16, large, 1, flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +131, 8, 16, large, 1, flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +131, 8, 16, large, 1, +131, 8, 16, large, 256, flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +131, 8, 16, large, 1, 760 +131, 8, 16, large, 256, 79399 +131, 12, 16, large, 1, 760 +131, 12, 16, large, 256, 79406 +131, 16, 16, large, 1, 809 +131, 16, 16, large, 256, 79461 +195, 8, 16, large, 1, 804 +195, 8, 16, large, 256, 79317 +195, 12, 16, large, 1, 752 +195, 12, 16, large, 256, 79315 +195, 16, 16, large, 1, 792 +195, 16, 16, large, 256, 79192 +323, 8, 16, large, 1, 749 +323, 8, 16, large, 256, 79313 +323, 12, 16, large, 1, 800 +323, 12, 16, large, 256, 79326 +323, 16, 16, large, 1, 777 +323, 16, 16, large, 256, 79227 +flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +131, 8, 32, large, 1, 342 +131, 8, 32, large, 256, 22244 +131, 12, 32, large, 1, 332 +131, 12, 32, large, 256, 22236 +131, 16, 32, large, 1, 334 +131, 16, 32, large, 256, 22272 +195, 8, 32, large, 1, 286 +195, 8, 32, large, 256, 20263 +195, 12, 32, large, 1, 274 +195, 12, 32, large, 256, 20315 +195, 16, 32, large, 1, 274 +195, 16, 32, large, 256, 20348 +323, 8, 32, large, 1, 275 +323, 8, 32, large, 256, 20347 +323, 12, 32, large, 1, 304 +323, 12, 32, large, 256, 20227 +323, 16, 32, large, 1, 275 +323, 16, 32, large, 256, 20279 +flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +131, 8, 16, large, 1, 787 +131, 8, 16, large, 256, 79270 +131, 8, 32, large, 1, 333 +131, 8, 32, large, 256, 22250 +131, 12, 16, large, 1, 810 +131, 12, 16, large, 256, 79393 +131, 12, 32, large, 1, 317 +131, 12, 32, large, 256, 22212 +131, 16, 16, large, 1, 760 +131, 16, 16, large, 256, 79483 +131, 16, 32, large, 1, 327 +131, 16, 32, large, 256, 22230 +195, 8, 16, large, 1, 802 +195, 8, 16, large, 256, 79476 +195, 8, 32, large, 1, 286 +195, 8, 32, large, 256, 20161 +195, 12, 16, large, 1, 792 +195, 12, 16, large, 256, 79510 +195, 12, 32, large, 1, 279 +195, 12, 32, large, 256, 20251 +195, 16, 16, large, 1, 792 +195, 16, 16, large, 256, 79212 +195, 16, 32, large, 1, 371 +195, 16, 32, large, 256, 20210 +323, 8, 16, large, 1, 750 +323, 8, 16, large, 256, 79233 +323, 8, 32, large, 1, 264 +323, 8, 32, large, 256, 20078 +323, 12, 16, large, 1, 800 +323, 12, 16, large, 256, 79329 +323, 12, 32, large, 1, 298 +323, 12, 32, large, 256, 20231 +323, 16, 16, large, 1, 749 +323, 16, 16, large, 256, 79452 +323, 16, 32, large, 1, 282 +323, 16, 32, large, 256, 20214 +flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +131, 8, 16, small, 1, 545 +131, 8, 16, small, 256, 54470 +131, 8, 32, small, 1, 285 +131, 8, 32, small, 256, 17428 +131, 12, 16, small, 1, 545 +131, 12, 16, small, 256, 54470 +131, 12, 32, small, 1, 276 +131, 12, 32, small, 256, 17470 +131, 16, 16, small, 1, 537 +131, 16, 16, small, 256, 54478 +131, 16, 32, small, 1, 281 +131, 16, 32, small, 256, 17415 +195, 8, 16, small, 1, 539 +195, 8, 16, small, 256, 54463 +195, 8, 32, small, 1, 242 +195, 8, 32, small, 256, 15007 +195, 12, 16, small, 1, 539 +195, 12, 16, small, 256, 54471 +195, 12, 32, small, 1, 223 +195, 12, 32, small, 256, 15011 +195, 16, 16, small, 1, 539 +195, 16, 16, small, 256, 54471 +195, 16, 32, small, 1, 223 +195, 16, 32, small, 256, 14980 +323, 8, 16, small, 1, 539 +323, 8, 16, small, 256, 54461 +323, 8, 32, small, 1, 217 +323, 8, 32, small, 256, 15035 +323, 12, 16, small, 1, 539 +323, 12, 16, small, 256, 54469 +323, 12, 32, small, 1, 239 +323, 12, 32, small, 256, 15011 +323, 16, 16, small, 1, 529 +323, 16, 16, small, 256, 54461 +323, 16, 32, small, 1, 213 +323, 16, 32, small, 256, 15005 +flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +131, 8, 64, small, 1, flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +131, 8, 64, small, 1, flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +195, 8, 64, small, 1, 236 +195, 8, 64, small, 256, 8798 +195, 16, 64, small, 1, 201 +195, 16, 64, small, 256, 8860 +323, 8, 64, small, 1, 170 +323, 8, 64, small, 256, 6164 +323, 16, 64, small, 1, 206 +323, 16, 64, small, 256, 6409 +flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +195, 8, 64, large, 1, 241 +195, 8, 64, large, 256, 13805 +195, 12, 64, large, 1, 265 +195, 12, 64, large, 256, 13568 +195, 16, 64, large, 1, 206 +195, 16, 64, large, 256, 13864 +323, 8, 64, large, 1, 212 +323, 8, 64, large, 256, 13143 +323, 12, 64, large, 1, 215 +323, 12, 64, large, 256, 13907 +323, 16, 64, large, 1, 243 +323, 16, 64, large, 256, 13595 +flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +195, 8, 64, large, 1, 246 +195, 8, 64, large, 256, 13044 +195, 12, 64, large, 1, 243 +195, 12, 64, large, 256, 13910 +195, 16, 64, large, 1, 217 +195, 16, 64, large, 256, 13962 +323, 8, 64, large, 1, 181 +323, 8, 64, large, 256, 14123 +323, 12, 64, large, 1, 185 +323, 12, 64, large, 256, 13545 +323, 16, 64, large, 1, 211 +323, 16, 64, large, 256, 13639 +flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +131, 8, 64, large, 1, flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +195, 8, 16, small, 1, 602 +195, 8, 16, small, 256, 79008 +195, 8, 16, large, 1, 804 +195, 8, 16, large, 256, 79510 +195, 8, 32, small, 1, 244 +195, 8, 32, small, 256, 19938 +195, 8, 32, large, 1, 288 +195, 8, 32, large, 256, 20210 +195, 8, 64, small, 1, 207 +195, 8, 64, small, 256, 8895 +195, 8, 64, large, 1, 278 +195, 8, 64, large, 256, 14391 +195, 16, 16, small, 1, 584 +195, 16, 16, small, 256, 79015 +195, 16, 16, large, 1, 804 +195, 16, 16, large, 256, 79368 +195, 16, 32, small, 1, 239 +195, 16, 32, small, 256, 19911 +195, 16, 32, large, 1, 274 +195, 16, 32, large, 256, 20308 +195, 16, 64, small, 1, 204 +195, 16, 64, small, 256, 9086 +195, 16, 64, large, 1, 211 +195, 16, 64, large, 256, 13955 +323, 8, 16, small, 1, 594 +323, 8, 16, small, 256, 79048 +323, 8, 16, large, 1, 801 +323, 8, 16, large, 256, 79190 +323, 8, 32, small, 1, 230 +323, 8, 32, small, 256, 19942 +323, 8, 32, large, 1, 284 +323, 8, 32, large, 256, 20301 +323, 8, 64, small, 1, 175 +323, 8, 64, small, 256, 6568 +323, 8, 64, large, 1, 242 +323, 8, 64, large, 256, 13545 +323, 16, 16, small, 1, 594 +323, 16, 16, small, 256, 79007 +323, 16, 16, large, 1, 801 +323, 16, 16, large, 256, 79394 +323, 16, 32, small, 1, 235 +323, 16, 32, small, 256, 19979 +323, 16, 32, large, 1, 278 +323, 16, 32, large, 256, 20143 +323, 16, 64, small, 1, 175 +323, 16, 64, small, 256, 6735 +323, 16, 64, large, 1, 214 +323, 16, 64, large, 256, 13290 +flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles +131, 8, 16, small, 1, 591 +131, 8, 16, small, 256, 79016 +131, 8, 16, large, 1, 787 +131, 8, 16, large, 256, 79268 +131, 8, 32, small, 1, 284 +131, 8, 32, small, 256, 20081 +131, 8, 32, large, 1, 360 +131, 8, 32, large, 256, 22256 +131, 16, 16, small, 1, 591 +131, 16, 16, small, 256, 79084 +131, 16, 16, large, 1, 809 +131, 16, 16, large, 256, 79483 +131, 16, 32, small, 1, 269 +131, 16, 32, small, 256, 20206 +131, 16, 32, large, 1, 333 +131, 16, 32, large, 256, 22210 +flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles diff --git a/rad-sim/example-designs/dlrm_two_rad/compiler/run_tests.py b/rad-sim/example-designs/dlrm_two_rad/compiler/run_tests.py new file mode 100644 index 0000000..b6d033e --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/compiler/run_tests.py @@ -0,0 +1,127 @@ +import sys +import os +import subprocess +import glob +import shutil + +models = ['small', 'large'] +flit_widths = [131] +vc_buffer_sizes = [8, 16] +mvm_configs = [16, 32] +num_inputs = [1, 256] + +root_dir = '/home/andrew/repos/rad-flow-dev/' + +report_file = open('report.csv', 'a') +report_file.write('flit_width, vc_buf_size, mvm_lanes, model, num_inputs, cycles\n') + +# Go to the RAD flow root directory +os.chdir(root_dir) + +for fw in flit_widths: + for vc in vc_buffer_sizes: + # For every variation of the flit width and buffer size + # change the rad-flow.config file and run config script + if (fw == 131): + pw = 82 + elif (fw == 195): + pw = 146 + else: + pw = 274 + + config_file = open(root_dir+'rad-flow.config', 'r') + lines = config_file.readlines() + config_file.close() + for i in range(len(lines)): + if 'noc_payload_width' in lines[i]: + lines[i] = 'noc_payload_width = ['+str(pw)+']\n' + elif 'noc_vc_buffer_size' in lines[i]: + lines[i] = 'noc_vc_buffer_size = ['+str(vc)+']\n' + config_file = open(root_dir+'rad-flow.config', 'w') + config_file.writelines(lines) + config_file.close() + os.chdir(root_dir+'scripts/') + subprocess.run(['python', 'config.py']) + + for mvm in mvm_configs: + for model in models: + for inputs in num_inputs: + name = 'fw' + str(fw) + '_vc' + str(vc) + '_mvm' + str(mvm) + \ + '_' + str(model) + '_' + str(inputs) + report_file.write(str(fw) + ', ' + + str(vc) + ', ' + + str(mvm) + ', ' + + str(model) + ', ' + + str(inputs) + ', ') + report_file.flush() + print(name) + + print('Creating reports directory ... ', end='', flush=True) + reports_path = root_dir + 'rad-sim/example-designs/dlrm/compiler/reports/' + name + if not os.path.exists(reports_path): + os.makedirs(reports_path) + print('Done') + + + # Run dlrm compiler script + print('Running DLRM compiler script ... ', end='', flush=True) + os.chdir(root_dir+'rad-sim/example-designs/dlrm/compiler/') + subprocess.run(['python', 'dlrm.py', + '-l', str(mvm), + '-n', str(inputs), + '-m', 'ab_'+model+'.csv']) + print('Done') + + # Build RAD-Sim + print('Building RAD-Sim ... ', end='', flush=True) + os.chdir(root_dir+'rad-sim/build/') + run_log = open(reports_path + '/radsim.log', 'w') + subprocess.run(['make'], stdout=run_log, stderr=run_log) + print('Done') + + # Run RAD-Sim + print('Running RAD-Sim ... ', end='', flush=True) + run_out = subprocess.run(['./sim/build/system'], + stdout=run_log, + stderr=run_log, timeout=300) + print('Done') + run_log.close() + + # Parse run log + read_run_log = open(reports_path + '/radsim.log', 'r') + lines = read_run_log.readlines() + flag = False + for line in lines: + if 'PASSED' in line: + print('Simulation Passed! ', end='') + flag = True + if 'Simulated' in line: + line = line.split() + cycles = line[1] + print(str(cycles) + ' cycles') + report_file.write(str(cycles) + '\n') + report_file.flush() + if not flag: + print('Something Wrong!') + report_file.write('\n') + report_file.flush() + read_run_log.close() + + + # Copy dramsim reports + print('Copying DRAMsim3 reports ... ', end='', flush=True) + files = glob.iglob(os.path.join(root_dir+'rad-sim/logs', '*.txt')) + for file in files: + if (os.path.isfile(file)): + shutil.copy2(file, reports_path + '/') + print('Done') + + # Plot and save as html + print('Ploting and saving trace ... ', end='', flush=True) + os.chdir(root_dir+'rad-sim/example-designs/dlrm/compiler/') + subprocess.run(['python', root_dir+'rad-sim/example-designs/dlrm/compiler/plot.py', root_dir+'rad-sim/', reports_path]) + print('Done') + print('---------------------------') + + +report_file.close() \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/compiler/run_tests.sh b/rad-sim/example-designs/dlrm_two_rad/compiler/run_tests.sh new file mode 100755 index 0000000..c2daf34 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/compiler/run_tests.sh @@ -0,0 +1,52 @@ +#!/bin/sh + +RUN="mvm16-flitw$FLITW-vcbuf$VCBUF-$MODEL" +ROOT_DIR="/home/andrew/repos/rad-flow-dev/" + +cd $ROOT_DIR +for FLITW in 131 +do + for VCBUF in 8 + do + + if [ $FLITW -eq 131 ] + then + sed -i 's/noc_payload_width = \[[0-9]*\]/noc_payload_width = \[82\]/g' rad-flow.config + elif [ $FLITW -eq 195 ] + then + sed -i 's/noc_payload_width = \[[0-9]*\]/noc_payload_width = \[146\]/g' rad-flow.config + else + sed -i 's/noc_payload_width = \[[0-9]*\]/noc_payload_width = \[274\]/g' rad-flow.config + fi + + sed -i 's/noc_vc_buffer_size = \[[0-9]*\]/noc_vc_buffer_size = \['$VCBUF'\]/g' rad-flow.config + cd scripts + python config.py + + for MVM in 16 32 + do + for MODEL in small + do + for INPUTS in 1 256 + do + RUN="mvm$MVM-flitw$FLITW-vcbuf$VCBUF-$MODEL-$INPUTS" + echo "$RUN" + + cd $ROOT_DIR/rad-sim/example-designs/dlrm/compiler + python dlrm.py -l $MVM -n $INPUTS -m ab_$MODEL.csv + + cd $ROOT_DIR/rad-sim/build + make >> make.log + ./sim/build/system + + cd $ROOT_DIR/rad-sim/logs + mkdir $ROOT_DIR/rad-sim/example-designs/dlrm/compiler/reports/dramsim_logs/$RUN + cp *.txt $ROOT_DIR/rad-sim/example-designs/dlrm/compiler/reports/dramsim_logs/$RUN/ + + cd $ROOT_DIR/rad-sim/example-designs/dlrm/compiler + python plot.py $ROOT_DIR/rad-sim/example-designs/dlrm/compiler/reports/plots/$RUN + done + done + done + done +done \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/config.yml b/rad-sim/example-designs/dlrm_two_rad/config.yml new file mode 100644 index 0000000..0816b92 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/config.yml @@ -0,0 +1,63 @@ +config rad1: + dram: + num_controllers: 4 + clk_periods: [3.32, 3.32, 2.0, 2.0] + queue_sizes: [64, 64, 64, 64] + config_files: ['DDR4_8Gb_x16_2400', 'DDR4_8Gb_x16_2400', 'HBM2_8Gb_x128', 'HBM2_8Gb_x128'] + + design: + name: 'dlrm_two_rad' + noc_placement: ['dlrm_two_rad.place'] + clk_periods: [5.0, 2.0, 3.32, 1.5] + +config anotherconfig: + dram: + num_controllers: 4 + clk_periods: [3.32, 3.32, 2.0, 2.0] + queue_sizes: [64, 64, 64, 64] + config_files: ['DDR4_8Gb_x16_2400', 'DDR4_8Gb_x16_2400', 'HBM2_8Gb_x128', 'HBM2_8Gb_x128'] + + design: + name: 'dlrm_two_rad' + noc_placement: ['dlrm_two_rad.place'] + clk_periods: [5.0, 2.0, 3.32, 1.5] + +noc: + type: ['2d'] + num_nocs: 1 + clk_period: [1.0] + payload_width: [82] + topology: ['mesh'] + dim_x: [10] + dim_y: [10] + routing_func: ['dim_order'] + vcs: [5] + vc_buffer_size: [16] + output_buffer_size: [8] + num_packet_types: [5] + router_uarch: ['iq'] + vc_allocator: ['islip'] + sw_allocator: ['islip'] + credit_delay: [1] + routing_delay: [1] + vc_alloc_delay: [1] + sw_alloc_delay: [1] + +noc_adapters: + clk_period: [1.25] + fifo_size: [16] + obuff_size: [2] + in_arbiter: ['fixed_rr'] + out_arbiter: ['priority_rr'] + vc_mapping: ['direct'] + +cluster: + sim_driver_period: 5.0 + telemetry_log_verbosity: 2 + telemetry_traces: ['Embedding LU', 'Mem0', 'Mem1', 'Mem2', 'Mem3', 'Feature Inter.', 'MVM first', 'MVM last'] + num_rads: 2 + cluster_configs: ['rad1', 'anotherconfig'] + cluster_topology: 'all-to-all' + inter_rad_latency: 2100 + inter_rad_bw: 102.4 + inter_rad_fifo_num_slots: 1000 \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/dlrm_driver.cpp b/rad-sim/example-designs/dlrm_two_rad/dlrm_driver.cpp new file mode 100644 index 0000000..1943dee --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/dlrm_driver.cpp @@ -0,0 +1,227 @@ +#include + +bool ParseInputs(std::vector> &lookup_indecies, + std::vector> &target_channels, + std::vector> &base_addresses, + std::string &io_filename) { + std::ifstream io_file(io_filename); + if (!io_file) + return false; + + uint64_t num_indecies_per_input, index; + std::string line; + + // Get number of indecies per input + std::getline(io_file, line); + std::stringstream header_stream(line); + header_stream >> num_indecies_per_input; + + unsigned int line_num = 0; + while (std::getline(io_file, line)) { + std::stringstream line_stream(line); + if (line_num % 3 == 0) { + data_vector dvector(num_indecies_per_input); + for (unsigned int i = 0; i < num_indecies_per_input; i++) { + line_stream >> index; + dvector[i] = index; + } + lookup_indecies.push_back(dvector); + } else if (line_num % 3 == 1) { + data_vector dvector(num_indecies_per_input); + for (unsigned int i = 0; i < num_indecies_per_input; i++) { + line_stream >> index; + dvector[i] = index; + } + target_channels.push_back(dvector); + } else { + data_vector dvector(num_indecies_per_input); + for (unsigned int i = 0; i < num_indecies_per_input; i++) { + line_stream >> index; + dvector[i] = index; + } + base_addresses.push_back(dvector); + } + line_num++; + } + return true; +} + +bool ParseOutputs(std::vector> &fi_outputs, + std::string &io_filename, unsigned int &num_outputs) { + std::ifstream io_file(io_filename); + if (!io_file) + return false; + + int16_t element; + std::string line; + + std::getline(io_file, line); + std::stringstream line_stream(line); + line_stream >> num_outputs; + + while (std::getline(io_file, line)) { + std::stringstream line_stream(line); + std::vector tmp; + while (line_stream.rdbuf()->in_avail() != 0) { + line_stream >> element; + tmp.push_back(element); + } + fi_outputs.push_back(tmp); + } + return true; +} + +dlrm_driver::dlrm_driver(const sc_module_name &name, RADSimDesignContext* radsim_design_) : sc_module(name) { + this->radsim_design = radsim_design_; + + // Parse design configuration (number of layers & number of MVM per layer) + std::string design_root_dir = + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); + + std::string inputs_filename = + design_root_dir + "/compiler/embedding_indecies.in"; + ParseInputs(_lookup_indecies, _target_channels, _base_addresses, + inputs_filename); + std::cout << "Finished parsing inputs!" << std::endl; + + std::string feature_interaction_outputs_filename = + design_root_dir + "/compiler/feature_interaction.out"; + ParseOutputs(_feature_interaction_outputs, + feature_interaction_outputs_filename, + _num_feature_interaction_outputs); + + std::string mlp_outputs_filename = design_root_dir + "/compiler/mlp.out"; + ParseOutputs(_mlp_outputs, mlp_outputs_filename, _num_mlp_outputs); + + SC_METHOD(assign); + sensitive << collector_fifo_rdy; + SC_CTHREAD(source, clk.pos()); + SC_CTHREAD(sink, clk.pos()); +} + +dlrm_driver::~dlrm_driver() {} + +void dlrm_driver::assign() { collector_fifo_ren.write(collector_fifo_rdy); } + +void dlrm_driver::source() { + // Reset + rst.write(true); + lookup_indecies_valid.write(false); + wait(); + rst.write(false); + wait(); + + unsigned int idx = 0; + _start_cycle = + GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + while (idx < _lookup_indecies.size()) { + lookup_indecies_data.write(_lookup_indecies[idx]); + lookup_indecies_target_channels.write(_target_channels[idx]); + lookup_indecies_base_addresses.write(_base_addresses[idx]); + lookup_indecies_valid.write(true); + + wait(); + + if (lookup_indecies_valid.read() && lookup_indecies_ready.read()) { + idx++; + } + } + lookup_indecies_valid.write(false); + std::cout << this->name() + << ": Finished sending all inputs to embedding lookup module!" + << std::endl; + wait(); +} + +void print_progress_bar(unsigned int outputs_count, unsigned int total) { + unsigned int loading_bar_width = 50; + std::cout << "["; + float progress = 1.0 * outputs_count / total; + unsigned int pos = loading_bar_width * progress; + for (unsigned int i = 0; i < loading_bar_width; ++i) { + if (i < pos) + std::cout << "="; + else if (i == pos) + std::cout << ">"; + else + std::cout << " "; + } + if (outputs_count == total) { + std::cout << "] " << int(progress * 100.0) << " %\n"; + } else { + std::cout << "] " << int(progress * 100.0) << " %\r"; + } + std::cout.flush(); +} + +void dlrm_driver::sink() { + std::ofstream mismatching_outputs_file("mismatching.log"); + + unsigned int outputs_count = 0; + data_vector dut_output; + bool all_outputs_matching = true; + while (outputs_count < _num_mlp_outputs) { + dut_output = collector_fifo_rdata.read(); + if (collector_fifo_rdy.read() && dut_output.size() > 0) { + bool matching = true; + for (unsigned int e = 0; e < dut_output.size(); e++) { + matching = (dut_output[e] == _mlp_outputs[outputs_count][e]); + } + if (!matching) { + std::cout << "Output " << outputs_count << " on rad " << radsim_design->rad_id << " does not match!\n"; + std::cout << "TRUE: [ "; + for (unsigned int e = 0; e < _mlp_outputs[outputs_count].size(); e++) { + std::cout << _mlp_outputs[outputs_count][e] << " "; + } + std::cout << "]\n"; + std::cout << "DUT : [ "; + for (unsigned int e = 0; e < dut_output.size(); e++) { + std::cout << dut_output[e] << " "; + } + std::cout << "]\n"; + std::cout << "-------------------------------\n"; + } + // else { + // std::cout << "Output " << outputs_count << " on rad " << radsim_design->rad_id << " does match :)\n"; + // std::cout << "TRUE: [ "; + // for (unsigned int e = 0; e < _mlp_outputs[outputs_count].size(); e++) { + // std::cout << _mlp_outputs[outputs_count][e] << " "; + // } + // std::cout << "]\n"; + // std::cout << "DUT : [ "; + // for (unsigned int e = 0; e < dut_output.size(); e++) { + // std::cout << dut_output[e] << " "; + // } + // std::cout << "]\n"; + // std::cout << "-------------------------------\n"; + // } + outputs_count++; + all_outputs_matching &= matching; + + print_progress_bar(outputs_count, _num_mlp_outputs); + //std::cout << "outputs_count " << outputs_count << " and _num_mlp_outputs " << _num_mlp_outputs << std::endl; + } + wait(); + } + std::cout << "Got " << outputs_count << " output(s)!\n"; + mismatching_outputs_file.flush(); + mismatching_outputs_file.close(); + + if (all_outputs_matching) { + std::cout << "Simulation PASSED! All outputs matching!" << std::endl; + } else { + std::cout << "Simulation FAILED! Some outputs are NOT matching!" << std::endl; + radsim_design->ReportDesignFailure(); + } + _end_cycle = + GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + std::cout << "Simulated " << (_end_cycle - _start_cycle) << " cycle(s)" + << std::endl; + + for (unsigned int i = 0; i < 10; i++) { + wait(); + } + //sc_stop(); + this->radsim_design->set_rad_done(); //flag to replace sc_stop calls + return; +} \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/dlrm_driver.hpp b/rad-sim/example-designs/dlrm_two_rad/dlrm_driver.hpp new file mode 100644 index 0000000..409152d --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/dlrm_driver.hpp @@ -0,0 +1,47 @@ +#pragma once + +#include +#include +#include +#include +#include +#include +#include +#include + +class dlrm_driver : public sc_module { +private: + std::vector> _lookup_indecies; + std::vector> _target_channels; + std::vector> _base_addresses; + std::vector> _feature_interaction_outputs; + std::vector> _mlp_outputs; + unsigned int _num_feature_interaction_outputs; + unsigned int _num_mlp_outputs; + unsigned int _start_cycle, _end_cycle; + RADSimDesignContext* radsim_design; + +public: + sc_in clk; + sc_out rst; + sc_out> lookup_indecies_data; + sc_out> lookup_indecies_target_channels; + sc_out> lookup_indecies_base_addresses; + sc_out lookup_indecies_valid; + sc_in lookup_indecies_ready; + + sc_in received_responses; + + sc_in collector_fifo_rdy; + sc_out collector_fifo_ren; + sc_in> collector_fifo_rdata; + + dlrm_driver(const sc_module_name &name, RADSimDesignContext* radsim_design_); + ~dlrm_driver(); + + void assign(); + void source(); + void sink(); + + SC_HAS_PROCESS(dlrm_driver); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/dlrm_top.cpp b/rad-sim/example-designs/dlrm_two_rad/dlrm_top.cpp new file mode 100644 index 0000000..307b5cd --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/dlrm_top.cpp @@ -0,0 +1,156 @@ +#include + +dlrm_top::dlrm_top(const sc_module_name &name, RADSimDesignContext* radsim_design) : RADSimDesignTop(radsim_design) { + this->radsim_design = radsim_design; + unsigned int line_bitwidth = 512; + unsigned int element_bitwidth = 16; + std::vector mem_channels = {1, 1, 8, 8}; + unsigned int embedding_lookup_fifos_depth = 16; + unsigned int feature_interaction_fifos_depth = 64; + unsigned int num_mem_controllers = + radsim_config.GetIntKnobPerRad("dram_num_controllers", radsim_design->rad_id); + assert(num_mem_controllers == mem_channels.size()); + unsigned int total_mem_channels = 0; + for (auto &num_channels : mem_channels) { + total_mem_channels += num_channels; + } + + std::string module_name_str; + char module_name[25]; + + // Parse MVM configuration + std::string design_root_dir = + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); + std::string design_config_filename = + design_root_dir + "/compiler/mvms.config"; + + std::ifstream design_config_file(design_config_filename); + if (!design_config_file) { + std::cerr << "Cannot read MLP design configuration file!" << std::endl; + exit(1); + } + std::string line; + std::getline(design_config_file, line); + std::stringstream line_stream(line); + unsigned int num_layers, tmp; + std::vector num_mvms; + line_stream >> num_layers; + num_mvms.resize(num_layers); + for (unsigned int layer_id = 0; layer_id < num_layers; layer_id++) { + line_stream >> tmp; + num_mvms[layer_id] = tmp; + } + + // Instantiate Embedding Lookup Module + module_name_str = "embedding_lookup_inst"; + std::strcpy(module_name, module_name_str.c_str()); + embedding_lookup_inst = new embedding_lookup( + module_name, line_bitwidth, mem_channels, embedding_lookup_fifos_depth, radsim_design); + embedding_lookup_inst->rst(rst); + embedding_lookup_inst->lookup_indecies_data(lookup_indecies_data); + embedding_lookup_inst->lookup_indecies_target_channels( + lookup_indecies_target_channels); + embedding_lookup_inst->lookup_indecies_base_addresses( + lookup_indecies_base_addresses); + embedding_lookup_inst->lookup_indecies_valid(lookup_indecies_valid); + embedding_lookup_inst->lookup_indecies_ready(lookup_indecies_ready); + + // Instantiate Feature Interaction Module + module_name_str = "feature_interaction_inst"; + std::strcpy(module_name, module_name_str.c_str()); + std::string feature_interaction_inst_file = + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id) + + "/compiler/instructions/feature_interaction.inst"; + feature_interaction_inst = new custom_feature_interaction( + module_name, line_bitwidth, element_bitwidth, total_mem_channels, + feature_interaction_fifos_depth, num_mvms[0], + feature_interaction_inst_file, radsim_design); + feature_interaction_inst->rst(rst); + feature_interaction_inst->received_responses(received_responses); + + // Instantiate MVM Engines + + unsigned int axis_signal_count = 0; + mvms.resize(num_layers); + for (unsigned int l = 0; l < num_layers; l++) { + mvms[l].resize(num_mvms[l]); + for (unsigned int m = 0; m < num_mvms[l]; m++) { + module_name_str = "layer" + to_string(l) + "_mvm" + to_string(m); + std::strcpy(module_name, module_name_str.c_str()); + std::string inst_filename = design_root_dir + "/compiler/instructions/" + + module_name_str + ".inst"; + mvms[l][m] = new mvm(module_name, m, l, inst_filename, radsim_design); + mvms[l][m]->rst(rst); + axis_signal_count++; + } + } + + axis_sig.resize(axis_signal_count); + unsigned int idx = 0; + for (unsigned int l = 0; l < num_layers; l++) { + for (unsigned int m = 0; m < num_mvms[l]; m++) { + if (m == num_mvms[l] - 1 && l == num_layers - 1) { + axis_sig[idx].Connect(mvms[l][m]->tx_reduce_interface, + mvms[0][0]->rx_reduce_interface); + } else if (m == num_mvms[l] - 1) { + axis_sig[idx].Connect(mvms[l][m]->tx_reduce_interface, + mvms[l + 1][0]->rx_reduce_interface); + } else { + axis_sig[idx].Connect(mvms[l][m]->tx_reduce_interface, + mvms[l][m + 1]->rx_reduce_interface); + } + idx++; + } + } + + // Instantiate Output Collector + module_name_str = "output_collector"; + std::strcpy(module_name, module_name_str.c_str()); + output_collector = new collector(module_name, radsim_design); + output_collector->rst(rst); + output_collector->data_fifo_rdy(collector_fifo_rdy); + output_collector->data_fifo_ren(collector_fifo_ren); + output_collector->data_fifo_rdata(collector_fifo_rdata); + + ext_mem.resize(num_mem_controllers); + mem_clks.resize(num_mem_controllers); + unsigned int ch_id = 0; + std::string mem_content_init_prefix = + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id) + + "/compiler/embedding_tables/channel_"; + for (unsigned int ctrl_id = 0; ctrl_id < num_mem_controllers; ctrl_id++) { + double mem_clk_period = + radsim_config.GetDoubleVectorKnobPerRad("dram_clk_periods", ctrl_id, radsim_design->rad_id); + module_name_str = "ext_mem_" + to_string(ctrl_id) + "_clk"; + std::strcpy(module_name, module_name_str.c_str()); + mem_clks[ctrl_id] = new sc_clock(module_name, mem_clk_period, SC_NS); + module_name_str = "ext_mem_" + to_string(ctrl_id); + std::strcpy(module_name, module_name_str.c_str()); + std::string mem_content_init = mem_content_init_prefix + to_string(ch_id); + ext_mem[ctrl_id] = + new mem_controller(module_name, ctrl_id, radsim_design, mem_content_init); + ext_mem[ctrl_id]->mem_clk(*mem_clks[ctrl_id]); + ext_mem[ctrl_id]->rst(rst); + ch_id += mem_channels[ctrl_id]; + } + + this->connectPortalReset(&rst); + + radsim_design->BuildDesignContext("dlrm_two_rad.place", "dlrm_two_rad.clks"); + radsim_design->CreateSystemNoCs(rst); + radsim_design->ConnectModulesToNoC(); +} + +dlrm_top::~dlrm_top() { + delete embedding_lookup_inst; + delete feature_interaction_inst; + for (auto &ctrlr : ext_mem) + delete ctrlr; + delete output_collector; + for (unsigned int l = 0; l < mvms.size(); l++) { + for (auto &mvm : mvms[l]) { + delete mvm; + } + } + +} \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/dlrm_top.hpp b/rad-sim/example-designs/dlrm_two_rad/dlrm_top.hpp new file mode 100644 index 0000000..71d699e --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/dlrm_top.hpp @@ -0,0 +1,44 @@ +#pragma once + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +class dlrm_top : public RADSimDesignTop { +private: + embedding_lookup *embedding_lookup_inst; + custom_feature_interaction *feature_interaction_inst; + std::vector> mvms; + collector *output_collector; + std::vector ext_mem; + + std::vector axis_sig; + std::vector mem_clks; + RADSimDesignContext* radsim_design; + +public: + sc_in rst; + + sc_in> lookup_indecies_data; + sc_in> lookup_indecies_target_channels; + sc_in> lookup_indecies_base_addresses; + sc_in lookup_indecies_valid; + sc_out lookup_indecies_ready; + + sc_out received_responses; + + sc_out collector_fifo_rdy; + sc_in collector_fifo_ren; + sc_out> collector_fifo_rdata; + + dlrm_top(const sc_module_name &name, RADSimDesignContext* radsim_design); + ~dlrm_top(); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/dlrm_two_rad.clks b/rad-sim/example-designs/dlrm_two_rad/dlrm_two_rad.clks new file mode 100644 index 0000000..d50f638 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/dlrm_two_rad.clks @@ -0,0 +1,16 @@ +embedding_lookup_inst 0 0 +feature_interaction_inst 0 0 +ext_mem_0 0 2 +ext_mem_1 0 2 +ext_mem_2 0 1 +ext_mem_3 0 1 +layer0_mvm0 0 3 +layer0_mvm1 0 3 +layer0_mvm2 0 3 +layer0_mvm3 0 3 +layer1_mvm0 0 3 +layer1_mvm1 0 3 +layer2_mvm0 0 3 +layer2_mvm1 0 3 +output_collector 0 0 +portal_inst 0 0 \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/dlrm_two_rad.place b/rad-sim/example-designs/dlrm_two_rad/dlrm_two_rad.place new file mode 100644 index 0000000..2e2e240 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/dlrm_two_rad.place @@ -0,0 +1,68 @@ +ext_mem_0.mem_channel_0 0 1 aximm +ext_mem_1.mem_channel_0 0 71 aximm +ext_mem_2.mem_channel_0 0 2 aximm +ext_mem_2.mem_channel_1 0 3 aximm +ext_mem_2.mem_channel_2 0 4 aximm +ext_mem_2.mem_channel_3 0 5 aximm +ext_mem_2.mem_channel_4 0 6 aximm +ext_mem_2.mem_channel_5 0 7 aximm +ext_mem_2.mem_channel_6 0 8 aximm +ext_mem_2.mem_channel_7 0 9 aximm +ext_mem_3.mem_channel_0 0 72 aximm +ext_mem_3.mem_channel_1 0 73 aximm +ext_mem_3.mem_channel_2 0 74 aximm +ext_mem_3.mem_channel_3 0 75 aximm +ext_mem_3.mem_channel_4 0 76 aximm +ext_mem_3.mem_channel_5 0 77 aximm +ext_mem_3.mem_channel_6 0 78 aximm +ext_mem_3.mem_channel_7 0 79 aximm +embedding_lookup_inst.aximm_req_interface_0 0 11 aximm +embedding_lookup_inst.aximm_req_interface_1 0 61 aximm +embedding_lookup_inst.aximm_req_interface_2 0 12 aximm +embedding_lookup_inst.aximm_req_interface_3 0 13 aximm +embedding_lookup_inst.aximm_req_interface_4 0 14 aximm +embedding_lookup_inst.aximm_req_interface_5 0 15 aximm +embedding_lookup_inst.aximm_req_interface_6 0 16 aximm +embedding_lookup_inst.aximm_req_interface_7 0 17 aximm +embedding_lookup_inst.aximm_req_interface_8 0 18 aximm +embedding_lookup_inst.aximm_req_interface_9 0 19 aximm +embedding_lookup_inst.aximm_req_interface_10 0 62 aximm +embedding_lookup_inst.aximm_req_interface_11 0 63 aximm +embedding_lookup_inst.aximm_req_interface_12 0 64 aximm +embedding_lookup_inst.aximm_req_interface_13 0 65 aximm +embedding_lookup_inst.aximm_req_interface_14 0 66 aximm +embedding_lookup_inst.aximm_req_interface_15 0 67 aximm +embedding_lookup_inst.aximm_req_interface_16 0 68 aximm +embedding_lookup_inst.aximm_req_interface_17 0 69 aximm +feature_interaction_inst.aximm_interface_0 0 21 aximm +feature_interaction_inst.aximm_interface_1 0 51 aximm +feature_interaction_inst.aximm_interface_2 0 22 aximm +feature_interaction_inst.aximm_interface_3 0 23 aximm +feature_interaction_inst.aximm_interface_4 0 24 aximm +feature_interaction_inst.aximm_interface_5 0 25 aximm +feature_interaction_inst.aximm_interface_6 0 26 aximm +feature_interaction_inst.aximm_interface_7 0 27 aximm +feature_interaction_inst.aximm_interface_8 0 28 aximm +feature_interaction_inst.aximm_interface_9 0 29 aximm +feature_interaction_inst.aximm_interface_10 0 52 aximm +feature_interaction_inst.aximm_interface_11 0 53 aximm +feature_interaction_inst.aximm_interface_12 0 54 aximm +feature_interaction_inst.aximm_interface_13 0 55 aximm +feature_interaction_inst.aximm_interface_14 0 56 aximm +feature_interaction_inst.aximm_interface_15 0 57 aximm +feature_interaction_inst.aximm_interface_16 0 58 aximm +feature_interaction_inst.aximm_interface_17 0 59 aximm +feature_interaction_inst.axis_interface_0 0 41 axis +feature_interaction_inst.axis_interface_1 0 41 axis +feature_interaction_inst.axis_interface_2 0 41 axis +feature_interaction_inst.axis_interface_3 0 41 axis +layer0_mvm0 0 70 axis +layer0_mvm1 0 60 axis +layer0_mvm2 0 50 axis +layer0_mvm3 0 40 axis +layer1_mvm0 0 30 axis +layer1_mvm1 0 20 axis +layer2_mvm0 0 10 axis +layer2_mvm1 0 0 axis +output_collector 0 31 axis +portal_inst 0 32 axis \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/dlrm_two_rad_system.cpp b/rad-sim/example-designs/dlrm_two_rad/dlrm_two_rad_system.cpp new file mode 100644 index 0000000..05087a2 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/dlrm_two_rad_system.cpp @@ -0,0 +1,42 @@ +#include + +dlrm_two_rad_system::dlrm_two_rad_system(const sc_module_name &name, sc_clock *driver_clk_sig, RADSimDesignContext* radsim_design) + : sc_module(name) { + + // Instantiate driver + driver_inst = new dlrm_driver("driver", radsim_design); + driver_inst->clk(*driver_clk_sig); + driver_inst->rst(rst_sig); + driver_inst->lookup_indecies_data(lookup_indecies_data_sig); + driver_inst->lookup_indecies_target_channels( + lookup_indecies_target_channels_sig); + driver_inst->lookup_indecies_base_addresses( + lookup_indecies_base_addresses_sig); + driver_inst->lookup_indecies_valid(lookup_indecies_valid_sig); + driver_inst->lookup_indecies_ready(lookup_indecies_ready_sig); + driver_inst->received_responses(received_responses_sig); + driver_inst->collector_fifo_rdy(collector_fifo_rdy_sig); + driver_inst->collector_fifo_ren(collector_fifo_ren_sig); + driver_inst->collector_fifo_rdata(collector_fifo_rdata_sig); + + // Instantiate design top-level + dut_inst = new dlrm_top("dut", radsim_design); + dut_inst->rst(rst_sig); + dut_inst->lookup_indecies_data(lookup_indecies_data_sig); + dut_inst->lookup_indecies_target_channels( + lookup_indecies_target_channels_sig); + dut_inst->lookup_indecies_base_addresses(lookup_indecies_base_addresses_sig); + dut_inst->lookup_indecies_valid(lookup_indecies_valid_sig); + dut_inst->lookup_indecies_ready(lookup_indecies_ready_sig); + dut_inst->received_responses(received_responses_sig); + dut_inst->collector_fifo_rdy(collector_fifo_rdy_sig); + dut_inst->collector_fifo_ren(collector_fifo_ren_sig); + dut_inst->collector_fifo_rdata(collector_fifo_rdata_sig); + //add _top as dut instance for parent class RADSimDesignSystem + this->design_dut_inst = dut_inst; +} + +dlrm_two_rad_system::~dlrm_two_rad_system() { + delete driver_inst; + delete dut_inst; +} \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/dlrm_two_rad_system.hpp b/rad-sim/example-designs/dlrm_two_rad/dlrm_two_rad_system.hpp new file mode 100644 index 0000000..d15b091 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/dlrm_two_rad_system.hpp @@ -0,0 +1,30 @@ +#pragma once + +#include +#include +#include +#include +#include + +class dlrm_two_rad_system : public RADSimDesignSystem { //sc_module { +private: + sc_signal> lookup_indecies_data_sig; + sc_signal> lookup_indecies_target_channels_sig; + sc_signal> lookup_indecies_base_addresses_sig; + sc_signal lookup_indecies_valid_sig; + sc_signal lookup_indecies_ready_sig; + + sc_signal received_responses_sig; + + sc_signal collector_fifo_rdy_sig; + sc_signal collector_fifo_ren_sig; + sc_signal> collector_fifo_rdata_sig; + +public: + sc_signal rst_sig; + dlrm_driver *driver_inst; + dlrm_top *dut_inst; + + dlrm_two_rad_system(const sc_module_name &name, sc_clock *driver_clk_sig, RADSimDesignContext* radsim_design); + ~dlrm_two_rad_system(); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/afifo.cpp b/rad-sim/example-designs/dlrm_two_rad/modules/afifo.cpp new file mode 100644 index 0000000..98c6625 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/afifo.cpp @@ -0,0 +1,103 @@ +#include + +template +afifo::afifo(const sc_module_name &name, unsigned int depth, + unsigned int iwidth, unsigned int owidth, + unsigned int almost_full_size) + : sc_module(name), _staging_vector(owidth) { + + _wide_to_narrow = (iwidth > owidth); + _input_width = iwidth; + _output_width = owidth; + if (_wide_to_narrow) { + _width_ratio = (int)(_input_width / _output_width); + } else { + _width_ratio = (int)(_output_width / _input_width); + } + _capacity = depth; + _fifo_almost_full_size = almost_full_size; + _staging_counter = 0; + + SC_CTHREAD(Tick, clk.pos()); + reset_signal_is(rst, true); +} + +template afifo::~afifo() {} + +template void afifo::Tick() { + // Reset logic + while (!_mem.empty()) + _mem.pop(); + empty.write(true); + full.write(false); + almost_full.write(false); + wait(); + + // Sequential logic + while (true) { + // Pop logic + if (ren.read()) { + if (_mem.size() == 0) { + sim_log.log(error, "FIFO is underflowing!", this->name()); + } + _mem.pop(); + } + + // Push logic + data_vector wdata_vector; + if (wen.read()) { + if (_wide_to_narrow) { + if (_mem.size() > _capacity - _width_ratio) { + sim_log.log(error, + "FIFO is overflowing! Size = " + + std::to_string(_mem.size()), + this->name()); + } + wdata_vector = wdata.read(); + for (unsigned int i = 0; i < _width_ratio; i++) { + data_vector tmp(_output_width); + for (unsigned int j = 0; j < _output_width; j++) { + tmp[j] = wdata_vector[(i * _output_width) + j]; + } + _mem.push(tmp); + } + } else { + wdata_vector = wdata.read(); + for (unsigned int i = 0; i < _input_width; i++) { + _staging_vector[(_staging_counter * _input_width) + i] = + wdata_vector[i]; + } + if (_staging_counter == _width_ratio - 1) { + if (_mem.size() >= _capacity) { + sim_log.log(error, + "FIFO is overflowing! Size = " + + std::to_string(_mem.size()), + this->name()); + } + _staging_counter = 0; + _mem.push(_staging_vector); + } else { + _staging_counter++; + } + } + } + + empty.write(_mem.size() == 0); + if (_wide_to_narrow) + full.write(_mem.size() > _capacity - _width_ratio); + else + full.write(_mem.size() >= _capacity); + almost_full.write(_mem.size() > _capacity - 2 * _width_ratio); + data_vector tmp(_output_width); + if (_mem.size() > 0) { + tmp = _mem.front(); + } + rdata.write(tmp); + + // std::cout << this->name() << ": " << _mem.size() << " " + // << (_mem.size() >= _fifo_almost_full_size) << std::endl; + wait(); + } +} + +template class afifo; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/afifo.hpp b/rad-sim/example-designs/dlrm_two_rad/modules/afifo.hpp new file mode 100644 index 0000000..6929cac --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/afifo.hpp @@ -0,0 +1,38 @@ +#pragma once + +#include +#include +#include +#include +#include + +template class afifo : public sc_module { +private: + unsigned int _capacity; // FIFO capcity in narrower words + unsigned int _input_width; // Input width in vector elements + unsigned int _output_width; // Output widths in vector elemnts + bool _wide_to_narrow; // Flag to specify assymetry mode + unsigned int _width_ratio; // Ratio between input and output widths + unsigned int _fifo_almost_full_size; // Almost full size + std::queue> _mem; // Memory of the FIFO (queue of vectors) + data_vector _staging_vector; + unsigned int _staging_counter; + +public: + sc_in clk; + sc_in rst; + sc_in wen; + sc_in> wdata; + sc_in ren; + sc_out> rdata; + sc_out full; + sc_out almost_full; + sc_out empty; + + afifo(const sc_module_name &name, unsigned int depth, unsigned int iwidth, + unsigned int owidth, unsigned int almost_full_size); + ~afifo(); + + void Tick(); + SC_HAS_PROCESS(afifo); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/collector.cpp b/rad-sim/example-designs/dlrm_two_rad/modules/collector.cpp new file mode 100644 index 0000000..e71ba8e --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/collector.cpp @@ -0,0 +1,65 @@ +#include + +collector::collector(const sc_module_name &name, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design), rst("rst"), data_fifo_rdy("data_fifo_rdy"), + data_fifo_ren("data_fifo_ren"), data_fifo_rdata("data_fifo_rdata") { + + module_name = name; + this->radsim_design = radsim_design; + + char fifo_name[25]; + std::string fifo_name_str; + fifo_name_str = "collector_data_fifo"; + std::strcpy(fifo_name, fifo_name_str.c_str()); + data_fifo = new fifo(fifo_name, FIFO_SIZE, LANES, FIFO_SIZE - 1, 0); + data_fifo->clk(clk); + data_fifo->rst(rst); + data_fifo->wen(data_fifo_wen_signal); + data_fifo->ren(data_fifo_ren); + data_fifo->wdata(data_fifo_wdata_signal); + data_fifo->full(data_fifo_full_signal); + data_fifo->almost_full(data_fifo_almost_full_signal); + data_fifo->empty(data_fifo_empty_signal); + data_fifo->almost_empty(data_fifo_almost_empty_signal); + data_fifo->rdata(data_fifo_rdata); + + SC_METHOD(Assign); + sensitive << rst << data_fifo_empty_signal << data_fifo_almost_full_signal + << rx_interface.tvalid << rx_interface.tdata << rx_interface.tready; + + this->RegisterModuleInfo(); +} + +collector::~collector() { delete data_fifo; } + +void collector::Assign() { + if (rst.read()) { + rx_interface.tready.write(false); + data_fifo_rdy.write(false); + } else if (radsim_design->rad_id == 1) { + rx_interface.tready.write(!data_fifo_almost_full_signal); + data_fifo_wen_signal.write(rx_interface.tvalid.read() && + rx_interface.tready.read()); + data_fifo_rdy.write(!data_fifo_empty_signal); + + data_vector tx_tdata(LANES); + sc_bv tx_tdata_bv = rx_interface.tdata.read(); + if (rx_interface.tvalid.read() && rx_interface.tready.read()) { + for (unsigned int lane_id = 0; lane_id < LANES; lane_id++) { + tx_tdata[lane_id] = + tx_tdata_bv.range((lane_id + 1) * BITWIDTH - 1, lane_id * BITWIDTH) + .to_int(); + } + data_fifo_wdata_signal.write(tx_tdata); + } + } +} + +void collector::RegisterModuleInfo() { + std::string port_name; + _num_noc_axis_slave_ports = 0; + _num_noc_axis_master_ports = 0; + + port_name = module_name + ".data_collect"; + RegisterAxisSlavePort(port_name, &rx_interface, DATAW, 0); +} \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/collector.hpp b/rad-sim/example-designs/dlrm_two_rad/modules/collector.hpp new file mode 100644 index 0000000..73f6f09 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/collector.hpp @@ -0,0 +1,36 @@ +#pragma once + +#include +#include +#include +#include +#include +#include +#include +#include + +class collector : public RADSimModule { +private: + std::string module_name; + + fifo *data_fifo; + sc_signal> data_fifo_wdata_signal; + sc_signal data_fifo_wen_signal, data_fifo_full_signal, + data_fifo_empty_signal, data_fifo_almost_full_signal, + data_fifo_almost_empty_signal; + +public: + RADSimDesignContext* radsim_design; + sc_in rst; + sc_out data_fifo_rdy; + sc_in data_fifo_ren; + sc_out> data_fifo_rdata; + axis_slave_port rx_interface; + + collector(const sc_module_name &name, RADSimDesignContext* radsim_design); + ~collector(); + + void Assign(); + SC_HAS_PROCESS(collector); + void RegisterModuleInfo(); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/custom_feature_interaction.cpp b/rad-sim/example-designs/dlrm_two_rad/modules/custom_feature_interaction.cpp new file mode 100644 index 0000000..b0f423e --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/custom_feature_interaction.cpp @@ -0,0 +1,314 @@ +#include + +void ParseFeatureInteractionInstructions( + std::string &instructions_file, + std::vector &instructions, + std::string &responses_file, unsigned int &num_expected_responses) { + + std::ifstream resp_file(responses_file); + if (!resp_file) { + sim_log.log(error, "Cannot find feature interaction responses file!"); + } + std::string line; + std::getline(resp_file, line); + std::stringstream ls(line); + unsigned int lookups, num_inputs; + ls >> lookups >> num_inputs; + num_expected_responses = lookups * num_inputs; + resp_file.close(); + + std::ifstream inst_file(instructions_file); + if (!inst_file) { + sim_log.log(error, "Cannot find feature interaction instructions file!"); + } + + unsigned int fifo_id, start_element, end_element; + bool pop; + while (std::getline(inst_file, line)) { + custom_feature_interaction_inst instruction; + std::stringstream line_stream(line); + while (line_stream >> fifo_id >> start_element >> end_element >> pop) { + instruction.xbar_schedule.push_back(fifo_id); + instruction.start_element.push_back(start_element); + instruction.end_element.push_back(end_element); + instruction.pop_fifo.push_back(pop); + } + instructions.push_back(instruction); + } + inst_file.close(); +} + +custom_feature_interaction::custom_feature_interaction( + const sc_module_name &name, unsigned int dataw, + unsigned int element_bitwidth, unsigned int num_mem_channels, + unsigned int fifos_depth, unsigned int num_output_channels, + std::string &instructions_file, + RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design) { + + this->radsim_design = radsim_design; + _fifos_depth = fifos_depth; + _num_received_responses = 0; + _num_mem_channels = num_mem_channels; + _dataw = dataw; + _bitwidth = element_bitwidth; + _num_input_elements = dataw / element_bitwidth; //512/16=32 + _num_output_elements = DATAW / element_bitwidth; + _num_output_channels = num_output_channels; + + aximm_interface.init(_num_mem_channels); + axis_interface.init(_num_output_channels); + + _input_fifos.resize(_num_mem_channels); + _ififo_full.init(_num_mem_channels); + _ififo_empty.init(_num_mem_channels); + + _output_fifos.resize(_num_output_channels); + _ofifo_full.init(_num_output_channels); + _ofifo_empty.init(_num_output_channels); + + std::string resp_filename = + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id) + + "/compiler/embedding_indecies.in"; + ParseFeatureInteractionInstructions(instructions_file, _instructions, + resp_filename, _num_expected_responses); + + // Combinational logic and its sensitivity list + SC_METHOD(Assign); + sensitive << rst; + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + sensitive << _ififo_full[ch_id]; + } + // Sequential logic and its clock/reset setup + SC_CTHREAD(Tick, clk.pos()); + reset_signal_is(rst, true); // Reset is active high + + // This function must be defined & called for any RAD-Sim module to register + // its info for automatically connecting to the NoC + this->RegisterModuleInfo(); + _debug_feature_interaction_out = new ofstream("dut_feature_interaction.out"); +} + +custom_feature_interaction::~custom_feature_interaction() { + delete _debug_feature_interaction_out; +} + +void custom_feature_interaction::Assign() { + if (rst) { + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + aximm_interface[ch_id].bready.write(false); + aximm_interface[ch_id].rready.write(false); + } + } else if (radsim_design->rad_id == 0) { + // Set ready signals to accept read/write response from the AXI-MM NoC + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + aximm_interface[ch_id].bready.write(false); + aximm_interface[ch_id].rready.write(!_ififo_full[ch_id].read()); + } + } +} + +void custom_feature_interaction::bv_to_data_vector( + sc_bv &bitvector, data_vector &datavector, + unsigned int num_elements) { + + unsigned int start_idx, end_idx; + for (unsigned int e = 0; e < num_elements; e++) { + start_idx = e * _bitwidth; + end_idx = (e + 1) * _bitwidth; + datavector[e] = bitvector.range(end_idx - 1, start_idx).to_int(); + } +} + +void custom_feature_interaction::data_vector_to_bv( + data_vector &datavector, sc_bv &bitvector, + unsigned int num_elements) { + + unsigned int start_idx, end_idx; + for (unsigned int e = 0; e < num_elements; e++) { + start_idx = e * _bitwidth; + end_idx = (e + 1) * _bitwidth; + bitvector.range(end_idx - 1, start_idx) = datavector[e]; + } +} + +bool are_ififos_ready(sc_vector> &ififo_empty, + custom_feature_interaction_inst &inst) { + + bool ready = true; + for (auto &f : inst.xbar_schedule) { + if (f == 0) + ready &= true; + else + ready &= !ififo_empty[f - 1].read(); + } + return ready; +} + +void custom_feature_interaction::Tick() { + if (radsim_design->rad_id == 0) { + // Reset ports + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + aximm_interface[ch_id].arvalid.write(false); + aximm_interface[ch_id].awvalid.write(false); + aximm_interface[ch_id].wvalid.write(false); + } + received_responses.write(0); + // Reset signals + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + _ififo_full[ch_id].write(false); + _ififo_empty[ch_id].write(true); + } + for (unsigned int ch_id = 0; ch_id < _num_output_channels; ch_id++) { + _ofifo_full[ch_id].write(false); + _ofifo_empty[ch_id].write(true); + axis_interface[ch_id].tvalid.write(false); + } + _dest_ofifo.write(0); + _pc.write(0); + wait(); + + int no_val_counter = 0; + bool got_all_mem_responses = false; + + // Always @ positive edge of the clock + while (true ) { //&& (radsim_design->rad_id == 0)) { + // Accept R responses from the NoC + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + if (_input_fifos[ch_id].size() < _fifos_depth && + aximm_interface[ch_id].rvalid.read()) { + sc_bv rdata_bv = aximm_interface[ch_id].rdata.read(); + data_vector rdata(_num_input_elements); + bv_to_data_vector(rdata_bv, rdata, _num_input_elements); + _input_fifos[ch_id].push(rdata); + _num_received_responses++; + if (_num_received_responses == _num_expected_responses) { + std::cout << this->name() << ": Got all memory responses at cycle " + << GetSimulationCycle(5.0) << "!" << std::endl; + got_all_mem_responses = true; + } + } + } + + // Pop from input FIFOs + bool ififos_ready = + are_ififos_ready(_ififo_empty, _instructions[_pc.read()]); + if (ififos_ready && !_ofifo_full[_dest_ofifo.read()]) { + data_vector ofifo_data_vector(_num_output_elements); + custom_feature_interaction_inst instruction = _instructions[_pc.read()]; + unsigned int num_steps = instruction.xbar_schedule.size(); + unsigned int element_id = 0; + unsigned int fifo_id, start_idx, end_idx; + for (unsigned int step = 0; step < num_steps; step++) { + fifo_id = instruction.xbar_schedule[step]; + start_idx = instruction.start_element[step]; + end_idx = instruction.end_element[step]; + data_vector tmp(_num_input_elements); + if (fifo_id != 0) { + tmp = _input_fifos[fifo_id - 1].front(); + if (instruction.pop_fifo[step]) { + _input_fifos[fifo_id - 1].pop(); + } + } + for (unsigned int element = start_idx; element <= end_idx; element++) { + assert(element_id < ofifo_data_vector.size()); + ofifo_data_vector[element_id] = tmp[element]; + element_id++; + } + } + if (fifo_id != 0) { + *_debug_feature_interaction_out << ofifo_data_vector << "\n"; + _debug_feature_interaction_out->flush(); + } + _output_fifos[_dest_ofifo.read()].push(ofifo_data_vector); + + // Advance destination FIFO pointer + if (_dest_ofifo.read() == _num_output_channels - 1) { + _dest_ofifo.write(0); + } else { + _dest_ofifo.write(_dest_ofifo.read() + 1); + } + + // Advance Instructions Pointer + if (_pc.read() == _instructions.size() - 1) { + _pc.write(0); + } else { + _pc.write(_pc.read() + 1); + } + } + + // Interface with AXI-S NoC + bool non_empty_output_fifo = false; + for (unsigned int ch_id = 0; ch_id < _num_output_channels; ch_id++) { + if (axis_interface[ch_id].tready.read() && + axis_interface[ch_id].tvalid.read()) { + int curr_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + data_vector tx_tdata = _output_fifos[ch_id].front(); + //std::cout << "custom_feature_interaction @ cycle " << curr_cycle << ": tx_tdata sent " << tx_tdata << " from RAD " << radsim_design->rad_id << " with tdest field " << axis_interface[ch_id].tdest.read() << std::endl; + _output_fifos[ch_id].pop(); + } + + if ( (!_output_fifos[ch_id].empty()) ) { //&& (radsim_design->rad_id == 0) ) { + non_empty_output_fifo = true; + data_vector tx_tdata = _output_fifos[ch_id].front(); + //std::cout << "custom_feature_interaction: tx_tdata sent " << tx_tdata << " from RAD " << radsim_design->rad_id << std::endl; + sc_bv tx_tdata_bv; + data_vector_to_bv(tx_tdata, tx_tdata_bv, _num_output_elements); + axis_interface[ch_id].tvalid.write(true); + axis_interface[ch_id].tdata.write(tx_tdata_bv); + axis_interface[ch_id].tuser.write(3 << 13); + axis_interface[ch_id].tid.write(0); + std::string dest_name = + "layer0_mvm" + std::to_string(ch_id) + ".rx_interface"; + //std::cout << "radsim_design->GetPortDestinationID(dest_name) on RAD " << radsim_design->rad_id << ": " << radsim_design->GetPortDestinationID(dest_name) << std::endl; + sc_bv dest_id_concat; + DEST_RAD(dest_id_concat) = 1; //radsim_design->rad_id; + DEST_LOCAL_NODE(dest_id_concat) = radsim_design->GetPortDestinationID(dest_name); + DEST_REMOTE_NODE(dest_id_concat) = radsim_design->GetPortDestinationID(dest_name); + axis_interface[ch_id].tdest.write( + dest_id_concat); + //radsim_design->GetPortDestinationID(dest_name)); + no_val_counter = 0; + } else { + axis_interface[ch_id].tvalid.write(false); + no_val_counter++; + } + } + + // Set FIFO signals + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + _ififo_empty[ch_id].write(_input_fifos[ch_id].empty()); + _ififo_full[ch_id].write(_input_fifos[ch_id].size() >= _fifos_depth - 4); + } + for (unsigned int ch_id = 0; ch_id < _num_output_channels; ch_id++) { + _ofifo_empty[ch_id].write(_output_fifos[ch_id].empty()); + _ofifo_full[ch_id].write(_output_fifos[ch_id].size() >= _fifos_depth - 2); + } + received_responses.write(_num_received_responses); + + if (non_empty_output_fifo && got_all_mem_responses) { + radsim_design->set_rad_done(); + } + + wait(); + } + } +} + +void custom_feature_interaction::RegisterModuleInfo() { + std::string port_name; + _num_noc_axis_slave_ports = 0; + _num_noc_axis_master_ports = 0; + _num_noc_aximm_slave_ports = 0; + _num_noc_aximm_master_ports = 0; + + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + port_name = module_name + ".aximm_interface_" + std::to_string(ch_id); + RegisterAximmMasterPort(port_name, &aximm_interface[ch_id], _dataw); + } + + for (unsigned int ch_id = 0; ch_id < _num_output_channels; ch_id++) { + port_name = module_name + ".axis_interface_" + std::to_string(ch_id); + RegisterAxisMasterPort(port_name, &axis_interface[ch_id], DATAW, 0); + } +} diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/custom_feature_interaction.hpp b/rad-sim/example-designs/dlrm_two_rad/modules/custom_feature_interaction.hpp new file mode 100644 index 0000000..bf481e0 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/custom_feature_interaction.hpp @@ -0,0 +1,77 @@ +#pragma once + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct custom_feature_interaction_inst { + std::vector xbar_schedule; + std::vector start_element; + std::vector end_element; + std::vector pop_fifo; +}; + +class custom_feature_interaction : public RADSimModule { +private: + unsigned int _fifos_depth; // Depth of input/output FIFOs + std::vector _instructions; // Instruction mem + sc_signal _pc; // Program counter + + std::vector>> _input_fifos; // Input FIFOs + sc_vector> _ififo_full; // Signals FIFOs full + sc_vector> _ififo_empty; // Signals iFIFOs empty + + std::vector>> _output_fifos; // Output FIFO + sc_vector> _ofifo_full; // Signals oFIFO full + sc_vector> _ofifo_empty; // Signals oFIFO empty + sc_signal _dest_ofifo; + + unsigned int _num_mem_channels; // No. of memory channels + unsigned int _dataw; // Data interface bitwidth + unsigned int _num_received_responses; + unsigned int _num_input_elements; + unsigned int _num_output_elements; + unsigned int _bitwidth; + unsigned int _num_output_channels; + unsigned int _num_expected_responses; + + ofstream *_debug_feature_interaction_out; + +public: + RADSimDesignContext* radsim_design; + sc_in rst; + // Interface to driver logic + sc_out received_responses; + // Interface to the NoC + sc_vector aximm_interface; + sc_vector axis_interface; + + custom_feature_interaction(const sc_module_name &name, unsigned int dataw, + unsigned int element_bitwidth, + unsigned int num_mem_channels, + unsigned int fifos_depth, + unsigned int num_output_channels, + std::string &instructions_file, + RADSimDesignContext* radsim_design); + ~custom_feature_interaction(); + + void Assign(); // Combinational logic process + void Tick(); // Sequential logic process + void bv_to_data_vector(sc_bv &bitvector, + data_vector &datavector, + unsigned int num_elements); + void data_vector_to_bv(data_vector &datavector, + sc_bv &bitvector, + unsigned int num_elements); + SC_HAS_PROCESS(custom_feature_interaction); + void RegisterModuleInfo(); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/dlrm_defines.hpp b/rad-sim/example-designs/dlrm_two_rad/modules/dlrm_defines.hpp new file mode 100644 index 0000000..139ca30 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/dlrm_defines.hpp @@ -0,0 +1,11 @@ +#define BITWIDTH 16 +#define LANES 32 +#define FIFO_SIZE 512 +#define COMPUTE_LATENCY 10 +#define RF_MEM_DEPTH 512 +#define ACCUM_MEM_DEPTH 64 +#define INST_MEM_DEPTH 2048 +#define DOT_PRODUCTS LANES +#define DATAW (BITWIDTH * LANES) +#define TDATA_ELEMS 32 +#define TDATA_WIDTH 16 diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/embedding_lookup.cpp b/rad-sim/example-designs/dlrm_two_rad/modules/embedding_lookup.cpp new file mode 100644 index 0000000..4f678fa --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/embedding_lookup.cpp @@ -0,0 +1,203 @@ +#include + +embedding_lookup::embedding_lookup( + const sc_module_name &name, unsigned int dataw, + std::vector &num_mem_channels_per_controller, + unsigned int fifo_depth, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design) { + + this->radsim_design = radsim_design; + + _total_num_channels = 0; + unsigned int ctrl_id = 0; + for (auto &num_channels : num_mem_channels_per_controller) { + _num_channels_per_ctrl.push_back(num_channels); + _total_num_channels += num_channels; + for (unsigned int ch_id = 0; ch_id < num_channels; ch_id++) { + std::string port_name = + "ext_mem_" + to_string(ctrl_id) + ".mem_channel_" + to_string(ch_id); + _dst_port_names.push_back(port_name); + } + ctrl_id++; + } + _lookup_indecies_fifo.resize(_total_num_channels); + _base_addresses_fifo.resize(_total_num_channels); + _fifo_depth = fifo_depth; + _fifo_full.init(_total_num_channels); + _id_count.init(_total_num_channels); + _num_received_responses = 0; + _dataw = dataw; + + aximm_req_interface.init(_total_num_channels); + + // Combinational logic and its sensitivity list + SC_METHOD(Assign); + sensitive << rst; + for (unsigned int ch_id = 0; ch_id < _total_num_channels; ch_id++) { + sensitive << _fifo_full[ch_id]; + } + // Sequential logic and its clock/reset setup + SC_CTHREAD(Tick, clk.pos()); + reset_signal_is(rst, true); // Reset is active high + + // This function must be defined & called for any RAD-Sim module to register + // its info for automatically connecting to the NoC + this->RegisterModuleInfo(); + _debug_sent_request_counter = 0; +} + +embedding_lookup::~embedding_lookup() {} + +void embedding_lookup::Assign() { + if (rst) { + lookup_indecies_ready.write(true); + for (unsigned int ch_id = 0; ch_id < _total_num_channels; ch_id++) { + aximm_req_interface[ch_id].bready.write(false); + aximm_req_interface[ch_id].rready.write(false); + } + } else if ((radsim_design->rad_id == 0)) { + bool all_fifos_not_full = true; + + // Always ready to accept read/write response from the AXI-MM NoC + // interface + for (unsigned int ch_id = 0; ch_id < _total_num_channels; ch_id++) { + aximm_req_interface[ch_id].bready.write(true); + aximm_req_interface[ch_id].rready.write(true); + all_fifos_not_full = all_fifos_not_full && !_fifo_full[ch_id].read(); + } + + // Ready to accept new lookup indecies from driver testbench as long as + // none of the FIFOs are full + lookup_indecies_ready.write(all_fifos_not_full); + } +} + +void embedding_lookup::Tick() { + if (radsim_design->rad_id == 0) { + // Reset logic + for (unsigned int ch_id = 0; ch_id < _total_num_channels; ch_id++) { + aximm_req_interface[ch_id].arvalid.write(false); + aximm_req_interface[ch_id].awvalid.write(false); + aximm_req_interface[ch_id].wvalid.write(false); + while (!_lookup_indecies_fifo[ch_id].empty()) { + _lookup_indecies_fifo[ch_id].pop(); + } + while (!_base_addresses_fifo[ch_id].empty()) { + _base_addresses_fifo[ch_id].pop(); + } + _fifo_full[ch_id].write(false); + _id_count[ch_id].write(0); + } + wait(); + + // Always @ positive edge of the clock + while (true) { //&& (radsim_design->rad_id == 0)) { + if (lookup_indecies_ready.read() && lookup_indecies_valid.read()) { + data_vector lookup_indecies = lookup_indecies_data.read(); + data_vector target_channels = + lookup_indecies_target_channels.read(); + data_vector base_addresses = + lookup_indecies_base_addresses.read(); + for (unsigned int i = 0; i < lookup_indecies.size(); i++) { + _lookup_indecies_fifo[target_channels[i]].push(lookup_indecies[i]); + _base_addresses_fifo[target_channels[i]].push(base_addresses[i]); + } + // std::cout << module_name << ": Received lookup indecies" << std::endl; + } + + // Set FIFO full signals + for (unsigned int ch_id = 0; ch_id < _total_num_channels; ch_id++) { + _fifo_full[ch_id].write(_lookup_indecies_fifo[ch_id].size() >= + _fifo_depth - 4); + } + + // Sending transactions to AXI-MM NoC + unsigned int ch_id = 0; + for (unsigned int ctrl_id = 0; ctrl_id < _num_channels_per_ctrl.size(); + ctrl_id++) { + for (unsigned int c = 0; c < _num_channels_per_ctrl[ctrl_id]; c++) { + if (!_lookup_indecies_fifo[ch_id].empty()) { + uint64_t lookup_index = _lookup_indecies_fifo[ch_id].front(); + uint64_t table_base_addr = _base_addresses_fifo[ch_id].front(); + + std::string dst_port_name = _dst_port_names[ch_id]; + uint64_t dst_addr = radsim_design->GetPortBaseAddress(dst_port_name) + + table_base_addr + lookup_index; + std::string src_port_name = + "feature_interaction_inst.aximm_interface_" + + std::to_string(ch_id); + uint64_t src_addr = radsim_design->GetPortBaseAddress(src_port_name); + + /*if (ctrl_id == 0) { + std::cout << "Base address: " << table_base_addr << std::endl; + std::cout << "Index: " << lookup_index << std::endl; + }*/ + + aximm_req_interface[ch_id].araddr.write(dst_addr); + aximm_req_interface[ch_id].arid.write(_id_count[ch_id].read()); + aximm_req_interface[ch_id].arlen.write(0); + aximm_req_interface[ch_id].arburst.write(0); + aximm_req_interface[ch_id].arsize.write(0); + aximm_req_interface[ch_id].aruser.write(src_addr); + aximm_req_interface[ch_id].arvalid.write(true); + aximm_req_interface[ch_id].awvalid.write(false); + aximm_req_interface[ch_id].wvalid.write(false); + } else { + aximm_req_interface[ch_id].arvalid.write(false); + aximm_req_interface[ch_id].awvalid.write(false); + aximm_req_interface[ch_id].wvalid.write(false); + } + + // Pop the FIFO if the transaction is accepted + if (aximm_req_interface[ch_id].arvalid.read() && + aximm_req_interface[ch_id].arready.read()) { + /*if (ctrl_id == 0) { + std::cout << "ELU sent address " + << aximm_req_interface[ch_id].araddr.read().to_uint64() + << std::endl; + cin.get(); + }*/ + _lookup_indecies_fifo[ch_id].pop(); + _base_addresses_fifo[ch_id].pop(); + _id_count[ch_id].write(_id_count[ch_id].read() + 1); + /*_debug_sent_request_counter++; + std::cout << module_name << ": Sent AR transaction " + << _debug_sent_request_counter << " @ channel " << ch_id + << "!" << std::endl;*/ + } + + // Receiving transactions from AXI-MM NoC + if (aximm_req_interface[ch_id].rvalid.read() && + aximm_req_interface[ch_id].rready.read()) { + /*std::cout << module_name << ": Received READ response " + << _num_received_responses << " (" + << aximm_req_interface[ch_id].rdata.read() << ")!" + << std::endl;*/ + _num_received_responses++; + } else if (aximm_req_interface[ch_id].bvalid.read() && + aximm_req_interface[ch_id].bready.read()) { + // std::cout << module_name << ": Received WRITE response!" << + // std::endl; + _num_received_responses++; + } + ch_id++; + } + } + wait(); + } + } +} + +void embedding_lookup::RegisterModuleInfo() { + std::string port_name; + _num_noc_axis_slave_ports = 0; + _num_noc_axis_master_ports = 0; + _num_noc_aximm_slave_ports = 0; + _num_noc_aximm_master_ports = 0; + + for (unsigned int ch_id = 0; ch_id < _total_num_channels; ch_id++) { + port_name = module_name + ".aximm_req_interface_" + std::to_string(ch_id); + // std::cout << "----" << port_name << std::endl; + RegisterAximmMasterPort(port_name, &aximm_req_interface[ch_id], _dataw); + } +} diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/embedding_lookup.hpp b/rad-sim/example-designs/dlrm_two_rad/modules/embedding_lookup.hpp new file mode 100644 index 0000000..19e3313 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/embedding_lookup.hpp @@ -0,0 +1,51 @@ +#pragma once + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +class embedding_lookup : public RADSimModule { +private: + std::vector> + _lookup_indecies_fifo; // Lookup indecies FIFO per channel + std::vector> + _base_addresses_fifo; // Base addresses FIFO per channel + unsigned int _fifo_depth; // Depth of request FIFOs + sc_vector> _fifo_full; // Signals flagging FIFOs are full + sc_vector> _id_count; // Counters for transaction IDs + unsigned int _num_received_responses; // Coutnter for received responses + std::vector _num_channels_per_ctrl; // # channels / controller + unsigned int _total_num_channels; // Total number of memory channels + unsigned int _dataw; // Data interface bitwidth + std::vector _dst_port_names; // Mem controller port names + + unsigned int _debug_sent_request_counter; + +public: + RADSimDesignContext* radsim_design; + sc_in rst; + // Interface to driver logic + sc_in> lookup_indecies_data; + sc_in> lookup_indecies_target_channels; + sc_in> lookup_indecies_base_addresses; + sc_in lookup_indecies_valid; + sc_out lookup_indecies_ready; + // Interface to the NoC + sc_vector aximm_req_interface; + + embedding_lookup(const sc_module_name &name, unsigned int dataw, + std::vector &num_mem_channels_per_controller, + unsigned int fifo_depth, RADSimDesignContext* radsim_design); + ~embedding_lookup(); + + void Assign(); // Combinational logic process + void Tick(); // Sequential logic process + SC_HAS_PROCESS(embedding_lookup); + void RegisterModuleInfo(); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/feature_interaction.cpp b/rad-sim/example-designs/dlrm_two_rad/modules/feature_interaction.cpp new file mode 100644 index 0000000..1479095 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/feature_interaction.cpp @@ -0,0 +1,360 @@ +#include + +void ParseFeatureInteractionInstructions( + std::string &instructions_file, + std::vector &instructions, + std::string &responses_file, unsigned int &num_expected_responses) { + + std::ifstream resp_file(responses_file); + if (!resp_file) { + sim_log.log(error, "Cannot find feature interaction responses file!"); + } + std::string line; + std::getline(resp_file, line); + std::stringstream ls(line); + unsigned int lookups, num_inputs; + ls >> lookups >> num_inputs; + num_expected_responses = lookups * num_inputs; + resp_file.close(); + + std::ifstream inst_file(instructions_file); + if (!inst_file) { + sim_log.log(error, "Cannot find feature interaction instructions file!"); + } + + unsigned int mux_select; + std::string pop_signals; + while (std::getline(inst_file, line)) { + feature_interaction_inst instruction; + std::stringstream line_stream(line); + line_stream >> mux_select; + instruction.mux_select = mux_select; + line_stream >> pop_signals; + uint8_t idx = 0; + for (char &c : pop_signals) { + if (c == '1') { + instruction.fifo_pops.push_back(true); + } else { + instruction.fifo_pops.push_back(false); + } + idx++; + } + instructions.push_back(instruction); + } + inst_file.close(); +} + +feature_interaction::feature_interaction(const sc_module_name &name, + unsigned int dataw, + unsigned int element_bitwidth, + unsigned int num_mem_channels, + unsigned int fifos_depth, + unsigned int num_output_channels, + std::string &instructions_file, + RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design) { + this->radsim_design = radsim_design; + _fifos_depth = fifos_depth; + _afifo_width_ratio_in = 32 / 4; + _afifo_width_ratio_out = LANES / 4; + _num_received_responses = 0; + _num_mem_channels = num_mem_channels; + _dataw = dataw; + _bitwidth = element_bitwidth; + _num_elements_wide_in = dataw / element_bitwidth; + _num_elements_narrow = _num_elements_wide_in / _afifo_width_ratio_in; + _num_elements_wide_out = _num_elements_narrow * _afifo_width_ratio_out; + _num_output_channels = num_output_channels; + _staging_counter = 0; + _staging_data.resize(_num_elements_wide_out); + + aximm_interface.init(_num_mem_channels); + axis_interface.init(_num_output_channels); + + _input_fifos.resize(_num_mem_channels); + _ififo_full.init(_num_mem_channels); + _ififo_empty.init(_num_mem_channels); + + _output_fifos.resize(_num_output_channels); + _ofifo_full.init(_num_output_channels); + _ofifo_empty.init(_num_output_channels); + + std::string resp_filename = + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id) + + "/compiler/embedding_indecies.in"; + ParseFeatureInteractionInstructions(instructions_file, _instructions, + resp_filename, _num_expected_responses); + + // Combinational logic and its sensitivity list + SC_METHOD(Assign); + sensitive << rst; + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + sensitive << _ififo_full[ch_id]; + } + // Sequential logic and its clock/reset setup + SC_CTHREAD(Tick, clk.pos()); + reset_signal_is(rst, true); // Reset is active high + + // This function must be defined & called for any RAD-Sim module to register + // its info for automatically connecting to the NoC + this->RegisterModuleInfo(); + _debug_feature_interaction_out = new ofstream("dut_feature_interaction.out"); +} + +feature_interaction::~feature_interaction() { + delete _debug_feature_interaction_out; +} + +void feature_interaction::Assign() { + if (rst) { + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + aximm_interface[ch_id].bready.write(false); + aximm_interface[ch_id].rready.write(false); + } + } else { + // Set ready signals to accept read/write response from the AXI-MM NoC + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + aximm_interface[ch_id].bready.write(false); + aximm_interface[ch_id].rready.write(!_ififo_full[ch_id].read()); + } + } +} + +void feature_interaction::bv_to_data_vector(sc_bv &bitvector, + data_vector &datavector, + unsigned int num_elements) { + + unsigned int start_idx, end_idx; + for (unsigned int e = 0; e < num_elements; e++) { + start_idx = e * _bitwidth; + end_idx = (e + 1) * _bitwidth; + datavector[e] = bitvector.range(end_idx - 1, start_idx).to_int(); + } +} + +void feature_interaction::data_vector_to_bv(data_vector &datavector, + sc_bv &bitvector, + unsigned int num_elements) { + + unsigned int start_idx, end_idx; + for (unsigned int e = 0; e < num_elements; e++) { + start_idx = e * _bitwidth; + end_idx = (e + 1) * _bitwidth; + bitvector.range(end_idx - 1, start_idx) = datavector[e]; + } +} + +bool are_ififos_ready(sc_vector> &ififo_empty, + feature_interaction_inst &inst) { + + bool ready = true; + bool fifos_popped = (inst.mux_select > 0); + for (unsigned int ch_id = 0; ch_id < inst.fifo_pops.size(); ch_id++) { + if (inst.fifo_pops[ch_id]) { + fifos_popped = true; + ready &= !ififo_empty[ch_id].read(); + } + } + return (ready && fifos_popped); +} + +void feature_interaction::Tick() { + // Reset ports + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + aximm_interface[ch_id].arvalid.write(false); + aximm_interface[ch_id].awvalid.write(false); + aximm_interface[ch_id].wvalid.write(false); + } + // feature_interaction_valid.write(false); + received_responses.write(0); + // Reset signals + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + _ififo_full[ch_id].write(false); + _ififo_empty[ch_id].write(true); + } + for (unsigned int ch_id = 0; ch_id < _num_output_channels; ch_id++) { + _ofifo_full[ch_id].write(false); + _ofifo_empty[ch_id].write(true); + axis_interface[ch_id].tvalid.write(false); + } + _dest_ofifo.write(0); + _src_ofifo.write(0); + _pc.write(0); + wait(); + + // Always @ positive edge of the clock + while (true) { + // Accept R responses from the NoC + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + if (_input_fifos[ch_id].size() < _fifos_depth && + aximm_interface[ch_id].rvalid.read()) { + sc_bv rdata_bv = aximm_interface[ch_id].rdata.read(); + data_vector rdata(_num_elements_wide_in); + bv_to_data_vector(rdata_bv, rdata, _num_elements_wide_in); + for (unsigned int c = 0; c < _afifo_width_ratio_in; c++) { + data_vector sliced_data(_num_elements_narrow); + for (unsigned int e = 0; e < sliced_data.size(); e++) { + sliced_data[e] = rdata[(c * sliced_data.size()) + e]; + } + _input_fifos[ch_id].push(sliced_data); + } + _num_received_responses++; + if (_num_received_responses == _num_expected_responses) { + std::cout << this->name() << ": Got all memory responses at cycle " + << GetSimulationCycle(5.0) << "!" << std::endl; + } + // std::cout << GetSimulationCycle(5.0) << " === " + // << "Pushed response to iFIFO " << rdata << std::endl; + } + } + + // Pop from input FIFOs to staging register + bool ififos_ready = + are_ififos_ready(_ififo_empty, _instructions[_pc.read()]); + if (ififos_ready && !_ofifo_full[_dest_ofifo.read()]) { + // Pick the right iFIFO (or zeros) to push to staging register + unsigned int mux_select = _instructions[_pc.read()].mux_select; + if (mux_select == _num_mem_channels + 1) { + for (unsigned int e = 0; e < _num_elements_narrow; e++) { + _staging_data[(_staging_counter * _num_elements_narrow) + e] = 0; + } + } else if (mux_select > 0) { + data_vector popped_data = _input_fifos[mux_select - 1].front(); + for (unsigned int e = 0; e < _num_elements_narrow; e++) { + _staging_data[(_staging_counter * _num_elements_narrow) + e] = + popped_data[e]; + } + } + + if (mux_select > 0) { + if (_staging_counter == _afifo_width_ratio_out - 1) { + _staging_counter = 0; + _output_fifos[_dest_ofifo.read()].push(_staging_data); + bool padding = true; + for (unsigned int i = 0; i < _staging_data.size(); i++) { + if (_staging_data[i] != 0) { + padding = false; + break; + } + } + if (!padding) { + *_debug_feature_interaction_out << _staging_data << "\n"; + } + _debug_feature_interaction_out->flush(); + if (_dest_ofifo.read() == _num_output_channels - 1) { + _dest_ofifo.write(0); + } else { + _dest_ofifo.write(_dest_ofifo.read() + 1); + } + } else { + _staging_counter++; + } + } + + // Pop selected iFIFOs + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + if (_instructions[_pc.read()].fifo_pops[ch_id]) { + _input_fifos[ch_id].pop(); + } + } + + // Advance Instructions Pointer + if (_pc.read() == _instructions.size() - 1) { + _pc.write(0); + assert(_staging_counter == 0); + } else { + _pc.write(_pc.read() + 1); + } + } + + // Interface with AXI-S NoC + for (unsigned int ch_id = 0; ch_id < _num_output_channels; ch_id++) { + if (axis_interface[ch_id].tready.read() && + axis_interface[ch_id].tvalid.read()) { + _output_fifos[ch_id].pop(); + // std::cout << "FI sent out vector to MVM " << ch_id << " at cycle " + // << GetSimulationCycle(5.0) << std::endl; + } + + if (!_output_fifos[ch_id].empty()) { + data_vector tx_tdata = _output_fifos[ch_id].front(); + sc_bv tx_tdata_bv; + data_vector_to_bv(tx_tdata, tx_tdata_bv, _num_elements_wide_out); + axis_interface[ch_id].tvalid.write(true); + axis_interface[ch_id].tdata.write(tx_tdata_bv); + axis_interface[ch_id].tuser.write(3 << 13); + axis_interface[ch_id].tid.write(0); + std::string dest_name = + "layer0_mvm" + std::to_string(ch_id) + ".rx_interface"; + sc_bv dest_id_concat = radsim_design->GetPortDestinationID(dest_name); + DEST_RAD(dest_id_concat) = radsim_design->rad_id; + axis_interface[ch_id].tdest.write( + dest_id_concat); + //radsim_design->GetPortDestinationID(dest_name)); + } else { + axis_interface[ch_id].tvalid.write(false); + } + } + + // Interface with testbench + /*if (!_ofifo_empty[_src_ofifo.read()] && feature_interaction_ready.read()) + { feature_interaction_valid.write(true); + feature_interaction_odata.write(_output_fifos[_src_ofifo.read()].front()); + _output_fifos[_src_ofifo.read()].pop(); + if (_src_ofifo.read() == _num_output_channels - 1) { + _src_ofifo.write(0); + } else { + _src_ofifo.write(_src_ofifo.read() + 1); + } + } else { + feature_interaction_valid.write(false); + }*/ + + // Set FIFO signals + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + _ififo_empty[ch_id].write(_input_fifos[ch_id].empty()); + _ififo_full[ch_id].write(_input_fifos[ch_id].size() >= + (_fifos_depth - _afifo_width_ratio_in)); + } + for (unsigned int ch_id = 0; ch_id < _num_output_channels; ch_id++) { + _ofifo_empty[ch_id].write(_output_fifos[ch_id].empty()); + _ofifo_full[ch_id].write(_output_fifos[ch_id].size() >= _fifos_depth - 2); + } + received_responses.write(_num_received_responses); + /*for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + std::cout << "iFIFO " << ch_id + << " occupancy = " << _input_fifos[ch_id].size() << std::endl; + } + for (unsigned int ch_id = 0; ch_id < _num_output_channels; ch_id++) { + std::cout << "oFIFO " << ch_id + << " occupancy = " << _output_fifos[ch_id].size() << std::endl; + } + for (unsigned int i = 0; i < _num_mem_channels; i++) { + std::cout << this->name() << " - " << i << ": " << _input_fifos[i].size() + << std::endl; + } + for (unsigned int i = 0; i < _num_output_channels; i++) { + std::cout << this->name() << " - " << i << ": " << _output_fifos[i].size() + << std::endl; + }*/ + wait(); + } +} + +void feature_interaction::RegisterModuleInfo() { + std::string port_name; + _num_noc_axis_slave_ports = 0; + _num_noc_axis_master_ports = 0; + _num_noc_aximm_slave_ports = 0; + _num_noc_aximm_master_ports = 0; + + for (unsigned int ch_id = 0; ch_id < _num_mem_channels; ch_id++) { + port_name = module_name + ".aximm_interface_" + std::to_string(ch_id); + RegisterAximmMasterPort(port_name, &aximm_interface[ch_id], _dataw); + } + + for (unsigned int ch_id = 0; ch_id < _num_output_channels; ch_id++) { + port_name = module_name + ".axis_interface_" + std::to_string(ch_id); + RegisterAxisMasterPort(port_name, &axis_interface[ch_id], DATAW, 0); + } +} diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/feature_interaction.hpp b/rad-sim/example-designs/dlrm_two_rad/modules/feature_interaction.hpp new file mode 100644 index 0000000..88e1528 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/feature_interaction.hpp @@ -0,0 +1,79 @@ +#pragma once + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct feature_interaction_inst { + unsigned int mux_select; + std::vector fifo_pops; +}; + +class feature_interaction : public RADSimModule { +private: + unsigned int _fifos_depth; // Depth of input/output FIFOs + unsigned int _afifo_width_ratio_in; + unsigned int _afifo_width_ratio_out; + std::vector _instructions; // Instruction mem + sc_signal _pc; // Program counter + + std::vector>> _input_fifos; // Input FIFOs + sc_vector> _ififo_full; // Signals FIFOs full + sc_vector> _ififo_empty; // Signals iFIFOs empty + + std::vector>> _output_fifos; // Output FIFO + sc_vector> _ofifo_full; // Signals oFIFO full + sc_vector> _ofifo_empty; // Signals oFIFO empty + sc_signal _dest_ofifo, _src_ofifo; + data_vector _staging_data; + unsigned int _staging_counter; + + unsigned int _num_mem_channels; // No. of memory channels + unsigned int _dataw; // Data interface bitwidth + unsigned int _num_received_responses; + unsigned int _num_elements_wide_in; + unsigned int _num_elements_narrow; + unsigned int _num_elements_wide_out; + unsigned int _bitwidth; + unsigned int _num_output_channels; + unsigned int _num_expected_responses; + + ofstream *_debug_feature_interaction_out; + +public: + RADSimDesignContext* radsim_design; + sc_in rst; + // Interface to driver logic + sc_out received_responses; + // Interface to the NoC + sc_vector aximm_interface; + sc_vector axis_interface; + + feature_interaction(const sc_module_name &name, unsigned int dataw, + unsigned int element_bitwidth, + unsigned int num_mem_channels, unsigned int fifos_depth, + unsigned int num_output_channels, + std::string &instructions_file, + RADSimDesignContext* radsim_design); + ~feature_interaction(); + + void Assign(); // Combinational logic process + void Tick(); // Sequential logic process + void bv_to_data_vector(sc_bv &bitvector, + data_vector &datavector, + unsigned int num_elements); + void data_vector_to_bv(data_vector &datavector, + sc_bv &bitvector, + unsigned int num_elements); + SC_HAS_PROCESS(feature_interaction); + void RegisterModuleInfo(); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/fifo.cpp b/rad-sim/example-designs/dlrm_two_rad/modules/fifo.cpp new file mode 100644 index 0000000..a7aef00 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/fifo.cpp @@ -0,0 +1,86 @@ +#include + +template +fifo::fifo(const sc_module_name &name, unsigned int depth, + unsigned int width, unsigned int almost_full_size, + unsigned int almost_empty_size) + : sc_module(name), wen("wen"), wdata("wdata"), ren("ren"), rdata("rdata"), + full("full"), almost_full("almost_full"), empty("empty"), + almost_empty("almost_empty") { + + dwidth = width; + capacity = depth; + fifo_almost_full_size = almost_full_size; + fifo_almost_empty_size = almost_empty_size; + + // Set clock and reset signal for SC_CTHREAD + SC_CTHREAD(Tick, clk.pos()); + reset_signal_is(rst, true); +} + +template fifo::~fifo() {} + +template bool fifo::not_full() { + return mem.size() < capacity; +} + +template unsigned int fifo::occupancy() { + return mem.size(); +} + +template void fifo::Tick() { + // Reset logic + while (!mem.empty()) + mem.pop(); + empty.write(true); + almost_empty.write(true); + full.write(false); + almost_full.write(false); + wait(); + + // Sequential logic + while (true) { + // Pop from queue if read enable signal is triggered and there is data in + // the FIFO + if (ren.read()) { + if (mem.size() == 0) + sim_log.log(error, "FIFO is underflowing!", this->name()); + mem.pop(); + } + + // Push data into the FIFO if there is enough space + if (wen.read()) { + if (mem.size() == capacity) + sim_log.log(error, "FIFO is overflowing!", this->name()); + data_vector wdata_temp = wdata.read(); + std::vector temp(wdata_temp.size()); + for (unsigned int element_id = 0; element_id < wdata_temp.size(); + element_id++) + temp[element_id] = wdata_temp[element_id]; + mem.push(temp); + } + + // Update FIFO status signals + empty.write(mem.empty()); + almost_empty.write(mem.size() <= fifo_almost_empty_size); + full.write(mem.size() == capacity); + almost_full.write(mem.size() >= fifo_almost_full_size); + + // Set FIFO read data output to the top of the queue -- a vector of zeros is + // produced if the queue is empty + if (mem.size() == 0) { + std::vector temp(dwidth, 0); + rdata.write(data_vector(temp)); + } else { + std::vector temp = mem.front(); + rdata.write(data_vector(temp)); + } + + wait(); + } +} + +template class fifo; +template class fifo>; +template class fifo>; +template class fifo>; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/fifo.hpp b/rad-sim/example-designs/dlrm_two_rad/modules/fifo.hpp new file mode 100644 index 0000000..857ba99 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/fifo.hpp @@ -0,0 +1,42 @@ +#pragma once + +#include +#include +#include +#include +#include + +// This class defines a vector FIFO module. This is a "peek" FIFO where the read +// data port always shows the top of the FIFO and the read enable signal is an +// acknowledgement signal (equivalent to pop in a software queue) +template class fifo : public sc_module { +private: + unsigned int capacity; // Depth of the FIFO + unsigned int dwidth; // Width of the FIFO in number of vector elements + unsigned int fifo_almost_empty_size, + fifo_almost_full_size; // Occupancy when FIFO is considered almost + // full/empty + std::queue> mem; // FIFO storage implemented as a C++ queue + +public: + sc_in clk; + sc_in rst; + sc_in wen; + sc_in> wdata; + sc_in ren; + sc_out> rdata; + sc_out full; + sc_out almost_full; + sc_out empty; + sc_out almost_empty; + + fifo(const sc_module_name &name, unsigned int depth, unsigned int width, + unsigned int almost_full_size, unsigned int almost_empty_size); + ~fifo(); + + bool not_full(); + unsigned int occupancy(); + + void Tick(); + SC_HAS_PROCESS(fifo); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/instructions.cpp b/rad-sim/example-designs/dlrm_two_rad/modules/instructions.cpp new file mode 100644 index 0000000..0ca819d --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/instructions.cpp @@ -0,0 +1,63 @@ +#include + +mvm_inst::mvm_inst() + : en(false), jump(false), reduce(false), accum(0), accum_en(false), + release(false), raddr(0), last(false), dest_layer(-1), dest_mvm(0) {} + +bool mvm_inst::operator==(const mvm_inst &rhs) { + return (en == rhs.en) && (jump == rhs.jump) && (reduce == rhs.reduce) && + (accum == rhs.accum) && (accum_en == rhs.accum_en) && + (release == rhs.release) && (raddr == rhs.raddr) && + (last == rhs.last) && (dest_layer == rhs.dest_layer) && + (dest_mvm == rhs.dest_mvm); +} + +void mvm_inst::from_bv(const sc_bv &inst_bv) { + this->en = inst_bv.range(0, 0).to_uint(); + this->jump = inst_bv.range(1, 1).to_uint(); + this->reduce = inst_bv.range(2, 2).to_uint(); + this->accum = inst_bv.range(11, 3).to_uint(); + this->accum_en = inst_bv.range(12, 12).to_uint(); + this->release = inst_bv.range(13, 13).to_uint(); + this->raddr = inst_bv.range(22, 14).to_uint(); + this->last = inst_bv.range(23, 23).to_uint(); + this->dest_layer = inst_bv.range(28, 24).to_int(); + this->dest_mvm = inst_bv.range(33, 29).to_uint(); +} + +sc_bv mvm_inst::to_bv() { + sc_bv inst_bv; + inst_bv.range(0, 0) = this->en; + inst_bv.range(1, 1) = this->jump; + inst_bv.range(2, 2) = this->reduce; + inst_bv.range(11, 3) = this->accum; + inst_bv.range(12, 12) = this->accum_en; + inst_bv.range(13, 13) = this->release; + inst_bv.range(22, 14) = this->raddr; + inst_bv.range(23, 23) = this->last; + inst_bv.range(28, 24) = this->dest_layer; + inst_bv.range(33, 29) = this->dest_mvm; + return inst_bv; +} + +ostream &operator<<(ostream &o, const mvm_inst &inst) { + o << "{ en:" << inst.en << " jump:" << inst.jump << " reduce:" << inst.reduce + << " accum:" << inst.accum << " accum_en:" << inst.accum_en + << " release:" << inst.release << " raddr:" << inst.raddr + << " last:" << inst.last << " dest_layer:" << inst.dest_layer + << " dest_mvm:" << inst.dest_mvm << " }"; + return o; +} + +void sc_trace(sc_trace_file *f, const mvm_inst &inst, const std::string &s) { + sc_trace(f, inst.en, s + "_inst_en"); + sc_trace(f, inst.jump, s + "_inst_jump"); + sc_trace(f, inst.reduce, s + "_inst_reduce"); + sc_trace(f, inst.accum, s + "_inst_accum"); + sc_trace(f, inst.accum_en, s + "_inst_accum_en"); + sc_trace(f, inst.release, s + "_inst_release"); + sc_trace(f, inst.raddr, s + "_inst_raddr"); + sc_trace(f, inst.last, s + "_inst_last"); + sc_trace(f, inst.dest_layer, s + "_dest_layer"); + sc_trace(f, inst.dest_mvm, s + "_dest_mvm"); +} \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/instructions.hpp b/rad-sim/example-designs/dlrm_two_rad/modules/instructions.hpp new file mode 100644 index 0000000..b00149b --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/instructions.hpp @@ -0,0 +1,25 @@ +#pragma once + +#include +#include + +class mvm_inst : public std::error_code { +public: + bool en; + bool jump; + bool reduce; + unsigned int accum; + bool accum_en; + bool release; + unsigned int raddr; + bool last; + sc_int<5> dest_layer; + sc_uint<5> dest_mvm; + + mvm_inst(); + bool operator==(const mvm_inst &rhs); + void from_bv(const sc_bv &inst_bv); + sc_bv to_bv(); + friend ostream &operator<<(ostream &o, const mvm_inst &inst); +}; +void sc_trace(sc_trace_file *f, const mvm_inst &inst, const std::string &s); \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/mvm.cpp b/rad-sim/example-designs/dlrm_two_rad/modules/mvm.cpp new file mode 100644 index 0000000..82dbac2 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/mvm.cpp @@ -0,0 +1,610 @@ +#include + +bool ParseInstructions(std::vector &inst_mem, + const std::string &inst_filename) { + std::ifstream inst_file(inst_filename); + if (!inst_file) + return false; + + std::string line; + unsigned int addr = 0; + while (std::getline(inst_file, line)) { + std::stringstream line_stream(line); + mvm_inst inst; + unsigned int value; + line_stream >> value; + inst.en = value; + line_stream >> value; + inst.jump = value; + line_stream >> value; + inst.reduce = value; + line_stream >> value; + inst.accum = value; + line_stream >> value; + inst.accum_en = value; + line_stream >> value; + inst.release = value; + line_stream >> value; + inst.raddr = value; + line_stream >> value; + inst.last = value; + line_stream >> value; + inst.dest_layer = value; + line_stream >> value; + inst.dest_mvm = value; + inst_mem[addr] = inst; + addr++; + } + return true; +} + +mvm::mvm(const sc_module_name &name, unsigned int id_mvm, unsigned int id_layer, + const std::string &inst_filename, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design), matrix_mem_rdata("matrix_mem_rdata", DOT_PRODUCTS), + matrix_mem_wen("matrix_mem_wen", DOT_PRODUCTS), + ififo_pipeline("ififo_pipeline", RF_RD_LATENCY), + reduce_pipeline("reduce_pipeline", RF_RD_LATENCY), + result_pipeline("result_pipeline", COMPUTE_LATENCY), + valid_pipeline("valid_pipeline", COMPUTE_LATENCY + RF_RD_LATENCY), + release_pipeline("release_pipeline", RF_RD_LATENCY), + accum_en_pipeline("accum_en_pipeline", RF_RD_LATENCY), + accum_pipeline("accum_pipeline", RF_RD_LATENCY), + dest_layer_pipeline("dest_layer_pipeline", + COMPUTE_LATENCY + RF_RD_LATENCY), + dest_mvm_pipeline("mvm_layer_pipeline", COMPUTE_LATENCY + RF_RD_LATENCY), + tdata_vec(LANES), result(DOT_PRODUCTS), rst("rst") { + + this->radsim_design = radsim_design; + module_name = name; + mvm_id = id_mvm; + layer_id = id_layer; + + inst_memory.resize(INST_MEM_DEPTH); + if (!inst_filename.empty()) { + if (!ParseInstructions(inst_memory, inst_filename)) { + std::cerr << "Parsing instructions failed!" << std::endl; + exit(1); + } + } + accum_memory.resize(ACCUM_MEM_DEPTH); + + char mem_name[25]; + std::string mem_name_str; + matrix_memory.resize(DOT_PRODUCTS); + std::string mvm_dir = + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); + std::string mem_init_file; + for (unsigned int dot_id = 0; dot_id < DOT_PRODUCTS; dot_id++) { + mem_init_file = mvm_dir + "/compiler/mvm_weights/layer" + + std::to_string(layer_id) + "_mvm" + std::to_string(mvm_id) + + "_dot" + std::to_string(dot_id) + ".dat"; + mem_name_str = + "mvm" + std::to_string(mvm_id) + "_matrix_mem" + std::to_string(dot_id); + std::strcpy(mem_name, mem_name_str.c_str()); + matrix_memory[dot_id] = new register_file( + mem_name, dot_id, RF_MEM_DEPTH, LANES, mem_init_file); + matrix_memory[dot_id]->clk(clk); + matrix_memory[dot_id]->rst(rst); + matrix_memory[dot_id]->raddr(matrix_mem_raddr); + matrix_memory[dot_id]->wdata(matrix_mem_wdata); + matrix_memory[dot_id]->waddr(matrix_mem_waddr); + matrix_memory[dot_id]->wen(matrix_mem_wen[dot_id]); + matrix_memory[dot_id]->clk_en(matrix_mem_clk_en); + matrix_memory[dot_id]->rdata(matrix_mem_rdata[dot_id]); + } + + char fifo_name[25]; + std::string fifo_name_str; + fifo_name_str = "mvm" + std::to_string(mvm_id) + "_ififo"; + std::strcpy(fifo_name, fifo_name_str.c_str()); + ififo = new fifo(fifo_name, FIFO_SIZE, LANES, FIFO_SIZE - 4, 0); + ififo->clk(clk); + ififo->rst(rst); + ififo->wen(ififo_wen_signal); + ififo->ren(ififo_ren_signal); + ififo->wdata(ififo_wdata_signal); + ififo->full(ififo_full_signal); + ififo->almost_full(ififo_almost_full_signal); + ififo->empty(ififo_empty_signal); + ififo->almost_empty(ififo_almost_empty_signal); + ififo->rdata(ififo_rdata_signal); + + fifo_name_str = "mvm" + std::to_string(mvm_id) + "_reduce_fifo"; + std::strcpy(fifo_name, fifo_name_str.c_str()); + reduce_fifo = + new fifo(fifo_name, FIFO_SIZE, LANES, FIFO_SIZE - 4, 0); + reduce_fifo->clk(clk); + reduce_fifo->rst(rst); + reduce_fifo->wen(reduce_fifo_wen_signal); + reduce_fifo->ren(reduce_fifo_ren_signal); + reduce_fifo->wdata(reduce_fifo_wdata_signal); + reduce_fifo->full(reduce_fifo_full_signal); + reduce_fifo->almost_full(reduce_fifo_almost_full_signal); + reduce_fifo->empty(reduce_fifo_empty_signal); + reduce_fifo->almost_empty(reduce_fifo_almost_empty_signal); + reduce_fifo->rdata(reduce_fifo_rdata_signal); + + fifo_name_str = "mvm" + std::to_string(mvm_id) + "_ofifo"; + std::strcpy(fifo_name, fifo_name_str.c_str()); + ofifo = new fifo(fifo_name, FIFO_SIZE, LANES, + FIFO_SIZE - COMPUTE_LATENCY - RF_RD_LATENCY - 4, 0); + ofifo->clk(clk); + ofifo->rst(rst); + ofifo->wen(ofifo_wen_signal); + ofifo->ren(ofifo_ren_signal); + ofifo->wdata(ofifo_wdata_signal); + ofifo->full(ofifo_full_signal); + ofifo->almost_full(ofifo_almost_full_signal); + ofifo->empty(ofifo_empty_signal); + ofifo->almost_empty(ofifo_almost_empty_signal); + ofifo->rdata(ofifo_rdata_signal); + + fifo_name_str = "mvm" + std::to_string(mvm_id) + "_dl_fifo"; + std::strcpy(fifo_name, fifo_name_str.c_str()); + dl_fifo = + new fifo>(fifo_name, FIFO_SIZE, 1, + FIFO_SIZE - COMPUTE_LATENCY - RF_RD_LATENCY - 4, 0); + dl_fifo->clk(clk); + dl_fifo->rst(rst); + dl_fifo->wen(dl_fifo_wen_signal); + dl_fifo->ren(dl_fifo_ren_signal); + dl_fifo->wdata(dl_fifo_wdata_signal); + dl_fifo->full(dl_fifo_full_signal); + dl_fifo->almost_full(dl_fifo_almost_full_signal); + dl_fifo->empty(dl_fifo_empty_signal); + dl_fifo->almost_empty(dl_fifo_almost_empty_signal); + dl_fifo->rdata(dl_fifo_rdata_signal); + + fifo_name_str = "mvm" + std::to_string(mvm_id) + "_dm_fifo"; + std::strcpy(fifo_name, fifo_name_str.c_str()); + dm_fifo = + new fifo>(fifo_name, FIFO_SIZE, 1, + FIFO_SIZE - COMPUTE_LATENCY - RF_RD_LATENCY - 4, 0); + dm_fifo->clk(clk); + dm_fifo->rst(rst); + dm_fifo->wen(dm_fifo_wen_signal); + dm_fifo->ren(dm_fifo_ren_signal); + dm_fifo->wdata(dm_fifo_wdata_signal); + dm_fifo->full(dm_fifo_full_signal); + dm_fifo->almost_full(dm_fifo_almost_full_signal); + dm_fifo->empty(dm_fifo_empty_signal); + dm_fifo->almost_empty(dm_fifo_almost_empty_signal); + dm_fifo->rdata(dm_fifo_rdata_signal); + + SC_METHOD(Assign); + sensitive << rst << ofifo_almost_full_signal << ofifo_rdata_signal + << tx_input_interface.tvalid << tx_input_interface.tready + << tx_reduce_interface.tvalid << tx_reduce_interface.tready + << rx_input_interface.tuser << rx_reduce_interface.tuser + << ififo_almost_full_signal << reduce_fifo_almost_full_signal + << result_pipeline[COMPUTE_LATENCY - 1] + << valid_pipeline[RF_RD_LATENCY + COMPUTE_LATENCY - 1] + << ififo_empty_signal << reduce_fifo_empty_signal << next_inst << pc + << dl_fifo_rdata_signal << dm_fifo_rdata_signal + << dest_layer_pipeline[RF_RD_LATENCY + COMPUTE_LATENCY - 1] + << dest_mvm_pipeline[RF_RD_LATENCY + COMPUTE_LATENCY - 1] + << dl_fifo_rdata_signal << dm_fifo_rdata_signal + << ofifo_empty_signal; + SC_CTHREAD(Tick, clk.pos()); + reset_signal_is(rst, true); + + this->RegisterModuleInfo(); +} + +mvm::~mvm() { delete ofifo; } + +int16_t dot(data_vector v1, data_vector v2) { + int16_t res = 0; + for (unsigned int element_id = 0; element_id < v1.size(); element_id++) { + res += (v1[element_id] * v2[element_id]); + } + return res; +} + +void mvm::Tick() { + if (radsim_design->rad_id == 1) { + // Reset logic + for (unsigned int lane_id = 0; lane_id < LANES; lane_id++) { + tdata_vec[lane_id] = 0; + } + pc.write(0); + mvm_inst rst_inst; + // next_inst.write(rst_inst); + wait(); + // Sequential logic + while (true && (radsim_design->rad_id == 1)) { + /*std::cout << this->name() << " iFIFO occ: " << ififo->occupancy() + << " rFIFO occ: " << reduce_fifo->occupancy() + << " oFIFO occ: " << ofifo->occupancy() + << " dlFIFO occ: " << dl_fifo->occupancy() + << " dmFIFO occ: " << dm_fifo->occupancy() + << " tvalid: " << tx_input_interface.tvalid.read() + << " tready: " << tx_input_interface.tready.read() << std::endl;*/ + // Instruction issue logic + // next_inst.write(inst_memory[pc.read()]); + + // Compute logic + if (dot_reduce_op) { + ififo_pipeline[0].write(ififo_rdata_signal.read()); + reduce_pipeline[0].write(reduce_fifo_rdata_signal.read()); + // std::cout << "Dot-Reduce op @ MVM (" << layer_id << ", " << mvm_id << + // ")" << std::endl; + valid_pipeline[0].write(true); + accum_pipeline[0].write(next_inst.read().accum); + accum_en_pipeline[0].write(next_inst.read().accum_en); + release_pipeline[0].write(next_inst.read().release); + dest_layer_pipeline[0].write(next_inst.read().dest_layer); + dest_mvm_pipeline[0].write(next_inst.read().dest_mvm); + pc.write(pc.read() + 1); + } else if (dot_op) { + // std::cout << "Dot op @ MVM (" << layer_id << ", " << mvm_id << ")" << + // std::endl; + data_vector zeros(LANES); + ififo_pipeline[0].write(ififo_rdata_signal.read()); + reduce_pipeline[0].write(zeros); + valid_pipeline[0].write(true); + accum_pipeline[0].write(next_inst.read().accum); + accum_en_pipeline[0].write(next_inst.read().accum_en); + release_pipeline[0].write(next_inst.read().release); + dest_layer_pipeline[0].write(next_inst.read().dest_layer); + dest_mvm_pipeline[0].write(next_inst.read().dest_mvm); + pc.write(pc.read() + 1); + // if (mvm_id == 0 && layer_id == 0 && pc.read() == 0) { + // std::cout << "MVMs started compute at cycle " << + // GetSimulationCycle(5.0) + // << std::endl; + //} + } else if (next_inst.read().en && next_inst.read().jump) { + valid_pipeline[0].write(false); + pc.write(next_inst.read().raddr); + // if (mvm_id == 1 && layer_id == 2) { + // std::cout << "MVMs finished compute at cycle " + // << GetSimulationCycle(5.0) << std::endl; + //} + } else { + valid_pipeline[0].write(false); + } + + if (valid_pipeline[RF_RD_LATENCY - 1].read()) { + data_vector reduce_vector = + reduce_pipeline[RF_RD_LATENCY - 1].read(); + unsigned int accum_addr = accum_pipeline[RF_RD_LATENCY - 1].read(); + bool accum_en = accum_en_pipeline[RF_RD_LATENCY - 1].read(); + data_vector accum_operand = accum_memory[accum_addr]; + + for (unsigned int dot_id = 0; dot_id < DOT_PRODUCTS; dot_id++) { + result[dot_id] = dot(ififo_pipeline[RF_RD_LATENCY - 1].read(), + matrix_mem_rdata[dot_id].read()); + result[dot_id] += reduce_vector[dot_id]; + if (accum_en) { + result[dot_id] += accum_operand[dot_id]; + } + } + accum_memory[accum_addr] = result; + result_pipeline[0].write(result); + // std::cout << "Result: " << result << std::endl; + } + + // Advance pipelines + for (unsigned int stage_id = 1; stage_id < RF_RD_LATENCY; stage_id++) { + ififo_pipeline[stage_id].write(ififo_pipeline[stage_id - 1].read()); + reduce_pipeline[stage_id].write(reduce_pipeline[stage_id - 1].read()); + valid_pipeline[stage_id].write(valid_pipeline[stage_id - 1].read()); + accum_pipeline[stage_id].write(accum_pipeline[stage_id - 1].read()); + accum_en_pipeline[stage_id].write(accum_en_pipeline[stage_id - 1].read()); + release_pipeline[stage_id].write(release_pipeline[stage_id - 1].read()); + dest_layer_pipeline[stage_id].write( + dest_layer_pipeline[stage_id - 1].read()); + dest_mvm_pipeline[stage_id].write(dest_mvm_pipeline[stage_id - 1].read()); + } + valid_pipeline[RF_RD_LATENCY].write( + valid_pipeline[RF_RD_LATENCY - 1].read() && + release_pipeline[RF_RD_LATENCY - 1].read()); + dest_layer_pipeline[RF_RD_LATENCY].write( + dest_layer_pipeline[RF_RD_LATENCY - 1].read()); + dest_mvm_pipeline[RF_RD_LATENCY].write( + dest_mvm_pipeline[RF_RD_LATENCY - 1].read()); + for (unsigned int stage_id = 1; stage_id < COMPUTE_LATENCY; stage_id++) { + result_pipeline[stage_id].write(result_pipeline[stage_id - 1].read()); + valid_pipeline[RF_RD_LATENCY + stage_id].write( + valid_pipeline[RF_RD_LATENCY + stage_id - 1].read()); + dest_layer_pipeline[RF_RD_LATENCY + stage_id].write( + dest_layer_pipeline[RF_RD_LATENCY + stage_id - 1].read()); + dest_mvm_pipeline[RF_RD_LATENCY + stage_id].write( + dest_mvm_pipeline[RF_RD_LATENCY + stage_id - 1].read()); + } + + if (rx_input_interface.tvalid.read() && rx_input_interface.tready.read()) { + sc_bv tdata = rx_input_interface.tdata.read(); + data_vector tdatavector(TDATA_ELEMS); + unsigned int start_idx, end_idx; + for (unsigned int e = 0; e < TDATA_ELEMS; e++) { + start_idx = e * TDATA_WIDTH; + end_idx = (e + 1) * TDATA_WIDTH; + tdatavector[e] = tdata.range(end_idx - 1, start_idx).to_int(); + } + + if (rx_input_interface.tuser.read().range(15, 13).to_uint() == 1) { + unsigned int waddr = + rx_input_interface.tuser.read().range(8, 0).to_uint(); + mvm_inst inst; + inst.from_bv(rx_input_interface.tdata.read()); + inst_memory[waddr] = inst; + ififo_wen_signal.write(false); + for (unsigned int dot_id = 0; dot_id < DOT_PRODUCTS; dot_id++) { + matrix_mem_wen[dot_id].write(false); + } + } else if (rx_input_interface.tuser.read().range(15, 13).to_uint() > 1) { + // Read rx tdata into a vector + for (unsigned int lane_id = 0; lane_id < LANES; lane_id++) { + tdata_vec[lane_id] = + tdata.range((lane_id + 1) * BITWIDTH - 1, lane_id * BITWIDTH) + .to_int(); + } + + // Push the data vector into the right FIFO/memory + if (rx_input_interface.tuser.read().range(15, 13).to_uint() == + 3) { // Input FIFO + ififo_wdata_signal.write(tdata_vec); + ififo_wen_signal.write(true); + for (unsigned int dot_id = 0; dot_id < DOT_PRODUCTS; dot_id++) { + matrix_mem_wen[dot_id].write(false); + } + // if (mvm_id == 0 && layer_id == 0) + // std::cout << tdata_vec << std::endl; + } else if (rx_input_interface.tuser.read().range(15, 13).to_uint() == + 4) { // Matrix memory + unsigned int waddr = + rx_input_interface.tuser.read().range(8, 0).to_uint(); + unsigned int wen_id = + rx_input_interface.tuser.read().range(12, 9).to_uint(); + matrix_mem_waddr.write(waddr); + matrix_mem_wdata.write(tdata_vec); + for (unsigned int dot_id = 0; dot_id < DOT_PRODUCTS; dot_id++) { + if (dot_id == wen_id) + matrix_mem_wen[wen_id].write(true); + else + matrix_mem_wen[dot_id].write(false); + } + ififo_wen_signal.write(false); + } else { + ififo_wen_signal.write(false); + for (unsigned int dot_id = 0; dot_id < DOT_PRODUCTS; dot_id++) { + matrix_mem_wen[dot_id].write(false); + } + } + } else { + ififo_wen_signal.write(false); + for (unsigned int dot_id = 0; dot_id < DOT_PRODUCTS; dot_id++) { + matrix_mem_wen[dot_id].write(false); + } + } + } else { + ififo_wen_signal.write(false); + for (unsigned int dot_id = 0; dot_id < DOT_PRODUCTS; dot_id++) { + matrix_mem_wen[dot_id].write(false); + } + } + + if (rx_reduce_interface.tvalid.read() && + rx_reduce_interface.tready.read()) { + sc_bv tdata = rx_reduce_interface.tdata.read(); + assert(rx_reduce_interface.tuser.read().range(15, 13).to_uint() == 2); + + // Read rx tdata into a vector + for (unsigned int lane_id = 0; lane_id < LANES; lane_id++) { + tdata_vec[lane_id] = + tdata.range((lane_id + 1) * BITWIDTH - 1, lane_id * BITWIDTH) + .to_int(); + } + + reduce_fifo_wdata_signal.write(tdata_vec); + reduce_fifo_wen_signal.write(true); + // std::cout << this->name() << " Write to reduce FIFO" << std::endl; + // std::cout << tdata_vec << std::endl; + } else { + reduce_fifo_wen_signal.write(false); + } + wait(); + } + } +} + +void mvm::Assign() { + if (rst.read()) { + rx_input_interface.tready.write(false); + rx_reduce_interface.tready.write(false); + tx_input_interface.tdata.write(0); + tx_input_interface.tvalid.write(false); + tx_input_interface.tstrb.write((2 << AXIS_STRBW) - 1); + tx_input_interface.tkeep.write((2 << AXIS_KEEPW) - 1); + tx_input_interface.tlast.write(0); + tx_input_interface.tuser.write(0); + tx_reduce_interface.tdata.write(0); + tx_reduce_interface.tvalid.write(false); + tx_reduce_interface.tstrb.write((2 << AXIS_STRBW) - 1); + tx_reduce_interface.tkeep.write((2 << AXIS_KEEPW) - 1); + tx_reduce_interface.tlast.write(0); + tx_reduce_interface.tuser.write(0); + ififo_ren_signal.write(false); + reduce_fifo_ren_signal.write(false); + ofifo_wen_signal.write(false); + ofifo_ren_signal.write(false); + dl_fifo_ren_signal.write(false); + dm_fifo_ren_signal.write(false); + matrix_mem_clk_en.write(false); + matrix_mem_raddr.write(0); + dot_op.write(false); + dot_reduce_op.write(false); + } else if (radsim_design->rad_id == 1) { + if (rx_input_interface.tuser.read().range(15, 13).to_uint() == + 1) { // Inst memory + rx_input_interface.tready.write(true); + } else if (rx_input_interface.tuser.read().range(15, 13).to_uint() == + 3) { // Input FIFO + rx_input_interface.tready.write(!ififo_almost_full_signal.read()); + } else if (rx_input_interface.tuser.read().range(15, 13).to_uint() == + 4) { // Matrix memory + rx_input_interface.tready.write(true); + } else { + rx_input_interface.tready.write(false); + } + + // if (rx_reduce_interface.tuser.read().range(15, 13).to_uint() == + // 2) { // Reduction FIFO + rx_reduce_interface.tready.write(!reduce_fifo_almost_full_signal.read()); + //} else { + // rx_reduce_interface.tready.write(false); + //} + + matrix_mem_raddr.write(next_inst.read().raddr); + next_inst.write(inst_memory[pc.read()]); + + if (!ififo_empty_signal && !reduce_fifo_empty_signal && + !ofifo_almost_full_signal && next_inst.read().en && + !next_inst.read().jump && next_inst.read().reduce) { + ififo_ren_signal.write(next_inst.read().last); + reduce_fifo_ren_signal.write(true); + dot_reduce_op.write(true); + dot_op.write(false); + } else if (!ififo_empty_signal && !ofifo_almost_full_signal && + next_inst.read().en && !next_inst.read().jump && + !next_inst.read().reduce) { + ififo_ren_signal.write(next_inst.read().last); + reduce_fifo_ren_signal.write(false); + dot_op.write(true); + dot_reduce_op.write(false); + } else { + ififo_ren_signal.write(false); + reduce_fifo_ren_signal.write(false); + dot_op.write(false); + dot_reduce_op.write(false); + } + + ofifo_wen_signal.write( + valid_pipeline[COMPUTE_LATENCY + RF_RD_LATENCY - 1].read()); + ofifo_wdata_signal.write(result_pipeline[COMPUTE_LATENCY - 1].read()); + + data_vector> dest_layer(1); + dest_layer[0] = + dest_layer_pipeline[COMPUTE_LATENCY + RF_RD_LATENCY - 1].read(); + dl_fifo_wen_signal.write( + valid_pipeline[COMPUTE_LATENCY + RF_RD_LATENCY - 1].read()); + dl_fifo_wdata_signal.write(dest_layer); + + data_vector> dest_mvm(1); + dest_mvm[0] = dest_mvm_pipeline[COMPUTE_LATENCY + RF_RD_LATENCY - 1].read(); + dm_fifo_wen_signal.write( + valid_pipeline[COMPUTE_LATENCY + RF_RD_LATENCY - 1].read()); + dm_fifo_wdata_signal.write(dest_mvm); + + data_vector tx_tdata = ofifo_rdata_signal.read(); + data_vector> dest_layer_vec; + dest_layer_vec = dl_fifo_rdata_signal.read(); + int dest_layer_int = 0; + data_vector> dest_mvm_vec; + dest_mvm_vec = dm_fifo_rdata_signal.read(); + unsigned int dest_mvm_int = 0; + if (dest_layer_vec.size() > 0) { + dest_layer_int = dest_layer_vec[0].to_int(); + dest_mvm_int = dest_mvm_vec[0].to_uint(); + } + std::string dest_name; + unsigned int dest_id; + if (dest_layer_int == 0) { + dest_name = "output_collector.data_collect"; + } else { + dest_name = "layer" + std::to_string(dest_layer_int - 1) + "_mvm" + + std::to_string(dest_mvm_int) + ".rx_interface"; + } + dest_id = radsim_design->GetPortDestinationID(dest_name); + sc_bv dest_id_concat; + DEST_LOCAL_NODE(dest_id_concat) = dest_id; + DEST_REMOTE_NODE(dest_id_concat) = dest_id; + // if (radsim_design->rad_id == 1){ + // std::cout << "mvm.cpp on RAD " << radsim_design->rad_id << "'s dest_id: " << dest_id << " and DEST_RAD(dest_id): " << DEST_RAD(dest_id_concat) << std::endl; + // } + DEST_RAD(dest_id_concat) = radsim_design->rad_id; + unsigned int dest_interface; // which FIFO + unsigned int dest_interface_id; // added for separate ports + // If destination is the same layer, send to reduce FIFO + if ((unsigned int)dest_layer_int - 1 == layer_id) { + dest_interface = 2 << 13; + dest_interface_id = 1; + // if (tx_tdata.size() > 0 && !ofifo_empty_signal) + // std::cout << this->name() << " sending to interface 2 -- " << + // dest_layer_int << std::endl; + } + // If destination is a different layer, send to the input FIFO + else { + dest_interface = 3 << 13; + dest_interface_id = 0; + // if (tx_tdata.size() > 0 && !ofifo_empty_signal) + // std::cout << this->name() << " sending to interface 3 -- " << + // dest_layer_int << std::endl; + } + + if (tx_tdata.size() > 0 && !ofifo_empty_signal && dest_interface_id == 0) { + sc_bv tx_tdata_bv; + for (unsigned int lane_id = 0; lane_id < LANES; lane_id++) { + tx_tdata_bv.range((lane_id + 1) * BITWIDTH - 1, lane_id * BITWIDTH) = + tx_tdata[lane_id]; + } + tx_input_interface.tdata.write(tx_tdata_bv); + tx_input_interface.tvalid.write(true); + tx_input_interface.tuser.write(dest_interface); + tx_input_interface.tdest.write(dest_id_concat); //dest_id); + tx_input_interface.tid.write(dest_interface_id); + tx_reduce_interface.tvalid.write(false); + // if (mvm_id == 1 && layer_id == 2 && !ofifo_empty_signal) { + // std::cout << ">>>>>> " << tx_tdata << std::endl; + // } + // std::cout << "MVM (" << layer_id << "," << mvm_id << ") pushed data + // into the NoC with dest " << dest_id << "!" << std::endl; + } else if (tx_tdata.size() > 0 && !ofifo_empty_signal && + dest_interface_id == 1) { + sc_bv tx_tdata_bv; + for (unsigned int lane_id = 0; lane_id < LANES; lane_id++) { + tx_tdata_bv.range((lane_id + 1) * BITWIDTH - 1, lane_id * BITWIDTH) = + tx_tdata[lane_id]; + } + tx_reduce_interface.tdata.write(tx_tdata_bv); + tx_reduce_interface.tvalid.write(true); + tx_reduce_interface.tuser.write(dest_interface); + tx_reduce_interface.tdest.write(dest_id_concat); //dest_id); + tx_reduce_interface.tid.write(dest_interface_id); + tx_input_interface.tvalid.write(false); + } else { + tx_input_interface.tvalid.write(false); + tx_reduce_interface.tvalid.write(false); + } + + ofifo_ren_signal.write((tx_input_interface.tvalid.read() && + tx_input_interface.tready.read()) || + (tx_reduce_interface.tvalid.read() && + tx_reduce_interface.tready.read())); + dl_fifo_ren_signal.write((tx_input_interface.tvalid.read() && + tx_input_interface.tready.read()) || + (tx_reduce_interface.tvalid.read() && + tx_reduce_interface.tready.read())); + dm_fifo_ren_signal.write((tx_input_interface.tvalid.read() && + tx_input_interface.tready.read()) || + (tx_reduce_interface.tvalid.read() && + tx_reduce_interface.tready.read())); + } +} + +void mvm::RegisterModuleInfo() { + std::string port_name; + _num_noc_axis_slave_ports = 0; + _num_noc_axis_master_ports = 0; + + port_name = module_name + ".tx_interface"; + RegisterAxisMasterPort(port_name, &tx_input_interface, DATAW, 0); + + port_name = module_name + ".rx_interface"; + RegisterAxisSlavePort(port_name, &rx_input_interface, DATAW, 0); + + // port_name = module_name + ".rx_reduce_interface"; + // RegisterAxisSlavePort(port_name, &rx_reduce_interface, DATAW, 1); +} diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/mvm.hpp b/rad-sim/example-designs/dlrm_two_rad/modules/mvm.hpp new file mode 100644 index 0000000..8c61837 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/mvm.hpp @@ -0,0 +1,91 @@ +#pragma once + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +class mvm : public RADSimModule { +private: + std::string module_name; + unsigned int mvm_id; + unsigned int layer_id; + + std::vector inst_memory; + sc_signal next_inst; + sc_signal pc; + + std::vector> accum_memory; + sc_signal> next_accum; + + std::vector *> matrix_memory; + sc_vector>> matrix_mem_rdata; + sc_signal> matrix_mem_wdata; + sc_vector> matrix_mem_wen; + sc_signal matrix_mem_raddr, matrix_mem_waddr; + sc_signal matrix_mem_clk_en; + + fifo *ififo; + sc_signal> ififo_wdata_signal, ififo_rdata_signal; + sc_signal ififo_wen_signal, ififo_ren_signal, ififo_full_signal, + ififo_empty_signal, ififo_almost_full_signal, ififo_almost_empty_signal; + + fifo *reduce_fifo; + sc_signal> reduce_fifo_wdata_signal, + reduce_fifo_rdata_signal; + sc_signal reduce_fifo_wen_signal, reduce_fifo_ren_signal, + reduce_fifo_full_signal, reduce_fifo_empty_signal, + reduce_fifo_almost_full_signal, reduce_fifo_almost_empty_signal; + + sc_vector>> ififo_pipeline, reduce_pipeline, + result_pipeline; + sc_vector> valid_pipeline, release_pipeline, + accum_en_pipeline; + sc_vector> accum_pipeline; + sc_vector>> dest_layer_pipeline; + sc_vector>> dest_mvm_pipeline; + + fifo *ofifo; + sc_signal> ofifo_wdata_signal, ofifo_rdata_signal; + sc_signal ofifo_wen_signal, ofifo_ren_signal, ofifo_full_signal, + ofifo_empty_signal, ofifo_almost_full_signal, ofifo_almost_empty_signal; + + fifo> *dl_fifo; + sc_signal>> dl_fifo_wdata_signal, dl_fifo_rdata_signal; + sc_signal dl_fifo_wen_signal, dl_fifo_ren_signal, dl_fifo_full_signal, + dl_fifo_empty_signal, dl_fifo_almost_full_signal, + dl_fifo_almost_empty_signal; + + fifo> *dm_fifo; + sc_signal>> dm_fifo_wdata_signal, dm_fifo_rdata_signal; + sc_signal dm_fifo_wen_signal, dm_fifo_ren_signal, dm_fifo_full_signal, + dm_fifo_empty_signal, dm_fifo_almost_full_signal, + dm_fifo_almost_empty_signal; + + data_vector tdata_vec; + data_vector result; + sc_signal dot_op, dot_reduce_op; + +public: + RADSimDesignContext* radsim_design; + sc_in rst; + axis_slave_port rx_input_interface; + axis_slave_port rx_reduce_interface; + axis_master_port tx_input_interface; + axis_master_port tx_reduce_interface; + + mvm(const sc_module_name &name, unsigned int id_mvm, unsigned int id_layer, + const std::string &inst_filename, RADSimDesignContext* radsim_design); + ~mvm(); + + void Assign(); + void Tick(); + SC_HAS_PROCESS(mvm); + void RegisterModuleInfo(); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/register_file.cpp b/rad-sim/example-designs/dlrm_two_rad/modules/register_file.cpp new file mode 100644 index 0000000..6b07972 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/register_file.cpp @@ -0,0 +1,103 @@ +#include + +template +register_file::register_file(const sc_module_name &name, unsigned int id, + unsigned int depth, unsigned int width, + std::string &init_file) + : sc_module(name), wdata_pipeline("wdata_pipeline", RF_WR_LATENCY), + waddr_pipeline("waddr_pipeline", RF_WR_LATENCY), + raddr_pipeline("raddr_pipeline", RF_RD_LATENCY), + wen_pipeline("wen_pipeline", RF_WR_LATENCY), rst("rst"), raddr("raddr"), + wdata("wdata"), waddr("waddr"), wen("wen"), clk_en("clk_en"), + rdata("rdata") { + + register_file_id = id; + num_elements_per_word = width; + mem.resize(depth); + for (unsigned int i = 0; i < depth; i++) { + mem[i].resize(width); + } + + if (!init_file.empty()) { + bool parse = + parse_register_file_contents_from_file(mem, init_file, width, depth); + if (!parse) { + cerr << "Error parsing contents of RF " << register_file_id << endl; + exit(1); + } + } + + SC_METHOD(Assign); + sensitive << wdata_pipeline[RF_WR_LATENCY - 1] + << wen_pipeline[RF_WR_LATENCY - 1] + << waddr_pipeline[RF_WR_LATENCY - 1] + << raddr_pipeline[RF_WR_LATENCY - 1]; + SC_CTHREAD(Tick, clk.pos()); + reset_signal_is(rst, true); +} + +template register_file::~register_file() { + for (unsigned int i = 0; i < mem.size(); i++) + mem[i].clear(); +} + +template void register_file::Assign() { + // Assign read data output + rdata.write( + data_vector(mem[raddr_pipeline[RF_RD_LATENCY - 1].read()])); + + // Assign write value + if (wen_pipeline[RF_WR_LATENCY - 1].read()) { + uint32_t addr = waddr_pipeline[RF_WR_LATENCY - 1].read(); + data_vector temp = wdata_pipeline[RF_WR_LATENCY - 1].read(); + if (temp.size() != 0) + for (unsigned int i = 0; i < num_elements_per_word; i++) + mem[addr][i] = temp[i]; + } +} + +template void register_file::Tick() { + for (unsigned int i = 0; i < RF_RD_LATENCY; i++) { + raddr_pipeline[i].write(0); + } + for (unsigned int i = 0; i < RF_WR_LATENCY; i++) { + waddr_pipeline[i].write(0); + wen_pipeline[i].write(false); + } + wait(); + + while (true) { + if (!clk_en.read()) { + // Address guards + if (raddr.read() >= mem.size()) { + cerr << "Read address (" << raddr.read() << ") out of bound at RF " + << register_file_id << endl; + exit(1); + } + if (waddr.read() >= mem.size()) { + cerr << "Write address (" << waddr.read() << ") out of bound at RF " + << register_file_id << endl; + exit(1); + } + + // Populate first stage of pipeline + raddr_pipeline[0].write(raddr.read()); + wdata_pipeline[0].write(data_vector(wdata)); + waddr_pipeline[0].write(waddr.read()); + wen_pipeline[0].write(wen.read()); + + // Advance the rest of the pipeline + for (unsigned int i = 1; i < RF_RD_LATENCY; i++) { + raddr_pipeline[i].write(raddr_pipeline[i - 1]); + } + for (unsigned int i = 1; i < RF_WR_LATENCY; i++) { + wdata_pipeline[i].write(wdata_pipeline[i - 1].read()); + waddr_pipeline[i].write(waddr_pipeline[i - 1].read()); + wen_pipeline[i].write(wen_pipeline[i - 1].read()); + } + } + wait(); + } +} + +template class register_file; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/register_file.hpp b/rad-sim/example-designs/dlrm_two_rad/modules/register_file.hpp new file mode 100644 index 0000000..6915e4a --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/register_file.hpp @@ -0,0 +1,71 @@ +#pragma once + +#include +#include +#include + +#define RF_RD_LATENCY 2 +#define RF_WR_LATENCY 2 + +template class register_file : public sc_module { +private: + unsigned int register_file_id; + unsigned int num_elements_per_word; + std::vector> mem; + + sc_vector>> wdata_pipeline; + sc_vector> waddr_pipeline; + sc_vector> raddr_pipeline; + sc_vector> wen_pipeline; + +public: + // Register file inputs + sc_in clk; + sc_in rst; + sc_in raddr; + sc_in> wdata; + sc_in waddr; + sc_in wen; + sc_in clk_en; + + // Register file outputs + sc_out> rdata; + + // DPE constructor and destructor + register_file(const sc_module_name &name, unsigned int id, unsigned int depth, + unsigned int width, std::string &init_file); + ~register_file(); + + void Tick(); + void Assign(); + + SC_HAS_PROCESS(register_file); +}; + +template +bool parse_register_file_contents_from_file( + std::vector> &mem, std::string &init_file, + unsigned int width, unsigned int depth) { + + std::ifstream rf_content(init_file); + + if (!rf_content) + return false; + + std::string line; + uint32_t addr = 0; + while (std::getline(rf_content, line) && (addr < depth)) { + std::stringstream line_stream(line); + std::vector rf_word(width, 0); + dtype value; + unsigned int idx = 0; + while (idx < width) { + line_stream >> value; + rf_word[idx] = value; + idx++; + } + mem[addr] = rf_word; + addr++; + } + return true; +} \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/sim_utils.cpp b/rad-sim/example-designs/dlrm_two_rad/modules/sim_utils.cpp new file mode 100644 index 0000000..8adf301 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/sim_utils.cpp @@ -0,0 +1,271 @@ +#include +#include + +template data_vector::data_vector() {} + +template data_vector::data_vector(unsigned int size) { + v.resize(size, 0); +} + +template +data_vector::data_vector(sc_vector> &iport) { + v.resize(iport.size()); + for (unsigned int i = 0; i < iport.size(); i++) + v[i] = iport[i].read(); +} + +template +data_vector::data_vector(std::vector &vec) { + v.resize(vec.size()); + for (unsigned int i = 0; i < vec.size(); i++) + v[i] = vec[i]; +} + +template data_vector::data_vector(std::vector &vec) { + v.resize(vec.size()); + for (unsigned int i = 0; i < vec.size(); i++) + v[i] = vec[i]; +} + +template +bool data_vector::operator==(const data_vector &rhs) { + if (v.size() != rhs.v.size()) + return false; + bool is_equal = true; + for (unsigned int i = 0; i < v.size(); i++) + is_equal &= (v[i] == rhs.v[i]); + return is_equal; +} + +template +bool data_vector::operator==(const std::vector &rhs) { + if (v.size() != rhs.size()) + return false; + bool is_equal = true; + for (unsigned int i = 0; i < v.size(); i++) + is_equal &= (v[i] == rhs[i]); + return is_equal; +} + +template dtype &data_vector::operator[](unsigned int idx) { + return v[idx]; +} + +template unsigned int data_vector::size() { + return v.size(); +} + +template void data_vector::resize(unsigned int size) { + return v.resize(size); +} + +template +ostream &operator<<(ostream &o, const data_vector &dvector) { + for (unsigned int i = 0; i < dvector.v.size(); i++) + o << dvector.v[i] << " "; + return o; +} +template ostream & +operator<< (ostream &o, const data_vector &dvector); +template ostream &operator<< (ostream &o, + const data_vector &dvector); +template ostream &operator<< (ostream &o, + const data_vector &dvector); +template ostream & +operator<< >(ostream &o, const data_vector> &dvector); +template ostream &operator<< >(ostream &o, + const data_vector> &dvector); +template ostream & +operator<< >(ostream &o, const data_vector> &dvector); + +template +data_vector operator+(const data_vector &v1, + const data_vector &v2) { + assert(v1.v.size() == v2.v.size()); + if (v1.v.size() != v2.v.size()) { + cerr << "Attempting to add two data vectors of different sizes! "; + cerr << "(" << v1.v.size() << "," << v2.v.size() << ")" << endl; + exit(1); + } + + data_vector res = data_vector(v1.v.size()); + for (unsigned int i = 0; i < v1.v.size(); i++) + res.v[i] = v1.v[i] + v2.v[i]; + return res; +} +template data_vector operator+ + (const data_vector &v1, + const data_vector &v2); +template data_vector operator+ + (const data_vector &v1, + const data_vector &v2); +template data_vector operator+ + (const data_vector &v1, const data_vector &v2); +template data_vector> operator+ + >(const data_vector> &v1, + const data_vector> &v2); +template data_vector> operator+ + >(const data_vector> &v1, + const data_vector> &v2); +template data_vector> operator+ + >(const data_vector> &v1, + const data_vector> &v2); + +template +data_vector operator-(const data_vector &v1, + const data_vector &v2) { + if (v1.v.size() != v2.v.size()) { + cerr << "Attempting to subtract two data vectors of different sizes!"; + cerr << "(" << v1.v.size() << "," << v2.v.size() << ")" << endl; + exit(1); + } + + data_vector res = data_vector(v1.v.size()); + for (unsigned int i = 0; i < v1.v.size(); i++) + res.v[i] = v1.v[i] - v2.v[i]; + return res; +} +template data_vector operator- + (const data_vector &v1, + const data_vector &v2); +template data_vector operator- + (const data_vector &v1, + const data_vector &v2); +template data_vector operator- + (const data_vector &v1, const data_vector &v2); +template data_vector> operator- + >(const data_vector> &v1, + const data_vector> &v2); +template data_vector> operator- + >(const data_vector> &v1, + const data_vector> &v2); +template data_vector> operator- + >(const data_vector> &v1, + const data_vector> &v2); + +template +data_vector operator*(const data_vector &v1, + const data_vector &v2) { + if (v1.v.size() != v2.v.size()) { + cerr << "Attempting to multiply two data vectors of different sizes!"; + cerr << "(" << v1.v.size() << "," << v2.v.size() << ")" << endl; + exit(1); + } + + data_vector res = data_vector(v1.v.size()); + for (unsigned int i = 0; i < v1.v.size(); i++) + res.v[i] = v1.v[i] * v2.v[i]; + return res; +} +template data_vector operator* + (const data_vector &v1, + const data_vector &v2); +template data_vector operator* + (const data_vector &v1, + const data_vector &v2); +template data_vector operator* + (const data_vector &v1, const data_vector &v2); +template data_vector> operator* + >(const data_vector> &v1, + const data_vector> &v2); +template data_vector> operator* + >(const data_vector> &v1, + const data_vector> &v2); +template data_vector> operator* + >(const data_vector> &v1, + const data_vector> &v2); + +template +data_vector max(const data_vector &v1, + const data_vector &v2) { + if (v1.v.size() != v2.v.size()) { + cerr << "Attempting to max two data vectors of different sizes!"; + cerr << "(" << v1.v.size() << "," << v2.v.size() << ")" << endl; + exit(1); + } + + data_vector res = data_vector(v1.v.size()); + for (unsigned int i = 0; i < v1.v.size(); i++) + res.v[i] = (v1.v[i] > v2.v[i]) ? v1.v[i] : v2.v[i]; + return res; +} +template data_vector +max(const data_vector &v1, + const data_vector &v2); +template data_vector max(const data_vector &v1, + const data_vector &v2); +template data_vector max(const data_vector &v1, + const data_vector &v2); +template data_vector> +max>(const data_vector> &v1, + const data_vector> &v2); +template data_vector> +max>(const data_vector> &v1, + const data_vector> &v2); +template data_vector> +max>(const data_vector> &v1, + const data_vector> &v2); + +void sc_trace(sc_trace_file *f, const data_vector &dvector, + const std::string &s) { + for (unsigned int i = 0; i < dvector.v.size(); i++) + sc_trace(f, dvector.v[i], s + "_v" + std::to_string(i)); +} +void sc_trace(sc_trace_file *f, const data_vector &dvector, + const std::string &s) { + for (unsigned int i = 0; i < dvector.v.size(); i++) + sc_trace(f, dvector.v[i], s + "_v" + std::to_string(i)); +} +void sc_trace(sc_trace_file *f, const data_vector &dvector, + const std::string &s) { + for (unsigned int i = 0; i < dvector.v.size(); i++) + sc_trace(f, dvector.v[i], s + "_v" + std::to_string(i)); +} +void sc_trace(sc_trace_file *f, const data_vector> &dvector, + const std::string &s) { + for (unsigned int i = 0; i < dvector.v.size(); i++) + sc_trace(f, dvector.v[i], s + "_v" + std::to_string(i)); +} +void sc_trace(sc_trace_file *f, const data_vector> &dvector, + const std::string &s) { + for (unsigned int i = 0; i < dvector.v.size(); i++) + sc_trace(f, dvector.v[i], s + "_v" + std::to_string(i)); +} +void sc_trace(sc_trace_file *f, const data_vector> &dvector, + const std::string &s) { + for (unsigned int i = 0; i < dvector.v.size(); i++) + sc_trace(f, dvector.v[i], s + "_v" + std::to_string(i)); +} + +template +void init_vector::init_sc_vector(sc_vector &vector, + unsigned int dim0) { + vector.init(dim0); +} + +template class data_vector; +template class data_vector; +template class data_vector; + +template class init_vector>>; +template class init_vector>>; +template class init_vector>; + +template class init_vector>>; +template class init_vector>>; +template class init_vector>; + +template class init_vector>>; +template class init_vector>>; +template class init_vector>; + +template class data_vector>; +template class data_vector>; +template class data_vector>; + +template class init_vector>; +template class init_vector>; +template class init_vector>; +template class init_vector>>>; +template class init_vector>>>; +template class init_vector>>>; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/modules/sim_utils.hpp b/rad-sim/example-designs/dlrm_two_rad/modules/sim_utils.hpp new file mode 100644 index 0000000..3283e67 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/modules/sim_utils.hpp @@ -0,0 +1,108 @@ +#pragma once + +#include +#include +#include +#include + +template class data_vector; +template +data_vector max(const data_vector &v1, const data_vector &v2); +template +ostream &operator<<(ostream &o, const data_vector &dvector); +template +data_vector operator+(const data_vector &v1, const data_vector &v2); +template +data_vector operator-(const data_vector &v1, const data_vector &v2); +template +data_vector operator*(const data_vector &v1, const data_vector &v2); + +template class data_vector : public std::error_code { +public: + std::vector v; + + data_vector(); + data_vector(unsigned int size); + data_vector(sc_vector> &iport); + data_vector(std::vector &vec); + data_vector(std::vector &vec); + bool operator==(const data_vector &rhs); + bool operator==(const std::vector &rhs); + dtype &operator[](unsigned int idx); + unsigned int size(); + void resize(unsigned int size); + friend ostream &operator<< (ostream &o, + const data_vector &dvector); + friend data_vector operator+ + (const data_vector &v1, const data_vector &v2); + friend data_vector operator- + (const data_vector &v1, const data_vector &v2); + friend data_vector operator* + (const data_vector &v1, const data_vector &v2); + friend data_vector max(const data_vector &v1, + const data_vector &v2); +}; +void sc_trace(sc_trace_file *f, const data_vector &dvector, + const std::string &s); +void sc_trace(sc_trace_file *f, const data_vector &dvector, + const std::string &s); +void sc_trace(sc_trace_file *f, const data_vector &dvector, + const std::string &s); +void sc_trace(sc_trace_file *f, const data_vector> &dvector, + const std::string &s); +void sc_trace(sc_trace_file *f, const data_vector> &dvector, + const std::string &s); +void sc_trace(sc_trace_file *f, const data_vector> &dvector, + const std::string &s); + +template struct init_vector { + static void init_sc_vector(sc_vector &vector, unsigned int dim0); +}; + +template <> struct std::iterator_traits { + typedef int difference_type; + typedef int value_type; + typedef int *pointer; + typedef int &reference; + typedef std::forward_iterator_tag iterator_category; +}; + +template <> struct std::iterator_traits { + typedef int difference_type; + typedef int value_type; + typedef int *pointer; + typedef int &reference; + typedef std::forward_iterator_tag iterator_category; +}; + +template <> struct std::iterator_traits { + typedef int difference_type; + typedef int value_type; + typedef int *pointer; + typedef int &reference; + typedef std::forward_iterator_tag iterator_category; +}; + +template <> struct std::iterator_traits> { + typedef int difference_type; + typedef int value_type; + typedef int *pointer; + typedef int &reference; + typedef std::forward_iterator_tag iterator_category; +}; + +template <> struct std::iterator_traits> { + typedef int difference_type; + typedef int value_type; + typedef int *pointer; + typedef int &reference; + typedef std::forward_iterator_tag iterator_category; +}; + +template <> struct std::iterator_traits> { + typedef int difference_type; + typedef int value_type; + typedef int *pointer; + typedef int &reference; + typedef std::forward_iterator_tag iterator_category; +}; \ No newline at end of file diff --git a/rad-sim/example-designs/dlrm_two_rad/sim_trace b/rad-sim/example-designs/dlrm_two_rad/sim_trace new file mode 100644 index 0000000..cd55320 --- /dev/null +++ b/rad-sim/example-designs/dlrm_two_rad/sim_trace @@ -0,0 +1,3 @@ +12 +15 32 78 59 63 44 74 88 75 32 69 82 +92 32 54 97 88 65 23 44 17 24 39 55 \ No newline at end of file diff --git a/rad-sim/example-designs/mlp/CMakeLists.txt b/rad-sim/example-designs/mlp/CMakeLists.txt index ae17c23..a92a5e2 100644 --- a/rad-sim/example-designs/mlp/CMakeLists.txt +++ b/rad-sim/example-designs/mlp/CMakeLists.txt @@ -39,5 +39,5 @@ set(hdrfiles add_compile_options(-Wall -Wextra -pedantic) -add_library(design STATIC ${srcfiles} ${hdrfiles}) -target_link_libraries(design PUBLIC SystemC::systemc booksim noc) +add_library(mlp STATIC ${srcfiles} ${hdrfiles}) +target_link_libraries(mlp PUBLIC SystemC::systemc booksim noc) diff --git a/rad-sim/example-designs/mlp/compiler/gen_testcase.py b/rad-sim/example-designs/mlp/compiler/gen_testcase.py index 6b5f12e..b94db14 100644 --- a/rad-sim/example-designs/mlp/compiler/gen_testcase.py +++ b/rad-sim/example-designs/mlp/compiler/gen_testcase.py @@ -206,6 +206,12 @@ placement_file.write('output_collector 0 ' + str(router_ids[idx]) + ' axis\n') idx = idx + 1 clocks_file.write('output_collector 0 0\n') + +#WARNING: uncomment out if multi-rad design +print('WARNING: if multi-rad mlp design, uncomment out lines 212-213 of gen_testcase.py') +# placement_file.write('portal_inst 0 16 axis\n') +# clocks_file.write('portal_inst 0 0\n') + placement_file.close() clocks_file.close() diff --git a/rad-sim/example-designs/mlp/config.yml b/rad-sim/example-designs/mlp/config.yml index 83ad0a1..bf12c31 100644 --- a/rad-sim/example-designs/mlp/config.yml +++ b/rad-sim/example-designs/mlp/config.yml @@ -27,15 +27,19 @@ noc_adapters: out_arbiter: ['priority_rr'] vc_mapping: ['direct'] -design: - name: 'mlp' - noc_placement: ['mlp.place'] - clk_periods: [5.0] +config rad1: + design: + name: 'mlp' + noc_placement: ['mlp.place'] + clk_periods: [5.0] + +cluster: + sim_driver_period: 5.0 + telemetry_log_verbosity: 2 + telemetry_traces: [] + num_rads: 1 + cluster_configs: ['rad1'] -telemetry: - log_verbosity: 2 - traces: [] - interfaces: max_axis_tdata_width: 512 axis_tuser_width: 75 diff --git a/rad-sim/example-designs/mlp/mlp.place b/rad-sim/example-designs/mlp/mlp.place index 21cd35d..db1f497 100644 --- a/rad-sim/example-designs/mlp/mlp.place +++ b/rad-sim/example-designs/mlp/mlp.place @@ -1,16 +1,16 @@ -layer0_mvm0 0 4 axis -layer0_mvm1 0 12 axis -layer0_mvm2 0 13 axis -layer0_mvm3 0 15 axis -layer1_mvm0 0 0 axis -layer1_mvm1 0 8 axis -layer1_mvm2 0 3 axis -layer2_mvm0 0 11 axis -layer2_mvm1 0 9 axis -layer3_mvm0 0 7 axis -layer3_mvm1 0 2 axis -input_dispatcher0 0 14 axis -input_dispatcher1 0 5 axis -input_dispatcher2 0 6 axis +layer0_mvm0 0 2 axis +layer0_mvm1 0 5 axis +layer0_mvm2 0 7 axis +layer0_mvm3 0 0 axis +layer1_mvm0 0 8 axis +layer1_mvm1 0 4 axis +layer1_mvm2 0 11 axis +layer2_mvm0 0 6 axis +layer2_mvm1 0 12 axis +layer3_mvm0 0 10 axis +layer3_mvm1 0 14 axis +input_dispatcher0 0 15 axis +input_dispatcher1 0 13 axis +input_dispatcher2 0 9 axis input_dispatcher3 0 1 axis -output_collector 0 10 axis +output_collector 0 3 axis diff --git a/rad-sim/example-designs/mlp/mlp_driver.cpp b/rad-sim/example-designs/mlp/mlp_driver.cpp index 61f91be..9a9df9a 100644 --- a/rad-sim/example-designs/mlp/mlp_driver.cpp +++ b/rad-sim/example-designs/mlp/mlp_driver.cpp @@ -19,12 +19,13 @@ bool ParseIO(std::vector>& data_vec, std::string& io_filename) return true; } -mlp_driver::mlp_driver(const sc_module_name& name) : sc_module(name) { +mlp_driver::mlp_driver(const sc_module_name& name, RADSimDesignContext* radsim_design_) : sc_module(name) { + this->radsim_design = radsim_design_; start_cycle = 0; end_cycle = 0; // Parse design configuration (number of layers & number of MVM per layer) - std::string design_root_dir = radsim_config.GetStringKnob("radsim_user_design_root_dir"); + std::string design_root_dir = radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); std::string design_config_filename = design_root_dir + "/compiler/layer_mvm_config"; std::ifstream design_config_file(design_config_filename); if(!design_config_file) { @@ -82,7 +83,7 @@ void mlp_driver::source() { dispatcher_fifo_wen[dispatcher_id].write(false); wait(); rst.write(false); - start_cycle = GetSimulationCycle(radsim_config.GetDoubleKnob("sim_driver_period")); + start_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); wait(); std::vector written_inputs(num_mvms[0], 0); @@ -128,20 +129,23 @@ void mlp_driver::sink() { } if (mistake) { std::cout << "FAILURE - Some outputs NOT matching!" << std::endl; - radsim_design.ReportDesignFailure(); + radsim_design->ReportDesignFailure(); } else std::cout << "SUCCESS - All outputs are matching!" << std::endl; - end_cycle = GetSimulationCycle(radsim_config.GetDoubleKnob("sim_driver_period")); + end_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); std::cout << "Simulation Cycles = " << end_cycle - start_cycle << std::endl; NoCTransactionTelemetry::DumpStatsToFile("stats.csv"); NoCFlitTelemetry::DumpNoCFlitTracesToFile("flit_traces.csv"); + std::cout << "mlp_driver.cpp radsim_design->rad_id: " << radsim_design->rad_id << std::endl; std::vector aggregate_bandwidths = NoCTransactionTelemetry::DumpTrafficFlows("traffic_flows", - end_cycle - start_cycle, radsim_design.GetNodeModuleNames()); + end_cycle - start_cycle, radsim_design->GetNodeModuleNames(), radsim_design->rad_id); std::cout << "Aggregate NoC BW = " << aggregate_bandwidths[0] / 1000000000 << " Gbps" << std::endl; - sc_stop(); + //sc_stop(); + this->radsim_design->set_rad_done(); //flag to replace sc_stop calls + return; } void mlp_driver::assign() { diff --git a/rad-sim/example-designs/mlp/mlp_driver.hpp b/rad-sim/example-designs/mlp/mlp_driver.hpp index 0d3dc11..5a01e54 100644 --- a/rad-sim/example-designs/mlp/mlp_driver.hpp +++ b/rad-sim/example-designs/mlp/mlp_driver.hpp @@ -18,6 +18,7 @@ class mlp_driver : public sc_module { std::vector num_mvms; std::vector>> test_inputs; std::vector> golden_outputs; + RADSimDesignContext* radsim_design; public: sc_in clk; @@ -31,7 +32,7 @@ class mlp_driver : public sc_module { sc_out collector_fifo_ren; sc_in>> collector_fifo_rdata; - mlp_driver(const sc_module_name& name); + mlp_driver(const sc_module_name& name, RADSimDesignContext* radsim_design_); ~mlp_driver(); void source(); diff --git a/rad-sim/example-designs/mlp/mlp_system.cpp b/rad-sim/example-designs/mlp/mlp_system.cpp index 47496c5..66be09f 100644 --- a/rad-sim/example-designs/mlp/mlp_system.cpp +++ b/rad-sim/example-designs/mlp/mlp_system.cpp @@ -1,10 +1,10 @@ #include "mlp_system.hpp" -mlp_system::mlp_system(const sc_module_name& name, sc_clock* driver_clk_sig) : +mlp_system::mlp_system(const sc_module_name& name, sc_clock* driver_clk_sig, RADSimDesignContext* radsim_design) : sc_module(name) { // Parse design configuration (number of layers and number of MVMs per layer) - std::string design_root_dir = radsim_config.GetStringKnob("radsim_user_design_root_dir"); + std::string design_root_dir = radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); std::string design_config_filename = design_root_dir + "/compiler/layer_mvm_config"; std::ifstream design_config_file(design_config_filename); if(!design_config_file) { @@ -29,7 +29,7 @@ mlp_system::mlp_system(const sc_module_name& name, sc_clock* driver_clk_sig) : init_vector>>>::init_sc_vector(dispatcher_fifo_wdata_signal, num_mvms[0]); // Instantiate driver - mlp_driver_inst = new mlp_driver("mlp_driver"); + mlp_driver_inst = new mlp_driver("mlp_driver", radsim_design); mlp_driver_inst->clk(*driver_clk_sig); mlp_driver_inst->rst(rst_sig); mlp_driver_inst->dispatcher_fifo_rdy(dispatcher_fifo_rdy_signal); @@ -40,7 +40,7 @@ mlp_system::mlp_system(const sc_module_name& name, sc_clock* driver_clk_sig) : mlp_driver_inst->collector_fifo_rdata(collector_fifo_rdata_signal); // Instantiate design top-level - mlp_inst = new mlp_top("mlp_top"); + mlp_inst = new mlp_top("mlp_top", radsim_design); mlp_inst->rst(rst_sig); mlp_inst->dispatcher_fifo_rdy(dispatcher_fifo_rdy_signal); mlp_inst->dispatcher_fifo_wen(dispatcher_fifo_wen_signal); @@ -48,6 +48,9 @@ mlp_system::mlp_system(const sc_module_name& name, sc_clock* driver_clk_sig) : mlp_inst->collector_fifo_rdy(collector_fifo_rdy_signal); mlp_inst->collector_fifo_ren(collector_fifo_ren_signal); mlp_inst->collector_fifo_rdata(collector_fifo_rdata_signal); + + //add _top as dut instance for parent class design_system + this->design_dut_inst = mlp_inst; } mlp_system::~mlp_system() { diff --git a/rad-sim/example-designs/mlp/mlp_system.hpp b/rad-sim/example-designs/mlp/mlp_system.hpp index a8dd8cd..50be2b0 100644 --- a/rad-sim/example-designs/mlp/mlp_system.hpp +++ b/rad-sim/example-designs/mlp/mlp_system.hpp @@ -6,8 +6,9 @@ #include #include #include +#include -class mlp_system : public sc_module { +class mlp_system : public RADSimDesignSystem { private: sc_vector> dispatcher_fifo_rdy_signal; sc_vector> dispatcher_fifo_wen_signal; @@ -22,6 +23,6 @@ class mlp_system : public sc_module { mlp_driver* mlp_driver_inst; mlp_top* mlp_inst; - mlp_system(const sc_module_name& name, sc_clock* driver_clk_sig); + mlp_system(const sc_module_name& name, sc_clock* driver_clk_sig, RADSimDesignContext* radsim_design); ~mlp_system(); }; \ No newline at end of file diff --git a/rad-sim/example-designs/mlp/mlp_top.cpp b/rad-sim/example-designs/mlp/mlp_top.cpp index 6d762c3..97316b8 100644 --- a/rad-sim/example-designs/mlp/mlp_top.cpp +++ b/rad-sim/example-designs/mlp/mlp_top.cpp @@ -1,9 +1,9 @@ #include -mlp_top::mlp_top(const sc_module_name &name) : sc_module(name) { - +mlp_top::mlp_top(const sc_module_name &name, RADSimDesignContext* radsim_design) : RADSimDesignTop(radsim_design) { + this->radsim_design = radsim_design; std::string design_root_dir = - radsim_config.GetStringKnob("radsim_user_design_root_dir"); + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); std::string design_config_filename = design_root_dir + "/compiler/layer_mvm_config"; @@ -42,13 +42,13 @@ mlp_top::mlp_top(const sc_module_name &name) : sc_module(name) { design_root_dir + "/compiler/inst_mifs/" + module_name_str + ".mif"; std::strcpy(module_name, module_name_str.c_str()); matrix_vector_engines[layer_id][mvm_id] = - new mvm(module_name, mvm_id, layer_id, inst_filename); + new mvm(module_name, mvm_id, layer_id, inst_filename, radsim_design); matrix_vector_engines[layer_id][mvm_id]->rst(rst); if (layer_id == 0) { module_name_str = "input_dispatcher" + std::to_string(mvm_id); std::strcpy(module_name, module_name_str.c_str()); - input_dispatchers[mvm_id] = new dispatcher(module_name, mvm_id); + input_dispatchers[mvm_id] = new dispatcher(module_name, mvm_id, radsim_design); input_dispatchers[mvm_id]->rst(rst); input_dispatchers[mvm_id]->data_fifo_rdy(dispatcher_fifo_rdy[mvm_id]); input_dispatchers[mvm_id]->data_fifo_wen(dispatcher_fifo_wen[mvm_id]); @@ -60,15 +60,16 @@ mlp_top::mlp_top(const sc_module_name &name) : sc_module(name) { module_name_str = "output_collector"; std::strcpy(module_name, module_name_str.c_str()); - output_collector = new collector(module_name); + output_collector = new collector(module_name, radsim_design); output_collector->rst(rst); output_collector->data_fifo_rdy(collector_fifo_rdy); output_collector->data_fifo_ren(collector_fifo_ren); output_collector->data_fifo_rdata(collector_fifo_rdata); - radsim_design.BuildDesignContext("mlp.place", "mlp.clks"); - radsim_design.CreateSystemNoCs(rst); - radsim_design.ConnectModulesToNoC(); + this->connectPortalReset(&rst); + radsim_design->BuildDesignContext("mlp.place", "mlp.clks"); + radsim_design->CreateSystemNoCs(rst); + radsim_design->ConnectModulesToNoC(); } mlp_top::~mlp_top() { diff --git a/rad-sim/example-designs/mlp/mlp_top.hpp b/rad-sim/example-designs/mlp/mlp_top.hpp index ef92126..96f3835 100644 --- a/rad-sim/example-designs/mlp/mlp_top.hpp +++ b/rad-sim/example-designs/mlp/mlp_top.hpp @@ -8,13 +8,14 @@ #include #include #include +#include - -class mlp_top : public sc_module { +class mlp_top : public RADSimDesignTop { private: std::vector> matrix_vector_engines; std::vector input_dispatchers; collector* output_collector; + RADSimDesignContext* radsim_design; public: sc_in rst; @@ -27,7 +28,7 @@ class mlp_top : public sc_module { sc_in collector_fifo_ren; sc_out>> collector_fifo_rdata; - mlp_top(const sc_module_name& name); + mlp_top(const sc_module_name& name, RADSimDesignContext* radsim_design); ~mlp_top(); void prepare_adapters_info(); }; \ No newline at end of file diff --git a/rad-sim/example-designs/mlp/modules/collector.cpp b/rad-sim/example-designs/mlp/modules/collector.cpp index 410e7c2..5b499d9 100644 --- a/rad-sim/example-designs/mlp/modules/collector.cpp +++ b/rad-sim/example-designs/mlp/modules/collector.cpp @@ -1,9 +1,10 @@ #include "collector.hpp" -collector::collector(const sc_module_name &name) - : RADSimModule(name), rst("rst"), data_fifo_rdy("data_fifo_rdy"), +collector::collector(const sc_module_name &name, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design), rst("rst"), data_fifo_rdy("data_fifo_rdy"), data_fifo_ren("data_fifo_ren"), data_fifo_rdata("data_fifo_rdata") { + this->radsim_design = radsim_design; module_name = name; char fifo_name[25]; diff --git a/rad-sim/example-designs/mlp/modules/collector.hpp b/rad-sim/example-designs/mlp/modules/collector.hpp index 4efab5a..d3cf8fc 100644 --- a/rad-sim/example-designs/mlp/modules/collector.hpp +++ b/rad-sim/example-designs/mlp/modules/collector.hpp @@ -20,13 +20,14 @@ class collector : public RADSimModule { data_fifo_almost_full_signal, data_fifo_almost_empty_signal; public: + RADSimDesignContext* radsim_design; sc_in rst; sc_out data_fifo_rdy; sc_in data_fifo_ren; sc_out>> data_fifo_rdata; axis_slave_port rx_interface; - collector(const sc_module_name& name); + collector(const sc_module_name& name, RADSimDesignContext* radsim_design); ~collector(); void Assign(); diff --git a/rad-sim/example-designs/mlp/modules/dispatcher.cpp b/rad-sim/example-designs/mlp/modules/dispatcher.cpp index aa1e590..6f37642 100644 --- a/rad-sim/example-designs/mlp/modules/dispatcher.cpp +++ b/rad-sim/example-designs/mlp/modules/dispatcher.cpp @@ -1,9 +1,10 @@ #include "dispatcher.hpp" -dispatcher::dispatcher(const sc_module_name &name, unsigned int id) - : RADSimModule(name), rst("rst"), data_fifo_rdy("data_fifo_rdy"), +dispatcher::dispatcher(const sc_module_name &name, unsigned int id, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design), rst("rst"), data_fifo_rdy("data_fifo_rdy"), data_fifo_wen("data_fifo_wen"), data_fifo_wdata("data_fifo_wdata") { + this->radsim_design = radsim_design; module_name = name; dispatcher_id = id; @@ -52,7 +53,12 @@ void dispatcher::Assign() { tx_interface.tid.write(0); std::string dest_name = "layer0_mvm" + std::to_string(dispatcher_id) + ".rx_interface"; - tx_interface.tdest.write(radsim_design.GetPortDestinationID(dest_name)); + unsigned int dest_id = radsim_design->GetPortDestinationID(dest_name); + sc_bv dest_id_concat; + DEST_REMOTE_NODE(dest_id_concat) = 0; //bc staying on same RAD + DEST_LOCAL_NODE(dest_id_concat) = dest_id; + DEST_RAD(dest_id_concat) = radsim_design->rad_id; + tx_interface.tdest.write(dest_id_concat); //radsim_design->GetPortDestinationID(dest_name)); // std::cout << "Dispatcher " << dispatcher_id << " pushed data into the // NoC with dest " // << radsim_design.GetPortDestinationID(dest_name) << "!" << std::endl; diff --git a/rad-sim/example-designs/mlp/modules/dispatcher.hpp b/rad-sim/example-designs/mlp/modules/dispatcher.hpp index 32aa957..2901390 100644 --- a/rad-sim/example-designs/mlp/modules/dispatcher.hpp +++ b/rad-sim/example-designs/mlp/modules/dispatcher.hpp @@ -21,13 +21,14 @@ class dispatcher : public RADSimModule { data_fifo_almost_full_signal, data_fifo_almost_empty_signal; public: + RADSimDesignContext* radsim_design; sc_in rst; sc_out data_fifo_rdy; sc_in data_fifo_wen; sc_in>> data_fifo_wdata; axis_master_port tx_interface; - dispatcher(const sc_module_name& name, unsigned int id); + dispatcher(const sc_module_name& name, unsigned int id, RADSimDesignContext* radsim_design); ~dispatcher(); void Assign(); diff --git a/rad-sim/example-designs/mlp/modules/mvm.cpp b/rad-sim/example-designs/mlp/modules/mvm.cpp index 29bfabf..c412915 100644 --- a/rad-sim/example-designs/mlp/modules/mvm.cpp +++ b/rad-sim/example-designs/mlp/modules/mvm.cpp @@ -39,8 +39,8 @@ bool ParseInstructions(std::vector &inst_mem, } mvm::mvm(const sc_module_name &name, unsigned int id_mvm, unsigned int id_layer, - const std::string &inst_filename) - : RADSimModule(name), matrix_mem_rdata("matrix_mem_rdata", DOT_PRODUCTS), + const std::string &inst_filename, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design), matrix_mem_rdata("matrix_mem_rdata", DOT_PRODUCTS), matrix_mem_wen("matrix_mem_wen", DOT_PRODUCTS), ififo_pipeline("ififo_pipeline", RF_RD_LATENCY), reduce_pipeline("reduce_pipeline", RF_RD_LATENCY), @@ -54,6 +54,7 @@ mvm::mvm(const sc_module_name &name, unsigned int id_mvm, unsigned int id_layer, dest_mvm_pipeline("mvm_layer_pipeline", COMPUTE_LATENCY + RF_RD_LATENCY), tdata_vec(LANES), result(DOT_PRODUCTS), rst("rst") { + this->radsim_design = radsim_design; module_name = name; mvm_id = id_mvm; layer_id = id_layer; @@ -71,7 +72,7 @@ mvm::mvm(const sc_module_name &name, unsigned int id_mvm, unsigned int id_layer, std::string mem_name_str; matrix_memory.resize(DOT_PRODUCTS); std::string mvm_dir = - radsim_config.GetStringKnob("radsim_user_design_root_dir"); + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); std::string mem_init_file; for (unsigned int dot_id = 0; dot_id < DOT_PRODUCTS; dot_id++) { mem_init_file = mvm_dir + "/compiler/weight_mifs/layer" + @@ -468,7 +469,7 @@ void mvm::Assign() { dest_name = "layer" + std::to_string(dest_layer_int - 1) + "_mvm" + std::to_string(dest_mvm_int) + ".rx_interface"; } - dest_id = radsim_design.GetPortDestinationID(dest_name); + dest_id = radsim_design->GetPortDestinationID(dest_name); unsigned int dest_interface; // If destination is the same layer, send to reduce FIFO @@ -495,7 +496,11 @@ void mvm::Assign() { tx_interface.tdata.write(tx_tdata_bv); tx_interface.tvalid.write(!ofifo_empty_signal); tx_interface.tuser.write(dest_interface); - tx_interface.tdest.write(dest_id); + sc_bv dest_id_concat; + DEST_REMOTE_NODE(dest_id_concat) = 0; //bc staying on same RAD + DEST_LOCAL_NODE(dest_id_concat) = dest_id; + DEST_RAD(dest_id_concat) = radsim_design->rad_id; + tx_interface.tdest.write(dest_id_concat); //dest_id); /*if (dest_interface == 2 << 13 && !ofifo_empty_signal) { std::cout << "Sending to reduce FIFO" << std::endl; std::cout << tx_tdata << std::endl; diff --git a/rad-sim/example-designs/mlp/modules/mvm.hpp b/rad-sim/example-designs/mlp/modules/mvm.hpp index 3969936..63ca169 100644 --- a/rad-sim/example-designs/mlp/modules/mvm.hpp +++ b/rad-sim/example-designs/mlp/modules/mvm.hpp @@ -96,11 +96,12 @@ class mvm : public RADSimModule { sc_signal dot_op, dot_reduce_op; public: + RADSimDesignContext* radsim_design; sc_in rst; axis_slave_port rx_interface; axis_master_port tx_interface; - mvm(const sc_module_name& name, unsigned int id_mvm, unsigned int id_layer, const std::string& inst_filename); + mvm(const sc_module_name& name, unsigned int id_mvm, unsigned int id_layer, const std::string& inst_filename, RADSimDesignContext* radsim_design); ~mvm(); void Assign(); diff --git a/rad-sim/example-designs/mlp_int8/CMakeLists.txt b/rad-sim/example-designs/mlp_int8/CMakeLists.txt index 99593ca..a517787 100644 --- a/rad-sim/example-designs/mlp_int8/CMakeLists.txt +++ b/rad-sim/example-designs/mlp_int8/CMakeLists.txt @@ -54,5 +54,5 @@ set(hdrfiles add_compile_options(-Wall -Wextra -pedantic) -add_library(design STATIC ${srcfiles} ${hdrfiles}) -target_link_libraries(design PUBLIC SystemC::systemc booksim noc rtl_designs) +add_library(mlp_int8 STATIC ${srcfiles} ${hdrfiles}) +target_link_libraries(mlp_int8 PUBLIC SystemC::systemc booksim noc rtl_designs) diff --git a/rad-sim/example-designs/mlp_int8/compiler/gen_testcase.py b/rad-sim/example-designs/mlp_int8/compiler/gen_testcase.py index 9281263..ebcc060 100644 --- a/rad-sim/example-designs/mlp_int8/compiler/gen_testcase.py +++ b/rad-sim/example-designs/mlp_int8/compiler/gen_testcase.py @@ -76,6 +76,11 @@ idx = idx + 1 clocks_file.write('inst_loader 0 0\n') +#WARNING: uncomment out if multi-rad design +print('WARNING: if multi-rad mlp_int8 design, uncomment out lines 81-82 of gen_testcase.py') +# placement_file.write('portal_inst 0 16 axis\n') +# clocks_file.write('portal_inst 0 0\n') + placement_file.close() clocks_file.close() diff --git a/rad-sim/example-designs/mlp_int8/config.yml b/rad-sim/example-designs/mlp_int8/config.yml index 45551bd..bbae812 100644 --- a/rad-sim/example-designs/mlp_int8/config.yml +++ b/rad-sim/example-designs/mlp_int8/config.yml @@ -27,11 +27,15 @@ noc_adapters: out_arbiter: ['priority_rr'] vc_mapping: ['direct'] -design: - name: 'mlp_int8' - noc_placement: ['mlp.place'] - clk_periods: [5.0] +config rad1: + design: + name: 'mlp_int8' + noc_placement: ['mlp.place'] + clk_periods: [5.0] -telemetry: - log_verbosity: 2 - traces: [] \ No newline at end of file +cluster: + sim_driver_period: 5.0 + telemetry_log_verbosity: 2 + telemetry_traces: [] + num_rads: 1 + cluster_configs: ['rad1'] \ No newline at end of file diff --git a/rad-sim/example-designs/mlp_int8/mlp.place b/rad-sim/example-designs/mlp_int8/mlp.place index bcbf43b..0f28d14 100644 --- a/rad-sim/example-designs/mlp_int8/mlp.place +++ b/rad-sim/example-designs/mlp_int8/mlp.place @@ -1,16 +1,16 @@ -layer0_mvm0 0 3 axis -layer0_mvm1 0 1 axis -layer0_mvm2 0 0 axis +layer0_mvm0 0 13 axis +layer0_mvm1 0 6 axis +layer0_mvm2 0 1 axis layer1_mvm0 0 2 axis -layer1_mvm1 0 5 axis -layer1_mvm2 0 6 axis +layer1_mvm1 0 14 axis +layer1_mvm2 0 0 axis layer2_mvm0 0 9 axis -layer2_mvm1 0 11 axis -layer3_mvm0 0 14 axis -layer3_mvm1 0 10 axis -input_dispatcher0 0 13 axis -input_dispatcher1 0 4 axis -input_dispatcher2 0 12 axis -output_collector 0 7 axis -weight_loader 0 8 axis -inst_loader 0 15 axis +layer2_mvm1 0 3 axis +layer3_mvm0 0 4 axis +layer3_mvm1 0 12 axis +input_dispatcher0 0 15 axis +input_dispatcher1 0 11 axis +input_dispatcher2 0 7 axis +output_collector 0 5 axis +weight_loader 0 10 axis +inst_loader 0 8 axis diff --git a/rad-sim/example-designs/mlp_int8/mlp_driver.cpp b/rad-sim/example-designs/mlp_int8/mlp_driver.cpp index 95130df..53f81d7 100644 --- a/rad-sim/example-designs/mlp_int8/mlp_driver.cpp +++ b/rad-sim/example-designs/mlp_int8/mlp_driver.cpp @@ -3,9 +3,9 @@ bool ParseWeights(std::vector>& weights, std::vector& rf_ids, std::vector& rf_addrs, std::vector& layer_ids, std::vector& mvm_ids, - unsigned int num_layers, std::vector& num_mvms) { + unsigned int num_layers, std::vector& num_mvms, unsigned int _rad_id) { - std::string design_root_dir = radsim_config.GetStringKnob("radsim_user_design_root_dir"); + std::string design_root_dir = radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", _rad_id); for (unsigned int l = 0; l < num_layers; l++) { for (unsigned int m = 0; m < num_mvms[l]; m++) { for (unsigned int d = 0; d < DPES; d++) { @@ -43,8 +43,8 @@ bool ParseWeights(std::vector>& weights, bool ParseInstructions(std::vector &insts, std::vector& layer_ids, std::vector& mvm_ids, - unsigned int num_layers, std::vector& num_mvms) { - std::string design_root_dir = radsim_config.GetStringKnob("radsim_user_design_root_dir"); + unsigned int num_layers, std::vector& num_mvms, unsigned int _rad_id) { + std::string design_root_dir = radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", _rad_id); for (unsigned int l = 0; l < num_layers; l++) { for (unsigned int m = 0; m < num_mvms[l]; m++) { @@ -106,12 +106,13 @@ bool ParseIO(std::vector>& data_vec, std::string& io_filename) return true; } -mlp_driver::mlp_driver(const sc_module_name& name) : sc_module(name) { +mlp_driver::mlp_driver(const sc_module_name& name, RADSimDesignContext* radsim_design_) : sc_module(name) { + this->radsim_design = radsim_design_; start_cycle = 0; end_cycle = 0; // Parse design configuration (number of layers & number of MVM per layer) - std::string design_root_dir = radsim_config.GetStringKnob("radsim_user_design_root_dir"); + std::string design_root_dir = radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); std::string design_config_filename = design_root_dir + "/compiler/layer_mvm_config"; std::ifstream design_config_file(design_config_filename); if (!design_config_file) { @@ -144,11 +145,11 @@ mlp_driver::mlp_driver(const sc_module_name& name) : sc_module(name) { // Parse weights ParseWeights(weight_data, weight_rf_id, weight_rf_addr, weight_layer_id, - weight_mvm_id, num_layers, num_mvms_total); + weight_mvm_id, num_layers, num_mvms_total, radsim_design->rad_id); std::cout << "# Weight vectors = " << weight_data.size() << std::endl; // Parse instructions - ParseInstructions(inst_data, inst_layer_id, inst_mvm_id, num_layers, num_mvms_total); + ParseInstructions(inst_data, inst_layer_id, inst_mvm_id, num_layers, num_mvms_total, radsim_design->rad_id); std::cout << "# Instructions = " << inst_data.size() << std::endl; // Parse test inputs @@ -258,7 +259,7 @@ void mlp_driver::source() { wait(); } - start_cycle = GetSimulationCycle(radsim_config.GetDoubleKnob("sim_driver_period")); + start_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); start_time = std::chrono::steady_clock::now(); wait(); @@ -309,12 +310,12 @@ void mlp_driver::sink() { } if (mistake) { std::cout << "FAILURE - Some outputs NOT matching!" << std::endl; - radsim_design.ReportDesignFailure(); + radsim_design->ReportDesignFailure(); } else { std::cout << "SUCCESS - All outputs are matching!" << std::endl; } - end_cycle = GetSimulationCycle(radsim_config.GetDoubleKnob("sim_driver_period")); + end_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); end_time = std::chrono::steady_clock::now(); std::cout << "Simulation Cycles = " << end_cycle - start_cycle << std::endl; std::cout << "Simulation Time = " << std::chrono::duration_cast (end_time - start_time).count() << " ms" << std::endl; @@ -322,10 +323,12 @@ void mlp_driver::sink() { NoCFlitTelemetry::DumpNoCFlitTracesToFile("flit_traces.csv"); std::vector aggregate_bandwidths = NoCTransactionTelemetry::DumpTrafficFlows("traffic_flows", - end_cycle - start_cycle, radsim_design.GetNodeModuleNames()); + end_cycle - start_cycle, radsim_design->GetNodeModuleNames(), radsim_design->rad_id); std::cout << "Aggregate NoC BW = " << aggregate_bandwidths[0] / 1000000000 << " Gbps" << std::endl; - sc_stop(); + //sc_stop(); + this->radsim_design->set_rad_done(); //flag to replace sc_stop calls + return; } void mlp_driver::assign() { diff --git a/rad-sim/example-designs/mlp_int8/mlp_driver.hpp b/rad-sim/example-designs/mlp_int8/mlp_driver.hpp index fd5b4d0..6490224 100644 --- a/rad-sim/example-designs/mlp_int8/mlp_driver.hpp +++ b/rad-sim/example-designs/mlp_int8/mlp_driver.hpp @@ -72,7 +72,9 @@ class mlp_driver : public sc_module { sc_out collector_fifo_ren; sc_in>> collector_fifo_rdata; - mlp_driver(const sc_module_name& name); + RADSimDesignContext* radsim_design; + + mlp_driver(const sc_module_name& name, RADSimDesignContext* radsim_design_); ~mlp_driver(); void source(); diff --git a/rad-sim/example-designs/mlp_int8/mlp_int8_system.cpp b/rad-sim/example-designs/mlp_int8/mlp_int8_system.cpp index bfd5286..7776e50 100644 --- a/rad-sim/example-designs/mlp_int8/mlp_int8_system.cpp +++ b/rad-sim/example-designs/mlp_int8/mlp_int8_system.cpp @@ -1,10 +1,10 @@ #include "mlp_int8_system.hpp" -mlp_int8_system::mlp_int8_system(const sc_module_name& name, sc_clock* driver_clk_sig) : +mlp_int8_system::mlp_int8_system(const sc_module_name& name, sc_clock* driver_clk_sig, RADSimDesignContext* radsim_design) : sc_module(name) { // Parse design configuration (number of layers and number of MVMs per layer) - std::string design_root_dir = radsim_config.GetStringKnob("radsim_user_design_root_dir"); + std::string design_root_dir = radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); std::string design_config_filename = design_root_dir + "/compiler/layer_mvm_config"; std::ifstream design_config_file(design_config_filename); if (!design_config_file) { @@ -37,7 +37,7 @@ mlp_int8_system::mlp_int8_system(const sc_module_name& name, sc_clock* driver_cl init_vector>>>::init_sc_vector(dispatcher_fifo_wdata_signal, num_mvms_total[0]); // Instantiate driver - mlp_driver_inst = new mlp_driver("mlp_driver"); + mlp_driver_inst = new mlp_driver("mlp_driver", radsim_design); mlp_driver_inst->clk(*driver_clk_sig); mlp_driver_inst->rst(rst_sig); mlp_driver_inst->weight_loader_weight_fifo_rdy(weight_loader_weight_fifo_rdy_signal); @@ -72,7 +72,7 @@ mlp_int8_system::mlp_int8_system(const sc_module_name& name, sc_clock* driver_cl mlp_driver_inst->collector_fifo_rdata(collector_fifo_rdata_signal); // Instantiate design top-level - mlp_inst = new mlp_top("mlp_top"); + mlp_inst = new mlp_top("mlp_top", radsim_design); mlp_inst->rst(rst_sig); mlp_inst->weight_loader_weight_fifo_rdy(weight_loader_weight_fifo_rdy_signal); mlp_inst->weight_loader_weight_fifo_wen(weight_loader_weight_fifo_wen_signal); @@ -104,6 +104,9 @@ mlp_int8_system::mlp_int8_system(const sc_module_name& name, sc_clock* driver_cl mlp_inst->collector_fifo_rdy(collector_fifo_rdy_signal); mlp_inst->collector_fifo_ren(collector_fifo_ren_signal); mlp_inst->collector_fifo_rdata(collector_fifo_rdata_signal); + + //add _top as dut instance for parent class RADSimDesignSystem + this->design_dut_inst = mlp_inst; } mlp_int8_system::~mlp_int8_system() { diff --git a/rad-sim/example-designs/mlp_int8/mlp_int8_system.hpp b/rad-sim/example-designs/mlp_int8/mlp_int8_system.hpp index 5e020a9..469a4af 100644 --- a/rad-sim/example-designs/mlp_int8/mlp_int8_system.hpp +++ b/rad-sim/example-designs/mlp_int8/mlp_int8_system.hpp @@ -5,8 +5,9 @@ #include #include #include +#include -class mlp_int8_system : public sc_module { +class mlp_int8_system : public RADSimDesignSystem { private: std::vector num_mvms_sysc; std::vector num_mvms_rtl; @@ -51,6 +52,6 @@ class mlp_int8_system : public sc_module { mlp_driver* mlp_driver_inst; mlp_top* mlp_inst; - mlp_int8_system(const sc_module_name& name, sc_clock* driver_clk_sig); + mlp_int8_system(const sc_module_name& name, sc_clock* driver_clk_sig, RADSimDesignContext* radsim_design); ~mlp_int8_system(); }; \ No newline at end of file diff --git a/rad-sim/example-designs/mlp_int8/mlp_top.cpp b/rad-sim/example-designs/mlp_int8/mlp_top.cpp index 8b3bd16..27e8fd7 100644 --- a/rad-sim/example-designs/mlp_int8/mlp_top.cpp +++ b/rad-sim/example-designs/mlp_int8/mlp_top.cpp @@ -1,9 +1,10 @@ #include -mlp_top::mlp_top(const sc_module_name &name) : sc_module(name) { +mlp_top::mlp_top(const sc_module_name &name, RADSimDesignContext* radsim_design) : RADSimDesignTop(radsim_design) { + this->radsim_design = radsim_design; std::string design_root_dir = - radsim_config.GetStringKnob("radsim_user_design_root_dir"); + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); std::string design_config_filename = design_root_dir + "/compiler/layer_mvm_config"; @@ -50,7 +51,7 @@ mlp_top::mlp_top(const sc_module_name &name) : sc_module(name) { "layer" + std::to_string(layer_id) + "_mvm" + std::to_string(mvm_id); std::strcpy(module_name, module_name_str.c_str()); sysc_matrix_vector_engines[layer_id][mvm_id] = - new sysc_mvm(module_name, mvm_id, layer_id); + new sysc_mvm(module_name, mvm_id, layer_id, radsim_design); sysc_matrix_vector_engines[layer_id][mvm_id]->rst(rst); } for (unsigned int mvm_id = 0; mvm_id < num_mvms_rtl[layer_id]; mvm_id++) { @@ -58,7 +59,7 @@ mlp_top::mlp_top(const sc_module_name &name) : sc_module(name) { "layer" + std::to_string(layer_id) + "_mvm" + std::to_string(mvm_id + num_mvms_sysc[layer_id]); std::strcpy(module_name, module_name_str.c_str()); rtl_matrix_vector_engines[layer_id][mvm_id] = - new rtl_mvm(module_name); + new rtl_mvm(module_name, radsim_design); rtl_matrix_vector_engines[layer_id][mvm_id]->rst(rst); } } @@ -66,7 +67,7 @@ mlp_top::mlp_top(const sc_module_name &name) : sc_module(name) { for (unsigned int mvm_id = 0; mvm_id < num_mvms_total[0]; mvm_id++) { module_name_str = "input_dispatcher" + std::to_string(mvm_id); std::strcpy(module_name, module_name_str.c_str()); - input_dispatchers[mvm_id] = new dispatcher(module_name, mvm_id); + input_dispatchers[mvm_id] = new dispatcher(module_name, mvm_id, radsim_design); input_dispatchers[mvm_id]->rst(rst); input_dispatchers[mvm_id]->data_fifo_rdy(dispatcher_fifo_rdy[mvm_id]); input_dispatchers[mvm_id]->data_fifo_wen(dispatcher_fifo_wen[mvm_id]); @@ -76,7 +77,7 @@ mlp_top::mlp_top(const sc_module_name &name) : sc_module(name) { module_name_str = "output_collector"; std::strcpy(module_name, module_name_str.c_str()); - output_collector = new collector(module_name); + output_collector = new collector(module_name, radsim_design); output_collector->rst(rst); output_collector->data_fifo_rdy(collector_fifo_rdy); output_collector->data_fifo_ren(collector_fifo_ren); @@ -84,7 +85,7 @@ mlp_top::mlp_top(const sc_module_name &name) : sc_module(name) { module_name_str = "weight_loader"; std::strcpy(module_name, module_name_str.c_str()); - wloader = new weight_loader(module_name); + wloader = new weight_loader(module_name, radsim_design); wloader->rst(rst); wloader->weight_fifo_rdy(weight_loader_weight_fifo_rdy); wloader->weight_fifo_wen(weight_loader_weight_fifo_wen); @@ -104,7 +105,7 @@ mlp_top::mlp_top(const sc_module_name &name) : sc_module(name) { module_name_str = "inst_loader"; std::strcpy(module_name, module_name_str.c_str()); - iloader = new inst_loader(module_name); + iloader = new inst_loader(module_name, radsim_design); iloader->rst(rst); iloader->inst_fifo_rdy(inst_loader_inst_fifo_rdy); iloader->inst_fifo_wen(inst_loader_inst_fifo_wen); @@ -116,9 +117,10 @@ mlp_top::mlp_top(const sc_module_name &name) : sc_module(name) { iloader->mvm_id_fifo_wen(inst_loader_mvm_id_fifo_wen); iloader->mvm_id_fifo_wdata(inst_loader_mvm_id_fifo_wdata); - radsim_design.BuildDesignContext("mlp.place", "mlp.clks"); - radsim_design.CreateSystemNoCs(rst); - radsim_design.ConnectModulesToNoC(); + this->connectPortalReset(&rst); + radsim_design->BuildDesignContext("mlp.place", "mlp.clks"); + radsim_design->CreateSystemNoCs(rst); + radsim_design->ConnectModulesToNoC(); } mlp_top::~mlp_top() { diff --git a/rad-sim/example-designs/mlp_int8/mlp_top.hpp b/rad-sim/example-designs/mlp_int8/mlp_top.hpp index 00704d5..a41c0de 100644 --- a/rad-sim/example-designs/mlp_int8/mlp_top.hpp +++ b/rad-sim/example-designs/mlp_int8/mlp_top.hpp @@ -11,9 +11,10 @@ #include #include #include +#include -class mlp_top : public sc_module { +class mlp_top : public RADSimDesignTop { private: std::vector> rtl_matrix_vector_engines; std::vector> sysc_matrix_vector_engines; @@ -24,6 +25,7 @@ class mlp_top : public sc_module { collector* output_collector; weight_loader* wloader; inst_loader* iloader; + RADSimDesignContext* radsim_design; public: sc_in rst; @@ -62,7 +64,7 @@ class mlp_top : public sc_module { sc_in collector_fifo_ren; sc_out>> collector_fifo_rdata; - mlp_top(const sc_module_name& name); + mlp_top(const sc_module_name& name, RADSimDesignContext* radsim_design); ~mlp_top(); void prepare_adapters_info(); }; \ No newline at end of file diff --git a/rad-sim/example-designs/mlp_int8/modules/collector.cpp b/rad-sim/example-designs/mlp_int8/modules/collector.cpp index 266017d..c74f8ef 100644 --- a/rad-sim/example-designs/mlp_int8/modules/collector.cpp +++ b/rad-sim/example-designs/mlp_int8/modules/collector.cpp @@ -1,9 +1,10 @@ #include "collector.hpp" -collector::collector(const sc_module_name &name) - : RADSimModule(name), rst("rst"), data_fifo_rdy("data_fifo_rdy"), +collector::collector(const sc_module_name &name, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design), rst("rst"), data_fifo_rdy("data_fifo_rdy"), data_fifo_ren("data_fifo_ren"), data_fifo_rdata("data_fifo_rdata") { + this->radsim_design = radsim_design; module_name = name; char fifo_name[25]; diff --git a/rad-sim/example-designs/mlp_int8/modules/collector.hpp b/rad-sim/example-designs/mlp_int8/modules/collector.hpp index 91c402b..d83308b 100644 --- a/rad-sim/example-designs/mlp_int8/modules/collector.hpp +++ b/rad-sim/example-designs/mlp_int8/modules/collector.hpp @@ -19,13 +19,14 @@ class collector : public RADSimModule { data_fifo_almost_full_signal; public: + RADSimDesignContext* radsim_design; sc_in rst; sc_out data_fifo_rdy; sc_in data_fifo_ren; sc_out>> data_fifo_rdata; axis_slave_port rx_interface; - collector(const sc_module_name& name); + collector(const sc_module_name& name, RADSimDesignContext* radsim_design); ~collector(); void Assign(); diff --git a/rad-sim/example-designs/mlp_int8/modules/dispatcher.cpp b/rad-sim/example-designs/mlp_int8/modules/dispatcher.cpp index 90382a7..f685c93 100644 --- a/rad-sim/example-designs/mlp_int8/modules/dispatcher.cpp +++ b/rad-sim/example-designs/mlp_int8/modules/dispatcher.cpp @@ -1,9 +1,11 @@ #include "dispatcher.hpp" -dispatcher::dispatcher(const sc_module_name &name, unsigned int id) - : RADSimModule(name), rst("rst"), data_fifo_rdy("data_fifo_rdy"), +dispatcher::dispatcher(const sc_module_name &name, unsigned int id, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design), rst("rst"), data_fifo_rdy("data_fifo_rdy"), data_fifo_wen("data_fifo_wen"), data_fifo_wdata("data_fifo_wdata") { + this->radsim_design = radsim_design; + module_name = name; dispatcher_id = id; @@ -50,7 +52,7 @@ void dispatcher::Assign() { tx_interface.tuser.write(2 << 9); tx_interface.tid.write(0); std::string dest_name = "layer0_mvm" + std::to_string(dispatcher_id) + ".axis_rx"; - tx_interface.tdest.write(radsim_design.GetPortDestinationID(dest_name)); + tx_interface.tdest.write(radsim_design->GetPortDestinationID(dest_name)); } else { tx_interface.tvalid.write(false); } diff --git a/rad-sim/example-designs/mlp_int8/modules/dispatcher.hpp b/rad-sim/example-designs/mlp_int8/modules/dispatcher.hpp index a706a9e..dcc1511 100644 --- a/rad-sim/example-designs/mlp_int8/modules/dispatcher.hpp +++ b/rad-sim/example-designs/mlp_int8/modules/dispatcher.hpp @@ -22,13 +22,14 @@ class dispatcher : public RADSimModule { data_fifo_almost_full_signal; public: + RADSimDesignContext* radsim_design; sc_in rst; sc_out data_fifo_rdy; sc_in data_fifo_wen; sc_in>> data_fifo_wdata; axis_master_port tx_interface; - dispatcher(const sc_module_name& name, unsigned int id); + dispatcher(const sc_module_name& name, unsigned int id, RADSimDesignContext* radsim_design); ~dispatcher(); void Assign(); diff --git a/rad-sim/example-designs/mlp_int8/modules/inst_loader.cpp b/rad-sim/example-designs/mlp_int8/modules/inst_loader.cpp index 91a88e4..2643641 100644 --- a/rad-sim/example-designs/mlp_int8/modules/inst_loader.cpp +++ b/rad-sim/example-designs/mlp_int8/modules/inst_loader.cpp @@ -1,7 +1,7 @@ #include "inst_loader.hpp" -inst_loader::inst_loader(const sc_module_name &name) - : RADSimModule(name), +inst_loader::inst_loader(const sc_module_name &name, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design), rst("rst"), inst_fifo_rdy("inst_fifo_rdy"), inst_fifo_wen("inst_fifo_wen"), @@ -13,6 +13,7 @@ inst_loader::inst_loader(const sc_module_name &name) mvm_id_fifo_wen("mvm_id_fifo_wen"), mvm_id_fifo_wdata("mvm_id_fifo_wdata") { + this->radsim_design = radsim_design; module_name = name; char fifo_name[25]; @@ -89,7 +90,7 @@ void inst_loader::Assign() { std::string dest_name = "layer" + std::to_string(layer_id_fifo_odata.read()) + "_mvm" + std::to_string(mvm_id_fifo_odata.read()) + ".axis_rx"; - tx_interface.tdest.write(radsim_design.GetPortDestinationID(dest_name)); + tx_interface.tdest.write(radsim_design->GetPortDestinationID(dest_name)); } tx_interface.tvalid.write(!inst_fifo_empty.read()); diff --git a/rad-sim/example-designs/mlp_int8/modules/inst_loader.hpp b/rad-sim/example-designs/mlp_int8/modules/inst_loader.hpp index 9bee446..8b0efd8 100644 --- a/rad-sim/example-designs/mlp_int8/modules/inst_loader.hpp +++ b/rad-sim/example-designs/mlp_int8/modules/inst_loader.hpp @@ -29,6 +29,7 @@ class inst_loader : public RADSimModule { sc_signal mvm_id_fifo_pop, mvm_id_fifo_full, mvm_id_fifo_empty, mvm_id_fifo_almost_full; public: + RADSimDesignContext* radsim_design; sc_in rst; sc_out inst_fifo_rdy; sc_in inst_fifo_wen; @@ -41,7 +42,7 @@ class inst_loader : public RADSimModule { sc_in mvm_id_fifo_wdata; axis_master_port tx_interface; - inst_loader(const sc_module_name& name); + inst_loader(const sc_module_name& name, RADSimDesignContext* radsim_design); ~inst_loader(); void Assign(); diff --git a/rad-sim/example-designs/mlp_int8/modules/rtl_mvm.cpp b/rad-sim/example-designs/mlp_int8/modules/rtl_mvm.cpp index 424c012..1e8e792 100644 --- a/rad-sim/example-designs/mlp_int8/modules/rtl_mvm.cpp +++ b/rad-sim/example-designs/mlp_int8/modules/rtl_mvm.cpp @@ -1,6 +1,7 @@ #include -rtl_mvm::rtl_mvm(const sc_module_name &name) : RADSimModule(name) { +rtl_mvm::rtl_mvm(const sc_module_name &name, RADSimDesignContext* radsim_design) : RADSimModule(name, radsim_design) { + this->radsim_design = radsim_design; char vrtl_mvm_name[25]; std::string vrtl_mvm_name_str = std::string(name); std::strcpy(vrtl_mvm_name, vrtl_mvm_name_str.c_str()); diff --git a/rad-sim/example-designs/mlp_int8/modules/rtl_mvm.hpp b/rad-sim/example-designs/mlp_int8/modules/rtl_mvm.hpp index 4fc26d8..bc4420a 100644 --- a/rad-sim/example-designs/mlp_int8/modules/rtl_mvm.hpp +++ b/rad-sim/example-designs/mlp_int8/modules/rtl_mvm.hpp @@ -14,12 +14,13 @@ class rtl_mvm : public RADSimModule { Vrtl_mvm* vrtl_mvm; public: + RADSimDesignContext* radsim_design; sc_in rst; axis_slave_port axis_rx; axis_master_port axis_tx; - rtl_mvm(const sc_module_name &name); + rtl_mvm(const sc_module_name &name, RADSimDesignContext* radsim_design); ~rtl_mvm(); SC_HAS_PROCESS(rtl_mvm); diff --git a/rad-sim/example-designs/mlp_int8/modules/sysc_mvm.cpp b/rad-sim/example-designs/mlp_int8/modules/sysc_mvm.cpp index 3674a83..f6589db 100644 --- a/rad-sim/example-designs/mlp_int8/modules/sysc_mvm.cpp +++ b/rad-sim/example-designs/mlp_int8/modules/sysc_mvm.cpp @@ -1,7 +1,7 @@ #include "sysc_mvm.hpp" -sysc_mvm::sysc_mvm(const sc_module_name &name, unsigned int id_mvm, unsigned int id_layer) - : RADSimModule(name), +sysc_mvm::sysc_mvm(const sc_module_name &name, unsigned int id_mvm, unsigned int id_layer, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design), rf_rdata("rf_rdata", DPES), rf_wdata("rf_wdata"), rf_wen("rf_wen", DPES), @@ -27,7 +27,7 @@ sysc_mvm::sysc_mvm(const sc_module_name &name, unsigned int id_mvm, unsigned int std::string datapath_name_str; rf.resize(DPES); datapath_inst.resize(DPES); - std::string mvm_dir = radsim_config.GetStringKnob("radsim_user_design_root_dir"); + std::string mvm_dir = radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); std::string mem_init_file; // STAGE 1: Instruction FIFO, Input FIFO, and Reduction FIFO diff --git a/rad-sim/example-designs/mlp_int8/modules/sysc_mvm.hpp b/rad-sim/example-designs/mlp_int8/modules/sysc_mvm.hpp index 6b9c9fb..280bdf8 100644 --- a/rad-sim/example-designs/mlp_int8/modules/sysc_mvm.hpp +++ b/rad-sim/example-designs/mlp_int8/modules/sysc_mvm.hpp @@ -78,7 +78,7 @@ class sysc_mvm : public RADSimModule { axis_slave_port rx_interface; axis_master_port tx_interface; - sysc_mvm(const sc_module_name& name, unsigned int id_mvm, unsigned int id_layer); + sysc_mvm(const sc_module_name& name, unsigned int id_mvm, unsigned int id_layer, RADSimDesignContext* radsim_design); ~sysc_mvm(); void Assign(); diff --git a/rad-sim/example-designs/mlp_int8/modules/weight_loader.cpp b/rad-sim/example-designs/mlp_int8/modules/weight_loader.cpp index 831aaaa..fa0fa15 100644 --- a/rad-sim/example-designs/mlp_int8/modules/weight_loader.cpp +++ b/rad-sim/example-designs/mlp_int8/modules/weight_loader.cpp @@ -1,7 +1,7 @@ #include "weight_loader.hpp" -weight_loader::weight_loader(const sc_module_name &name) - : RADSimModule(name), +weight_loader::weight_loader(const sc_module_name &name, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design), rst("rst"), weight_fifo_rdy("weight_fifo_rdy"), weight_fifo_wen("weight_fifo_wen"), @@ -19,6 +19,7 @@ weight_loader::weight_loader(const sc_module_name &name) mvm_id_fifo_wen("mvm_id_fifo_wen"), mvm_id_fifo_wdata("mvm_id_fifo_wdata") { + this->radsim_design = radsim_design; module_name = name; char fifo_name[25]; @@ -136,7 +137,7 @@ void weight_loader::Assign() { std::string dest_name = "layer" + std::to_string(layer_id_fifo_odata.read()) + "_mvm" + std::to_string(mvm_id_fifo_odata.read()) + ".axis_rx"; - tx_interface.tdest.write(radsim_design.GetPortDestinationID(dest_name)); + tx_interface.tdest.write(radsim_design->GetPortDestinationID(dest_name)); } else { tx_interface.tvalid.write(false); } diff --git a/rad-sim/example-designs/mlp_int8/modules/weight_loader.hpp b/rad-sim/example-designs/mlp_int8/modules/weight_loader.hpp index 1aae3fd..b645251 100644 --- a/rad-sim/example-designs/mlp_int8/modules/weight_loader.hpp +++ b/rad-sim/example-designs/mlp_int8/modules/weight_loader.hpp @@ -36,6 +36,7 @@ class weight_loader : public RADSimModule { sc_signal mvm_id_fifo_pop, mvm_id_fifo_full, mvm_id_fifo_empty, mvm_id_fifo_almost_full; public: + RADSimDesignContext* radsim_design; sc_in rst; sc_out weight_fifo_rdy; sc_in weight_fifo_wen; @@ -54,7 +55,7 @@ class weight_loader : public RADSimModule { sc_in mvm_id_fifo_wdata; axis_master_port tx_interface; - weight_loader(const sc_module_name& name); + weight_loader(const sc_module_name& name, RADSimDesignContext* radsim_design); ~weight_loader(); void Assign(); diff --git a/rad-sim/example-designs/mult/.gitignore b/rad-sim/example-designs/mult/.gitignore new file mode 100644 index 0000000..e2d03be --- /dev/null +++ b/rad-sim/example-designs/mult/.gitignore @@ -0,0 +1,3 @@ +CMakeFiles/ +Makefile +CMakeCache.txt \ No newline at end of file diff --git a/rad-sim/example-designs/mult/CMakeLists.txt b/rad-sim/example-designs/mult/CMakeLists.txt new file mode 100644 index 0000000..a9cc639 --- /dev/null +++ b/rad-sim/example-designs/mult/CMakeLists.txt @@ -0,0 +1,35 @@ +cmake_minimum_required(VERSION 3.16) +find_package(SystemCLanguage CONFIG REQUIRED) + +include_directories( + ./ + modules + ../../sim + ../../sim/noc + ../../sim/noc/booksim + ../../sim/noc/booksim/networks + ../../sim/noc/booksim/routers +) + +set(srcfiles + modules/mult.cpp + modules/client.cpp + modules/fifo.cpp + mult_top.cpp + mult_driver.cpp + mult_system.cpp +) + +set(hdrfiles + modules/mult.hpp + modules/client.hpp + modules/fifo.hpp + mult_top.hpp + mult_driver.hpp + mult_system.hpp +) + +add_compile_options(-Wall -Wextra -pedantic) + +add_library(mult STATIC ${srcfiles} ${hdrfiles}) +target_link_libraries(mult PUBLIC SystemC::systemc booksim noc) \ No newline at end of file diff --git a/rad-sim/example-designs/mult/config.yml b/rad-sim/example-designs/mult/config.yml new file mode 100644 index 0000000..dd80ac8 --- /dev/null +++ b/rad-sim/example-designs/mult/config.yml @@ -0,0 +1,41 @@ +config rad1: + design: + name: 'mult' + noc_placement: ['mult.place'] + clk_periods: [5.0] + +noc: + type: ['2d'] + num_nocs: 1 + clk_period: [1.0] + payload_width: [166] + topology: ['mesh'] + dim_x: [4] + dim_y: [4] + routing_func: ['dim_order'] + vcs: [5] + vc_buffer_size: [8] + output_buffer_size: [8] + num_packet_types: [5] + router_uarch: ['iq'] + vc_allocator: ['islip'] + sw_allocator: ['islip'] + credit_delay: [1] + routing_delay: [1] + vc_alloc_delay: [1] + sw_alloc_delay: [1] + +noc_adapters: + clk_period: [1.25] + fifo_size: [16] + obuff_size: [2] + in_arbiter: ['fixed_rr'] + out_arbiter: ['priority_rr'] + vc_mapping: ['direct'] + +cluster: + sim_driver_period: 5.0 + telemetry_log_verbosity: 2 + telemetry_traces: [] + num_rads: 1 + cluster_configs: ['rad1'] \ No newline at end of file diff --git a/rad-sim/example-designs/mult/modules/client.cpp b/rad-sim/example-designs/mult/modules/client.cpp new file mode 100644 index 0000000..6c15013 --- /dev/null +++ b/rad-sim/example-designs/mult/modules/client.cpp @@ -0,0 +1,127 @@ +#include + +client::client(const sc_module_name &name, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design) { + + this->radsim_design = radsim_design; + + char fifo_name[25]; + std::string fifo_name_str; + fifo_name_str = "client_tdata_fifo"; + std::strcpy(fifo_name, fifo_name_str.c_str()); + client_tdata_fifo = new fifo>(fifo_name, FIFO_DEPTH, FIFO_DEPTH - 1, 0); + client_tdata_fifo->clk(clk); + client_tdata_fifo->rst(rst); + client_tdata_fifo->wen(client_tdata_fifo_wen_signal); + client_tdata_fifo->ren(client_tdata_fifo_ren_signal); + client_tdata_fifo->wdata(client_tdata); + client_tdata_fifo->full(client_tdata_fifo_full_signal); + client_tdata_fifo->almost_full(client_tdata_fifo_almost_full_signal); + client_tdata_fifo->empty(client_tdata_fifo_empty_signal); + client_tdata_fifo->almost_empty(client_tdata_fifo_almost_empty_signal); + client_tdata_fifo->rdata(client_tdata_fifo_rdata_signal); + + fifo_name_str = "client_tlast_fifo"; + std::strcpy(fifo_name, fifo_name_str.c_str()); + client_tlast_fifo = new fifo(fifo_name, FIFO_DEPTH, FIFO_DEPTH - 1, 0); + client_tlast_fifo->clk(clk); + client_tlast_fifo->rst(rst); + client_tlast_fifo->wen(client_tlast_fifo_wen_signal); + client_tlast_fifo->ren(client_tlast_fifo_ren_signal); + client_tlast_fifo->wdata(client_tlast); + client_tlast_fifo->full(client_tlast_fifo_full_signal); + client_tlast_fifo->almost_full(client_tlast_fifo_almost_full_signal); + client_tlast_fifo->empty(client_tlast_fifo_empty_signal); + client_tlast_fifo->almost_empty(client_tlast_fifo_almost_empty_signal); + client_tlast_fifo->rdata(client_tlast_fifo_rdata_signal); + + // Combinational logic and its sensitivity list + SC_METHOD(Assign); + sensitive << rst << client_ready << client_valid << client_tdata_fifo_almost_full_signal + << client_tdata_fifo_empty_signal << axis_client_interface.tready + << axis_client_interface.tvalid << client_tdata_fifo_rdata_signal + << client_tlast_fifo_rdata_signal; + // Sequential logic and its clock/reset setup + SC_CTHREAD(Tick, clk.pos()); + reset_signal_is(rst, true); // Reset is active high + + // This function must be defined & called for any RAD-Sim module to register + // its info for automatically connecting to the NoC + this->RegisterModuleInfo(); +} + +client::~client() { + delete client_tdata_fifo; + delete client_tlast_fifo; +} + +void client::Tick() { + wait(); + while (true) { + if (client_ready.read() && client_valid.read()) { + std::cout << this->name() << " @ cycle " + << GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")) << ": " + << " Pushed request to FIFO!" << std::endl; + } + + if (axis_client_interface.tvalid.read() && axis_client_interface.tready.read()) { + std::cout << this->name() << " @ cycle " + << GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")) << ": " + << " Sent Transaction!" << std::endl; + } + wait(); + } +} + +void client::Assign() { + if (rst) { + client_tdata_fifo_wen_signal.write(false); + client_tlast_fifo_wen_signal.write(false); + client_ready.write(false); + axis_client_interface.tvalid.write(false); + } else { + if (!client_tdata_fifo_empty_signal.read()) { + sc_bv tdata = client_tdata_fifo_rdata_signal.read(); + bool tlast = client_tlast_fifo_rdata_signal.read(); + std::string src_port_name = module_name + ".axis_client_interface"; + std::string dst_port_name = "mult_inst.axis_mult_interface"; + uint64_t dst_addr = radsim_design->GetPortDestinationID(dst_port_name); + uint64_t src_addr = radsim_design->GetPortDestinationID(src_port_name); + sc_bv dest_id_concat; + DEST_REMOTE_NODE(dest_id_concat) = 0; //bc staying on same RAD + DEST_LOCAL_NODE(dest_id_concat) = dst_addr; + DEST_RAD(dest_id_concat) = radsim_design->rad_id; + axis_client_interface.tdest.write(dest_id_concat); + axis_client_interface.tid.write(0); + axis_client_interface.tstrb.write(0); + axis_client_interface.tkeep.write(0); + axis_client_interface.tuser.write(src_addr); + axis_client_interface.tlast.write(tlast); + axis_client_interface.tdata.write(tdata); + axis_client_interface.tvalid.write(true); + } else { + axis_client_interface.tvalid.write(false); + } + + client_ready.write(!client_tdata_fifo_almost_full_signal.read()); + + client_tdata_fifo_wen_signal.write(client_ready.read() && client_valid.read()); + client_tlast_fifo_wen_signal.write(client_ready.read() && client_valid.read()); + + client_tdata_fifo_ren_signal.write(axis_client_interface.tvalid.read() && + axis_client_interface.tready.read()); + client_tlast_fifo_ren_signal.write(axis_client_interface.tvalid.read() && + axis_client_interface.tready.read()); + } +} + +void client::RegisterModuleInfo() { + std::string port_name; + _num_noc_axis_slave_ports = 0; + _num_noc_axis_master_ports = 0; + _num_noc_aximm_slave_ports = 0; + _num_noc_aximm_master_ports = 0; + + port_name = module_name + ".axis_client_interface"; + RegisterAxisMasterPort(port_name, &axis_client_interface, DATAW, 0); +} diff --git a/rad-sim/example-designs/mult/modules/client.hpp b/rad-sim/example-designs/mult/modules/client.hpp new file mode 100644 index 0000000..2bc9865 --- /dev/null +++ b/rad-sim/example-designs/mult/modules/client.hpp @@ -0,0 +1,60 @@ +#pragma once + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define FIFO_DEPTH 16 + +class client : public RADSimModule { +private: + // FIFO to store numbers + fifo>* client_tdata_fifo; + fifo* client_tlast_fifo; + + // Data FIFO signals + sc_signal> client_tdata_fifo_rdata_signal; + sc_signal client_tdata_fifo_wen_signal; + sc_signal client_tdata_fifo_ren_signal; + sc_signal client_tdata_fifo_full_signal; + sc_signal client_tdata_fifo_empty_signal; + sc_signal client_tdata_fifo_almost_full_signal; + sc_signal client_tdata_fifo_almost_empty_signal; + + // Last FIFO signals + sc_signal client_tlast_fifo_rdata_signal; + sc_signal client_tlast_fifo_wen_signal; + sc_signal client_tlast_fifo_ren_signal; + sc_signal client_tlast_fifo_full_signal; + sc_signal client_tlast_fifo_empty_signal; + sc_signal client_tlast_fifo_almost_full_signal; + sc_signal client_tlast_fifo_almost_empty_signal; + + bool testbench_tlast; + +public: + sc_in rst; + // Interface to driver logic + sc_in> client_tdata; + sc_in client_tlast; + sc_in client_valid; + sc_out client_ready; + // Interface to the NoC + axis_master_port axis_client_interface; + RADSimDesignContext* radsim_design; + + client(const sc_module_name &name, RADSimDesignContext* radsim_design); + ~client(); + + void Assign(); // Combinational logic process + void Tick(); // Sequential logic process + SC_HAS_PROCESS(client); + void RegisterModuleInfo(); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/mult/modules/fifo.cpp b/rad-sim/example-designs/mult/modules/fifo.cpp new file mode 100644 index 0000000..5ba051f --- /dev/null +++ b/rad-sim/example-designs/mult/modules/fifo.cpp @@ -0,0 +1,70 @@ +#include "fifo.hpp" + +template +fifo::fifo(const sc_module_name& name, unsigned int depth, unsigned int almost_full_size, + unsigned int almost_empty_size) + : sc_module(name), + wen("wen"), + wdata("wdata"), + ren("ren"), + rdata("rdata"), + full("full"), + almost_full("almost_full"), + empty("empty"), + almost_empty("almost_empty") { + + capacity = depth; + fifo_almost_full_size = almost_full_size; + fifo_almost_empty_size = almost_empty_size; + + // Set clock and reset signal for SC_CTHREAD + SC_CTHREAD(Tick, clk.pos()); + reset_signal_is(rst, true); +} + +template +fifo::~fifo() {} + +template +void fifo::Tick() { + // Reset logic + while (!mem.empty()) mem.pop(); + empty.write(true); + almost_empty.write(true); + full.write(false); + almost_full.write(false); + wait(); + + // Sequential logic + while (true) { + // Pop from queue if read enable signal is triggered and there is data in the FIFO + if (ren.read()) { + if (mem.size() == 0) sim_log.log(error, "FIFO is underflowing!", this->name()); + mem.pop(); + } + + // Push data into the FIFO if there is enough space + if (wen.read()) { + if (mem.size() == capacity) sim_log.log(error, "FIFO is overflowing!", this->name()); + mem.push(wdata.read()); + } + + // Update FIFO status signals + empty.write(mem.empty()); + almost_empty.write(mem.size() <= fifo_almost_empty_size); + full.write(mem.size() == capacity); + almost_full.write(mem.size() >= fifo_almost_full_size); + + // Set FIFO read data output to the top of the queue -- a vector of zeros is produced if the queue is empty + if (mem.size() == 0) { + rdata.write(0); + } else { + rdata.write(mem.front()); + } + + wait(); + } +} + +template class fifo>; +template class fifo; \ No newline at end of file diff --git a/rad-sim/example-designs/mult/modules/fifo.hpp b/rad-sim/example-designs/mult/modules/fifo.hpp new file mode 100644 index 0000000..e057e06 --- /dev/null +++ b/rad-sim/example-designs/mult/modules/fifo.hpp @@ -0,0 +1,37 @@ +#pragma once + +#include +#include +#include +#include + +#define DATAW 128 + +// This class defines a FIFO module. This is a "peek" FIFO where the read data port always shows the top of the +// FIFO and the read enable signal is an acknowledgement signal (equivalent to pop in a software queue) +template +class fifo : public sc_module { + private: + unsigned int capacity; // Depth of the FIFO + unsigned int fifo_almost_empty_size, fifo_almost_full_size; // Occupancy when FIFO is considered almost full/empty + std::queue mem; // FIFO storage implemented as a C++ queue + + public: + sc_in clk; + sc_in rst; + sc_in wen; + sc_in wdata; + sc_in ren; + sc_out rdata; + sc_out full; + sc_out almost_full; + sc_out empty; + sc_out almost_empty; + + fifo(const sc_module_name& name, unsigned int depth, unsigned int almost_full_size, + unsigned int almost_empty_size); + ~fifo(); + + void Tick(); + SC_HAS_PROCESS(fifo); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/mult/modules/mult.cpp b/rad-sim/example-designs/mult/modules/mult.cpp new file mode 100644 index 0000000..f32020a --- /dev/null +++ b/rad-sim/example-designs/mult/modules/mult.cpp @@ -0,0 +1,73 @@ +#include + +mult::mult(const sc_module_name &name, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design) { + + this->radsim_design = radsim_design; + + // Combinational logic and its sensitivity list + SC_METHOD(Assign); + sensitive << rst; + // Sequential logic and its clock/reset setup + SC_CTHREAD(Tick, clk.pos()); + reset_signal_is(rst, true); // Reset is active high + + // This function must be defined & called for any RAD-Sim module to register + // its info for automatically connecting to the NoC + this->RegisterModuleInfo(); +} + +mult::~mult() {} + +void mult::Assign() { + if (rst) { + mult_rolling_product = 1; + axis_mult_interface.tready.write(false); + } else { + // Always ready to accept the transaction + axis_mult_interface.tready.write(true); + } +} + +void mult::Tick() { + response_valid.write(0); + response.write(0); + wait(); + + int curr_cycle; + + // Always @ positive edge of the clock + while (true) { + curr_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + + // Receiving transaction from AXI-S interface + if (axis_mult_interface.tvalid.read() && + axis_mult_interface.tready.read()) { + uint64_t current_product = mult_rolling_product.to_uint64(); + mult_rolling_product = current_product * axis_mult_interface.tdata.read().to_uint64(); + t_finished.write(axis_mult_interface.tlast.read()); + std::cout << module_name << ": Got Transaction on cycle " << curr_cycle << "(user = " + << axis_mult_interface.tuser.read().to_uint64() << ") (factor = " + << axis_mult_interface.tdata.read().to_uint64() << ")!" + << std::endl; + } + + // Print Sum and Exit + if (t_finished.read()) { + response_valid.write(1); + response.write(mult_rolling_product); + } + wait(); + } +} + +void mult::RegisterModuleInfo() { + std::string port_name; + _num_noc_axis_slave_ports = 0; + _num_noc_axis_master_ports = 0; + _num_noc_aximm_slave_ports = 0; + _num_noc_aximm_master_ports = 0; + + port_name = module_name + ".axis_mult_interface"; + RegisterAxisSlavePort(port_name, &axis_mult_interface, DATAW, 0); +} \ No newline at end of file diff --git a/rad-sim/example-designs/mult/modules/mult.hpp b/rad-sim/example-designs/mult/modules/mult.hpp new file mode 100644 index 0000000..2c0a25c --- /dev/null +++ b/rad-sim/example-designs/mult/modules/mult.hpp @@ -0,0 +1,34 @@ +#pragma once + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +class mult : public RADSimModule { +private: + sc_bv mult_rolling_product; // Product to store result + sc_signal t_finished; // Signal flagging that the transaction has terminated + +public: + RADSimDesignContext* radsim_design; + sc_in rst; + sc_out response_valid; + sc_out> response; + // Interface to the NoC + axis_slave_port axis_mult_interface; + + mult(const sc_module_name &name, RADSimDesignContext* radsim_design); + ~mult(); + + void Assign(); // Combinational logic process + void Tick(); // Sequential logic process + SC_HAS_PROCESS(mult); + void RegisterModuleInfo(); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/mult/mult.clks b/rad-sim/example-designs/mult/mult.clks new file mode 100644 index 0000000..3e3efd6 --- /dev/null +++ b/rad-sim/example-designs/mult/mult.clks @@ -0,0 +1,2 @@ +mult_inst 0 0 +client_inst 0 0 \ No newline at end of file diff --git a/rad-sim/example-designs/mult/mult.place b/rad-sim/example-designs/mult/mult.place new file mode 100644 index 0000000..f7e1ad0 --- /dev/null +++ b/rad-sim/example-designs/mult/mult.place @@ -0,0 +1,2 @@ +mult_inst 0 0 axis +client_inst 0 3 axis \ No newline at end of file diff --git a/rad-sim/example-designs/mult/mult_driver.cpp b/rad-sim/example-designs/mult/mult_driver.cpp new file mode 100644 index 0000000..2ed5dfa --- /dev/null +++ b/rad-sim/example-designs/mult/mult_driver.cpp @@ -0,0 +1,76 @@ +#include + +#define NUM_FACTORS 5 + +mult_driver::mult_driver(const sc_module_name &name, RADSimDesignContext* radsim_design_) + : sc_module(name) { + + this->radsim_design = radsim_design_; + + //for simulation cycle count + start_cycle = 0; + end_cycle = 0; + + // Random Seed + srand (time(NULL)); + actual_product = 1; + + // Generate random numbers to be multiplied together by the multiplier + std::cout << "Generating Random Numbers to be multiplied ..." << std::endl; + for (unsigned int i = 0; i < NUM_FACTORS; i++) { + unsigned int r_num = std::rand() % 10 + 1; + std::cout << r_num << " "; + numbers_to_send.push(r_num); + actual_product *= r_num; + } + std::cout << std::endl << "----------------------------------------" << std::endl; + + SC_CTHREAD(source, clk.pos()); + SC_CTHREAD(sink, clk.pos()); +} + +mult_driver::~mult_driver() {} + +void mult_driver::source() { + // Reset + rst.write(true); + client_valid.write(false); + wait(); + rst.write(false); + start_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + wait(); + + while (!numbers_to_send.empty()) { + client_tdata.write(numbers_to_send.front()); + client_tlast.write(numbers_to_send.size() <= 1); + client_valid.write(true); + + wait(); + + if (client_valid.read() && client_ready.read()) { + numbers_to_send.pop(); + } + } + client_valid.write(false); + //std::cout << "Finished sending all numbers to client module!" << std::endl; + wait(); +} + +void mult_driver::sink() { + + while (!(response_valid.read())) { + wait(); + } + std::cout << "Received " << response.read().to_uint64() << " product from the multiplier!" << std::endl; + std::cout << "The actual product is " << actual_product << std::endl; + + if (response.read() != actual_product) std::cout << "FAILURE - Output is not matching!" << std::endl; + else std::cout << "SUCCESS - Output is matching!" << std::endl; + + end_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + std::cout << "Simulation Cycles for Just Mult Portion = " << end_cycle - start_cycle << std::endl; + + this->radsim_design->set_rad_done(); + return; + +} \ No newline at end of file diff --git a/rad-sim/example-designs/mult/mult_driver.hpp b/rad-sim/example-designs/mult/mult_driver.hpp new file mode 100644 index 0000000..5e060af --- /dev/null +++ b/rad-sim/example-designs/mult/mult_driver.hpp @@ -0,0 +1,35 @@ +#pragma once + +#include +#include +#include +#include +#include +#include +#include + +class mult_driver : public sc_module { +private: + int start_cycle, end_cycle; + std::queue numbers_to_send; + int actual_product; + RADSimDesignContext* radsim_design; + +public: + sc_in clk; + sc_out rst; + sc_out> client_tdata; + sc_out client_tlast; + sc_out client_valid; + sc_in client_ready; + sc_in> response; + sc_in response_valid; + + mult_driver(const sc_module_name &name, RADSimDesignContext* radsim_design_); + ~mult_driver(); + + void source(); + void sink(); + + SC_HAS_PROCESS(mult_driver); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/mult/mult_system.cpp b/rad-sim/example-designs/mult/mult_system.cpp new file mode 100644 index 0000000..0984d6e --- /dev/null +++ b/rad-sim/example-designs/mult/mult_system.cpp @@ -0,0 +1,35 @@ +#include + +mult_system::mult_system(const sc_module_name &name, sc_clock *driver_clk_sig, RADSimDesignContext* radsim_design) + : sc_module(name) { + + // Instantiate driver + driver_inst = new mult_driver("driver", radsim_design); + driver_inst->clk(*driver_clk_sig); + driver_inst->rst(rst_sig); + driver_inst->client_tdata(client_tdata_sig); + driver_inst->client_tlast(client_tlast_sig); + driver_inst->client_valid(client_valid_sig); + driver_inst->client_ready(client_ready_sig); + driver_inst->response(response_sig); + driver_inst->response_valid(response_valid_sig); + + // Instantiate design top-level + dut_inst = new mult_top("dut", radsim_design); + dut_inst->rst(rst_sig); + dut_inst->client_tdata(client_tdata_sig); + dut_inst->client_tlast(client_tlast_sig); + dut_inst->client_valid(client_valid_sig); + dut_inst->client_ready(client_ready_sig); + dut_inst->response(response_sig); + dut_inst->response_valid(response_valid_sig); + + //add mult_top as dut instance for parent class RADSimDesignSystem + this->design_dut_inst = dut_inst; +} + +mult_system::~mult_system() { + delete driver_inst; + delete dut_inst; + delete sysclk; +} \ No newline at end of file diff --git a/rad-sim/example-designs/mult/mult_system.hpp b/rad-sim/example-designs/mult/mult_system.hpp new file mode 100644 index 0000000..f3026b3 --- /dev/null +++ b/rad-sim/example-designs/mult/mult_system.hpp @@ -0,0 +1,27 @@ +#pragma once + +#include +#include +#include +#include +#include + +class mult_system : public RADSimDesignSystem { +private: + sc_signal> client_tdata_sig; + sc_signal client_tlast_sig; + sc_signal client_valid_sig; + sc_signal client_ready_sig; + sc_signal> response_sig; + sc_signal response_valid_sig; + +public: + sc_signal rst_sig; + sc_clock *sysclk; + mult_driver *driver_inst; + mult_top *dut_inst; + + mult_system(const sc_module_name &name, + sc_clock *driver_clk_sig, RADSimDesignContext* radsim_design); + ~mult_system(); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/mult/mult_top.cpp b/rad-sim/example-designs/mult/mult_top.cpp new file mode 100644 index 0000000..9115d07 --- /dev/null +++ b/rad-sim/example-designs/mult/mult_top.cpp @@ -0,0 +1,37 @@ +#include + +mult_top::mult_top(const sc_module_name &name, RADSimDesignContext* radsim_design) + : RADSimDesignTop(radsim_design) { + + this->radsim_design = radsim_design; + + std::string module_name_str; + char module_name[25]; + + module_name_str = "client_inst"; + std::strcpy(module_name, module_name_str.c_str()); + + client_inst = new client(module_name, radsim_design); + client_inst->rst(rst); + client_inst->client_tdata(client_tdata); + client_inst->client_tlast(client_tlast); + client_inst->client_valid(client_valid); + client_inst->client_ready(client_ready); + + module_name_str = "mult_inst"; + std::strcpy(module_name, module_name_str.c_str()); + mult_inst = new mult(module_name, radsim_design); + mult_inst->rst(rst); + mult_inst->response(response); + mult_inst->response_valid(response_valid); + + this->connectPortalReset(&rst); + radsim_design->BuildDesignContext("mult.place", "mult.clks"); + radsim_design->CreateSystemNoCs(rst); + radsim_design->ConnectModulesToNoC(); +} + + mult_top::~mult_top() { + delete mult_inst; + delete client_inst; +} \ No newline at end of file diff --git a/rad-sim/example-designs/mult/mult_top.hpp b/rad-sim/example-designs/mult/mult_top.hpp new file mode 100644 index 0000000..8d07678 --- /dev/null +++ b/rad-sim/example-designs/mult/mult_top.hpp @@ -0,0 +1,29 @@ +#pragma once + +#include +#include +#include +#include +#include +#include +#include + +class mult_top : public RADSimDesignTop { +private: + mult *mult_inst; + client *client_inst; + RADSimDesignContext* radsim_design; + +public: + sc_in rst; + // Client's interface + sc_in> client_tdata; + sc_in client_tlast; + sc_in client_valid; + sc_out client_ready; + sc_out> response; + sc_out response_valid; + + mult_top(const sc_module_name &name, RADSimDesignContext* radsim_design); + ~mult_top(); +}; \ No newline at end of file diff --git a/rad-sim/example-designs/npu/.gitignore b/rad-sim/example-designs/npu/.gitignore index aebfa9a..b602a05 100644 --- a/rad-sim/example-designs/npu/.gitignore +++ b/rad-sim/example-designs/npu/.gitignore @@ -10,3 +10,4 @@ compiler/dump/* compiler/__pycache__/* scripts/reports/*.rpt .vscode +Makefile \ No newline at end of file diff --git a/rad-sim/example-designs/npu/CMakeLists.txt b/rad-sim/example-designs/npu/CMakeLists.txt index 62102b6..16f65e3 100644 --- a/rad-sim/example-designs/npu/CMakeLists.txt +++ b/rad-sim/example-designs/npu/CMakeLists.txt @@ -65,5 +65,5 @@ set(hdrfiles add_compile_options(-Wall -Wextra -pedantic) -add_library(design STATIC ${srcfiles} ${hdrfiles}) -target_link_libraries(design PUBLIC SystemC::systemc booksim noc) +add_library(npu STATIC ${srcfiles} ${hdrfiles}) +target_link_libraries(npu PUBLIC SystemC::systemc booksim noc) diff --git a/rad-sim/example-designs/npu/compiler/add_mlp5_1536.py b/rad-sim/example-designs/npu/compiler/09_std_rnn_1536_8.py similarity index 68% rename from rad-sim/example-designs/npu/compiler/add_mlp5_1536.py rename to rad-sim/example-designs/npu/compiler/09_std_rnn_1536_8.py index c443d44..db29e4d 100644 --- a/rad-sim/example-designs/npu/compiler/add_mlp5_1536.py +++ b/rad-sim/example-designs/npu/compiler/09_std_rnn_1536_8.py @@ -14,19 +14,16 @@ # Define constants INPUT_SIZE = 1536 -DENSE_SIZE = 1536 +HIDDEN_UNITS = 1536 +TIME_STEPS = 8 # Define model architecture using Keras Sequential Model model = NPUSequential([ - layers.Dense(DENSE_SIZE, name="layer1"), - layers.Dense(DENSE_SIZE, name="layer2"), - layers.Dense(DENSE_SIZE, name="layer3"), - layers.Dense(DENSE_SIZE, name="layer4"), - layers.Dense(DENSE_SIZE, name="layer5"), + layers.SimpleRNN(HIDDEN_UNITS, name="layer1"), ]) # Random test inputs for different types of layers -test_input = tf.random.uniform(shape=[6, INPUT_SIZE], minval=-128, maxval=127) +test_input = tf.random.uniform(shape=[TIME_STEPS, 6, INPUT_SIZE], minval=-128, maxval=127) # Call model on example input y = model(test_input) diff --git a/rad-sim/example-designs/npu/config.yml b/rad-sim/example-designs/npu/config.yml index 6cc4c28..33d4bb8 100644 --- a/rad-sim/example-designs/npu/config.yml +++ b/rad-sim/example-designs/npu/config.yml @@ -27,11 +27,15 @@ noc_adapters: out_arbiter: ['priority_rr'] vc_mapping: ['direct'] -design: - name: 'npu' - noc_placement: ['npu.place'] - clk_periods: [5.0, 2.5] +config rad1: + design: + name: 'npu' + noc_placement: ['npu.place'] + clk_periods: [5.0, 2.5] -telemetry: - log_verbosity: 2 - traces: [] \ No newline at end of file +cluster: + sim_driver_period: 5.0 + telemetry_log_verbosity: 2 + telemetry_traces: [] + num_rads: 1 + cluster_configs: ['rad1'] \ No newline at end of file diff --git a/rad-sim/example-designs/npu/modules/axis_fifo_adapters.cpp b/rad-sim/example-designs/npu/modules/axis_fifo_adapters.cpp index 310f67d..89f40bf 100644 --- a/rad-sim/example-designs/npu/modules/axis_fifo_adapters.cpp +++ b/rad-sim/example-designs/npu/modules/axis_fifo_adapters.cpp @@ -4,9 +4,10 @@ template axis_master_fifo_adapter::axis_master_fifo_adapter( const sc_module_name& _name, unsigned int _interface_type, unsigned int _interface_dataw, unsigned int _num_fifo, - unsigned int _element_bitwidth, std::string& _destination_port) + unsigned int _element_bitwidth, std::string& _destination_port, RADSimDesignContext* radsim_design) : sc_module(_name), fifo_rdy("fifo_rdy"), fifo_ren("fifo_ren"), fifo_rdata("fifo_rdata") { // Initialize member variables + this->radsim_design = radsim_design; interface_type = _interface_type; if (_interface_dataw > AXIS_MAX_DATAW) sim_log.log(error, "AXI-S datawidth exceeds maximum value!", this->name()); @@ -139,7 +140,7 @@ void axis_master_fifo_adapter::insert_payload_into_buffer() TUSER_FLAG(buffer_wdata) = payload_bitvector.get_bit(flag_idx); TUSER_ADDR(buffer_wdata) = payload_bitvector.range(addr_start_idx + VRF_ADDRW - 1, addr_start_idx); TUSER_VRFID(buffer_wdata) = payload_bitvector.range(vrf_id_start_idx + VRF_WB_SELW - 1, vrf_id_start_idx); - TDEST(buffer_wdata) = radsim_design.GetPortDestinationID(destination_port); + TDEST(buffer_wdata) = this->radsim_design->GetPortDestinationID(destination_port); } else if (interface_type == VEW_WRITEBACK_INTERFACE) { TUSER_FLAG(buffer_wdata) = payload_bitvector.get_bit(flag_idx); TUSER_ADDR(buffer_wdata) = payload_bitvector.range(addr_start_idx + VRF_ADDRW - 1, addr_start_idx); @@ -148,13 +149,13 @@ void axis_master_fifo_adapter::insert_payload_into_buffer() } else { TUSER_FLAG(buffer_wdata) = 0; TUSER_ADDR(buffer_wdata) = 0; - TDEST(buffer_wdata) = radsim_design.GetPortDestinationID(destination_port); + TDEST(buffer_wdata) = this->radsim_design->GetPortDestinationID(destination_port); } - TID(buffer_wdata) = radsim_design.GetPortInterfaceID(destination_port); + TID(buffer_wdata) = this->radsim_design->GetPortInterfaceID(destination_port); TLAST(buffer_wdata) = (transfer_id == transfers_per_axis_packet - 1); buffer.push(buffer_wdata); assert(buffer.size() <= buffer_capacity); - //std::cout << "Destination Port " << destination_port << " is at node " << radsim_design.GetPortDestinationID(destination_port) << " interface " << radsim_design.GetPortInterfaceID(destination_port) << std::endl; + //std::cout << "Destination Port " << destination_port << " is at node " << this->radsim_design->GetPortDestinationID(destination_port) << " interface " << this->radsim_design->GetPortInterfaceID(destination_port) << std::endl; } } @@ -258,9 +259,11 @@ axis_slave_fifo_adapter::axis_slave_fifo_adapter(const sc_mo unsigned int _interface_dataw, unsigned int _num_fifo, unsigned int _element_bitwidth, - unsigned int _num_element) + unsigned int _num_element, + RADSimDesignContext* radsim_design) : sc_module(_name), fifo_rdy("fifo_rdy"), fifo_ren("fifo_ren"), fifo_rdata("fifo_rdata") { // Initialize member variables + this->radsim_design = radsim_design; interface_type = _interface_type; if (_interface_dataw > AXIS_MAX_DATAW) sim_log.log(error, "AXI-S datawidth exceeds maximum value!", this->name()); diff --git a/rad-sim/example-designs/npu/modules/axis_fifo_adapters.hpp b/rad-sim/example-designs/npu/modules/axis_fifo_adapters.hpp index dca5b2e..43a9206 100644 --- a/rad-sim/example-designs/npu/modules/axis_fifo_adapters.hpp +++ b/rad-sim/example-designs/npu/modules/axis_fifo_adapters.hpp @@ -31,6 +31,7 @@ class axis_master_fifo_adapter : public sc_module { sc_signal buffer_occupancy; // Current occupancy of adapter buffer public: + RADSimDesignContext* radsim_design; sc_in clk; sc_in rst; sc_vector> fifo_rdy; @@ -39,7 +40,8 @@ class axis_master_fifo_adapter : public sc_module { axis_master_port axis_port; axis_master_fifo_adapter(const sc_module_name& _name, unsigned int _interface_type, unsigned int _interface_dataw, - unsigned int _num_fifo, unsigned int _element_bitwidth, std::string& _destination_port); + unsigned int _num_fifo, unsigned int _element_bitwidth, std::string& _destination_port, + RADSimDesignContext* radsim_design); ~axis_master_fifo_adapter(); bool buffer_full(); // Checks if adapter buffer is full @@ -73,6 +75,7 @@ class axis_slave_fifo_adapter : public sc_module { sc_signal transfer_count; // Count of received-so-far transfers (for a specific payload) public: + RADSimDesignContext* radsim_design; sc_in clk; sc_in rst; sc_vector> fifo_rdy; @@ -81,7 +84,8 @@ class axis_slave_fifo_adapter : public sc_module { axis_slave_port axis_port; axis_slave_fifo_adapter(const sc_module_name& _name, unsigned int _interface_type, unsigned int _interface_dataw, - unsigned int _num_fifo, unsigned int _element_bitwidth, unsigned int _num_element); + unsigned int _num_fifo, unsigned int _element_bitwidth, unsigned int _num_element, + RADSimDesignContext* radsim_design); ~axis_slave_fifo_adapter(); void sc_bitvector_to_data_fifo(); // Converts a SystemC bitvector to a data FIFO entry diff --git a/rad-sim/example-designs/npu/modules/axis_inst_dispatch.cpp b/rad-sim/example-designs/npu/modules/axis_inst_dispatch.cpp index 117a7e8..c5ccce2 100644 --- a/rad-sim/example-designs/npu/modules/axis_inst_dispatch.cpp +++ b/rad-sim/example-designs/npu/modules/axis_inst_dispatch.cpp @@ -1,7 +1,7 @@ #include -axis_inst_dispatch::axis_inst_dispatch(const sc_module_name& name, unsigned int thread_id) - : RADSimModule(name) { +axis_inst_dispatch::axis_inst_dispatch(const sc_module_name& name, unsigned int thread_id, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design) { // Create SystemC vectors with the required sizes -- macro-op interfaces are vectors of size 1 to match the template // definition used with data FIFOs which works with multiple cores init_vector::init_sc_vector(sector_mop_interface, SECTORS); @@ -69,7 +69,7 @@ axis_inst_dispatch::axis_inst_dispatch(const sc_module_name& name, unsigned int module_name_str = "sector_" + std::to_string(sector_id) + "_mop_axis_interface_" + std::to_string(_thread_id); std::strcpy(module_name, module_name_str.c_str()); sector_mop_axis_interface[sector_id] = new axis_master_fifo_adapter>( - module_name, INSTRUCTION_INTERFACE, DEC_INSTRUCTION_INTERFACE_DATAW, 1, MVU_MOP_BITWIDTH, dest_name); + module_name, INSTRUCTION_INTERFACE, DEC_INSTRUCTION_INTERFACE_DATAW, 1, MVU_MOP_BITWIDTH, dest_name, radsim_design); sector_mop_axis_interface[sector_id]->clk(clk); sector_mop_axis_interface[sector_id]->rst(rst); sector_mop_axis_interface[sector_id]->fifo_rdy(sector_mop_rdy_signal[sector_id]); @@ -83,7 +83,7 @@ axis_inst_dispatch::axis_inst_dispatch(const sc_module_name& name, unsigned int module_name_str = "evrf_" + std::to_string(sector_id) + "_mop_axis_interface_" + std::to_string(_thread_id); std::strcpy(module_name, module_name_str.c_str()); evrf_mop_axis_interface[sector_id] = new axis_master_fifo_adapter>( - module_name, INSTRUCTION_INTERFACE, DEC_INSTRUCTION_INTERFACE_DATAW, 1, EVRF_MOP_BITWIDTH, dest_name); + module_name, INSTRUCTION_INTERFACE, DEC_INSTRUCTION_INTERFACE_DATAW, 1, EVRF_MOP_BITWIDTH, dest_name, radsim_design); evrf_mop_axis_interface[sector_id]->clk(clk); evrf_mop_axis_interface[sector_id]->rst(rst); evrf_mop_axis_interface[sector_id]->fifo_rdy(evrf_mop_rdy_signal[sector_id]); @@ -97,7 +97,7 @@ axis_inst_dispatch::axis_inst_dispatch(const sc_module_name& name, unsigned int module_name_str = "mfu0_" + std::to_string(sector_id) + "_mop_axis_interface_" + std::to_string(_thread_id); std::strcpy(module_name, module_name_str.c_str()); mfu0_mop_axis_interface[sector_id] = new axis_master_fifo_adapter>( - module_name, INSTRUCTION_INTERFACE, DEC_INSTRUCTION_INTERFACE_DATAW, 1, MFU_MOP_BITWIDTH, dest_name); + module_name, INSTRUCTION_INTERFACE, DEC_INSTRUCTION_INTERFACE_DATAW, 1, MFU_MOP_BITWIDTH, dest_name, radsim_design); mfu0_mop_axis_interface[sector_id]->clk(clk); mfu0_mop_axis_interface[sector_id]->rst(rst); mfu0_mop_axis_interface[sector_id]->fifo_rdy(mfu0_mop_rdy_signal[sector_id]); @@ -111,7 +111,7 @@ axis_inst_dispatch::axis_inst_dispatch(const sc_module_name& name, unsigned int module_name_str = "mfu1_" + std::to_string(sector_id) + "_mop_axis_interface_" + std::to_string(_thread_id); std::strcpy(module_name, module_name_str.c_str()); mfu1_mop_axis_interface[sector_id] = new axis_master_fifo_adapter>( - module_name, INSTRUCTION_INTERFACE, DEC_INSTRUCTION_INTERFACE_DATAW, 1, MFU_MOP_BITWIDTH, dest_name); + module_name, INSTRUCTION_INTERFACE, DEC_INSTRUCTION_INTERFACE_DATAW, 1, MFU_MOP_BITWIDTH, dest_name, radsim_design); mfu1_mop_axis_interface[sector_id]->clk(clk); mfu1_mop_axis_interface[sector_id]->rst(rst); mfu1_mop_axis_interface[sector_id]->fifo_rdy(mfu1_mop_rdy_signal[sector_id]); @@ -125,7 +125,7 @@ axis_inst_dispatch::axis_inst_dispatch(const sc_module_name& name, unsigned int module_name_str = "ld_mop_axis_interface_" + std::to_string(_thread_id); std::strcpy(module_name, module_name_str.c_str()); ld_mop_axis_interface = new axis_master_fifo_adapter>( - module_name, INSTRUCTION_INTERFACE, DEC_INSTRUCTION_INTERFACE_DATAW, 1, LD_MOP_BITWIDTH, dest_name); + module_name, INSTRUCTION_INTERFACE, DEC_INSTRUCTION_INTERFACE_DATAW, 1, LD_MOP_BITWIDTH, dest_name, radsim_design); ld_mop_axis_interface->clk(clk); ld_mop_axis_interface->rst(rst); ld_mop_axis_interface->fifo_rdy(ld_mop_rdy_signal); diff --git a/rad-sim/example-designs/npu/modules/axis_inst_dispatch.hpp b/rad-sim/example-designs/npu/modules/axis_inst_dispatch.hpp index 4f95a66..fcd976a 100644 --- a/rad-sim/example-designs/npu/modules/axis_inst_dispatch.hpp +++ b/rad-sim/example-designs/npu/modules/axis_inst_dispatch.hpp @@ -49,7 +49,7 @@ class axis_inst_dispatch : public RADSimModule { sc_vector mfu1_mop_interface; axis_master_port ld_mop_interface; - axis_inst_dispatch(const sc_module_name& name, unsigned int thread_id); + axis_inst_dispatch(const sc_module_name& name, unsigned int thread_id, RADSimDesignContext* radsim_design); ~axis_inst_dispatch(); void RegisterModuleInfo(); }; diff --git a/rad-sim/example-designs/npu/modules/axis_mvu_sector.cpp b/rad-sim/example-designs/npu/modules/axis_mvu_sector.cpp index d95869d..67b6628 100644 --- a/rad-sim/example-designs/npu/modules/axis_mvu_sector.cpp +++ b/rad-sim/example-designs/npu/modules/axis_mvu_sector.cpp @@ -1,7 +1,7 @@ #include "axis_mvu_sector.hpp" -axis_mvu_sector::axis_mvu_sector(const sc_module_name& name, unsigned int sector_id) - : RADSimModule(name), +axis_mvu_sector::axis_mvu_sector(const sc_module_name& name, unsigned int sector_id, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design), ofifo_rdy_signal("ofifo_rdy_signal"), ofifo_ren_signal("ofifo_ren_signal"), ofifo_rdata_signal("ofifo_rdata_signal"), @@ -66,7 +66,7 @@ axis_mvu_sector::axis_mvu_sector(const sc_module_name& name, unsigned int sector inst_interface_name_str = "sector" + std::to_string(_sector_id) + "_inst_interface_" + std::to_string(thread_id); std::strcpy(inst_interface_name, inst_interface_name_str.c_str()); inst_axis_interface[thread_id] = new axis_slave_fifo_adapter>( - inst_interface_name, INSTRUCTION_INTERFACE, MVU_INSTRUCTION_INTERFACE_DATAW, 1, MVU_MOP_BITWIDTH, 1); + inst_interface_name, INSTRUCTION_INTERFACE, MVU_INSTRUCTION_INTERFACE_DATAW, 1, MVU_MOP_BITWIDTH, 1, radsim_design); inst_axis_interface[thread_id]->clk(clk); inst_axis_interface[thread_id]->rst(rst); inst_axis_interface[thread_id]->fifo_rdy(mop_rdy_signal[thread_id]); @@ -79,7 +79,7 @@ axis_mvu_sector::axis_mvu_sector(const sc_module_name& name, unsigned int sector std::strcpy(wb_interface_name, wb_interface_name_str.c_str()); wb_axis_interface[thread_id] = new axis_slave_fifo_adapter, sc_bv>( - wb_interface_name, MVU_WRITEBACK_INTERFACE, MVU_WRITEBACK_INTERFACE_DATAW, CORES, LOW_PRECISION, LANES); + wb_interface_name, MVU_WRITEBACK_INTERFACE, MVU_WRITEBACK_INTERFACE_DATAW, CORES, LOW_PRECISION, LANES, radsim_design); wb_axis_interface[thread_id]->clk(clk); wb_axis_interface[thread_id]->rst(rst); wb_axis_interface[thread_id]->fifo_rdy(wb_rdy_signal[thread_id]); @@ -97,7 +97,7 @@ axis_mvu_sector::axis_mvu_sector(const sc_module_name& name, unsigned int sector ofifo_axis_interface[thread_id] = new axis_master_fifo_adapter, sc_bv>( sector_ofifo_interface_name, FEEDFORWARD_INTERFACE, MVU_FEEDFORWARD_INTERFACE_DATAW, CORES, HIGH_PRECISION, - dest_name); + dest_name, radsim_design); ofifo_axis_interface[thread_id]->clk(clk); ofifo_axis_interface[thread_id]->rst(rst); ofifo_axis_interface[thread_id]->fifo_rdy(ofifo_rdy_signal[thread_id]); @@ -110,7 +110,7 @@ axis_mvu_sector::axis_mvu_sector(const sc_module_name& name, unsigned int sector char sector_name[NAME_LENGTH]; std::string sector_name_str = "sector" + std::to_string(_sector_id); std::strcpy(sector_name, sector_name_str.c_str()); - sector_module = new mvu_sector(sector_name, _sector_id); + sector_module = new mvu_sector(sector_name, _sector_id, radsim_design); sector_module->clk(clk); sector_module->rst(rst); sector_module->inst(uop_wdata_signal); diff --git a/rad-sim/example-designs/npu/modules/axis_mvu_sector.hpp b/rad-sim/example-designs/npu/modules/axis_mvu_sector.hpp index 92ea906..420ded9 100644 --- a/rad-sim/example-designs/npu/modules/axis_mvu_sector.hpp +++ b/rad-sim/example-designs/npu/modules/axis_mvu_sector.hpp @@ -47,7 +47,7 @@ class axis_mvu_sector : public RADSimModule { sc_vector>>> sector_chain_ofifo_rdata; sc_vector sector_ofifo_interface; - axis_mvu_sector(const sc_module_name& name, unsigned int sector_id); + axis_mvu_sector(const sc_module_name& name, unsigned int sector_id, RADSimDesignContext* radsim_design); ~axis_mvu_sector(); void RegisterModuleInfo(); }; diff --git a/rad-sim/example-designs/npu/modules/axis_mvu_sector_chain.cpp b/rad-sim/example-designs/npu/modules/axis_mvu_sector_chain.cpp index 849b6e6..ac5c608 100644 --- a/rad-sim/example-designs/npu/modules/axis_mvu_sector_chain.cpp +++ b/rad-sim/example-designs/npu/modules/axis_mvu_sector_chain.cpp @@ -1,7 +1,7 @@ #include "axis_mvu_sector_chain.hpp" -axis_mvu_sector_chain::axis_mvu_sector_chain(const sc_module_name& name, unsigned int sector_id) - : RADSimModule(name), +axis_mvu_sector_chain::axis_mvu_sector_chain(const sc_module_name& name, unsigned int sector_id, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design), ofifo_rdy_signal("ofifo_rdy_signal"), ofifo_ren_signal("ofifo_ren_signal"), ofifo_rdata_signal("ofifo_rdata_signal"), @@ -60,7 +60,7 @@ axis_mvu_sector_chain::axis_mvu_sector_chain(const sc_module_name& name, unsigne inst_interface_name_str = "sector" + std::to_string(_sector_id) + "_inst_interface_" + std::to_string(thread_id); std::strcpy(inst_interface_name, inst_interface_name_str.c_str()); inst_axis_interface[thread_id] = new axis_slave_fifo_adapter>( - inst_interface_name, INSTRUCTION_INTERFACE, MVU_INSTRUCTION_INTERFACE_DATAW, 1, MVU_MOP_BITWIDTH, 1); + inst_interface_name, INSTRUCTION_INTERFACE, MVU_INSTRUCTION_INTERFACE_DATAW, 1, MVU_MOP_BITWIDTH, 1, radsim_design); inst_axis_interface[thread_id]->clk(clk); inst_axis_interface[thread_id]->rst(rst); inst_axis_interface[thread_id]->fifo_rdy(mop_rdy_signal[thread_id]); @@ -78,7 +78,7 @@ axis_mvu_sector_chain::axis_mvu_sector_chain(const sc_module_name& name, unsigne ofifo_axis_interface[thread_id] = new axis_master_fifo_adapter, sc_bv>( sector_ofifo_interface_name, FEEDFORWARD_INTERFACE, MVU_FEEDFORWARD_INTERFACE_DATAW, CORES, HIGH_PRECISION, - dest_name); + dest_name, radsim_design); ofifo_axis_interface[thread_id]->clk(clk); ofifo_axis_interface[thread_id]->rst(rst); ofifo_axis_interface[thread_id]->fifo_rdy(ofifo_rdy_signal[thread_id]); @@ -91,7 +91,7 @@ axis_mvu_sector_chain::axis_mvu_sector_chain(const sc_module_name& name, unsigne char sector_name[NAME_LENGTH]; std::string sector_name_str = "sector" + std::to_string(_sector_id); std::strcpy(sector_name, sector_name_str.c_str()); - sector_module = new mvu_sector(sector_name, _sector_id); + sector_module = new mvu_sector(sector_name, _sector_id, radsim_design); sector_module->clk(clk); sector_module->rst(rst); sector_module->inst(uop_wdata_signal); diff --git a/rad-sim/example-designs/npu/modules/axis_mvu_sector_chain.hpp b/rad-sim/example-designs/npu/modules/axis_mvu_sector_chain.hpp index 98912b6..cf2aa6a 100644 --- a/rad-sim/example-designs/npu/modules/axis_mvu_sector_chain.hpp +++ b/rad-sim/example-designs/npu/modules/axis_mvu_sector_chain.hpp @@ -45,7 +45,7 @@ class axis_mvu_sector_chain : public RADSimModule { sc_vector>>> sector_chain_ofifo_rdata; sc_vector sector_ofifo_interface; - axis_mvu_sector_chain(const sc_module_name& name, unsigned int id); + axis_mvu_sector_chain(const sc_module_name& name, unsigned int id, RADSimDesignContext* radsim_design); ~axis_mvu_sector_chain(); void RegisterModuleInfo(); }; diff --git a/rad-sim/example-designs/npu/modules/axis_vector_elementwise.cpp b/rad-sim/example-designs/npu/modules/axis_vector_elementwise.cpp index 0522476..18d6e13 100644 --- a/rad-sim/example-designs/npu/modules/axis_vector_elementwise.cpp +++ b/rad-sim/example-designs/npu/modules/axis_vector_elementwise.cpp @@ -1,7 +1,7 @@ #include -axis_vector_elementwise::axis_vector_elementwise(const sc_module_name& name, unsigned int thread_id) - : RADSimModule(name), +axis_vector_elementwise::axis_vector_elementwise(const sc_module_name& name, unsigned int thread_id, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design), evrf_ififo_rdy_signal("evrf_ififo_rdy_signal"), evrf_ififo_ren_signal("evrf_ififo_ren_signal"), evrf_ififo_rdata_signal("evrf_ififo_rdata_signal"), @@ -136,7 +136,7 @@ axis_vector_elementwise::axis_vector_elementwise(const sc_module_name& name, uns module_name_str = "evrf_inst_axis_interface_" + std::to_string(sector_id); std::strcpy(module_name, module_name_str.c_str()); evrf_inst_axis_interfaces[sector_id] = new axis_slave_fifo_adapter>( - module_name, INSTRUCTION_INTERFACE, VEW_INSTRUCTION_INTERFACE_DATAW, 1, EVRF_MOP_BITWIDTH, 1); + module_name, INSTRUCTION_INTERFACE, VEW_INSTRUCTION_INTERFACE_DATAW, 1, EVRF_MOP_BITWIDTH, 1, radsim_design); evrf_inst_axis_interfaces[sector_id]->clk(clk); evrf_inst_axis_interfaces[sector_id]->rst(rst); evrf_inst_axis_interfaces[sector_id]->fifo_rdy(evrf_mop_rdy_signal[sector_id]); @@ -148,7 +148,7 @@ axis_vector_elementwise::axis_vector_elementwise(const sc_module_name& name, uns std::strcpy(module_name, module_name_str.c_str()); evrf_ififo_axis_interfaces[sector_id] = new axis_slave_fifo_adapter, sc_bv>( - module_name, FEEDFORWARD_INTERFACE, VEW_FEEDFORWARD_INTERFACE_DATAW, CORES, HIGH_PRECISION, DPES_PER_SECTOR); + module_name, FEEDFORWARD_INTERFACE, VEW_FEEDFORWARD_INTERFACE_DATAW, CORES, HIGH_PRECISION, DPES_PER_SECTOR, radsim_design); evrf_ififo_axis_interfaces[sector_id]->clk(clk); evrf_ififo_axis_interfaces[sector_id]->rst(rst); evrf_ififo_axis_interfaces[sector_id]->fifo_rdy(evrf_ififo_rdy_signal[sector_id]); @@ -189,7 +189,7 @@ axis_vector_elementwise::axis_vector_elementwise(const sc_module_name& name, uns module_name_str = "mfu0_inst_interface_" + std::to_string(sector_id); std::strcpy(module_name, module_name_str.c_str()); mfu0_inst_axis_interfaces[sector_id] = new axis_slave_fifo_adapter>( - module_name, INSTRUCTION_INTERFACE, VEW_INSTRUCTION_INTERFACE_DATAW, 1, MFU_MOP_BITWIDTH, 1); + module_name, INSTRUCTION_INTERFACE, VEW_INSTRUCTION_INTERFACE_DATAW, 1, MFU_MOP_BITWIDTH, 1, radsim_design); mfu0_inst_axis_interfaces[sector_id]->clk(clk); mfu0_inst_axis_interfaces[sector_id]->rst(rst); mfu0_inst_axis_interfaces[sector_id]->fifo_rdy(mfu0_mop_rdy_signal[sector_id]); @@ -230,7 +230,7 @@ axis_vector_elementwise::axis_vector_elementwise(const sc_module_name& name, uns module_name_str = "mfu1_inst_interface_" + std::to_string(sector_id); std::strcpy(module_name, module_name_str.c_str()); mfu1_inst_axis_interfaces[sector_id] = new axis_slave_fifo_adapter>( - module_name, INSTRUCTION_INTERFACE, VEW_INSTRUCTION_INTERFACE_DATAW, 1, MFU_MOP_BITWIDTH, 1); + module_name, INSTRUCTION_INTERFACE, VEW_INSTRUCTION_INTERFACE_DATAW, 1, MFU_MOP_BITWIDTH, 1, radsim_design); mfu1_inst_axis_interfaces[sector_id]->clk(clk); mfu1_inst_axis_interfaces[sector_id]->rst(rst); mfu1_inst_axis_interfaces[sector_id]->fifo_rdy(mfu1_mop_rdy_signal[sector_id]); @@ -272,7 +272,7 @@ axis_vector_elementwise::axis_vector_elementwise(const sc_module_name& name, uns ld_module->ext_output_fifo_rdata(ext_output_fifo_rdata); ld_inst_axis_interface = new axis_slave_fifo_adapter>( - "ld_inst_axis_interface", INSTRUCTION_INTERFACE, VEW_INSTRUCTION_INTERFACE_DATAW, 1, LD_MOP_BITWIDTH, 1); + "ld_inst_axis_interface", INSTRUCTION_INTERFACE, VEW_INSTRUCTION_INTERFACE_DATAW, 1, LD_MOP_BITWIDTH, 1, radsim_design); ld_inst_axis_interface->clk(clk); ld_inst_axis_interface->rst(rst); ld_inst_axis_interface->fifo_rdy(ld_mop_rdy_signal); @@ -283,7 +283,7 @@ axis_vector_elementwise::axis_vector_elementwise(const sc_module_name& name, uns // Create two write-back master AXI-streaming interfaces (send write-back data to different NPU modules) std::string dest_name = "axis_mvu_sector_0.sector_wb_interface_" + std::to_string(_thread_id); ld_wb0_axis_interface = new axis_master_fifo_adapter, sc_bv>( - "ld_wb0_axis_interface", MVU_WRITEBACK_INTERFACE, VEW_WB0_INTERFACE_DATAW, CORES, LOW_PRECISION, dest_name); + "ld_wb0_axis_interface", MVU_WRITEBACK_INTERFACE, VEW_WB0_INTERFACE_DATAW, CORES, LOW_PRECISION, dest_name, radsim_design); ld_wb0_axis_interface->clk(clk); ld_wb0_axis_interface->rst(rst); ld_wb0_axis_interface->fifo_rdy(ld_wb0_rdy_signal); diff --git a/rad-sim/example-designs/npu/modules/axis_vector_elementwise.hpp b/rad-sim/example-designs/npu/modules/axis_vector_elementwise.hpp index bf96c20..86d815c 100644 --- a/rad-sim/example-designs/npu/modules/axis_vector_elementwise.hpp +++ b/rad-sim/example-designs/npu/modules/axis_vector_elementwise.hpp @@ -88,7 +88,7 @@ class axis_vector_elementwise : public RADSimModule { sc_vector> ext_output_fifo_ren; sc_vector>> ext_output_fifo_rdata; - axis_vector_elementwise(const sc_module_name& name, unsigned int thread_id); + axis_vector_elementwise(const sc_module_name& name, unsigned int thread_id, RADSimDesignContext* radsim_design); ~axis_vector_elementwise(); void RegisterModuleInfo(); diff --git a/rad-sim/example-designs/npu/modules/mvu_sector.cpp b/rad-sim/example-designs/npu/modules/mvu_sector.cpp index 31a1500..1a8815d 100644 --- a/rad-sim/example-designs/npu/modules/mvu_sector.cpp +++ b/rad-sim/example-designs/npu/modules/mvu_sector.cpp @@ -1,6 +1,6 @@ #include "mvu_sector.hpp" -mvu_sector::mvu_sector(const sc_module_name& name, unsigned int id) +mvu_sector::mvu_sector(const sc_module_name& name, unsigned int id, RADSimDesignContext* radsim_design) : sc_module(name), inst_valid_pipeline("inst_valid_pipeline", SECTOR_INST_TO_DPES_PIPELINE), inst_pipeline("inst_pipeline", SECTOR_INST_PIPELINE), @@ -173,7 +173,7 @@ mvu_sector::mvu_sector(const sc_module_name& name, unsigned int id) mrfs.resize(DPES_PER_SECTOR); std::string mrf_filename, mrf_path; - std::string npu_dir = radsim_config.GetStringKnob("radsim_user_design_root_dir"); + std::string npu_dir = radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); for (unsigned int dpe_id = 0; dpe_id < DPES_PER_SECTOR; dpe_id++) { mrfs[dpe_id].resize(TILES); for (unsigned int tile_id = 0; tile_id < TILES; tile_id++) { diff --git a/rad-sim/example-designs/npu/modules/mvu_sector.hpp b/rad-sim/example-designs/npu/modules/mvu_sector.hpp index afb732c..46234b0 100644 --- a/rad-sim/example-designs/npu/modules/mvu_sector.hpp +++ b/rad-sim/example-designs/npu/modules/mvu_sector.hpp @@ -11,6 +11,7 @@ #include #include #include +#include class mvu_sector : public sc_module { private: @@ -92,7 +93,7 @@ class mvu_sector : public sc_module { sc_vector>> sector_ofifo_ren; sc_vector>>> sector_ofifo_rdata; - mvu_sector(const sc_module_name& name, unsigned int id); + mvu_sector(const sc_module_name& name, unsigned int id, RADSimDesignContext* radsim_design); ~mvu_sector(); void Tick(); diff --git a/rad-sim/example-designs/npu/npu_driver.cpp b/rad-sim/example-designs/npu/npu_driver.cpp index 1dc1393..e51292d 100644 --- a/rad-sim/example-designs/npu/npu_driver.cpp +++ b/rad-sim/example-designs/npu/npu_driver.cpp @@ -1,6 +1,6 @@ #include -npu_driver::npu_driver(const sc_module_name &name) +npu_driver::npu_driver(const sc_module_name &name, RADSimDesignContext* radsim_design_) : sc_module(name), rst("rst"), inst_wdata("inst_wdata"), @@ -19,6 +19,8 @@ npu_driver::npu_driver(const sc_module_name &name) ofifo_ren("ofifo_ren"), ofifo_rdata("ofifo_rdata") { + this->radsim_design = radsim_design_; + init_vector>::init_sc_vector(ififo_rdy, THREADS, CORES); init_vector>::init_sc_vector(ififo_wen, THREADS, CORES); init_vector>>::init_sc_vector(ififo_wdata, THREADS, CORES); @@ -41,7 +43,7 @@ npu_driver::~npu_driver() {} void npu_driver::source() { bool parse_flag; - std::string npu_dir = radsim_config.GetStringKnob("radsim_user_design_root_dir"); + std::string npu_dir = radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); // Parse NPU instructions std::string inst_filename = "/register_files/instructions.txt"; @@ -97,7 +99,7 @@ void npu_driver::source() { // Trigger NPU start signal start.write(true); wait(); - start_cycle = GetSimulationCycle(radsim_config.GetDoubleKnob("max_period")); + start_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); start.write(false); wait(); @@ -176,10 +178,10 @@ void npu_driver::sink() { } wait(); } - end_cycle = GetSimulationCycle(radsim_config.GetDoubleKnob("max_period")); + end_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); std::ofstream report; - std::string npu_dir = radsim_config.GetStringKnob("radsim_user_design_root_dir"); + std::string npu_dir = radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", radsim_design->rad_id); std::string report_filename = "/sim_done"; std::string report_path = npu_dir + report_filename; report.open(report_path); @@ -196,7 +198,9 @@ void npu_driver::sink() { sim_trace_probe.dump_traces(); - sc_stop(); + //sc_stop(); + this->radsim_design->set_rad_done(); //flag to replace sc_stop calls + return; //NoCTransactionTelemetry::DumpStatsToFile("/Users/andrew/PhD/dev/rad-sim-opt-npu-multithread-hard-c2/stats.csv"); } diff --git a/rad-sim/example-designs/npu/npu_driver.hpp b/rad-sim/example-designs/npu/npu_driver.hpp index ad5c462..ffd6f18 100644 --- a/rad-sim/example-designs/npu/npu_driver.hpp +++ b/rad-sim/example-designs/npu/npu_driver.hpp @@ -18,6 +18,7 @@ class npu_driver : public sc_module { std::vector> npu_outputs; public: + RADSimDesignContext* radsim_design; sc_in clk; sc_out rst; sc_out inst_wdata; @@ -36,7 +37,7 @@ class npu_driver : public sc_module { sc_vector>> ofifo_ren; sc_vector>>> ofifo_rdata; - npu_driver(const sc_module_name& name); + npu_driver(const sc_module_name& name, RADSimDesignContext* radsim_design_); ~npu_driver(); void source(); diff --git a/rad-sim/example-designs/npu/npu_system.cpp b/rad-sim/example-designs/npu/npu_system.cpp index 3b9c325..564c6ab 100644 --- a/rad-sim/example-designs/npu/npu_system.cpp +++ b/rad-sim/example-designs/npu/npu_system.cpp @@ -1,6 +1,6 @@ #include -npu_system::npu_system(const sc_module_name &name, sc_clock* driver_clk_sig) +npu_system::npu_system(const sc_module_name &name, sc_clock* driver_clk_sig, RADSimDesignContext* radsim_design) : sc_module(name), inst_wdata("inst_wdata"), inst_waddr("inst_waddr"), @@ -26,7 +26,7 @@ npu_system::npu_system(const sc_module_name &name, sc_clock* driver_clk_sig) init_vector>::init_sc_vector(ofifo_ren, THREADS, CORES); init_vector>>::init_sc_vector(ofifo_rdata, THREADS, CORES); - npu_driver_inst = new npu_driver("npu_driver_inst"); + npu_driver_inst = new npu_driver("npu_driver_inst", radsim_design); npu_driver_inst->clk(*driver_clk_sig); npu_driver_inst->rst(rst_sig); npu_driver_inst->inst_wdata(inst_wdata); @@ -45,7 +45,7 @@ npu_system::npu_system(const sc_module_name &name, sc_clock* driver_clk_sig) npu_driver_inst->ofifo_ren(ofifo_ren); npu_driver_inst->ofifo_rdata(ofifo_rdata); - npu_inst = new npu_top("npu_inst"); + npu_inst = new npu_top("npu_inst", radsim_design); npu_inst->rst(rst_sig); npu_inst->inst_wdata(inst_wdata); npu_inst->inst_waddr(inst_waddr); @@ -62,6 +62,9 @@ npu_system::npu_system(const sc_module_name &name, sc_clock* driver_clk_sig) npu_inst->ofifo_rdy(ofifo_rdy); npu_inst->ofifo_ren(ofifo_ren); npu_inst->ofifo_rdata(ofifo_rdata); + + //add _top as dut instance for parent class RADSimDesignSystem + this->design_dut_inst = npu_inst; } npu_system::~npu_system() { diff --git a/rad-sim/example-designs/npu/npu_system.hpp b/rad-sim/example-designs/npu/npu_system.hpp index 1581ea3..1e18097 100644 --- a/rad-sim/example-designs/npu/npu_system.hpp +++ b/rad-sim/example-designs/npu/npu_system.hpp @@ -9,8 +9,9 @@ #include #include #include +#include -class npu_system : public sc_module { +class npu_system : public RADSimDesignSystem { private: public: sc_signal inst_wdata; @@ -33,6 +34,6 @@ class npu_system : public sc_module { npu_driver* npu_driver_inst; npu_top* npu_inst; - npu_system(const sc_module_name& name, sc_clock* driver_clk_sig); + npu_system(const sc_module_name& name, sc_clock* driver_clk_sig, RADSimDesignContext* radsim_design); ~npu_system(); }; \ No newline at end of file diff --git a/rad-sim/example-designs/npu/npu_top.cpp b/rad-sim/example-designs/npu/npu_top.cpp index 442ef53..4f3fcdd 100644 --- a/rad-sim/example-designs/npu/npu_top.cpp +++ b/rad-sim/example-designs/npu/npu_top.cpp @@ -1,6 +1,6 @@ #include -npu_top::npu_top(const sc_module_name &name) : sc_module(name), +npu_top::npu_top(const sc_module_name &name, RADSimDesignContext* radsim_design) : RADSimDesignTop(radsim_design), sector_chain_fifo_rdy_signals("sector_chain_fifo_rdy_signals"), sector_chain_fifo_ren_signals("sector_chain_fifo_ren_signals"), sector_chain_fifo_rdata_signals("sector_chain_fifo_rdata_signals"), @@ -31,7 +31,7 @@ npu_top::npu_top(const sc_module_name &name) : sc_module(name), char module_name[NAME_LENGTH]; std::string module_name_str; - first_mvu_sector = new axis_mvu_sector("axis_mvu_sector_0", 0); + first_mvu_sector = new axis_mvu_sector("axis_mvu_sector_0", 0, radsim_design); first_mvu_sector->rst(rst); first_mvu_sector->mrf_waddr(mrf_waddr); first_mvu_sector->mrf_wdata(mrf_wdata); @@ -43,7 +43,7 @@ npu_top::npu_top(const sc_module_name &name) : sc_module(name), for (unsigned int sector_id = 1; sector_id < SECTORS; sector_id++) { module_name_str = "axis_mvu_sector_" + std::to_string(sector_id); std::strcpy(module_name, module_name_str.c_str()); - mvu_sectors[sector_id - 1] = new axis_mvu_sector_chain(module_name, sector_id); + mvu_sectors[sector_id - 1] = new axis_mvu_sector_chain(module_name, sector_id, radsim_design); mvu_sectors[sector_id - 1]->rst(rst); mvu_sectors[sector_id - 1]->mrf_waddr(mrf_waddr); mvu_sectors[sector_id - 1]->mrf_wdata(mrf_wdata); @@ -59,7 +59,7 @@ npu_top::npu_top(const sc_module_name &name) : sc_module(name), for (unsigned int thread_id = 0; thread_id < THREADS; thread_id++) { module_name_str = "axis_inst_dispatcher_" + std::to_string(thread_id); std::strcpy(module_name, module_name_str.c_str()); - inst_dispatcher[thread_id] = new axis_inst_dispatch(module_name, thread_id); + inst_dispatcher[thread_id] = new axis_inst_dispatch(module_name, thread_id, radsim_design); inst_dispatcher[thread_id]->rst(rst); inst_dispatcher[thread_id]->start_pc(start_pc); inst_dispatcher[thread_id]->end_pc(end_pc); @@ -70,7 +70,7 @@ npu_top::npu_top(const sc_module_name &name) : sc_module(name), module_name_str = "axis_vector_elementwise_" + std::to_string(thread_id); std::strcpy(module_name, module_name_str.c_str()); - vector_elementwise_blocks[thread_id] = new axis_vector_elementwise(module_name, thread_id); + vector_elementwise_blocks[thread_id] = new axis_vector_elementwise(module_name, thread_id, radsim_design); vector_elementwise_blocks[thread_id]->rst(rst); vector_elementwise_blocks[thread_id]->ext_input_fifo_rdy(ififo_rdy[thread_id]); vector_elementwise_blocks[thread_id]->ext_input_fifo_wen(ififo_wen[thread_id]); @@ -80,9 +80,10 @@ npu_top::npu_top(const sc_module_name &name) : sc_module(name), vector_elementwise_blocks[thread_id]->ext_output_fifo_rdata(ofifo_rdata[thread_id]); } - radsim_design.BuildDesignContext("npu.place", "npu.clks"); - radsim_design.CreateSystemNoCs(rst); - radsim_design.ConnectModulesToNoC(); + this->connectPortalReset(&rst); + radsim_design->BuildDesignContext("npu.place", "npu.clks"); + radsim_design->CreateSystemNoCs(rst); + radsim_design->ConnectModulesToNoC(); } npu_top::~npu_top() { diff --git a/rad-sim/example-designs/npu/npu_top.hpp b/rad-sim/example-designs/npu/npu_top.hpp index 0c0d8ee..02b30f8 100644 --- a/rad-sim/example-designs/npu/npu_top.hpp +++ b/rad-sim/example-designs/npu/npu_top.hpp @@ -12,9 +12,10 @@ #include #include #include +#include -class npu_top : public sc_module { +class npu_top : public RADSimDesignTop { private: sc_vector>>> sector_chain_fifo_rdy_signals; sc_vector>>> sector_chain_fifo_ren_signals; @@ -46,7 +47,7 @@ class npu_top : public sc_module { sc_vector>> ofifo_ren; sc_vector>>> ofifo_rdata; - npu_top(const sc_module_name& name); + npu_top(const sc_module_name& name, RADSimDesignContext* radsim_design); ~npu_top(); void prepare_adapters_info(); }; \ No newline at end of file diff --git a/rad-sim/sim/.gitignore b/rad-sim/sim/.gitignore index e7ea263..a7b904a 100644 --- a/rad-sim/sim/.gitignore +++ b/rad-sim/sim/.gitignore @@ -4,4 +4,5 @@ *.txt *.csv *.xml -*.log \ No newline at end of file +*.log +*.trace \ No newline at end of file diff --git a/rad-sim/sim/CMakeLists.txt b/rad-sim/sim/CMakeLists.txt index b7e4bfb..0b78659 100644 --- a/rad-sim/sim/CMakeLists.txt +++ b/rad-sim/sim/CMakeLists.txt @@ -18,9 +18,14 @@ include_directories( dram dram/DRAMsim3 dram/DRAMsim3/src - ../example-designs/${DESIGN} - ../example-designs/${DESIGN}/modules -) + ) + +FOREACH(DESIGN_NAME ${DESIGN_NAMES}) + include_directories( + ../example-designs/${DESIGN_NAME} + ../example-designs/${DESIGN_NAME}/modules + ) +ENDFOREACH() find_package(verilator CONFIG) if (verilator_FOUND) @@ -33,6 +38,9 @@ set(srcfiles radsim_module.cpp radsim_telemetry.cpp radsim_utils.cpp + radsim_cluster.cpp + radsim_inter_rad.cpp + portal.cpp ) set(hdrfiles @@ -42,17 +50,22 @@ set(hdrfiles radsim_module.hpp radsim_telemetry.hpp radsim_utils.hpp + radsim_cluster.hpp + radsim_inter_rad.hpp + design_top.hpp + design_system.hpp + portal.hpp ) add_compile_options(-Wall -Wextra -pedantic) set(CMAKE_CXX_FLAGS_DEBUG "-g") -set(CMAKE_CXX_FLAGS_RELEASE "-O3") +set(CMAKE_CXX_FLAGS_RELEASE "-O3 -g") add_library(radsim STATIC ${srcfiles} ${hdrfiles}) -target_link_libraries(radsim PUBLIC SystemC::systemc booksim noc dram design) +target_link_libraries(radsim PUBLIC SystemC::systemc booksim noc dram ${DESIGN_NAMES}) add_executable(system main.cpp ${srcfiles} ${hdrfiles}) -target_link_libraries(system PUBLIC radsim SystemC::systemc booksim noc dram design) +target_link_libraries(system PUBLIC radsim SystemC::systemc booksim noc dram ${DESIGN_NAMES}) add_custom_target(run COMMAND system WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR} diff --git a/rad-sim/sim/design_context.cpp b/rad-sim/sim/design_context.cpp index 1ae3aaf..1ae1be7 100644 --- a/rad-sim/sim/design_context.cpp +++ b/rad-sim/sim/design_context.cpp @@ -1,14 +1,14 @@ #include -RADSimDesignContext::RADSimDesignContext() { - std::string radsim_knobs_filename = "/sim/radsim_knobs"; - std::string radsim_knobs_filepath = RADSIM_ROOT_DIR + radsim_knobs_filename; - ParseRADSimKnobs(radsim_knobs_filepath); +RADSimDesignContext::RADSimDesignContext(unsigned int rad_id_) { + + //assign its rad id + rad_id = rad_id_; // Create NoC clocks std::string clk_name; std::vector noc_period = - radsim_config.GetDoubleVectorKnob("noc_clk_period"); + radsim_config.GetDoubleVectorKnobPerRad("noc_clk_period", rad_id); _noc_clks.resize(noc_period.size()); for (unsigned int clk_id = 0; clk_id < _noc_clks.size(); clk_id++) { clk_name = "noc_clk" + std::to_string(clk_id); @@ -18,7 +18,7 @@ RADSimDesignContext::RADSimDesignContext() { // Create adapter clocks std::vector adapter_period = - radsim_config.GetDoubleVectorKnob("noc_adapters_clk_period"); + radsim_config.GetDoubleVectorKnobPerRad("noc_adapters_clk_period", rad_id); _adapter_clks.resize(adapter_period.size()); for (unsigned int clk_id = 0; clk_id < _adapter_clks.size(); clk_id++) { clk_name = "adapter_clk" + std::to_string(clk_id); @@ -28,7 +28,7 @@ RADSimDesignContext::RADSimDesignContext() { // Create module clocks std::vector module_period = - radsim_config.GetDoubleVectorKnob("design_clk_periods"); + radsim_config.GetDoubleVectorKnobPerRad("design_clk_periods", rad_id); _module_clks.resize(module_period.size()); for (unsigned int clk_id = 0; clk_id < _module_clks.size(); clk_id++) { clk_name = "module_clk" + std::to_string(clk_id); @@ -36,12 +36,13 @@ RADSimDesignContext::RADSimDesignContext() { new sc_clock(clk_name.c_str(), module_period[clk_id], SC_NS); } - int num_nocs = radsim_config.GetIntKnob("noc_num_nocs"); + int num_nocs = radsim_config.GetIntKnobPerRad("noc_num_nocs", rad_id); _node_module_names.resize(num_nocs); for (int noc_id = 0; noc_id < num_nocs; noc_id++) { - int num_nodes = radsim_config.GetIntVectorKnob("noc_num_nodes", noc_id); + int num_nodes = radsim_config.GetIntVectorKnobPerRad("noc_num_nodes", noc_id, rad_id); _node_module_names[noc_id].resize(num_nodes); } + rad_done = false; //initially this RAD is not done its simulation design } RADSimDesignContext::~RADSimDesignContext() {} @@ -69,11 +70,11 @@ std::string GetModuleNameFromPortName(std::string &port_name) { return module_name; } -uint64_t DeterminedBaseAddress(int noc_id, int node_id) { - int num_nocs = radsim_config.GetIntKnob("noc_num_nocs"); +uint64_t DetermineBaseAddress(int noc_id, int node_id, int rad_id) { + int num_nocs = radsim_config.GetIntKnobPerRad("noc_num_nocs", rad_id); int max_num_nodes = 0; for (int noc_id = 0; noc_id < num_nocs; noc_id++) { - int num_nodes = radsim_config.GetIntVectorKnob("noc_num_nodes", noc_id); + int num_nodes = radsim_config.GetIntVectorKnobPerRad("noc_num_nodes", noc_id, rad_id); if (num_nodes > max_num_nodes) { max_num_nodes = num_nodes; } @@ -86,12 +87,12 @@ uint64_t DeterminedBaseAddress(int noc_id, int node_id) { return base_addr; } -void RADSimDesignContext::ParseNoCPlacement( - const std::string &placement_filename) { +void RADSimDesignContext::ParseNoCPlacement(const std::string &placement_filename) { std::string placement_filepath = - radsim_config.GetStringKnob("radsim_user_design_root_dir") + "/" + + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", rad_id) + "/" + placement_filename; std::ifstream placement_file(placement_filepath); + std::cout << "placement_filepath: " << placement_filepath << std::endl; std::string line; while (std::getline(placement_file, line)) { @@ -153,7 +154,7 @@ void RADSimDesignContext::ParseNoCPlacement( // Set base address information _aximm_port_base_addresses[port_name] = - DeterminedBaseAddress(port_noc_placement, port_node_placement); + DetermineBaseAddress(port_noc_placement, port_node_placement, rad_id); } } else { std::string module_name, port_name, port_noc_placement_str, @@ -257,7 +258,7 @@ void RADSimDesignContext::ParseNoCPlacement( } // Set base address information _aximm_port_base_addresses[port_name] = - DeterminedBaseAddress(port_noc_placement, port_node_placement); + DetermineBaseAddress(port_noc_placement, port_node_placement, rad_id); } for (unsigned int port_id = 0; @@ -287,7 +288,7 @@ void RADSimDesignContext::ParseNoCPlacement( } // Set base address information _aximm_port_base_addresses[port_name] = - DeterminedBaseAddress(port_noc_placement, port_node_placement); + DetermineBaseAddress(port_noc_placement, port_node_placement, rad_id); } } _node_module_names[port_noc_placement][port_node_placement].insert( @@ -298,7 +299,7 @@ void RADSimDesignContext::ParseNoCPlacement( void RADSimDesignContext::ParseClockSettings(const std::string &clks_filename) { std::string clks_filepath = - radsim_config.GetStringKnob("radsim_user_design_root_dir") + "/" + + radsim_config.GetStringKnobPerRad("radsim_user_design_root_dir", rad_id) + "/" + clks_filename; std::ifstream clks_file(clks_filepath); @@ -327,9 +328,8 @@ void RADSimDesignContext::RegisterModule(std::string module_name, _design_modules[module_name] = module_ptr; } -void RADSimDesignContext::BuildDesignContext( - const std::string &placement_filename, const std::string &clks_filename) { - unsigned int num_nocs = radsim_config.GetIntKnob("noc_num_nocs"); +void RADSimDesignContext::BuildDesignContext(const std::string &placement_filename, const std::string &clks_filename) { + unsigned int num_nocs = radsim_config.GetIntKnobPerRad("noc_num_nocs", rad_id); _node_id_is_aximm.resize(num_nocs); _node_id_ports_list.resize(num_nocs); _noc_axis_slave_adapter_info.resize(num_nocs); @@ -464,16 +464,17 @@ void RADSimDesignContext::BuildDesignContext( } void RADSimDesignContext::CreateSystemNoCs(sc_in &rst) { - unsigned int num_nocs = radsim_config.GetIntKnob("noc_num_nocs"); + unsigned int num_nocs = radsim_config.GetIntKnobPerRad("noc_num_nocs", rad_id); for (unsigned int noc_id = 0; noc_id < num_nocs; noc_id++) { std::string noc_name_str = "radsim_noc_" + std::to_string(noc_id); const char *noc_name = noc_name_str.c_str(); radsim_noc *noc_inst = - new radsim_noc(noc_name, noc_id, _adapter_clks, _module_clks, + new radsim_noc(noc_name, rad_id, portal_slave_name, noc_id, _adapter_clks, _module_clks, _noc_axis_master_adapter_info[noc_id], _noc_axis_slave_adapter_info[noc_id], _noc_aximm_master_adapter_info[noc_id], - _noc_aximm_slave_adapter_info[noc_id]); + _noc_aximm_slave_adapter_info[noc_id], + this); noc_inst->noc_clk(*_noc_clks[noc_id]); noc_inst->rst(rst); @@ -488,26 +489,31 @@ void RADSimDesignContext::ConnectModulesToNoC() { for (auto module_it = _design_modules.begin(); module_it != _design_modules.end(); module_it++) { RADSimModule *module_ptr = module_it->second; - // std::cout << "MODULE " << module_ptr->name() << std::endl; + //std::cout << "MODULE " << module_ptr->name() << std::endl; // Connect AXI-S Slave ports of the module - // std::cout << "AXI-S slave ports: " << std::endl; + //std::cout << "AXI-S slave ports: " << std::endl; + //std::cout << module_ptr->_axis_slave_ports.begin()->first<< std::endl; for (auto slave_port_it = module_ptr->_axis_slave_ports.begin(); slave_port_it != module_ptr->_axis_slave_ports.end(); slave_port_it++) { + //std::cout << "here" << std::endl; std::string port_name = slave_port_it->first; // std::cout << port_name << ", "; if (_port_placement.find(port_name) == _port_placement.end()) sim_log.log(error, "Port " + port_name + " has no NoC placement defined!"); unsigned int noc_id = std::get<0>(_port_placement[port_name]); + //std::cout << _noc_axis_master_ports[noc_id][port_name] << std::endl; _axis_signals[axis_signal_id].Connect( *(_noc_axis_master_ports[noc_id][port_name]), *(slave_port_it->second)); + //std::cout << "here3" << std::endl; axis_signal_id++; + //std::cout << axis_signal_id << std::endl; } // Connect AXI-S Master ports of the module - // std::cout << "\nAXI-S master ports: "; + //std::cout << "\nAXI-S master ports: "; for (auto master_port_it = module_ptr->_axis_master_ports.begin(); master_port_it != module_ptr->_axis_master_ports.end(); master_port_it++) { @@ -523,7 +529,7 @@ void RADSimDesignContext::ConnectModulesToNoC() { } // Connect AXI-MM Slave ports of the module - // std::cout << "\nAXI-MM slave ports: "; + //std::cout << "\nAXI-MM slave ports: "; for (auto slave_port_it = module_ptr->_aximm_slave_ports.begin(); slave_port_it != module_ptr->_aximm_slave_ports.end(); slave_port_it++) { @@ -539,7 +545,7 @@ void RADSimDesignContext::ConnectModulesToNoC() { } // Connect AXI-MM Master ports of the module - // std::cout << "\nAXI-MM master ports: "; + //std::cout << "\nAXI-MM master ports: "; for (auto master_port_it = module_ptr->_aximm_master_ports.begin(); master_port_it != module_ptr->_aximm_master_ports.end(); master_port_it++) { @@ -553,7 +559,7 @@ void RADSimDesignContext::ConnectModulesToNoC() { *(_noc_aximm_slave_ports[noc_id][port_name])); aximm_signal_id++; } - // std::cout << "\n"; + //std::cout << "\n"; } } @@ -679,9 +685,9 @@ void RADSimDesignContext::DumpDesignContext() { cin.get(); } -std::vector>> & +std::vector>> RADSimDesignContext::GetNodeModuleNames() { - return _node_module_names; + return std::ref(_node_module_names); } uint64_t RADSimDesignContext::GetPortBaseAddress(std::string &port_name) { @@ -696,4 +702,26 @@ int RADSimDesignContext::GetSimExitCode() { void RADSimDesignContext::ReportDesignFailure() { _sim_exit_code = 1; +} + +//returns whether the rad is done simulation. needed because rad_done is private member. +bool +RADSimDesignContext::is_rad_done() { + return this->rad_done; +} + +void +RADSimDesignContext::set_rad_done() { + this->rad_done = true; +} + +void +RADSimDesignContext::AssignPortalSlaveName(std::string name) { + //std::cout << "design_context assigned portal name: " << name << std::endl; + this->portal_slave_name = name; +} + +unsigned int +RADSimDesignContext::GetPortalSlaveID () { + return GetPortDestinationID(portal_slave_name); } \ No newline at end of file diff --git a/rad-sim/sim/design_context.hpp b/rad-sim/sim/design_context.hpp index cf9ead2..73aa7fd 100644 --- a/rad-sim/sim/design_context.hpp +++ b/rad-sim/sim/design_context.hpp @@ -54,8 +54,13 @@ class RADSimDesignContext { _noc_aximm_master_ports; std::vector _aximm_signals; + //flag to indicate if this device done + bool rad_done; + public: - RADSimDesignContext(); + unsigned int rad_id; //unique ID of this RAD + std::string portal_slave_name; //when a portal module is created for the RAD, its name is stored here to lookup its node ID on the NoC + RADSimDesignContext(unsigned int rad_id_); ~RADSimDesignContext(); void ParseNoCPlacement(const std::string &placement_filename); void ParseClockSettings(const std::string &clks_filename); @@ -96,11 +101,13 @@ class RADSimDesignContext { unsigned int GetPortDestinationID(std::string &port_name); unsigned int GetPortInterfaceID(std::string &port_name); void DumpDesignContext(); - std::vector>> &GetNodeModuleNames(); + std::vector>> GetNodeModuleNames(); uint64_t GetPortBaseAddress(std::string &port_name); int GetSimExitCode(); void ReportDesignFailure(); -}; - -extern RADSimDesignContext radsim_design; \ No newline at end of file + bool is_rad_done(); + void set_rad_done(); + void AssignPortalSlaveName(std::string name); + unsigned int GetPortalSlaveID (); +}; \ No newline at end of file diff --git a/rad-sim/sim/design_system.hpp b/rad-sim/sim/design_system.hpp new file mode 100644 index 0000000..f830292 --- /dev/null +++ b/rad-sim/sim/design_system.hpp @@ -0,0 +1,9 @@ +#pragma once + +#include +#include + +class RADSimDesignSystem : virtual public sc_module { + public: + RADSimDesignTop* design_dut_inst; +}; \ No newline at end of file diff --git a/rad-sim/sim/design_top.hpp b/rad-sim/sim/design_top.hpp new file mode 100644 index 0000000..5bd9754 --- /dev/null +++ b/rad-sim/sim/design_top.hpp @@ -0,0 +1,44 @@ +#pragma once + +#include +#include +#include + +//Parent class for top-level classes of example designs. +//The class constructor creates a portal module and connects it to the ports on the design. +//Communication between RADs relies on the portal module as a connection between the device's NoC and the inter-RAD network. +//The definition of SINGLE_RAD is used to prevent creation of the portal module for simulations of only one RAD. +class RADSimDesignTop : virtual public sc_module { + public: + #ifndef SINGLE_RAD + axis_slave_port design_top_portal_axis_slave; + axis_master_port design_top_portal_axis_master; + portal* portal_inst; + #endif + RADSimDesignTop(RADSimDesignContext* radsim_design) { + #ifndef SINGLE_RAD + //create portal module + std::string module_name_str = "portal_inst"; + char module_name[25]; + std::strcpy(module_name, module_name_str.c_str()); + portal_inst = new portal(module_name, radsim_design); + + //connect master to master instead, to expose to top + portal_inst->portal_axis_master.ConnectToPort(this->design_top_portal_axis_master); + portal_inst->portal_axis_slave.ConnectToPort(this->design_top_portal_axis_slave); //top drives portal bc top receives slave inputs + + //connect reset signal + // portal_inst->rst(rst); + #endif + } + ~RADSimDesignTop() { + #ifndef SINGLE_RAD + delete portal_inst; + #endif + } + void connectPortalReset(sc_in* rst) { + #ifndef SINGLE_RAD + this->portal_inst->rst(*rst); + #endif + } +}; \ No newline at end of file diff --git a/rad-sim/sim/dram/DRAMsim3/Makefile b/rad-sim/sim/dram/DRAMsim3/Makefile deleted file mode 100644 index 130f7b5..0000000 --- a/rad-sim/sim/dram/DRAMsim3/Makefile +++ /dev/null @@ -1,43 +0,0 @@ -# ONLY use this makefile if you do NOT have a cmake 3.0+ version - -CC=gcc -CXX=g++ - -FMT_LIB_DIR=ext/fmt/include -INI_LIB_DIR=ext/headers -JSON_LIB_DIR=ext/headers -ARGS_LIB_DIR=ext/headers - -INC=-Isrc/ -I$(FMT_LIB_DIR) -I$(INI_LIB_DIR) -I$(ARGS_LIB_DIR) -I$(JSON_LIB_DIR) -CXXFLAGS=-Wall -O3 -fPIC -std=c++11 $(INC) -DFMT_HEADER_ONLY=1 - -LIB_NAME=libdramsim3.so -EXE_NAME=dramsim3main.out - -SRCS = src/bankstate.cc src/channel_state.cc src/command_queue.cc src/common.cc \ - src/configuration.cc src/controller.cc src/dram_system.cc src/hmc.cc \ - src/memory_system.cc src/refresh.cc src/simple_stats.cc src/timing.cc - -EXE_SRCS = src/cpu.cc src/main.cc - -OBJECTS = $(addsuffix .o, $(basename $(SRCS))) -EXE_OBJS = $(addsuffix .o, $(basename $(EXE_SRCS))) -EXE_OBJS := $(EXE_OBJS) $(OBJECTS) - - -all: $(LIB_NAME) $(EXE_NAME) - -$(EXE_NAME): $(EXE_OBJS) - $(CXX) $(CXXFLAGS) -o $@ $^ - -$(LIB_NAME): $(OBJECTS) - $(CXX) -g -shared -Wl,-soname,$@ -o $@ $^ - -%.o : %.cc - $(CXX) $(CXXFLAGS) -o $@ -c $< - -%.o : %.c - $(CC) -fPIC -O2 -o $@ -c $< - -clean: - -rm -f $(EXE_OBJS) $(LIB_NAME) $(EXE_NAME) \ No newline at end of file diff --git a/rad-sim/sim/dram/mem_controller.cpp b/rad-sim/sim/dram/mem_controller.cpp index 45ab503..efe1f4b 100644 --- a/rad-sim/sim/dram/mem_controller.cpp +++ b/rad-sim/sim/dram/mem_controller.cpp @@ -31,16 +31,18 @@ void mem_controller::InitializeMemoryContents(std::string &init_filename) { } mem_controller::mem_controller(const sc_module_name &name, unsigned int dram_id, - std::string init_filename) - : RADSimModule(name), mem_clk("mem_clk"), rst("rst") { + RADSimDesignContext* radsim_design, std::string init_filename) + : RADSimModule(name, radsim_design), mem_clk("mem_clk"), rst("rst") { std::string config_file = - radsim_config.GetStringKnob("radsim_root_dir") + + radsim_config.GetStringKnobShared("radsim_root_dir") + "/sim/dram/DRAMsim3/configs/" + - radsim_config.GetStringVectorKnob("dram_config_files", dram_id) + ".ini"; + radsim_config.GetStringVectorKnobPerRad("dram_config_files", dram_id, radsim_design->rad_id) + ".ini"; + + //std::cout << "mem_controller::mem_controller() config_file: " << config_file << std::endl; std::string output_dir = - radsim_config.GetStringKnob("radsim_root_dir") + "/logs"; + radsim_config.GetStringKnobShared("radsim_root_dir") + "/logs"; _dramsim = new dramsim3::MemorySystem( config_file, output_dir, @@ -49,6 +51,7 @@ mem_controller::mem_controller(const sc_module_name &name, unsigned int dram_id, dram_id); _mem_id = dram_id; _num_channels = _dramsim->GetChannels(); + //std::cout << "mem_controller.cpp mem_controller() _num_channels: " << _num_channels << std::endl; mem_channels.init(_num_channels); _memory_channel_bitwidth = _dramsim->GetBusBits(); @@ -56,7 +59,7 @@ mem_controller::mem_controller(const sc_module_name &name, unsigned int dram_id, _dramsim->GetBusBits() * _dramsim->GetBurstLength(); _memory_clk_period_ns = _dramsim->GetTCK(); _controller_clk_period_ns = - radsim_config.GetDoubleVectorKnob("dram_clk_periods", dram_id); + radsim_config.GetDoubleVectorKnobPerRad("dram_clk_periods", dram_id, radsim_design->rad_id); double bitwidth_ratio = 1.0 * _controller_channel_bitwidth / _memory_channel_bitwidth; double clk_period_ratio = @@ -90,9 +93,9 @@ mem_controller::mem_controller(const sc_module_name &name, unsigned int dram_id, _output_write_queue_occupancy.init(_num_channels); _output_read_queue_occupancy.init(_num_channels); _input_queue_size = - radsim_config.GetIntVectorKnob("dram_queue_sizes", dram_id); + radsim_config.GetIntVectorKnobPerRad("dram_queue_sizes", dram_id, radsim_design->rad_id); _output_queue_size = - radsim_config.GetIntVectorKnob("dram_queue_sizes", dram_id); + radsim_config.GetIntVectorKnobPerRad("dram_queue_sizes", dram_id, radsim_design->rad_id); _num_ranks = _dramsim->GetRanks(); _num_bank_groups = _dramsim->GetBankGroups(); @@ -625,6 +628,7 @@ void mem_controller::RegisterModuleInfo() { for (unsigned int ch_id = 0; ch_id < _num_channels; ch_id++) { port_name = module_name + ".mem_channel_" + std::to_string(ch_id); + //std::cout << "mem_controller::RegisterModuleInfo() port_name: " << port_name << std::endl; RegisterAximmSlavePort(port_name, &mem_channels[ch_id], _addressable_size_bytes * 8); } diff --git a/rad-sim/sim/dram/mem_controller.hpp b/rad-sim/sim/dram/mem_controller.hpp index 7495ece..a8f4fd3 100644 --- a/rad-sim/sim/dram/mem_controller.hpp +++ b/rad-sim/sim/dram/mem_controller.hpp @@ -82,8 +82,10 @@ class mem_controller : public RADSimModule { sc_in mem_clk; sc_in rst; sc_vector mem_channels; + RADSimDesignContext* radsim_design; mem_controller(const sc_module_name &name, unsigned int dram_id, + RADSimDesignContext* radsim_design, std::string init_filename = ""); ~mem_controller(); diff --git a/rad-sim/sim/dram/mem_controller_test.cpp b/rad-sim/sim/dram/mem_controller_test.cpp index 62dd6c5..cd4645a 100644 --- a/rad-sim/sim/dram/mem_controller_test.cpp +++ b/rad-sim/sim/dram/mem_controller_test.cpp @@ -4,9 +4,11 @@ mem_controller_test::mem_controller_test( const sc_module_name &name, unsigned int num_cmds, unsigned int test_mode, unsigned int burst_size, unsigned int num_channels, unsigned int mem_capacity_mb, unsigned int num_used_channels, - unsigned int addressable_word_size_bytes, double clk_period) + unsigned int addressable_word_size_bytes, double clk_period, RADSimDesignContext* radsim_design) : sc_module(name) { + this -> radsim_design = radsim_design; + tx_interface.init(num_channels); _burst_size = burst_size; @@ -316,7 +318,7 @@ void mem_controller_test::assign() { } } -mem_controller_system::mem_controller_system(const sc_module_name &name) +mem_controller_system::mem_controller_system(const sc_module_name &name, RADSimDesignContext* radsim_design) : sc_module(name) { double clk_period = 2.0; double mem_clk_period = 1.0; @@ -328,7 +330,7 @@ mem_controller_system::mem_controller_system(const sc_module_name &name) clk_sig = new sc_clock("clk0", clk_period, SC_NS); mem_clk_sig = new sc_clock("mem_clk", mem_clk_period, SC_NS); - dut_inst = new mem_controller("mem_controller", 0); + dut_inst = new mem_controller("mem_controller", 0, radsim_design); dut_inst->clk(*clk_sig); dut_inst->mem_clk(*mem_clk_sig); dut_inst->rst(rst_sig); @@ -339,7 +341,7 @@ mem_controller_system::mem_controller_system(const sc_module_name &name) test_inst = new mem_controller_test( "mem_controller_test", total_cmds, mode, burst_size, num_channels, dut_inst->GetMemCapacity(), num_used_channels, - dut_inst->GetAddressableWordSize(), clk_period); + dut_inst->GetAddressableWordSize(), clk_period, radsim_design); test_inst->clk(*clk_sig); test_inst->rst(rst_sig); diff --git a/rad-sim/sim/dram/mem_controller_test.hpp b/rad-sim/sim/dram/mem_controller_test.hpp index f2d3bdc..ab8cffb 100644 --- a/rad-sim/sim/dram/mem_controller_test.hpp +++ b/rad-sim/sim/dram/mem_controller_test.hpp @@ -28,13 +28,15 @@ class mem_controller_test : public sc_module { sc_in clk; sc_out rst; sc_vector tx_interface; + RADSimDesignContext* radsim_design; mem_controller_test(const sc_module_name &name, unsigned int num_cmds, unsigned int test_mode, unsigned int burst_size, unsigned int num_channels, unsigned int mem_capacity_mb, unsigned int num_used_channels, unsigned int addressable_word_size_bytes, - double clk_peiod); + double clk_peiod, + RADSimDesignContext* radsim_design); ~mem_controller_test(); void aw_source(); @@ -57,6 +59,6 @@ class mem_controller_system : public sc_module { sc_clock *clk_sig; sc_clock *mem_clk_sig; - mem_controller_system(const sc_module_name &name); + mem_controller_system(const sc_module_name &name, RADSimDesignContext* radsim_design); ~mem_controller_system(); }; \ No newline at end of file diff --git a/rad-sim/sim/main.cpp b/rad-sim/sim/main.cpp index a946870..3384f44 100644 --- a/rad-sim/sim/main.cpp +++ b/rad-sim/sim/main.cpp @@ -4,31 +4,63 @@ #include #include #include +#include +#include -#include +#include +#define NUM_RADS 2 RADSimConfig radsim_config; -RADSimDesignContext radsim_design; std::ostream *gWatchOut; SimLog sim_log; SimTraceRecording sim_trace_probe; int sc_main(int argc, char *argv[]) { + std::string radsim_knobs_filename = "/sim/radsim_knobs"; + std::string radsim_knobs_filepath = RADSIM_ROOT_DIR + radsim_knobs_filename; + radsim_config.ResizeAll(NUM_RADS); + ParseRADSimKnobs(radsim_knobs_filepath); + + RADSimCluster* cluster = new RADSimCluster(NUM_RADS); + gWatchOut = &cout; - int log_verbosity = radsim_config.GetIntKnob("telemetry_log_verbosity"); + int log_verbosity = radsim_config.GetIntKnobShared("telemetry_log_verbosity"); sim_log.SetLogSettings(log_verbosity, "sim.log"); - int num_traces = radsim_config.GetIntKnob("telemetry_num_traces"); + int num_traces = radsim_config.GetIntKnobShared("telemetry_num_traces"); sim_trace_probe.SetTraceRecordingSettings("sim.trace", num_traces); - sc_clock *driver_clk_sig = new sc_clock( - "node_clk0", radsim_config.GetDoubleKnob("sim_driver_period"), SC_NS); + sc_clock *driver_clk_sig0 = new sc_clock( + "node_clk0", radsim_config.GetDoubleKnobShared("sim_driver_period"), SC_NS); + dlrm_two_rad_system *system0 = new dlrm_two_rad_system("dlrm_two_rad_system", driver_clk_sig0, cluster->all_rads[0]); + cluster->StoreSystem(system0); + sc_clock *driver_clk_sig1 = new sc_clock( + "node_clk0", radsim_config.GetDoubleKnobShared("sim_driver_period"), SC_NS); + dlrm_two_rad_system *system1 = new dlrm_two_rad_system("dlrm_two_rad_system", driver_clk_sig1, cluster->all_rads[1]); + cluster->StoreSystem(system1); + + sc_clock *inter_rad_clk_sig = new sc_clock( + "node_clk0", radsim_config.GetDoubleKnobShared("sim_driver_period"), SC_NS); + RADSimInterRad* blackbox = new RADSimInterRad("inter_rad_box", inter_rad_clk_sig, cluster); + + blackbox->ConnectClusterInterfaces(0); + blackbox->ConnectClusterInterfaces(1); + + int start_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + while (cluster->AllRADsNotDone()) { + sc_start(1, SC_NS); + } + int end_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + sc_stop(); + std::cout << "Simulation Cycles from main.cpp = " << end_cycle - start_cycle << std::endl; - dlrm_system *system = new dlrm_system("dlrm_system", driver_clk_sig); - sc_start(); + delete system0; + delete driver_clk_sig0; + delete system1; + delete driver_clk_sig1; + delete blackbox; + delete inter_rad_clk_sig; - delete system; - delete driver_clk_sig; sc_flit scf; scf.FreeAllFlits(); Flit *f = Flit::New(); @@ -38,5 +70,5 @@ int sc_main(int argc, char *argv[]) { sim_trace_probe.dump_traces(); (void)argc; (void)argv; - return radsim_design.GetSimExitCode(); + return cluster->all_rads[0]->GetSimExitCode(); } diff --git a/rad-sim/sim/noc/.gitignore b/rad-sim/sim/noc/.gitignore new file mode 100644 index 0000000..7ef2347 --- /dev/null +++ b/rad-sim/sim/noc/.gitignore @@ -0,0 +1 @@ +noc*_rad*_config diff --git a/rad-sim/sim/noc/aximm_master_adapter.cpp b/rad-sim/sim/noc/aximm_master_adapter.cpp index 259abc2..200ef3a 100644 --- a/rad-sim/sim/noc/aximm_master_adapter.cpp +++ b/rad-sim/sim/noc/aximm_master_adapter.cpp @@ -1,7 +1,7 @@ #include "aximm_master_adapter.hpp" aximm_master_adapter::aximm_master_adapter( - const sc_module_name &name, int node_id, int network_id, + const sc_module_name &name, unsigned int rad_id, int node_id, int network_id, BookSimConfig *noc_config, Network *noc, BufferState *buffer_state, tRoutingFunction routing_func, bool lookahead_routing, bool wait_for_tail_credit, map *ejected_flits, @@ -9,11 +9,12 @@ aximm_master_adapter::aximm_master_adapter( : sc_module(name) { // Initialize basic adapter member variables + _rad_id = rad_id; _node_id = node_id; _network_id = network_id; _node_period = node_period; _adapter_period = adapter_period; - _noc_period = radsim_config.GetDoubleVectorKnob("noc_clk_period", _network_id); + _noc_period = radsim_config.GetDoubleVectorKnobPerRad("noc_clk_period", _network_id, _rad_id); _interface_dataw = interface_dataw; _noc_config = noc_config; @@ -26,7 +27,7 @@ aximm_master_adapter::aximm_master_adapter( // Initialize request interface (AR, AW, W) member variables _ejection_afifo_depth = - radsim_config.GetIntVectorKnob("noc_adapters_fifo_size", _network_id); + radsim_config.GetIntVectorKnobPerRad("noc_adapters_fifo_size", _network_id, _rad_id); _ejection_afifos.resize(AXI_NUM_REQ_TYPES); _ejection_afifo_push_counter.init(AXI_NUM_REQ_TYPES); _ejection_afifo_pop_counter.init(AXI_NUM_REQ_TYPES); @@ -45,7 +46,7 @@ aximm_master_adapter::aximm_master_adapter( // Initialize response interface (B, R) member variables _injection_afifo_depth = - radsim_config.GetIntVectorKnob("noc_adapters_fifo_size", _network_id); + radsim_config.GetIntVectorKnobPerRad("noc_adapters_fifo_size", _network_id, _rad_id); _axi_transaction_width = AXI4_USERW; if ((AXI4_ADDRW + AXI4_CTRLW) > (_interface_dataw + AXI4_RESPW + 1)) { _axi_transaction_width += (AXI4_ADDRW + AXI4_CTRLW); @@ -593,9 +594,9 @@ void aximm_master_adapter::InputInjection() { booksim_flit->subnetwork = 0; booksim_flit->src = _node_id; booksim_flit->ctime = GetSimulationCycle( - radsim_config.GetDoubleVectorKnob("noc_clk_period", _network_id)); + radsim_config.GetDoubleVectorKnobPerRad("noc_clk_period", _network_id, _rad_id)); booksim_flit->itime = GetSimulationCycle( - radsim_config.GetDoubleVectorKnob("noc_clk_period", _network_id)); + radsim_config.GetDoubleVectorKnobPerRad("noc_clk_period", _network_id, _rad_id)); booksim_flit->cl = 0; booksim_flit->head = _to_be_injected_flit._head; booksim_flit->tail = _to_be_injected_flit._tail; diff --git a/rad-sim/sim/noc/aximm_master_adapter.hpp b/rad-sim/sim/noc/aximm_master_adapter.hpp index aa775e3..6e7547a 100644 --- a/rad-sim/sim/noc/aximm_master_adapter.hpp +++ b/rad-sim/sim/noc/aximm_master_adapter.hpp @@ -16,6 +16,7 @@ class aximm_master_adapter : public sc_module { private: // The ID and reconfigurable width of the node this adapter is connected to + unsigned int _rad_id; int _node_id; int _network_id; int _interface_dataw; @@ -84,7 +85,7 @@ class aximm_master_adapter : public sc_module { sc_in rst; aximm_master_port aximm_interface; - aximm_master_adapter(const sc_module_name &name, int node_id, int network_id, + aximm_master_adapter(const sc_module_name &name, unsigned int rad_id, int node_id, int network_id, BookSimConfig *noc_config, Network *noc, BufferState *buffer_state, tRoutingFunction routing_func, bool lookahead_routing, bool wait_for_tail_credit, diff --git a/rad-sim/sim/noc/aximm_slave_adapter.cpp b/rad-sim/sim/noc/aximm_slave_adapter.cpp index 82508de..03738c8 100644 --- a/rad-sim/sim/noc/aximm_slave_adapter.cpp +++ b/rad-sim/sim/noc/aximm_slave_adapter.cpp @@ -3,7 +3,7 @@ std::unordered_map>> stats; aximm_slave_adapter::aximm_slave_adapter( - const sc_module_name &name, int node_id, int network_id, + const sc_module_name &name, unsigned int rad_id, int node_id, int network_id, BookSimConfig *noc_config, Network *noc, BufferState *buffer_state, tRoutingFunction routing_func, bool lookahead_routing, bool wait_for_tail_credit, map *ejected_flits, @@ -11,11 +11,12 @@ aximm_slave_adapter::aximm_slave_adapter( : sc_module(name) { // Initialize basic adapter member variables + _rad_id = rad_id; _node_id = node_id; _network_id = network_id; _node_period = node_period; _adapter_period = adapter_period; - _noc_period = radsim_config.GetDoubleVectorKnob("noc_clk_period", _network_id); + _noc_period = radsim_config.GetDoubleVectorKnobPerRad("noc_clk_period", _network_id, _rad_id); _interface_dataw = interface_dataw; _noc_config = noc_config; @@ -28,7 +29,7 @@ aximm_slave_adapter::aximm_slave_adapter( // Initialize request interface (AR, AW, W) member variables _injection_afifo_depth = - radsim_config.GetIntVectorKnob("noc_adapters_fifo_size", _network_id); + radsim_config.GetIntVectorKnobPerRad("noc_adapters_fifo_size", _network_id, _rad_id); _axi_transaction_width = AXI4_USERW; if ((AXI4_ADDRW + AXI4_CTRLW) > (_interface_dataw + AXI4_RESPW + 1)) { @@ -61,7 +62,7 @@ aximm_slave_adapter::aximm_slave_adapter( // Initialize response interface (B, R) member variables _ejected_booksim_flit = nullptr; _ejection_afifo_depth = - radsim_config.GetIntVectorKnob("noc_adapters_fifo_size", _network_id); + radsim_config.GetIntVectorKnobPerRad("noc_adapters_fifo_size", _network_id, _rad_id); _ejection_afifos.resize(AXI_NUM_RSP_TYPES); _ejection_afifo_push_counter.init(AXI_NUM_RSP_TYPES); _ejection_afifo_pop_counter.init(AXI_NUM_RSP_TYPES); @@ -402,9 +403,9 @@ void aximm_slave_adapter::InputInjection() { booksim_flit->subnetwork = 0; booksim_flit->src = _node_id; booksim_flit->ctime = GetSimulationCycle( - radsim_config.GetDoubleVectorKnob("noc_clk_period", _network_id)); + radsim_config.GetDoubleVectorKnobPerRad("noc_clk_period", _network_id, _rad_id)); booksim_flit->itime = GetSimulationCycle( - radsim_config.GetDoubleVectorKnob("noc_clk_period", _network_id)); + radsim_config.GetDoubleVectorKnobPerRad("noc_clk_period", _network_id, _rad_id)); booksim_flit->cl = 0; booksim_flit->head = _to_be_injected_flit._head; booksim_flit->tail = _to_be_injected_flit._tail; diff --git a/rad-sim/sim/noc/aximm_slave_adapter.hpp b/rad-sim/sim/noc/aximm_slave_adapter.hpp index 56346f9..7d05d8d 100644 --- a/rad-sim/sim/noc/aximm_slave_adapter.hpp +++ b/rad-sim/sim/noc/aximm_slave_adapter.hpp @@ -51,6 +51,7 @@ class aximm_slave_adapter : public sc_module { private: // The node ID, network ID and data width of the node this adapter is // connected to + unsigned int _rad_id; int _node_id; int _network_id; int _interface_dataw; @@ -165,7 +166,7 @@ class aximm_slave_adapter : public sc_module { // AXI-MM Master Port aximm_slave_port aximm_interface; - aximm_slave_adapter(const sc_module_name &name, int node_id, int network_id, + aximm_slave_adapter(const sc_module_name &name, unsigned int rad_id, int node_id, int network_id, BookSimConfig *noc_config, Network *noc, BufferState *buffer_state, tRoutingFunction routing_func, bool lookahead_routing, bool wait_for_tail_credit, diff --git a/rad-sim/sim/noc/axis_master_adapter.cpp b/rad-sim/sim/noc/axis_master_adapter.cpp index 2ca1551..26bd400 100644 --- a/rad-sim/sim/noc/axis_master_adapter.cpp +++ b/rad-sim/sim/noc/axis_master_adapter.cpp @@ -1,13 +1,14 @@ #include "axis_master_adapter.hpp" axis_master_adapter::axis_master_adapter( - const sc_module_name &name, int node_id, int network_id, + const sc_module_name &name, unsigned int rad_id, int node_id, int network_id, std::vector &interface_types, std::vector &interface_dataw, BookSimConfig *noc_config, Network *noc, BufferState *buffer_state, tRoutingFunction routing_func, bool lookahead_routing, bool wait_for_tail_credit, map *ejected_flits) : sc_module(name) { + _rad_id = rad_id; _node_id = node_id; _network_id = network_id; _num_axis_interfaces = interface_types.size(); @@ -20,7 +21,7 @@ axis_master_adapter::axis_master_adapter( _num_flits[interface_id] = (int)ceil(payload_dataw * 1.0 / NOC_LINKS_PAYLOAD_WIDTH); } - _num_vcs = radsim_config.GetIntVectorKnob("noc_vcs", _network_id); + _num_vcs = radsim_config.GetIntVectorKnobPerRad("noc_vcs", _network_id, _rad_id); axis_interfaces.init(_num_axis_interfaces); _noc_config = noc_config; @@ -33,7 +34,7 @@ axis_master_adapter::axis_master_adapter( _ejected_booksim_flit = nullptr; _ejection_afifo_depth = - radsim_config.GetIntVectorKnob("noc_adapters_fifo_size", _network_id); + radsim_config.GetIntVectorKnobPerRad("noc_adapters_fifo_size", _network_id, _rad_id); _ejection_afifos.resize(_num_vcs); _ejection_afifo_push_counter.init(_num_vcs); _ejection_afifo_pop_counter.init(_num_vcs); @@ -44,7 +45,7 @@ axis_master_adapter::axis_master_adapter( _output_afifos.resize(_num_axis_interfaces); _output_packet_ready.resize(_num_axis_interfaces); _output_afifo_depth = - radsim_config.GetIntVectorKnob("noc_adapters_obuff_size", _network_id); + radsim_config.GetIntVectorKnobPerRad("noc_adapters_obuff_size", _network_id, _rad_id); _constructed_packet = sc_packet(); _output_chunk.resize(_num_axis_interfaces); diff --git a/rad-sim/sim/noc/axis_master_adapter.hpp b/rad-sim/sim/noc/axis_master_adapter.hpp index 13b4126..38d81ba 100644 --- a/rad-sim/sim/noc/axis_master_adapter.hpp +++ b/rad-sim/sim/noc/axis_master_adapter.hpp @@ -16,6 +16,7 @@ class axis_master_adapter : public sc_module { private: + unsigned int _rad_id; unsigned int _node_id; unsigned int _network_id; unsigned int _num_axis_interfaces; @@ -52,7 +53,7 @@ class axis_master_adapter : public sc_module { sc_in rst; sc_vector axis_interfaces; - axis_master_adapter(const sc_module_name &name, int node_id, int network_id, + axis_master_adapter(const sc_module_name &name, unsigned int rad_id, int node_id, int network_id, std::vector &interface_types, std::vector &interface_dataw, BookSimConfig *noc_config, Network *noc, diff --git a/rad-sim/sim/noc/axis_slave_adapter.cpp b/rad-sim/sim/noc/axis_slave_adapter.cpp index ca6d75e..7055e41 100644 --- a/rad-sim/sim/noc/axis_slave_adapter.cpp +++ b/rad-sim/sim/noc/axis_slave_adapter.cpp @@ -1,7 +1,7 @@ #include axis_slave_adapter::axis_slave_adapter( - const sc_module_name &name, int node_id, int network_id, + const sc_module_name &name, unsigned int rad_id, int node_id, int network_id, std::vector &interface_types, std::vector &interface_dataw, double node_period, double adapter_period, BookSimConfig *noc_config, Network *noc, @@ -11,12 +11,13 @@ axis_slave_adapter::axis_slave_adapter( axis_interfaces.init(interface_types.size()); // Node properties - _rad_id = 0; // TO-DO-MR: set appropriate RAD ID through constructor + _rad_id = rad_id; + _portal_id = 0; //default 0, user must set correct value using AssignPortalSlaveID() _node_id = node_id; _network_id = network_id; _node_period = node_period; _adapter_period = adapter_period; - _noc_period = radsim_config.GetDoubleVectorKnob("noc_clk_period", _network_id); + _noc_period = radsim_config.GetDoubleVectorKnobPerRad("noc_clk_period", _network_id, _rad_id); _num_axis_interfaces = interface_types.size(); _interface_types = interface_types; _interface_dataw = interface_dataw; @@ -50,7 +51,7 @@ axis_slave_adapter::axis_slave_adapter( _input_axis_transactions_afifo_depth = 2; _injection_afifo_depth = - radsim_config.GetIntVectorKnob("noc_adapters_fifo_size", _network_id); + radsim_config.GetIntVectorKnobPerRad("noc_adapters_fifo_size", _network_id, _rad_id); _injection_flit_ready = false; SC_METHOD(InputReady); @@ -229,19 +230,20 @@ void axis_slave_adapter::InputInjection() { booksim_flit->tail = _to_be_injected_flit._tail; booksim_flit->type = _to_be_injected_flit._type; - // TO-DO-MR BEGIN - if (DEST_RAD(_to_be_injected_flit._dest) == _rad_id) { + //If the destination of the incoming transaction is a module on the same RAD as the sender module, the destination field is set appropriately. + //Else if on a different RAD, the destination field is set to the portal module node used to communicate with other RADs. + if (DEST_RAD(_to_be_injected_flit._dest) == _rad_id) { //not crossing to other RAD sc_bv booksim_flit_dest = DEST_LOCAL_NODE(_to_be_injected_flit._dest); booksim_flit->dest = GetInputDestinationNode(booksim_flit_dest); - booksim_flit->dest_rad = DEST_RAD(_to_be_injected_flit._dest).to_uint(); - booksim_flit->dest_remote = DEST_REMOTE_NODE(_to_be_injected_flit._dest).to_uint(); + booksim_flit->dest_rad = DEST_RAD(_to_be_injected_flit._dest).to_int(); + booksim_flit->dest_remote = DEST_REMOTE_NODE(_to_be_injected_flit._dest).to_int(); } else { - sc_bv booksim_flit_dest = 0; // TO-DO-MR: set to portal node ID + //std::cout << "_portal_id in axis_slave_adapter.cpp: " << _portal_id << std::endl; + sc_bv booksim_flit_dest = _portal_id; booksim_flit->dest = GetInputDestinationNode(booksim_flit_dest); - booksim_flit->dest_rad = DEST_RAD(_to_be_injected_flit._dest).to_uint(); - booksim_flit->dest_remote = DEST_REMOTE_NODE(_to_be_injected_flit._dest).to_uint(); + booksim_flit->dest_rad = DEST_RAD(_to_be_injected_flit._dest).to_int(); + booksim_flit->dest_remote = DEST_REMOTE_NODE(_to_be_injected_flit._dest).to_int(); } - // TO-DO-MR END booksim_flit->dest_interface = _to_be_injected_flit._dest_interface.to_uint(); @@ -277,4 +279,12 @@ void axis_slave_adapter::InputInjection() { } wait(); } +} + +//For the current NoC, store the node ID of the portal module that RAD-Sim adds for multi-RAD designs. +//This is used for inter-rad communication. +void +axis_slave_adapter::AssignPortalSlaveID(int id) { + _portal_id = id; + //std::cout << "set portal_id of RAD "<< _rad_id << " in axis_slave_adapter.cpp to: " << _portal_id << std::endl; } \ No newline at end of file diff --git a/rad-sim/sim/noc/axis_slave_adapter.hpp b/rad-sim/sim/noc/axis_slave_adapter.hpp index 2acd2a5..f0934cf 100644 --- a/rad-sim/sim/noc/axis_slave_adapter.hpp +++ b/rad-sim/sim/noc/axis_slave_adapter.hpp @@ -17,7 +17,8 @@ class axis_slave_adapter : public sc_module { private: - unsigned int _rad_id; // TO-DO-MR: RAD ID of this adapter (for multi-RAD systems) + unsigned int _rad_id; //RAD ID of this adapter (for multi-RAD systems) + unsigned int _portal_id; //Node ID of portal module for this RAD (for communication between RADs in multi-RAD systems) unsigned int _node_id; // Node ID of this adapter double _node_period, _adapter_period, _noc_period; unsigned int _network_id; @@ -62,7 +63,7 @@ class axis_slave_adapter : public sc_module { sc_in rst; sc_vector axis_interfaces; - axis_slave_adapter(const sc_module_name &name, int node_id, int network_id, + axis_slave_adapter(const sc_module_name &name, unsigned int rad_id, int node_id, int network_id, std::vector &interface_types, std::vector &interface_dataw, double node_period, double adapter_period, @@ -77,4 +78,5 @@ class axis_slave_adapter : public sc_module { void InputPacketization(); void InputInjection(); SC_HAS_PROCESS(axis_slave_adapter); + void AssignPortalSlaveID(int id); }; \ No newline at end of file diff --git a/rad-sim/sim/noc/radsim_noc.cpp b/rad-sim/sim/noc/radsim_noc.cpp index f1dff96..4bd974f 100644 --- a/rad-sim/sim/noc/radsim_noc.cpp +++ b/rad-sim/sim/noc/radsim_noc.cpp @@ -1,22 +1,25 @@ #include #include -radsim_noc::radsim_noc(const sc_module_name &name, int noc_id, +radsim_noc::radsim_noc(const sc_module_name &name, unsigned int rad_id, std::string portal_slave_name, int noc_id, std::vector &adapter_clks, std::vector &module_clks, std::vector &axis_master_adapter_info, std::vector &axis_slave_adapter_info, std::vector &aximm_master_adapter_info, - std::vector &aximm_slave_adapter_info) + std::vector &aximm_slave_adapter_info, + RADSimDesignContext* radsim_design) : sc_module(name), noc_clk("noc_clk"), rst("rst") { - + _rad_id = rad_id; + _portal_slave_name = portal_slave_name; _noc_id = noc_id; - _num_noc_nodes = radsim_config.GetIntVectorKnob("noc_num_nodes", _noc_id); + _num_noc_nodes = radsim_config.GetIntVectorKnobPerRad("noc_num_nodes", _noc_id, _rad_id); // Parse config file, initialize routing data structures and create Booksim // NoC - std::string config_filename = radsim_config.GetStringKnob("radsim_root_dir") + + std::string config_filename = radsim_config.GetStringKnobShared("radsim_root_dir") + "/sim/noc/noc" + std::to_string(noc_id) + + "_rad" + std::to_string(rad_id) + "_config"; _config.ParseFile(config_filename); InitializeRoutingMap(_config); @@ -56,12 +59,12 @@ radsim_noc::radsim_noc(const sc_module_name &name, int noc_id, // Create NoC AXI-S Master adapters _num_axis_master_endpoints = - radsim_design.GetNumNoCMasterAdapters(_noc_id, false); + radsim_design->GetNumNoCMasterAdapters(_noc_id, false); noc_axis_master_ports.init(_num_axis_master_endpoints); for (unsigned int adapter_id = 0; adapter_id < _num_axis_master_endpoints; adapter_id++) { unsigned int num_adapter_ports = - radsim_design.GetNumAxisMasterAdapterPorts(_noc_id, adapter_id); + radsim_design->GetNumAxisMasterAdapterPorts(_noc_id, adapter_id); noc_axis_master_ports[adapter_id].init(num_adapter_ports); // Prepare adapter information @@ -75,7 +78,7 @@ radsim_noc::radsim_noc(const sc_module_name &name, int noc_id, // Create adapter axis_master_adapter *master_adapter = new axis_master_adapter( - adapter_name, axis_master_adapter_info[adapter_id]._node_id, _noc_id, + adapter_name, _rad_id, axis_master_adapter_info[adapter_id]._node_id, _noc_id, adapter_port_types, axis_master_adapter_info[adapter_id]._port_dataw, &_config, _booksim_noc, _buffer_state[axis_master_adapter_info[adapter_id]._node_id], @@ -92,9 +95,10 @@ radsim_noc::radsim_noc(const sc_module_name &name, int noc_id, for (unsigned int port_id = 0; port_id < num_adapter_ports; port_id++) { std::string port_name = axis_master_adapter_info[adapter_id]._port_names[port_id]; + //std::cout << "axis_master_adapter_info radsim_noc.cpp port_name is: " << port_name << std::endl; master_adapter->axis_interfaces[port_id].ConnectToPort( noc_axis_master_ports[adapter_id][port_id]); - radsim_design.RegisterNoCMasterPort( + radsim_design->RegisterNoCMasterPort( _noc_id, port_name, &noc_axis_master_ports[adapter_id][port_id]); } @@ -103,12 +107,12 @@ radsim_noc::radsim_noc(const sc_module_name &name, int noc_id, // Create NoC AXI-S Slave adapters _num_axis_slave_endpoints = - radsim_design.GetNumNoCSlaveAdapters(_noc_id, false); + radsim_design->GetNumNoCSlaveAdapters(_noc_id, false); noc_axis_slave_ports.init(_num_axis_slave_endpoints); for (unsigned int adapter_id = 0; adapter_id < _num_axis_slave_endpoints; adapter_id++) { unsigned int num_adapter_ports = - radsim_design.GetNumAxisSlaveAdapterPorts(_noc_id, adapter_id); + radsim_design->GetNumAxisSlaveAdapterPorts(_noc_id, adapter_id); noc_axis_slave_ports[adapter_id].init(num_adapter_ports); // Prepare adapter information @@ -119,14 +123,14 @@ radsim_noc::radsim_noc(const sc_module_name &name, int noc_id, for (auto it = axis_slave_adapter_info[adapter_id]._port_types.begin(); it != axis_slave_adapter_info[adapter_id]._port_types.end(); it++) adapter_port_types.push_back(static_cast(*it)); - double adapter_module_period = radsim_config.GetDoubleVectorKnob( - "design_clk_periods", axis_slave_adapter_info[adapter_id]._module_clk_idx); - double adapter_period = radsim_config.GetDoubleVectorKnob( - "noc_adapters_clk_period", axis_slave_adapter_info[adapter_id]._adapter_clk_idx); + double adapter_module_period = radsim_config.GetDoubleVectorKnobPerRad( + "design_clk_periods", axis_slave_adapter_info[adapter_id]._module_clk_idx, _rad_id); + double adapter_period = radsim_config.GetDoubleVectorKnobPerRad( + "noc_adapters_clk_period", axis_slave_adapter_info[adapter_id]._adapter_clk_idx, _rad_id); // Create adapter axis_slave_adapter *slave_adapter = new axis_slave_adapter( - adapter_name, axis_slave_adapter_info[adapter_id]._node_id, _noc_id, + adapter_name, _rad_id, axis_slave_adapter_info[adapter_id]._node_id, _noc_id, adapter_port_types, axis_slave_adapter_info[adapter_id]._port_dataw, adapter_module_period, adapter_period, &_config, _booksim_noc, _buffer_state[axis_slave_adapter_info[adapter_id]._node_id], @@ -142,9 +146,10 @@ radsim_noc::radsim_noc(const sc_module_name &name, int noc_id, for (unsigned int port_id = 0; port_id < num_adapter_ports; port_id++) { std::string port_name = axis_slave_adapter_info[adapter_id]._port_names[port_id]; + //std::cout << "axis_slave_adapter radsim_noc.cpp port_name is: " << port_name << std::endl; slave_adapter->axis_interfaces[port_id].ConnectToPort( noc_axis_slave_ports[adapter_id][port_id]); - radsim_design.RegisterNoCSlavePort( + radsim_design->RegisterNoCSlavePort( _noc_id, port_name, &noc_axis_slave_ports[adapter_id][port_id]); } @@ -153,7 +158,7 @@ radsim_noc::radsim_noc(const sc_module_name &name, int noc_id, // Create NoC AXI-MM Master adapters _num_aximm_slave_endpoints = - radsim_design.GetNumNoCMasterAdapters(_noc_id, true); + radsim_design->GetNumNoCMasterAdapters(_noc_id, true); noc_aximm_master_ports.init(_num_aximm_slave_endpoints); for (unsigned int adapter_id = 0; adapter_id < _num_aximm_slave_endpoints; adapter_id++) { @@ -162,15 +167,15 @@ radsim_noc::radsim_noc(const sc_module_name &name, int noc_id, std::string adapter_name_str = "aximm_master_adapter_" + std::to_string(adapter_id); const char *adapter_name = adapter_name_str.c_str(); - double adapter_module_period = radsim_config.GetDoubleVectorKnob( - "design_clk_periods", aximm_master_adapter_info[adapter_id]._module_clk_idx); - double adapter_period = radsim_config.GetDoubleVectorKnob( + double adapter_module_period = radsim_config.GetDoubleVectorKnobPerRad( + "design_clk_periods", aximm_master_adapter_info[adapter_id]._module_clk_idx, _rad_id); + double adapter_period = radsim_config.GetDoubleVectorKnobPerRad( "noc_adapters_clk_period", - aximm_master_adapter_info[adapter_id]._adapter_clk_idx); + aximm_master_adapter_info[adapter_id]._adapter_clk_idx, _rad_id); // Create adapter aximm_master_adapter *master_adapter = new aximm_master_adapter( - adapter_name, aximm_master_adapter_info[adapter_id]._node_id, _noc_id, + adapter_name, _rad_id, aximm_master_adapter_info[adapter_id]._node_id, _noc_id, &_config, _booksim_noc, _buffer_state[aximm_master_adapter_info[adapter_id]._node_id], _routing_func, _lookahead_routing, _wait_for_tail_credit, @@ -188,7 +193,7 @@ radsim_noc::radsim_noc(const sc_module_name &name, int noc_id, noc_aximm_master_ports[adapter_id]); std::string port_name = aximm_master_adapter_info[adapter_id]._port_names[0]; - radsim_design.RegisterNoCMasterPort(_noc_id, port_name, + radsim_design->RegisterNoCMasterPort(_noc_id, port_name, &noc_aximm_master_ports[adapter_id]); _aximm_master_adapters.push_back(master_adapter); @@ -196,7 +201,7 @@ radsim_noc::radsim_noc(const sc_module_name &name, int noc_id, // Create NoC AXI-MM Slave adapters _num_aximm_master_endpoints = - radsim_design.GetNumNoCSlaveAdapters(_noc_id, true); + radsim_design->GetNumNoCSlaveAdapters(_noc_id, true); noc_aximm_slave_ports.init(_num_aximm_master_endpoints); for (unsigned int adapter_id = 0; adapter_id < _num_aximm_master_endpoints; adapter_id++) { @@ -205,15 +210,15 @@ radsim_noc::radsim_noc(const sc_module_name &name, int noc_id, std::string adapter_name_str = "aximm_slave_adapter_" + std::to_string(adapter_id); const char *adapter_name = adapter_name_str.c_str(); - double adapter_module_period = radsim_config.GetDoubleVectorKnob( - "design_clk_periods", aximm_slave_adapter_info[adapter_id]._module_clk_idx); - double adapter_period = radsim_config.GetDoubleVectorKnob( + double adapter_module_period = radsim_config.GetDoubleVectorKnobPerRad( + "design_clk_periods", aximm_slave_adapter_info[adapter_id]._module_clk_idx, _rad_id); + double adapter_period = radsim_config.GetDoubleVectorKnobPerRad( "noc_adapters_clk_period", - aximm_slave_adapter_info[adapter_id]._adapter_clk_idx); + aximm_slave_adapter_info[adapter_id]._adapter_clk_idx, _rad_id); // Create adapter aximm_slave_adapter *slave_adapter = new aximm_slave_adapter( - adapter_name, aximm_slave_adapter_info[adapter_id]._node_id, _noc_id, + adapter_name, _rad_id, aximm_slave_adapter_info[adapter_id]._node_id, _noc_id, &_config, _booksim_noc, _buffer_state[aximm_slave_adapter_info[adapter_id]._node_id], _routing_func, _lookahead_routing, _wait_for_tail_credit, @@ -230,12 +235,22 @@ radsim_noc::radsim_noc(const sc_module_name &name, int noc_id, slave_adapter->aximm_interface.ConnectToPort( noc_aximm_slave_ports[adapter_id]); std::string port_name = aximm_slave_adapter_info[adapter_id]._port_names[0]; - radsim_design.RegisterNoCSlavePort(_noc_id, port_name, + radsim_design->RegisterNoCSlavePort(_noc_id, port_name, &noc_aximm_slave_ports[adapter_id]); _aximm_slave_adapters.push_back(slave_adapter); } + #ifndef SINGLE_RAD + //set portal ID to use in axis_slave_adapter for NoC versus inter_rad + unsigned int PortalSlaveID = radsim_design->GetPortalSlaveID(); + //std::cout << "Set portal slave ids in radsim_noc.cpp to: " << PortalSlaveID << std::endl; + for (int i = 0; i < _axis_slave_adapters.size(); i++) { + _axis_slave_adapters[i]->AssignPortalSlaveID(PortalSlaveID); + } + #endif + //std::cout << "DONE AXIS SLAVE ADAPTER CREATION " << std::endl; + SC_CTHREAD(Tick, noc_clk.pos()); reset_signal_is(rst, true); } diff --git a/rad-sim/sim/noc/radsim_noc.hpp b/rad-sim/sim/noc/radsim_noc.hpp index 9e8b0dc..b6328b9 100644 --- a/rad-sim/sim/noc/radsim_noc.hpp +++ b/rad-sim/sim/noc/radsim_noc.hpp @@ -15,9 +15,13 @@ #include #include +class RADSimDesignContext; + // NoC SystemC wrapper around all Booksim-related datastructures class radsim_noc : public sc_module { private: + int _rad_id; + std::string _portal_slave_name; int _noc_id; int _num_noc_nodes; BookSimConfig _config; // Booksim NoC configuration @@ -44,13 +48,14 @@ class radsim_noc : public sc_module { sc_vector noc_aximm_master_ports; sc_vector noc_aximm_slave_ports; - radsim_noc(const sc_module_name &name, int noc_id, + radsim_noc(const sc_module_name &name, unsigned int rad_id, std::string portal_slave_name, int noc_id, std::vector &adapter_clks, std::vector &module_clks, std::vector &axis_master_adapter_info, std::vector &axis_slave_adapter_info, std::vector &aximm_master_adapter_info, - std::vector &aximm_slave_adapter_info); + std::vector &aximm_slave_adapter_info, + RADSimDesignContext* radsim_design); ~radsim_noc(); Network *GetNetwork(); diff --git a/rad-sim/sim/portal.cpp b/rad-sim/sim/portal.cpp new file mode 100644 index 0000000..71eea98 --- /dev/null +++ b/rad-sim/sim/portal.cpp @@ -0,0 +1,159 @@ +#include + +portal::portal(const sc_module_name &name, RADSimDesignContext* radsim_design) + : RADSimModule(name, radsim_design) { + + this->radsim_design = radsim_design; + + //combinational logic + SC_METHOD(Assign); + sensitive << rst; // << axis_portal_master_interface.tready; //TODO: add back if inter-rad eventually supports backpressure in this direction + //sequential logic + SC_CTHREAD(Tick, clk.pos()); + // This function must be defined & called for any RAD-Sim module to register + // its info for automatically connecting to the NoC + reset_signal_is(rst, true); // Reset is active high + this->RegisterModuleInfo(); //can comment out if not connecting to NoC +} + + +portal::~portal() {} + +void portal::Assign() { //combinational logic + if (rst) { + portal_axis_slave.tready.write(false); + axis_portal_slave_interface.tready.write(false); + } + else { + //Always ready to accept from NoC because we have FIFO buffers in both directions + //Not exerting back-pressure + portal_axis_slave.tready.write(true); + axis_portal_slave_interface.tready.write(true); + } +} + +sc_bv data_to_buffer = 0; +void portal::Tick() { //sequential logic + portal_axis_master.tvalid.write(false); + wait(); + //Always @ positive edge of clock + while (true) { + + //Accepting incoming NoC transaction + if (axis_portal_slave_interface.tvalid.read() && + axis_portal_slave_interface.tready.read()) { + data_to_buffer = axis_portal_slave_interface.tdata.read(); + portal_axis_fields curr_transaction = { + axis_portal_slave_interface.tvalid.read(), + axis_portal_slave_interface.tready.read(), + axis_portal_slave_interface.tdata.read(), + axis_portal_slave_interface.tstrb.read(), + axis_portal_slave_interface.tkeep.read(), + axis_portal_slave_interface.tlast.read(), + axis_portal_slave_interface.tid.read(), + axis_portal_slave_interface.tdest.read(), + axis_portal_slave_interface.tuser.read() + }; + + portal_axis_fifo_noc_incoming.push(curr_transaction); + } + + //Sending outgoing inter-rad data + //warning: must do this before next if-else block so that we pop before reading front. otherwise we get outtdated value on second turn. + //we see valid as high the clock cycle AFTER we set it as high in the if-else below + if (portal_axis_master.tvalid.read() && portal_axis_master.tready.read()) { // && test_ready_toggle) { + //pop out of fifo + if (!portal_axis_fifo_noc_incoming.empty()) { + int curr_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + portal_axis_fifo_noc_incoming.pop(); + + if (portal_axis_master.tlast.read()) { + std::cout << "dlrm design portal.cpp sent last data via inter_rad at cycle " << curr_cycle << std::endl; + } + } + else { //should never reach here because valid should be false if fifo is empty + std::cout << "reached here but why? portal_axis_fifo_noc_incoming.size(): " << portal_axis_fifo_noc_incoming.size() << std::endl; + } + } + //Prep for sending outgoing inter-rad data + if ((portal_axis_fifo_noc_incoming.size() > 0) ) { //&& test_ready_toggle) { + portal_axis_fields curr_transaction = portal_axis_fifo_noc_incoming.front(); + portal_axis_master.tdata.write(curr_transaction.tdata); + portal_axis_master.tdest.write(curr_transaction.tdest); + portal_axis_master.tuser.write(curr_transaction.tuser); + portal_axis_master.tvalid.write(true); + portal_axis_master.tlast.write(curr_transaction.tlast); + } + else { + portal_axis_master.tdata.write(0); + portal_axis_master.tvalid.write(false); + } + + //Accepting incoming inter-rad data and then sending to correct module on RAD over NoC + if (portal_axis_slave.tvalid.read() && + portal_axis_slave.tready.read()) { + portal_axis_fields curr_transaction = { + portal_axis_slave.tvalid.read(), + portal_axis_slave.tready.read(), + portal_axis_slave.tdata.read(), + portal_axis_slave.tstrb.read(), + portal_axis_slave.tkeep.read(), + portal_axis_slave.tlast.read(), + portal_axis_slave.tid.read(), + portal_axis_slave.tdest.read(), + portal_axis_slave.tuser.read() //tuser field + }; + + portal_axis_fifo_noc_outgoing.push(curr_transaction); + } + + //Sending outgoing NoC data + //warning: must do this before next if-else block so that we pop before reading front. otherwise we get outtdated value on second turn. + //we see valid as high the clock cycle AFTER we set it as high in the if-else below + if (axis_portal_master_interface.tvalid.read() && axis_portal_master_interface.tready.read()) { // && test_ready_toggle) { + //pop out of fifo + if (!portal_axis_fifo_noc_outgoing.empty()) { + int curr_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + portal_axis_fifo_noc_outgoing.pop(); + if (axis_portal_master_interface.tlast.read()) { + std::cout << "dlrm design portal.cpp sent last data via NoC at cycle " << curr_cycle << std::endl; + } + } + else { //should never reach here because valid should be false if fifo is empty + std::cout << "reached here but why? portal_axis_fifo_noc_outgoing.size(): " << portal_axis_fifo_noc_outgoing.size() << std::endl; + } + } + //Prep for sending outgoing NoC data + if ((portal_axis_fifo_noc_outgoing.size() > 0) ) { //&& test_ready_toggle) { + portal_axis_fields curr_transaction = portal_axis_fifo_noc_outgoing.front(); + axis_portal_master_interface.tdata.write(curr_transaction.tdata); + axis_portal_master_interface.tdest.write(curr_transaction.tdest); + axis_portal_master_interface.tuser.write(curr_transaction.tuser); + axis_portal_master_interface.tvalid.write(true); + axis_portal_master_interface.tlast.write(curr_transaction.tlast); + } + else { + axis_portal_master_interface.tdata.write(0); + axis_portal_master_interface.tvalid.write(false); + } + + + wait(); + } +} + +void portal::RegisterModuleInfo() { + std::string port_name; + _num_noc_axis_slave_ports = 0; + _num_noc_axis_master_ports = 0; + _num_noc_aximm_slave_ports = 0; + _num_noc_aximm_master_ports = 0; + + port_name = module_name + ".axis_portal_slave_interface"; + RegisterAxisSlavePort(port_name, &axis_portal_slave_interface, AXIS_MAX_DATAW, 0); + radsim_design->AssignPortalSlaveName(port_name); //other modules will send to this slave interface + + port_name = module_name + ".axis_portal_master_interface"; + RegisterAxisMasterPort(port_name, &axis_portal_master_interface, AXIS_MAX_DATAW, 0); + +} diff --git a/rad-sim/sim/portal.hpp b/rad-sim/sim/portal.hpp new file mode 100644 index 0000000..a45f2ba --- /dev/null +++ b/rad-sim/sim/portal.hpp @@ -0,0 +1,49 @@ +#pragma once + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct portal_axis_fields { + bool tvalid; + bool tready; + sc_bv tdata; + sc_bv tstrb; + sc_bv tkeep; + bool tlast; + sc_bv tid; + sc_bv tdest; + sc_bv tuser; + }; + + +//The portal class is a module that connects from a device's NoC to the outer inter-RAD network. +class portal : public RADSimModule { + private: + std::queue portal_axis_fifo_noc_incoming; + std::queue portal_axis_fifo_noc_outgoing; + + public: + RADSimDesignContext* radsim_design; + sc_in rst { "rst" }; + //axis ports for external access to inter_rad + axis_master_port portal_axis_master; + axis_slave_port portal_axis_slave; + //Interfaces to the NoC + axis_slave_port axis_portal_slave_interface; + axis_master_port axis_portal_master_interface; + + portal(const sc_module_name &name, RADSimDesignContext* radsim_design); + ~portal(); + + void Assign(); // Combinational logic process + void Tick(); // Sequential logic process + SC_HAS_PROCESS(portal); + void RegisterModuleInfo(); +}; \ No newline at end of file diff --git a/rad-sim/sim/radsim_cluster.cpp b/rad-sim/sim/radsim_cluster.cpp new file mode 100644 index 0000000..74a2288 --- /dev/null +++ b/rad-sim/sim/radsim_cluster.cpp @@ -0,0 +1,51 @@ +#include + +RADSimCluster::RADSimCluster(int num_rads) { + this->num_rads = num_rads; + for (int rad_id = 0; rad_id < num_rads; rad_id++) { + RADSimDesignContext* new_rad = new RADSimDesignContext(rad_id); //pass in unique RAD ID + all_rads.push_back(new_rad); + } + //TODO: use configuration parameters to change topology and connectivity models + inter_rad_topo = ALL_TO_ALL; + inter_rad_conn_model = NETWORK; +} + +RADSimCluster::~RADSimCluster() { + for (int rad_id = 0; rad_id < num_rads; rad_id++) { + delete all_rads[rad_id]; //free the RADs allocated + } +} + +RADSimDesignContext* +RADSimCluster::CreateNewRAD(unsigned int rad_id) { + RADSimDesignContext* new_rad = new RADSimDesignContext(rad_id); + num_rads++; + all_rads.push_back(new_rad); + return new_rad; +} + +void +RADSimCluster::SetTopo(inter_rad_topo_type inter_rad_topo) { + this->inter_rad_topo = inter_rad_topo; +} + +void +RADSimCluster::SetConnModel(inter_rad_conn_model_type inter_rad_conn_model) { + this->inter_rad_conn_model = inter_rad_conn_model; +} + +bool +RADSimCluster::AllRADsNotDone() { + for (int rad_id = 0; rad_id < num_rads; rad_id++) { + if (!(all_rads[rad_id]->is_rad_done())) { + return true; + } + } + return false; +} + +void +RADSimCluster::StoreSystem(RADSimDesignSystem* system) { + all_systems.push_back(system); +} \ No newline at end of file diff --git a/rad-sim/sim/radsim_cluster.hpp b/rad-sim/sim/radsim_cluster.hpp new file mode 100644 index 0000000..d4f4456 --- /dev/null +++ b/rad-sim/sim/radsim_cluster.hpp @@ -0,0 +1,39 @@ +#pragma once + +#include +#include +#include +#include +#include +#include + +//Represents a cluster of one or multiple RAD devices +//Stores pointers to objects representing the RADs and the designs on each RAD +//Contains support for future development of new topologies for inter-RAD connections +class RADSimCluster { + private: + public: + int num_rads; + std::vector all_rads; + std::vector all_systems; + + enum inter_rad_topo_type { + ALL_TO_ALL = 0, + SWITCH = 1, + RING = 2 + }; + enum inter_rad_conn_model_type { + WIRE = 0, //Direct wire-based connection between RADs. This option has been deprecated. + NETWORK = 1 //Current approach using bandwidth and latency constraints from the user for inter-RAD communication. + }; + inter_rad_topo_type inter_rad_topo; + inter_rad_conn_model_type inter_rad_conn_model; + + RADSimCluster(int num_rads); + ~RADSimCluster(); + RADSimDesignContext* CreateNewRAD(unsigned int rad_id); //returns ptr to the newly added RAD + void SetTopo(inter_rad_topo_type inter_rad_topo); + void SetConnModel(inter_rad_conn_model_type inter_rad_topo); + bool AllRADsNotDone(); + void StoreSystem(RADSimDesignSystem* system); +}; \ No newline at end of file diff --git a/rad-sim/sim/radsim_config.cpp b/rad-sim/sim/radsim_config.cpp index f0e5bb1..7d6ec00 100644 --- a/rad-sim/sim/radsim_config.cpp +++ b/rad-sim/sim/radsim_config.cpp @@ -3,160 +3,340 @@ RADSimConfig::RADSimConfig() {} RADSimConfig::~RADSimConfig() {} +// template +// void insert_with_resize(std::vector<> ) { + +// } + +void RADSimConfig::ResizeAll(int num_rads) { + _int_knobs_per_rad.resize(num_rads); + _double_knobs_per_rad.resize(num_rads); + _string_knobs_per_rad.resize(num_rads); + _int_vector_knobs_per_rad.resize(num_rads); + _double_vector_knobs_per_rad.resize(num_rads); + _string_vector_knobs_per_rad.resize(num_rads); +} + + // Adds a new integer configuration knob -void RADSimConfig::AddIntKnob(const std::string &key, int val) { - _int_knobs[key] = val; +void RADSimConfig::AddIntKnobShared(const std::string &key, int val) { + _int_knobs_shared[key] = val; } // Adds a new double configuration knob -void RADSimConfig::AddDoubleKnob(const std::string &key, double val) { - _double_knobs[key] = val; +void RADSimConfig::AddDoubleKnobShared(const std::string &key, double val) { + _double_knobs_shared[key] = val; } // Adds a new string configuration knob -void RADSimConfig::AddStringKnob(const std::string &key, std::string &val) { - _string_knobs[key] = val; +void RADSimConfig::AddStringKnobShared(const std::string &key, std::string &val) { + _string_knobs_shared[key] = val; } // Adds a new integer vector configuration knob -void RADSimConfig::AddIntVectorKnob(const std::string &key, +void RADSimConfig::AddIntVectorKnobShared(const std::string &key, std::vector &val) { - _int_vector_knobs[key] = val; + _int_vector_knobs_shared[key] = val; } // Adds a new double vector configuration knob -void RADSimConfig::AddDoubleVectorKnob(const std::string &key, +void RADSimConfig::AddDoubleVectorKnobShared(const std::string &key, std::vector &val) { - _double_vector_knobs[key] = val; + _double_vector_knobs_shared[key] = val; } // Adds a new string vector configuration knob -void RADSimConfig::AddStringVectorKnob(const std::string &key, +void RADSimConfig::AddStringVectorKnobShared(const std::string &key, std::vector &val) { - _string_vector_knobs[key] = val; + _string_vector_knobs_shared[key] = val; +} + +//RAD-specific functions +// Adds a new integer configuration knob +void RADSimConfig::AddIntKnobPerRad(const std::string &key, int val, unsigned int rad_id) { + _int_knobs_per_rad[rad_id][key] = val; +} + +// Adds a new double configuration knob +void RADSimConfig::AddDoubleKnobPerRad(const std::string &key, double val, unsigned int rad_id) { + _double_knobs_per_rad[rad_id][key] = val; +} + +// Adds a new string configuration knob +void RADSimConfig::AddStringKnobPerRad(const std::string &key, std::string &val, unsigned int rad_id) { + _string_knobs_per_rad[rad_id][key] = val; +} + +// Adds a new integer vector configuration knob +void RADSimConfig::AddIntVectorKnobPerRad(const std::string &key, + std::vector &val, unsigned int rad_id) { + _int_vector_knobs_per_rad[rad_id][key] = val; +} + +// Adds a new double vector configuration knob +void RADSimConfig::AddDoubleVectorKnobPerRad(const std::string &key, + std::vector &val, unsigned int rad_id) { + _double_vector_knobs_per_rad[rad_id][key] = val; +} + +// Adds a new string vector configuration knob +void RADSimConfig::AddStringVectorKnobPerRad(const std::string &key, + std::vector &val, unsigned int rad_id) { + _string_vector_knobs_per_rad[rad_id][key] = val; } // Gets the value of an integer configuration knob -int RADSimConfig::GetIntKnob(const std::string &key) { - if (_int_knobs.find(key) == _int_knobs.end()) { - std::cerr << "GetIntKnob: Cannot find configuration parameter \"" << key +int RADSimConfig::GetIntKnobShared(const std::string &key) { + if (_int_knobs_shared.find(key) == _int_knobs_shared.end()) { + std::cerr << "GetIntKnobShared: Cannot find configuration parameter \"" << key << "\"" << std::endl; exit(1); } - return _int_knobs[key]; + return _int_knobs_shared[key]; } // Gets the value of a double configuration knob -double RADSimConfig::GetDoubleKnob(const std::string &key) { - if (_double_knobs.find(key) == _double_knobs.end()) { - std::cerr << "GetDoubleKnob: Cannot find configuration parameter \"" << key +double RADSimConfig::GetDoubleKnobShared(const std::string &key) { + if (_double_knobs_shared.find(key) == _double_knobs_shared.end()) { + std::cerr << "GetDoubleKnobShared: Cannot find configuration parameter \"" << key << "\"" << std::endl; exit(1); } - return _double_knobs[key]; + return _double_knobs_shared[key]; } // Gets the value of a string configuration knob -std::string RADSimConfig::GetStringKnob(const std::string &key) { - if (_string_knobs.find(key) == _string_knobs.end()) { - std::cerr << "GetStringKnob: Cannot find configuration parameter \"" << key +std::string RADSimConfig::GetStringKnobShared(const std::string &key) { + if (_string_knobs_shared.find(key) == _string_knobs_shared.end()) { + std::cerr << "GetStringKnobShared: Cannot find configuration parameter \"" << key << "\"" << std::endl; exit(1); } - return _string_knobs[key]; + return _string_knobs_shared[key]; } // Gets the value of an integer vector configuration knob -int RADSimConfig::GetIntVectorKnob(const std::string &key, unsigned int idx) { - if (_int_vector_knobs.find(key) == _int_vector_knobs.end()) { - std::cerr << "GetIntVectorKnob: Cannot find configuration parameter \"" +int RADSimConfig::GetIntVectorKnobShared(const std::string &key, unsigned int idx) { + if (_int_vector_knobs_shared.find(key) == _int_vector_knobs_shared.end()) { + std::cerr << "GetIntVectorKnobShared: Cannot find configuration parameter \"" << key << "\"" << std::endl; exit(1); } - return _int_vector_knobs[key][idx]; + return _int_vector_knobs_shared[key][idx]; } // Gets the value of a double vector configuration knob -double RADSimConfig::GetDoubleVectorKnob(const std::string &key, +double RADSimConfig::GetDoubleVectorKnobShared(const std::string &key, unsigned int idx) { - if (_double_vector_knobs.find(key) == _double_vector_knobs.end()) { - std::cerr << "GetDoubleVectorKnob: Cannot find configuration parameter \"" + if (_double_vector_knobs_shared.find(key) == _double_vector_knobs_shared.end()) { + std::cerr << "GetDoubleVectorKnobShared: Cannot find configuration parameter \"" << key << "\"" << std::endl; exit(1); } - return _double_vector_knobs[key][idx]; + return _double_vector_knobs_shared[key][idx]; } // Gets the value of a string vector configuration knob -std::string RADSimConfig::GetStringVectorKnob(const std::string &key, +std::string RADSimConfig::GetStringVectorKnobShared(const std::string &key, unsigned int idx) { - if (_string_vector_knobs.find(key) == _string_vector_knobs.end()) { - std::cerr << "GetStringVectorKnob: Cannot find configuration parameter \"" + if (_string_vector_knobs_shared.find(key) == _string_vector_knobs_shared.end()) { + std::cerr << "GetStringVectorKnobShared: Cannot find configuration parameter \"" << key << "\"" << std::endl; exit(1); } - return _string_vector_knobs[key][idx]; + return _string_vector_knobs_shared[key][idx]; } // Gets the value of an integer vector configuration knob -std::vector &RADSimConfig::GetIntVectorKnob(const std::string &key) { - if (_int_vector_knobs.find(key) == _int_vector_knobs.end()) { - std::cerr << "GetIntVectorKnob: Cannot find configuration parameter \"" +std::vector &RADSimConfig::GetIntVectorKnobShared(const std::string &key) { + if (_int_vector_knobs_shared.find(key) == _int_vector_knobs_shared.end()) { + std::cerr << "GetIntVectorKnobShared: Cannot find configuration parameter \"" << key << "\"" << std::endl; exit(1); } - return _int_vector_knobs[key]; + return _int_vector_knobs_shared[key]; } // Gets the value of a double vector configuration knob -std::vector &RADSimConfig::GetDoubleVectorKnob(const std::string &key) { - if (_double_vector_knobs.find(key) == _double_vector_knobs.end()) { - std::cerr << "GetDoubleVectorKnob: Cannot find configuration parameter \"" +std::vector &RADSimConfig::GetDoubleVectorKnobShared(const std::string &key) { + if (_double_vector_knobs_shared.find(key) == _double_vector_knobs_shared.end()) { + std::cerr << "GetDoubleVectorKnobShared: Cannot find configuration parameter \"" << key << "\"" << std::endl; exit(1); } - return _double_vector_knobs[key]; + return _double_vector_knobs_shared[key]; } // Gets the value of a string vector configuration knob std::vector & -RADSimConfig::GetStringVectorKnob(const std::string &key) { - if (_string_vector_knobs.find(key) == _string_vector_knobs.end()) { - std::cerr << "GetStringVectorKnob: Cannot find configuration parameter \"" +RADSimConfig::GetStringVectorKnobShared(const std::string &key) { + if (_string_vector_knobs_shared.find(key) == _string_vector_knobs_shared.end()) { + std::cerr << "GetStringVectorKnobShared: Cannot find configuration parameter \"" << key << "\"" << std::endl; exit(1); } - return _string_vector_knobs[key]; + return _string_vector_knobs_shared[key]; } +//Retrieve values of RAD-specific knobs +// Gets the value of an integer configuration knob +int RADSimConfig::GetIntKnobPerRad(const std::string &key, unsigned int rad_id) { + if (_int_knobs_per_rad[rad_id].find(key) == _int_knobs_per_rad[rad_id].end()) { + std::cerr << "GetIntKnobPerRAD: Cannot find configuration parameter \"" << key + << "\"" << std::endl; + exit(1); + } + return _int_knobs_per_rad[rad_id][key]; +} + +// Gets the value of a double configuration knob +double RADSimConfig::GetDoubleKnobPerRad(const std::string &key, unsigned int rad_id) { + if (_double_knobs_per_rad[rad_id].find(key) == _double_knobs_per_rad[rad_id].end()) { + std::cerr << "GetDoubleKnobPerRAD: Cannot find configuration parameter \"" << key + << "\"" << std::endl; + exit(1); + } + return _double_knobs_per_rad[rad_id][key]; +} + +// Gets the value of a string configuration knob +std::string RADSimConfig::GetStringKnobPerRad(const std::string &key, unsigned int rad_id) { + if (_string_knobs_per_rad[rad_id].find(key) == _string_knobs_per_rad[rad_id].end()) { + std::cerr << "GetStringKnobPerRAD: Cannot find configuration parameter \"" << key + << "\"" << std::endl; + exit(1); + } + return _string_knobs_per_rad[rad_id][key]; +} + +// Gets the value of an integer vector configuration knob +int RADSimConfig::GetIntVectorKnobPerRad(const std::string &key, unsigned int idx, unsigned int rad_id) { + if (_int_vector_knobs_per_rad[rad_id].find(key) == _int_vector_knobs_per_rad[rad_id].end()) { + std::cerr << "GetIntVectorKnobPerRAD: Cannot find configuration parameter \"" + << key << "\"" << std::endl; + exit(1); + } + return _int_vector_knobs_per_rad[rad_id][key][idx]; +} + +// Gets the value of a double vector configuration knob +double RADSimConfig::GetDoubleVectorKnobPerRad(const std::string &key, + unsigned int idx, unsigned int rad_id) { + if (_double_vector_knobs_per_rad[rad_id].find(key) == _double_vector_knobs_per_rad[rad_id].end()) { + std::cerr << "GetDoubleVectorKnobPerRAD: Cannot find configuration parameter \"" + << key << "\"" << std::endl; + exit(1); + } + return _double_vector_knobs_per_rad[rad_id][key][idx]; +} + +// Gets the value of a string vector configuration knob +std::string RADSimConfig::GetStringVectorKnobPerRad(const std::string &key, + unsigned int idx, unsigned int rad_id) { + if (_string_vector_knobs_per_rad[rad_id].find(key) == _string_vector_knobs_per_rad[rad_id].end()) { + std::cerr << "GetStringVectorKnobPerRAD: Cannot find configuration parameter \"" + << key << "\"" << std::endl; + exit(1); + } + // if (key == "dram_config_files" ) { + // std::cout << "radsim_config.cpp: dram_config_files: " << std::endl; + // for (unsigned int i = 0; i < _string_vector_knobs_per_rad[rad_id][key].size(); i++) { + // std::cout << _string_vector_knobs_per_rad[rad_id][key][i] << std::endl; + // } + // } + return _string_vector_knobs_per_rad[rad_id][key][idx]; +} + +// Gets the value of an integer vector configuration knob +std::vector &RADSimConfig::GetIntVectorKnobPerRad(const std::string &key, unsigned int rad_id) { + if (_int_vector_knobs_per_rad[rad_id].find(key) == _int_vector_knobs_per_rad[rad_id].end()) { + std::cerr << "GetIntVectorKnobPerRAD: Cannot find configuration parameter \"" + << key << "\"" << std::endl; + exit(1); + } + return _int_vector_knobs_per_rad[rad_id][key]; +} + +// Gets the value of a double vector configuration knob +std::vector &RADSimConfig::GetDoubleVectorKnobPerRad(const std::string &key, unsigned int rad_id) { + if (_double_vector_knobs_per_rad[rad_id].find(key) == _double_vector_knobs_per_rad[rad_id].end()) { + std::cerr << "GetDoubleVectorKnobPerRAD: Cannot find configuration parameter \"" + << key << "\"" << std::endl; + exit(1); + } + return _double_vector_knobs_per_rad[rad_id][key]; +} + +// Gets the value of a string vector configuration knob +std::vector & +RADSimConfig::GetStringVectorKnobPerRad(const std::string &key, unsigned int rad_id) { + if (_string_vector_knobs_per_rad[rad_id].find(key) == _string_vector_knobs_per_rad[rad_id].end()) { + std::cerr << "GetStringVectorKnobPerRAD: Cannot find configuration parameter \"" + << key << "\"" << std::endl; + exit(1); + } + return _string_vector_knobs_per_rad[rad_id][key]; +} + +// Check if an integer configuration knob is defined +bool RADSimConfig::HasIntKnobShared(const std::string &key) { + return (_int_knobs_shared.find(key) != _int_knobs_shared.end()); +} + +// Check if a double configuration knob is defined +bool RADSimConfig::HasDoubleKnobShared(const std::string &key) { + return (_double_knobs_shared.find(key) != _double_knobs_shared.end()); +} + +// Check if a string configuration knob is defined +bool RADSimConfig::HasStringKnobShared(const std::string &key) { + return (_string_knobs_shared.find(key) != _string_knobs_shared.end()); +} + +// Check if an integer vector configuration knob is defined +bool RADSimConfig::HasIntVectorKnobShared(const std::string &key) { + return (_int_vector_knobs_shared.find(key) != _int_vector_knobs_shared.end()); +} + +// Check if a double vector configuration knob is defined +bool RADSimConfig::HasDoubleVectorKnobShared(const std::string &key) { + return (_double_vector_knobs_shared.find(key) != _double_vector_knobs_shared.end()); +} + +// Check if a string vector configuration knob is defined +bool RADSimConfig::HasStringVectorKnobShared(const std::string &key) { + return (_string_vector_knobs_shared.find(key) != _string_vector_knobs_shared.end()); +} + +//per-RAD functions to check if has certain knob defined for that RAD // Check if an integer configuration knob is defined -bool RADSimConfig::HasIntKnob(const std::string &key) { - return (_int_knobs.find(key) != _int_knobs.end()); +bool RADSimConfig::HasIntKnobPerRad(const std::string &key, unsigned int rad_id) { + return (_int_knobs_per_rad[rad_id].find(key) != _int_knobs_per_rad[rad_id].end()); } // Check if a double configuration knob is defined -bool RADSimConfig::HasDoubleKnob(const std::string &key) { - return (_double_knobs.find(key) != _double_knobs.end()); +bool RADSimConfig::HasDoubleKnobPerRad(const std::string &key, unsigned int rad_id) { + return (_double_knobs_per_rad[rad_id].find(key) != _double_knobs_per_rad[rad_id].end()); } // Check if a string configuration knob is defined -bool RADSimConfig::HasStringKnob(const std::string &key) { - return (_string_knobs.find(key) != _string_knobs.end()); +bool RADSimConfig::HasStringKnobPerRad(const std::string &key, unsigned int rad_id) { + return (_string_knobs_per_rad[rad_id].find(key) != _string_knobs_per_rad[rad_id].end()); } // Check if an integer vector configuration knob is defined -bool RADSimConfig::HasIntVectorKnob(const std::string &key) { - return (_int_vector_knobs.find(key) != _int_vector_knobs.end()); +bool RADSimConfig::HasIntVectorKnobPerRad(const std::string &key, unsigned int rad_id) { + return (_int_vector_knobs_per_rad[rad_id].find(key) != _int_vector_knobs_per_rad[rad_id].end()); } // Check if a double vector configuration knob is defined -bool RADSimConfig::HasDoubleVectorKnob(const std::string &key) { - return (_double_vector_knobs.find(key) != _double_vector_knobs.end()); +bool RADSimConfig::HasDoubleVectorKnobPerRad(const std::string &key, unsigned int rad_id) { + return (_double_vector_knobs_per_rad[rad_id].find(key) != _double_vector_knobs_per_rad[rad_id].end()); } // Check if a string vector configuration knob is defined -bool RADSimConfig::HasStringVectorKnob(const std::string &key) { - return (_string_vector_knobs.find(key) != _string_vector_knobs.end()); +bool RADSimConfig::HasStringVectorKnobPerRad(const std::string &key, unsigned int rad_id) { + return (_string_vector_knobs_per_rad[rad_id].find(key) != _string_vector_knobs_per_rad[rad_id].end()); } // Parse RADSim knobs from file into RADSimConfig data structures @@ -176,22 +356,40 @@ void ParseRADSimKnobs(const std::string &knobs_filename) { // Based on parameter name, parse a single or a vector of values of int, // double or string data types - if ((param == "radsim_root_dir") || - (param == "radsim_user_design_root_dir") || - (param == "design_name")) { + if ( (param == "radsim_root_dir") || (param == "cluster_topology") ){ //TODO: support other topologies, for now this param does not do anything actively std::string value; std::getline(ss, value, ' '); - radsim_config.AddStringKnob(param, value); - } else if ((param == "noc_num_nocs") || (param == "telemetry_log_verbosity") || - (param == "dram_num_controllers")) { + radsim_config.AddStringKnobShared(param, value); + } else if ((param == "radsim_user_design_root_dir") || (param == "design_name")) { //example: design_name 2 dlrm + std::string rad_id_str; + std::getline(ss, rad_id_str, ' '); + unsigned int rad_id = std::stoi(rad_id_str); + std::string value; + std::getline(ss, value, ' '); + radsim_config.AddStringKnobPerRad(param, value, rad_id); + } else if ( (param == "telemetry_log_verbosity") || (param == "num_rads") || (param == "inter_rad_latency_cycles") || (param == "inter_rad_fifo_num_slots") + || (param == "inter_rad_bw_accept_cycles") || (param == "inter_rad_bw_total_cycles") ) { std::string value_str; std::getline(ss, value_str, ' '); int value = std::stoi(value_str); - radsim_config.AddIntKnob(param, value); + radsim_config.AddIntKnobShared(param, value); + } else if ((param == "noc_num_nocs") || (param == "dram_num_controllers")) { + std::string rad_id_str; + std::getline(ss, rad_id_str, ' '); + unsigned int rad_id = std::stoi(rad_id_str); + std::string value_str; + std::getline(ss, value_str, ' '); + int value = std::stoi(value_str); + radsim_config.AddIntKnobPerRad(param, value, rad_id); } else if ((param == "noc_payload_width") || (param == "noc_vcs") || (param == "noc_num_nodes") || (param == "noc_adapters_fifo_size") || (param == "noc_adapters_obuff_size") || (param == "dram_queue_sizes")) { + //get rad-id + std::string rad_id_str; + std::getline(ss, rad_id_str, ' '); + unsigned int rad_id = std::stoi(rad_id_str); + //get values std::vector value; std::string value_element_str; int value_element; @@ -199,7 +397,7 @@ void ParseRADSimKnobs(const std::string &knobs_filename) { value_element = std::stoi(value_element_str); value.push_back(value_element); } - radsim_config.AddIntVectorKnob(param, value); + radsim_config.AddIntVectorKnobPerRad(param, value, rad_id); } else if ((param == "sim_driver_period")) { std::string value_element_str; std::getline(ss, value_element_str, ' '); @@ -207,10 +405,15 @@ void ParseRADSimKnobs(const std::string &knobs_filename) { if (value > max_period) { max_period = value; } - radsim_config.AddDoubleKnob(param, value); + radsim_config.AddDoubleKnobShared(param, value); } else if ((param == "noc_clk_period") || (param == "noc_adapters_clk_period") || (param == "design_clk_periods") || (param == "dram_clk_periods")) { + //get rad-id + std::string rad_id_str; + std::getline(ss, rad_id_str, ' '); + unsigned int rad_id = std::stoi(rad_id_str); + //get values std::vector value; std::string value_element_str; double value_element; @@ -221,25 +424,40 @@ void ParseRADSimKnobs(const std::string &knobs_filename) { } value.push_back(value_element); } - radsim_config.AddDoubleVectorKnob(param, value); + radsim_config.AddDoubleVectorKnobPerRad(param, value, rad_id); + } else if (param == "telemetry_traces") { + std::vector value; + std::string value_element; + while (getline(ss, value_element, ' ')) { + value.push_back(value_element); + } + radsim_config.AddStringVectorKnobShared(param, value); + radsim_config.AddIntKnobShared("telemetry_num_traces", value.size()); } else if ((param == "design_noc_placement") || (param == "noc_adapters_in_arbiter") || (param == "noc_adapters_out_arbiter") || - (param == "noc_adapters_vc_mapping") || (param == "telemetry_traces") || + (param == "noc_adapters_vc_mapping") || (param == "dram_config_files")) { + //get rad-id + std::string rad_id_str; + std::getline(ss, rad_id_str, ' '); + unsigned int rad_id = std::stoi(rad_id_str); + //get values std::vector value; std::string value_element; while (getline(ss, value_element, ' ')) { + if (param == "dram_config_files") { + //std::cout << "radsim_config.cpp: " << value_element << std::endl; + } value.push_back(value_element); } - radsim_config.AddStringVectorKnob(param, value); - if (param == "telemetry_traces") { - radsim_config.AddIntKnob("telemetry_num_traces", value.size()); - } - } else { + radsim_config.AddStringVectorKnobPerRad(param, value, rad_id); + } else if (param == "cluster_configs") { + continue; //go to next iteration, not using this knob for anything currently. was used to generate the radsim_knobs file. + } + else { std::cerr << "Undefined RADSim knob \"" << param << "\"" << std::endl; exit(1); } } - radsim_config.AddDoubleKnob("max_period", max_period); } \ No newline at end of file diff --git a/rad-sim/sim/radsim_config.hpp b/rad-sim/sim/radsim_config.hpp index 55038fa..a24bb3b 100644 --- a/rad-sim/sim/radsim_config.hpp +++ b/rad-sim/sim/radsim_config.hpp @@ -12,36 +12,72 @@ class RADSimConfig { public: // Simulation configuration parameters are stored in pairs of knob name and value - std::unordered_map _int_knobs; - std::unordered_map _double_knobs; - std::unordered_map _string_knobs; - std::unordered_map> _int_vector_knobs; - std::unordered_map> _double_vector_knobs; - std::unordered_map> _string_vector_knobs; + std::unordered_map _int_knobs_shared; + std::unordered_map _double_knobs_shared; + std::unordered_map _string_knobs_shared; + std::unordered_map> _int_vector_knobs_shared; + std::unordered_map> _double_vector_knobs_shared; + std::unordered_map> _string_vector_knobs_shared; + + //for rad-specific parameters + std::vector> _int_knobs_per_rad; + std::vector> _double_knobs_per_rad; + std::vector> _string_knobs_per_rad; + std::vector>> _int_vector_knobs_per_rad; + std::vector>> _double_vector_knobs_per_rad; + std::vector>> _string_vector_knobs_per_rad; RADSimConfig(); ~RADSimConfig(); - void AddIntKnob(const std::string& key, int val); - void AddDoubleKnob(const std::string& key, double val); - void AddStringKnob(const std::string& key, std::string& val); - void AddIntVectorKnob(const std::string& key, std::vector& val); - void AddDoubleVectorKnob(const std::string& key, std::vector& val); - void AddStringVectorKnob(const std::string& key, std::vector& val); - int GetIntKnob(const std::string& key); - double GetDoubleKnob(const std::string& key); - std::string GetStringKnob(const std::string& key); - int GetIntVectorKnob(const std::string& key, unsigned int idx); - double GetDoubleVectorKnob(const std::string& key, unsigned int idx); - std::string GetStringVectorKnob(const std::string& key, unsigned int idx); - std::vector& GetIntVectorKnob(const std::string& key); - std::vector& GetDoubleVectorKnob(const std::string& key); - std::vector& GetStringVectorKnob(const std::string& key); - bool HasIntKnob(const std::string& key); - bool HasDoubleKnob(const std::string& key); - bool HasStringKnob(const std::string& key); - bool HasIntVectorKnob(const std::string& key); - bool HasDoubleVectorKnob(const std::string& key); - bool HasStringVectorKnob(const std::string& key); + void ResizeAll(int num_rads); //resizing instance of class based on # of RADs + //general parameters shared across all RADs + void AddIntKnobShared(const std::string& key, int val); + void AddDoubleKnobShared(const std::string& key, double val); + void AddStringKnobShared(const std::string& key, std::string& val); + void AddIntVectorKnobShared(const std::string& key, std::vector& val); + void AddDoubleVectorKnobShared(const std::string& key, std::vector& val); + void AddStringVectorKnobShared(const std::string& key, std::vector& val); + //rad-specific parameters + void AddIntKnobPerRad(const std::string& key, int val, unsigned int rad_id); + void AddDoubleKnobPerRad(const std::string& key, double val, unsigned int rad_id); + void AddStringKnobPerRad(const std::string& key, std::string& val, unsigned int rad_id); + void AddIntVectorKnobPerRad(const std::string& key, std::vector& val, unsigned int rad_id); + void AddDoubleVectorKnobPerRad(const std::string& key, std::vector& val, unsigned int rad_id); + void AddStringVectorKnobPerRad(const std::string& key, std::vector& val, unsigned int rad_id); + //general parameters shared across all RADs + int GetIntKnobShared(const std::string& key); + double GetDoubleKnobShared(const std::string& key); + std::string GetStringKnobShared(const std::string& key); + int GetIntVectorKnobShared(const std::string& key, unsigned int idx); + double GetDoubleVectorKnobShared(const std::string& key, unsigned int idx); + std::string GetStringVectorKnobShared(const std::string& key, unsigned int idx); + std::vector& GetIntVectorKnobShared(const std::string& key); + std::vector& GetDoubleVectorKnobShared(const std::string& key); + std::vector& GetStringVectorKnobShared(const std::string& key); + //rad-specific parameters + int GetIntKnobPerRad(const std::string& key, unsigned int rad_id); + double GetDoubleKnobPerRad(const std::string& key, unsigned int rad_id); + std::string GetStringKnobPerRad(const std::string& key, unsigned int rad_id); + int GetIntVectorKnobPerRad(const std::string& key, unsigned int idx, unsigned int rad_id); + double GetDoubleVectorKnobPerRad(const std::string& key, unsigned int idx, unsigned int rad_id); + std::string GetStringVectorKnobPerRad(const std::string& key, unsigned int idx, unsigned int rad_id); + std::vector& GetIntVectorKnobPerRad(const std::string& key, unsigned int rad_id); + std::vector& GetDoubleVectorKnobPerRad(const std::string& key, unsigned int rad_id); + std::vector& GetStringVectorKnobPerRad(const std::string& key, unsigned int rad_id); + //specify if shared knob + bool HasIntKnobShared(const std::string& key); + bool HasDoubleKnobShared(const std::string& key); + bool HasStringKnobShared(const std::string& key); + bool HasIntVectorKnobShared(const std::string& key); + bool HasDoubleVectorKnobShared(const std::string& key); + bool HasStringVectorKnobShared(const std::string& key); + //rad-specific knobs + bool HasIntKnobPerRad(const std::string& key, unsigned int rad_id); + bool HasDoubleKnobPerRad(const std::string& key, unsigned int rad_id); + bool HasStringKnobPerRad(const std::string& key, unsigned int rad_id); + bool HasIntVectorKnobPerRad(const std::string& key, unsigned int rad_id); + bool HasDoubleVectorKnobPerRad(const std::string& key, unsigned int rad_id); + bool HasStringVectorKnobPerRad(const std::string& key, unsigned int rad_id); }; void ParseRADSimKnobs(const std::string& knobs_filename); diff --git a/rad-sim/sim/radsim_defines.hpp b/rad-sim/sim/radsim_defines.hpp index 25b3226..470ad96 100644 --- a/rad-sim/sim/radsim_defines.hpp +++ b/rad-sim/sim/radsim_defines.hpp @@ -1,7 +1,7 @@ #pragma once // clang-format off -#define RADSIM_ROOT_DIR "/home/andrew/rad-flow/rad-sim" +#define RADSIM_ROOT_DIR "/home/bassiabn/rad-sim/rad-flow/rad-sim" // NoC-related Parameters #define NOC_LINKS_PAYLOAD_WIDTH 82 diff --git a/rad-sim/sim/radsim_inter_rad.cpp b/rad-sim/sim/radsim_inter_rad.cpp new file mode 100644 index 0000000..5d623e9 --- /dev/null +++ b/rad-sim/sim/radsim_inter_rad.cpp @@ -0,0 +1,186 @@ +#include + +std::ostream& operator<<(std::ostream& os, const axis_fields& I) { + return os; //needed to create sc_fifo of custom struct type +} + +RADSimInterRad::RADSimInterRad(const sc_module_name &name, sc_clock *inter_rad_clk, RADSimCluster* cluster) : sc_module(name) { + std::cout << "RADSimInterRad AXIS_MAX_DATAW " << AXIS_MAX_DATAW << std::endl; + this->cluster = cluster; + this->clk(*inter_rad_clk); + num_rads = cluster->num_rads; + all_signals.init(num_rads + 1); + + fifos_latency_counters.resize(num_rads); + int inter_rad_fifo_num_slots = radsim_config.GetIntKnobShared("inter_rad_fifo_num_slots"); + for (int v = 0; v < num_rads; v++) { //width of vector = num of rads bc want fifo per rad + sc_fifo* new_fifo_ptr = new sc_fifo(inter_rad_fifo_num_slots); + fifos.push_back(new_fifo_ptr); + //adding to axi vectors + axis_signal* new_axis_signal = new axis_signal; + all_axis_master_signals.push_back(new_axis_signal); + new_axis_signal = new axis_signal; //second signal (one for master, one for slave) + all_axis_slave_signals.push_back(new_axis_signal); + axis_slave_port* new_axis_slave_port = new axis_slave_port; + all_axis_slave_ports.push_back(new_axis_slave_port); + axis_master_port* new_axis_master_port = new axis_master_port; + all_axis_master_ports.push_back(new_axis_master_port); + std::cout << "RADSimInterRad: " << v << std::endl; + } + SC_CTHREAD(writeFifo, clk.pos()); + SC_CTHREAD(readFifo, clk.pos()); + +} + +RADSimInterRad::~RADSimInterRad() { + int v; + for (v = 0; v < num_rads; v++) { + delete fifos[v]; + delete all_axis_master_signals[v]; + delete all_axis_slave_signals[v]; + delete all_axis_slave_ports[v]; + delete all_axis_master_ports[v]; + } +} + +//Connect the axi slave interface of each portal module to its corresponding RADSimInterRad axi master interface, and vice versa +void +RADSimInterRad::ConnectClusterInterfaces(int rad_id) { + #ifndef SINGLE_RAD + all_axis_master_signals[rad_id]->Connect(*(all_axis_master_ports[rad_id]), cluster->all_systems[rad_id]->design_dut_inst->design_top_portal_axis_slave); + all_axis_slave_signals[rad_id]->Connect(cluster->all_systems[rad_id]->design_dut_inst->design_top_portal_axis_master, *(all_axis_slave_ports[rad_id])); + #endif +} + +bool wrote_yet = false; +int write_fifo_packet_drop_count = 0; +void +RADSimInterRad::writeFifo() { + /* + Always @ positive edge of the clock + Writes into fifo from axi interface + Iterates over all_axis_slave_ports entries, + Checks if the data is valid and only write into fifo then + Writes into the fifo corresponding to the dest. + TODO: use tdest instead of tuser + TODO: automating adding all fields to curr_transaction + */ + + //initial setup + for (int i = 0; i < num_rads; i++) { + all_axis_slave_signals[i]->tready.write(true); + } + int bw_counter = 0; //counter used for bandwidth constraint on inter-rad network + wait(); + + //every clock cycle + while (true) { + //get current cycle for experiments + //int curr_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + + //iterate thru all RADs + for (int i = 0; i < num_rads; i++) { + //from bandwidth constraint, we calculated how many cycles at a time we can accept data into the inter-rad network + //when bw_counter exceeds those cycles, we can no longer accept more data + //we assert backpressure accordingly to prevent accepting of data into the inter-rad. + if (bw_counter < radsim_config.GetIntKnobShared("inter_rad_bw_accept_cycles")) { + all_axis_slave_ports[i]->tready.write(true); + } + else { + all_axis_slave_ports[i]->tready.write(false); + } + + //Obtain relevant axi-s fields + struct axis_fields curr_transaction; + curr_transaction.tdata = all_axis_slave_ports[i]->tdata.read(); + curr_transaction.tuser = all_axis_slave_ports[i]->tuser.read(); + curr_transaction.tvalid = all_axis_slave_ports[i]->tvalid.read(); + curr_transaction.tlast = all_axis_slave_ports[i]->tlast.read(); + + //Update the dest field because data will now move from an initial RAD to a different RAD + //Once we reach the destination RAD, what was previously a remote NoC node destination is now local + //DEST_LOCAL_NODE is now DEST_REMOTE_NODE, and DEST_REMOTE_NODE can be reset to 0 + sc_bv concat_dest_swap = all_axis_slave_ports[i]->tdest.read(); + DEST_LOCAL_NODE(concat_dest_swap) = DEST_REMOTE_NODE(all_axis_slave_ports[i]->tdest.read()); + DEST_REMOTE_NODE(concat_dest_swap) = 0; + curr_transaction.tdest = concat_dest_swap; + + if (curr_transaction.tvalid && all_axis_slave_ports[i]->tready.read()) { + unsigned int dest_rad = DEST_RAD(curr_transaction.tdest).to_uint64(); + if (this->fifos[dest_rad]->nb_write(curr_transaction) != false) { //there was an available slot to write to + fifos_latency_counters[dest_rad].push_back(0); //for latency counters + } + else { + std::cout << "WRITE FIFO FULL: packet dropped at inter_rad: could not write into internal fifo. Packets dropped count: " << write_fifo_packet_drop_count << std::endl; + write_fifo_packet_drop_count++; + } + } + } + + //Update bandwidth cycle counter. Reset to 0 if we have accepted data for enough cycles to meet our bandwidth limit. + //Else, increment the cycle counter. + if (bw_counter >= (radsim_config.GetIntKnobShared("inter_rad_bw_total_cycles") - 1)) { + bw_counter = 0; + } + else { + bw_counter++; + } + wait(); + } +} + +void +RADSimInterRad::readFifo() { + /* + Always @ positive edge of the clock + Read from fifo slot + Iterates thru all fifos + Matches the dest index of fifo to the dest rad + */ + int target_delay = radsim_config.GetIntKnobShared("inter_rad_latency_cycles"); + while (true) { + + //get current cycle for experiments + //int curr_cycle = GetSimulationCycle(radsim_config.GetDoubleKnobShared("sim_driver_period")); + + for (int i = 0; i < num_rads; i++) { //iterate through all rad's fifos + + //increment delay on all counters + for (unsigned int j = 0; j < fifos_latency_counters[i].size(); j++) { + fifos_latency_counters[i][j]++; + } + + //Try reading from front of fifo. sc_fifo does not support peek so this is required instead to obtain dest. + //TODO: replace sc_fifo with something else e.g., std::queue that can support peeks + //IMPORTANT: currently does not accept backpressure. Portal module must create a buffer for backpressure on the RAD's NoC + if ( (this->fifos[i]->num_available() != 0) && (fifos_latency_counters[i][0] >= target_delay) ){ //check that fifo is not empty + fifos_latency_counters[i].erase(fifos_latency_counters[i].begin()); //to reset counter, remove first elem + struct axis_fields read_from_fifo; + this->fifos[i]->nb_read(read_from_fifo); + sc_bv val = read_from_fifo.tdata; + int dest_device = (DEST_RAD(read_from_fifo.tdest)).to_uint64(); + + if (read_from_fifo.tvalid) { + all_axis_master_signals[dest_device]->tdata.write(val); + all_axis_master_signals[dest_device]->tvalid.write(read_from_fifo.tvalid); + all_axis_master_signals[dest_device]->tlast.write(read_from_fifo.tlast); + all_axis_master_signals[dest_device]->tdest.write(read_from_fifo.tdest); + all_axis_master_signals[dest_device]->tuser.write(read_from_fifo.tuser); + } + else { + //no data to be written to any RAD's portal module + all_axis_master_signals[i]->tvalid.write(false); + } + + } + else { + //no data to be written to any RAD's portal module + all_axis_master_signals[i]->tvalid.write(false); + } + + } + + wait(); + + } +} diff --git a/rad-sim/sim/radsim_inter_rad.hpp b/rad-sim/sim/radsim_inter_rad.hpp new file mode 100644 index 0000000..9a1c2c6 --- /dev/null +++ b/rad-sim/sim/radsim_inter_rad.hpp @@ -0,0 +1,59 @@ +#pragma once + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct axis_fields { + bool tvalid; + bool tready; + sc_bv tdata; + sc_bv tstrb; + sc_bv tkeep; + bool tlast; + sc_bv tid; + sc_bv tdest; + sc_bv tuser; + //needed to create sc_fifo of custom struct type + friend std::ostream& operator<<(std::ostream& os, const axis_fields& I); +}; + + +//Network for communication between RADs +class RADSimInterRad : public sc_module { + private: + RADSimCluster* cluster; + sc_vector>> all_signals{"all_signals"}; + int bw_limit = 0; + public: + int num_rads; + sc_in clk; + std::vector*> fifos; + + //for axi interfaces + std::vector all_axis_master_signals; + std::vector all_axis_slave_signals; + std::vector all_axis_slave_ports; + std::vector all_axis_master_ports; + + //for rising edge detection on each interface + std::vector prev_valid; + + //using vector of vectors bc dynamic sizing based on num_rads, allows pushback and erase, and faster incrementing (we increment more often than erase from front) than std::deque + std::vector> fifos_latency_counters; + + RADSimInterRad(const sc_module_name &name, sc_clock *inter_rad_clk, RADSimCluster* cluster); + ~RADSimInterRad(); + void ConnectClusterInterfaces(int rad_id); + void writeFifo(); + void readFifo(); + SC_HAS_PROCESS(RADSimInterRad); +}; diff --git a/rad-sim/sim/radsim_knobs b/rad-sim/sim/radsim_knobs index 5a6b30b..8c9e682 100644 --- a/rad-sim/sim/radsim_knobs +++ b/rad-sim/sim/radsim_knobs @@ -1,23 +1,49 @@ -radsim_root_dir /home/andrew/rad-flow/rad-sim -design_name dlrm -noc_num_nocs 1 -noc_clk_period 1.0 -noc_vcs 5 -noc_payload_width 82 -noc_num_nodes 100 -design_noc_placement dlrm.place -noc_adapters_clk_period 1.25 -noc_adapters_fifo_size 16 -noc_adapters_obuff_size 2 -noc_adapters_in_arbiter fixed_rr -noc_adapters_out_arbiter priority_rr -noc_adapters_vc_mapping direct -design_clk_periods 5.0 2.0 3.32 1.5 +design_name 0 dlrm_two_rad +noc_num_nocs 0 1 +noc_clk_period 0 1.0 +noc_vcs 0 5 +noc_payload_width 0 82 +noc_num_nodes 0 100 +design_noc_placement 0 dlrm_two_rad.place +noc_adapters_clk_period 0 1.25 +noc_adapters_fifo_size 0 16 +noc_adapters_obuff_size 0 2 +noc_adapters_in_arbiter 0 fixed_rr +noc_adapters_out_arbiter 0 priority_rr +noc_adapters_vc_mapping 0 direct +design_clk_periods 0 5.0 2.0 3.32 1.5 +dram_num_controllers 0 4 +dram_clk_periods 0 3.32 3.32 2.0 2.0 +dram_queue_sizes 0 64 64 64 64 +dram_config_files 0 DDR4_8Gb_x16_2400 DDR4_8Gb_x16_2400 HBM2_8Gb_x128 HBM2_8Gb_x128 +radsim_user_design_root_dir 0 /home/bassiabn/rad-sim/rad-flow/rad-sim/example-designs/dlrm_two_rad +design_name 1 dlrm_two_rad +noc_num_nocs 1 1 +noc_clk_period 1 1.0 +noc_vcs 1 5 +noc_payload_width 1 82 +noc_num_nodes 1 100 +design_noc_placement 1 dlrm_two_rad.place +noc_adapters_clk_period 1 1.25 +noc_adapters_fifo_size 1 16 +noc_adapters_obuff_size 1 2 +noc_adapters_in_arbiter 1 fixed_rr +noc_adapters_out_arbiter 1 priority_rr +noc_adapters_vc_mapping 1 direct +design_clk_periods 1 5.0 2.0 3.32 1.5 +dram_num_controllers 1 4 +dram_clk_periods 1 3.32 3.32 2.0 2.0 +dram_queue_sizes 1 64 64 64 64 +dram_config_files 1 DDR4_8Gb_x16_2400 DDR4_8Gb_x16_2400 HBM2_8Gb_x128 HBM2_8Gb_x128 +radsim_user_design_root_dir 1 /home/bassiabn/rad-sim/rad-flow/rad-sim/example-designs/dlrm_two_rad +radsim_root_dir /home/bassiabn/rad-sim/rad-flow/rad-sim sim_driver_period 5.0 telemetry_log_verbosity 2 telemetry_traces Embedding LU Mem0 Mem1 Mem2 Mem3 Feature Inter. MVM first MVM last -dram_num_controllers 4 -dram_clk_periods 3.32 3.32 2.0 2.0 -dram_queue_sizes 64 64 64 64 -dram_config_files DDR4_8Gb_x16_2400 DDR4_8Gb_x16_2400 HBM2_8Gb_x128 HBM2_8Gb_x128 -radsim_user_design_root_dir /home/andrew/rad-flow/rad-sim/example-designs/dlrm +num_rads 2 +cluster_configs rad1 anotherconfig +cluster_topology all-to-all +inter_rad_latency_cycles 420 +inter_rad_bw_accept_cycles 1 +inter_rad_bw_total_cycles 1 +inter_rad_fifo_num_slots 1000 diff --git a/rad-sim/sim/radsim_module.cpp b/rad-sim/sim/radsim_module.cpp index f5fa763..73a1e0d 100644 --- a/rad-sim/sim/radsim_module.cpp +++ b/rad-sim/sim/radsim_module.cpp @@ -1,10 +1,10 @@ #include #include -RADSimModule::RADSimModule(const sc_module_name &name) : sc_module(name) { +RADSimModule::RADSimModule(const sc_module_name &name, RADSimDesignContext* radsim_design) : sc_module(name) { module_name = name; std::string name_str(static_cast(name)); - radsim_design.RegisterModule(name_str, this); + radsim_design->RegisterModule(name_str, this); _num_noc_axis_slave_ports = 0; _num_noc_axis_master_ports = 0; _num_noc_aximm_slave_ports = 0; @@ -17,12 +17,14 @@ void RADSimModule::RegisterAxisSlavePort(std::string &port_name, axis_slave_port *port_ptr, unsigned int port_dataw, unsigned int port_type) { + //std::cout << "Adding AxisSlavePort named: " << port_name << endl; _ordered_axis_slave_ports.push_back(port_name); _axis_slave_ports[port_name] = port_ptr; _ports_dataw[port_name] = port_dataw; _ports_types[port_name] = port_type; _ports_is_aximm[port_name] = false; _num_noc_axis_slave_ports++; + //std::cout << "Added AxisSlavePort named: " << _axis_slave_ports[port_name] << endl; } void RADSimModule::RegisterAxisMasterPort(std::string &port_name, @@ -40,6 +42,7 @@ void RADSimModule::RegisterAxisMasterPort(std::string &port_name, void RADSimModule::RegisterAximmSlavePort(std::string &port_name, aximm_slave_port *port_ptr, unsigned int port_dataw) { + //std::cout << "radsim_module.cpp RegisterAximmSlavePort() port_name: " << port_name << std::endl; _ordered_aximm_slave_ports.push_back(port_name); _aximm_slave_ports[port_name] = port_ptr; _ports_dataw[port_name] = port_dataw; @@ -51,6 +54,7 @@ void RADSimModule::RegisterAximmSlavePort(std::string &port_name, void RADSimModule::RegisterAximmMasterPort(std::string &port_name, aximm_master_port *port_ptr, unsigned int port_dataw) { + //std::cout << "radsim_module.cpp RegisterAximmMasterPort() port_name: " << port_name << std::endl; _ordered_aximm_master_ports.push_back(port_name); _aximm_master_ports[port_name] = port_ptr; _ports_dataw[port_name] = port_dataw; diff --git a/rad-sim/sim/radsim_module.hpp b/rad-sim/sim/radsim_module.hpp index 99251a5..df00a1f 100644 --- a/rad-sim/sim/radsim_module.hpp +++ b/rad-sim/sim/radsim_module.hpp @@ -6,6 +6,8 @@ #include #include +class RADSimDesignContext; + class RADSimModule : public sc_module { public: std::string module_name; @@ -25,7 +27,7 @@ class RADSimModule : public sc_module { sc_in clk; - RADSimModule(const sc_module_name &name); + RADSimModule(const sc_module_name &name, RADSimDesignContext* radsim_design); ~RADSimModule(); virtual void RegisterModuleInfo() = 0; void RegisterAxisSlavePort(std::string &port_name, axis_slave_port *port_ptr, diff --git a/rad-sim/sim/radsim_telemetry.cpp b/rad-sim/sim/radsim_telemetry.cpp index b529d67..a8a5c5e 100644 --- a/rad-sim/sim/radsim_telemetry.cpp +++ b/rad-sim/sim/radsim_telemetry.cpp @@ -10,7 +10,7 @@ NoCTransactionTelemetry::~NoCTransactionTelemetry() {} int NoCTransactionTelemetry::RecordTransactionInitiation(int src, int dest, int type, int dataw, int network_id) { NoCTransactionTrace entry; entry.src_node = src; - entry.dest_node = dest; + entry.dest_node = ((1 << AXIS_DEST_FIELDW) - 1) & dest; //extract only local NoC node. ignore any remote NoC node set for inter-rad network. entry.transaction_type = type; entry.dataw = dataw; entry.network_id = network_id; @@ -64,13 +64,13 @@ void NoCTransactionTelemetry::DumpStatsToFile(const std::string& filename) { } std::vector NoCTransactionTelemetry::DumpTrafficFlows(const std::string& filename, unsigned int cycle_count, - std::vector>>& node_module_names) { - double sim_driver_period = radsim_config.GetDoubleKnob("sim_driver_period") / 1000000000.0; - unsigned int num_nocs = radsim_config.GetIntKnob("noc_num_nocs"); + std::vector>> node_module_names, unsigned int rad_id) { + double sim_driver_period = radsim_config.GetDoubleKnobShared("sim_driver_period") / 1000000000.0; + unsigned int num_nocs = radsim_config.GetIntKnobPerRad("noc_num_nocs", rad_id); std::vector>> traffic_bits(num_nocs); std::vector>> traffic_num_hops(num_nocs); for (unsigned int noc_id = 0; noc_id < num_nocs; noc_id++) { - unsigned int num_nodes = radsim_config.GetIntVectorKnob("noc_num_nodes", noc_id); + unsigned int num_nodes = radsim_config.GetIntVectorKnobPerRad("noc_num_nodes", noc_id, rad_id); traffic_bits[noc_id].resize(num_nodes); traffic_num_hops[noc_id].resize(num_nodes); } @@ -90,13 +90,13 @@ std::vector NoCTransactionTelemetry::DumpTrafficFlows(const std::string& double aggregate_bandwidth = 0.0; std::ofstream traffic_file(filename + "_noc" + std::to_string(noc_id) + ".xml", std::ofstream::out); traffic_file << "" << endl; - unsigned int num_nodes = radsim_config.GetIntVectorKnob("noc_num_nodes", noc_id); + unsigned int num_nodes = radsim_config.GetIntVectorKnobPerRad("noc_num_nodes", noc_id, rad_id); for (unsigned int src_id = 0; src_id < num_nodes; src_id++) { if (traffic_bits[noc_id][src_id].size() > 0) { for (auto& flow : traffic_bits[noc_id][src_id]) { traffic_file << "\t #include +#include #include #include @@ -58,7 +59,7 @@ class NoCTransactionTelemetry { static void UpdateHops(int id, int num_hops); static void DumpStatsToFile(const std::string& filename); static std::vector DumpTrafficFlows(const std::string& filename, unsigned int cycle_count, - std::vector>>& node_module_names); + std::vector>> node_module_names, unsigned int rad_id); }; // Class for recording and storing flit traces diff --git a/rad-sim/sim/radsim_utils.cpp b/rad-sim/sim/radsim_utils.cpp index ef265ee..497ed21 100644 --- a/rad-sim/sim/radsim_utils.cpp +++ b/rad-sim/sim/radsim_utils.cpp @@ -7,7 +7,7 @@ int GetSimulationCycle(double period) { } int GetSimulationCycle() { - double period = radsim_config.GetDoubleKnob("max_period"); + double period = radsim_config.GetDoubleKnobShared("sim_driver_period"); sc_time t = sc_time_stamp(); int cycle = (int)ceil(t.value() / period / 1000); return cycle; diff --git a/rad-sim/sim/sim.log b/rad-sim/sim/sim.log deleted file mode 100644 index e69de29..0000000 diff --git a/rad-sim/sim/sim.trace b/rad-sim/sim/sim.trace deleted file mode 100644 index 1901ba9..0000000 --- a/rad-sim/sim/sim.trace +++ /dev/null @@ -1,12 +0,0 @@ - - - - - - - - - - - - diff --git a/rad-sim/test/add_test.sh b/rad-sim/test/add_test.sh new file mode 100755 index 0000000..beb6184 --- /dev/null +++ b/rad-sim/test/add_test.sh @@ -0,0 +1,9 @@ +#!/bin/bash +test_path=$( cd "$(dirname "${BASH_SOURCE[0]}")" ; pwd -P ) +cd $test_path + +cp -f ../example-designs/add/config.yml ../config.yml + +(cd ../; python config.py add) + +(cd ../build; make run) \ No newline at end of file diff --git a/rad-sim/test/dlrm_test.sh b/rad-sim/test/dlrm_test.sh index eeaf378..8cb4e44 100755 --- a/rad-sim/test/dlrm_test.sh +++ b/rad-sim/test/dlrm_test.sh @@ -2,6 +2,8 @@ test_path=$( cd "$(dirname "${BASH_SOURCE[0]}")" ; pwd -P ) cd $test_path +cp -f ../example-designs/dlrm/config.yml ../config.yml + (cd ../; python config.py dlrm) (cd ../example-designs/dlrm/compiler; python dlrm.py) diff --git a/rad-sim/test/dlrm_two_rad_test.sh b/rad-sim/test/dlrm_two_rad_test.sh new file mode 100755 index 0000000..d0a9d40 --- /dev/null +++ b/rad-sim/test/dlrm_two_rad_test.sh @@ -0,0 +1,10 @@ +#!/bin/bash +test_path=$( cd "$(dirname "${BASH_SOURCE[0]}")" ; pwd -P ) +cd $test_path + +cp -f ../example-designs/dlrm_two_rad/config.yml ../config.yml + +(cd ../; python config.py dlrm_two_rad) + +(cd ../example-designs/dlrm_two_rad/compiler; python dlrm.py) +(cd ../build; make run) diff --git a/rad-sim/test/mlp_int8_test.sh b/rad-sim/test/mlp_int8_test.sh index a2b8209..605592e 100755 --- a/rad-sim/test/mlp_int8_test.sh +++ b/rad-sim/test/mlp_int8_test.sh @@ -2,6 +2,8 @@ test_path=$( cd "$(dirname "${BASH_SOURCE[0]}")" ; pwd -P ) cd $test_path +cp -f ../example-designs/mlp_int8/config.yml ../config.yml + (cd ../; python config.py mlp_int8) # python gen_testcase.py {} {} diff --git a/rad-sim/test/mlp_test.sh b/rad-sim/test/mlp_test.sh index d770f03..d63d622 100755 --- a/rad-sim/test/mlp_test.sh +++ b/rad-sim/test/mlp_test.sh @@ -2,6 +2,8 @@ test_path=$( cd "$(dirname "${BASH_SOURCE[0]}")" ; pwd -P ) cd $test_path +cp -f ../example-designs/mlp/config.yml ../config.yml + (cd ../; python config.py mlp) # python gen_testcase.py {} {} diff --git a/rad-sim/test/mult_test.sh b/rad-sim/test/mult_test.sh new file mode 100755 index 0000000..af10b0e --- /dev/null +++ b/rad-sim/test/mult_test.sh @@ -0,0 +1,9 @@ +#!/bin/bash +test_path=$( cd "$(dirname "${BASH_SOURCE[0]}")" ; pwd -P ) +cd $test_path + +cp -f ../example-designs/mult/config.yml ../config.yml + +(cd ../; python config.py mult) + +(cd ../build; make run) diff --git a/rad-sim/test/npu_test.sh b/rad-sim/test/npu_test.sh index 220e243..86ad5f2 100755 --- a/rad-sim/test/npu_test.sh +++ b/rad-sim/test/npu_test.sh @@ -22,6 +22,8 @@ done test_path=$( cd "$(dirname "${BASH_SOURCE[0]}")" ; pwd -P ) cd $test_path +cp -f ../example-designs/npu/config.yml ../config.yml + (cd ../; python config.py npu) (cd ../example-designs/npu/compiler; chmod 777 perf_sim.sh)