CarbonSpot

This repository presents the CarbonSpot, a analytical design space exploration (DSE) framework evaluating the performance and the carbon footprint for AI accelerators. CarbonSpot bridges the gap between the performance-oriented DSE framework and the carbon-oriented DSE framework, enabling the architecture comparison between the most performant hardware choice and the most carbon-friendly hardware choice within one framework.

The part of the performance simulator is inherited from our previous framework ZigZag-IMC and ZigZag, with analytical accelerators performance models that have been validated against chip results.

If you find this repository useful to your work, please consider cite our paper in your study:

@ARTICLE{11142328,
  author={Sun, Jiacong and Yi, Xiaoling and Symons, Arne and Gielen, Georges and Eeckhout, Lieven and Verhelst, Marian},
  journal={IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems}, 
  title={An Analytical Model for Performance-Carbon Co-Optimization of Edge AI Accelerators}, 
  year={2025},
  volume={},
  number={},
  pages={1-1},
  keywords={Costs;Carbon;Carbon dioxide;Hardware;Computational modeling;AI accelerators;Fabrication;Estimation;Electricity;Analytical models;AI accelerators;carbon cost estimation;performance modeling;edge computing;in-memory computing},
  doi={10.1109/TCAD.2025.3602746}}

Motivation

The size of current AI models is increasing drastically in recent years together with model accuracy and capability. We plotted this trend in the figure below (data is collected from here). At the same time, this demands powerful hardware and significant energy consumption, bringing our community attention to evaluate the environmental impacts -- or more specific, the equivalent carbon footprint.

Framework Capability

CarbonSpot is capabile of:

Figuring out the optimal mapping for a user-defined accelerator architectures, including digital TPU-like architecture, analog In-Memory-Computing architecture and digital In-Memory-Computing architecture.
Reporting the energy, latency and carbon footprint for a given architecture and workload.
Comparing the optimal architectures in terms of performance and in terms of carbon footprint, with respective optimal mapping.
Evaluating the carbon footprint under Continuous-Active (CA) scenario and Periodically-Active (PA) scenario.
Reporting the cost breakdown for energy, latency and carbon footprint.

MLPerf-Tiny and MLPerf-Mobile wokloads are placed under folder ./zigzag/inputs/examples/workload/. The timing requirement for each workload are summarized below.

If you find this table useful to your work, please consider cite our paper in your study!

Workload Suite	Network Name	Usecase	Targetd Dataset	Workload Size (MB)	Frame/Second Requirement	Paper Reference
MLPerf-Tiny	DS-CNN	Keyword Spotting	Speech Commands	0.06	10	[1]
MLPerf-Tiny	MobileNet-V1	Visual Wake Words	Visual Wake Words Dataset	0.9	0.75	[2]
MLPerf-Tiny	ResNet8	Binary Image Classification	Cifar10	0.3	25	[3], [4]
MLPerf-Tiny	AutoEncoder	Anamaly Detection	ToyADMOS	1.0	1	[5]
MLPerf-Mobile	MobileNetV3	Image Classification	ImageNet	15.6	25*	[6]
MLPerf-Mobile	SSD MobileNetV2	Object Classification	COCO	64.4	25*	[7]
MLPerf-Mobile	DeepLab MobileNetV2	Semantic Segmentation	ImageNet ADE20K Training Set	8.7	25*	[8]
MLPerf-Mobile	MobileBert	Language Understanding	NA	96	25*	[9]

*: 25 FPS is borrowed from the setting for ResNet8 as this information is missing in the paper reference.

Getting Started

To get started, you can install all packages directly through pip using the pip-requirements.txt with the command:

$ pip install -r requirements.txt

The main script is expr.py, which can:

Evaluate the carbon footprint of prior works in literature

The function experiment_1_literature_trend() can output the equivalent carbon footprint of chips reported in prior works. Following graphs can be generated.

Figure 1: Carbon cost (y axis) versus energy efficiency (x axis) of AI accelerators from the literature in 16-28 nm CMOS technology when applied on the (a) MLPerf-Tiny and (b) MLPerf-Mobile benchmarks. The pie charts show the carbon breakdown into operational (green) and embodied (red) carbon costs for each design. TOP/s/W has been normalized to INT8. The most performant and most carbon-efficient designs are circled out.

Follwing papers are used to generate the data points in the figure. Due to the page limitation, please forgive us not able to include the citations in the paper. We here sincerely want to express our gratitude to these amazing works.

[1] Chih, Yu-Der, et al. "16.4 An 89TOPS/W and 16.3 TOPS/mm 2 all-digital SRAM-based full-precision compute-in memory macro in 22nm for machine-learning edge applications." 2021 IEEE International Solid-State Circuits Conference (ISSCC). Vol. 64. IEEE, 2021.

[2] Tu, Fengbin, et al. "A 28nm 15.59 µJ/token full-digital bitline-transpose CIM-based sparse transformer accelerator with pipeline/parallel reconfigurable modes." 2022 IEEE International Solid-State Circuits Conference (ISSCC). Vol. 65. IEEE, 2022.

[3] Lee, Chia-Fu, et al. "A 12nm 121-TOPS/W 41.6-TOPS/mm2 all digital full precision SRAM-based compute-in-memory with configurable bit-width for AI edge applications." 2022 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits). IEEE, 2022.

[4] Wang, Dewei, et al. "DIMC: 2219TOPS/W 2569F2/b digital in-memory computing macro in 28nm based on approximate arithmetic hardware." 2022 IEEE international solid-state circuits conference (ISSCC). Vol. 65. IEEE, 2022.

[5] Jiang, Weijie, et al. "A 16nm 128kB high-density fully digital In Memory Compute macro with reverse SRAM pre-charge achieving 0.36 TOPs/mm 2, 256kB/mm 2 and 23. 8TOPs/W." ESSCIRC 2023-IEEE 49th European Solid State Circuits Conference (ESSCIRC). IEEE, 2023.

[6] Jia, Hongyang, et al. "15.1 a programmable neural-network inference accelerator based on scalable in-memory computing." 2021 IEEE International Solid-State Circuits Conference (ISSCC). Vol. 64. IEEE, 2021.

[7] Liu, Shiwei, et al. "16.2 A 28nm 53.8 TOPS/W 8b sparse transformer accelerator with in-memory butterfly zero skipper for unstructured-pruned NN and CIM-based local-attention-reusable engine." 2023 IEEE International Solid-State Circuits Conference (ISSCC). IEEE, 2023.

[8] Zhao, Yuanzhe, et al. "A double-mode sparse compute-in-memory macro with reconfigurable single and dual layer computation." 2023 IEEE Custom Integrated Circuits Conference (CICC). IEEE, 2023.

[9] Lin, Chuan-Tung, et al. "iMCU: A 102-μJ, 61-ms Digital In-Memory Computing-based Microcontroller Unit for Edge TinyML." 2023 IEEE Custom Integrated Circuits Conference (CICC). IEEE, 2023.

[10] Lee, Jinseok, et al. "Fully row/column-parallel in-memory computing SRAM macro employing capacitor-based mixed-signal computation with 5-b inputs." 2021 Symposium on VLSI Circuits. IEEE, 2021.

[11] Yin, Shihui, et al. "PIMCA: A 3.4-Mb programmable in-memory computing accelerator in 28nm for on-chip DNN inference." 2021 Symposium on VLSI Technology. IEEE, 2021.

[12] Zhu, Haozhe, et al. "COMB-MCM: Computing-on-memory-boundary NN processor with bipolar bitwise sparsity optimization for scalable multi-chiplet-module edge machine learning." 2022 IEEE International Solid-State Circuits Conference (ISSCC). Vol. 65. IEEE, 2022.

[13] Song, Jiahao, et al. "A 28 nm 16 kb bit-scalable charge-domain transpose 6T SRAM in-memory computing macro." IEEE Transactions on Circuits and Systems I: Regular Papers 70.5 (2023): 1835-1845.

Simulate and estimate the performance and carbon footprint of user-defined accelerators

The function zigzag_similation_and_result_storage() simulates and evalutes the performance and carbon footprint for given architectures. An example demonstration is shown below.

Figure 2: An example demonstration of CarbonSpot output. The figures show the carbon footprint under PA scenario (y axis) and CA scenario (x axis) across different architecture solutions. Different colors mean different SRAM size. (enable active_plot=True to see which point corresponds to what architecture solution.)

Performance Model Description

Note in the CarbonSpot paper, we only show the simulation results on digital TPU-like architecture and digital In-Memory-Computing architectures. In fact, the framework supports all propagation-based digital architectuers with any random data stationarity dataflow, and analog In-Memory-Computing architectures.

Our SRAM-based In-Memory-Computing performance model is borrowed from ZigZag-IMC, which supports both analog (AIMC) and digital (DIMC) In-Memory-Computing architectures. A summary of the hardware settings for these chips is provided in the following table.

source	label	B_i/B_o/B_cycle	macro size	#cell_group	nb_of_macros
[1]	AIMC1	7 / 2 / 7	1024×512	1	1
[2]	AIMC2	8 / 8 / 2	16×12	32	1
[3]	AIMC3	8 / 8 / 1	64×256	1	8
[4]	DIMC1	8 / 8 / 2	32×6	1	64
[5]	DIMC2	8 / 8 / 1	32×1	16	2
[6]	DIMC3	8 / 8 / 2	128×8	8	8
[7]	DIMC4	8 / 8 / 1	128×8	2	4

B_i/B_o/B_cycle: input precision/weight precision/number of bits processed per cycle per input. #cell_group: the number of cells sharing one entry to computation logic.

The validation details can be found at here.

Our digital performance model is based on ZigZag. The validation details can be found at here.

Carbon Model Description

The carbon model is developed based upon ACT. To estimate the carbon footprint of prior works, we adapt its equations and derive the following equations (used in expr.py):

For Continuous-Active (CA) scenario:

$Carbon/operation = \frac{k_1}{TOP/s/W} + \frac{k_2}{TOP/s/mm^2} + package\ cost$

For Periodic-Active (PA) scenario:

$Carbon/operation = \frac{k_1}{TOP/s/W} + \frac{k_2 \cdot T_{c} \cdot TOP/s}{TOP/s/mm^2 \cdot parallelism} + package\ cost$

where, $k_1$ is the operational carbon intensity ($\frac{301}{3.6E+18}\ g, CO_2/pJ$ in globe average). $k_2$ is the embodied carbon intensity ($8.709 \cdot \frac{1}{Yield \cdot lifetime(year)}\ g, CO_2/mm^2/ps$). $T_c$ is the reponse time constraint under PA scenario.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
docs		docs
workload_preprocessing		workload_preprocessing
zigzag		zigzag
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
carbon_example.png		carbon_example.png
demo.py		demo.py
expr.py		expr.py
literature_carbon.png		literature_carbon.png
main.py		main.py
main_onnx.py		main_onnx.py
main_onnx_salsa.py		main_onnx_salsa.py
model_size_trends.png		model_size_trends.png
no_cme_expr_res_mobile.pkl		no_cme_expr_res_mobile.pkl
no_cme_expr_res_tiny.pkl		no_cme_expr_res_tiny.pkl
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CarbonSpot

Motivation

Framework Capability

Getting Started

Evaluate the carbon footprint of prior works in literature

Simulate and estimate the performance and carbon footprint of user-defined accelerators

Performance Model Description

Carbon Model Description

About

Uh oh!

Releases

Packages

Languages

License

KULeuven-MICAS/carbonspot

Folders and files

Latest commit

History

Repository files navigation

CarbonSpot

Motivation

Framework Capability

Getting Started

Evaluate the carbon footprint of prior works in literature

Simulate and estimate the performance and carbon footprint of user-defined accelerators

Performance Model Description

Carbon Model Description

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages