We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DMA(Direct Memory Access 直接内存访问 ),它能把CPU从繁忙的数据传输过程中解脱出来,完成数据存取与CPU解耦。
现在大部分rocc加速器都是从cache取数,有的拓展cache例如从L2 cache拿数据,有的有ping-pong机制, 数据预取机制等等,但是有用DMA取数的还比较少。
rocketchip生成器中不带有DMA,只能从cache取数。
The text was updated successfully, but these errors were encountered:
我的同学@zwk最近学习chisel设计了DMA模块,详细可以参考他的博客: 基于TileLink的DMA设计
Sorry, something went wrong.
可以参考一些加速器设计里面的DMA模块,例如chipyard里面的icenet和gemmini加速器
基于RocketChip的DMA取数大致情况:
DMA读数是这样的,比如发送一个请求读4096个64bits的数组,它会分成多个cache blocks 请求到DRAM,现在每个cache block最大有8个节拍,每个节拍64bits。
每个cache block请求时间在10-15 cycles
cache block请求被DRAM响应后,这个cache block 的8*64bits会在后面的8个时钟周期内顺序到来
读取下一个cache block时,又需要向DRAM发起请求,重复上面的过程
假如cache block请求响应时间不变的话,拓宽DRAM->DMA,相同时间取到的数据就变多了。比如位宽从64-》128,应该有两倍的提高
No branches or pull requests
背景
DMA(Direct Memory Access 直接内存访问 ),它能把CPU从繁忙的数据传输过程中解脱出来,完成数据存取与CPU解耦。
现在大部分rocc加速器都是从cache取数,有的拓展cache例如从L2 cache拿数据,有的有ping-pong机制, 数据预取机制等等,但是有用DMA取数的还比较少。
rocketchip生成器中不带有DMA,只能从cache取数。
The text was updated successfully, but these errors were encountered: