Skip to content
This repository has been archived by the owner on Dec 1, 2018. It is now read-only.

Use URAMs on F1 #230

Open
shadjis opened this issue Oct 4, 2017 · 6 comments
Open

Use URAMs on F1 #230

shadjis opened this issue Oct 4, 2017 · 6 comments
Assignees

Comments

@shadjis
Copy link
Contributor

shadjis commented Oct 4, 2017

The F1 has UltraRAMs which can be used for larger SRAMs. However, SRAMs need to be explicitly assigned to URAMs using the following syntax:

(* ram_style = "ultra" *) reg [DWIDTH-1:0] mem [0:WORDS-1];

One way to do this is to have an analysis pass which:

  1. gets a list of all SRAMs,
  2. sorts the list by size, and
  3. keeps track of the largest 800 SRAMs (there are 800 URAMs on the F1)

E.g. this can be done by storing the size of the 800th largest SRAM and then in code generation using a different template for SRAMs larger than that.

@dkoeplin
Copy link
Collaborator

dkoeplin commented Oct 4, 2017

A single SRAM may take more than 1 URAM, but other than that yep this should work.

Are there any downsides to using a URAM over an SRAM (e.g. not dual ported, higher latency, etc.)?

@raghup17
Copy link
Contributor

raghup17 commented Oct 4, 2017

Latency should be the same (1-cycle), and URAMs are dual ported. However, the width of URAM ports is twice the width of BRAMs (72 bits), and this cannot be configured to operate as a smaller width. In other words, we do not get more depth with URAM if we use a narrower width, unlike BRAMs. This means that without fixing #231 , URAM usage will be quite inefficient for narrower data types.

@shadjis
Copy link
Contributor Author

shadjis commented Oct 4, 2017

Also David, the case of 1 SRAM being >1 URAM may be a bit complicated since URAMs can be cascaded to implement bigger URAMs. However, enabling cascading reduces the number of URAMs available:
https://github.com/aws/aws-fpga/blob/master/hdk/cl/examples/cl_uram_example/README.md#implementation-options

If cascading is not enabled, and something > 4096 words is given a uram directive, I'm not sure if this will:

  1. still use multiple URAMs but non-dedicated routing (e.g. may impact timing and routability),
  2. use block rams instead, or
  3. fail

If case 1 or 2 it should be ok but if 3 then we might want to omit SRAMs > 4096 from this URAM list. But I think for now we can just assume 1 SRAM per URAM and handle this more complicated case later? Also, as Raghu said 4096 is the depth without packing into the 72-bit word width (#231), so it might actually be 8k or 16k words. This might be larger than anything we ever need so cascading may not be necessary.

@dkoeplin
Copy link
Collaborator

dkoeplin commented Oct 4, 2017

Ah interesting, thanks for pointing this out. In that case, if I see a bank larger than 4096 words I won't include it in the URAM candidate list for now. This doesn't happen extremely often in practice, so this simple solution should work ok for now.

@mattfel1
Copy link
Member

mattfel1 commented Oct 5, 2017

Do we have the metadata yet that tells me if I should uramify a memory?

@dkoeplin
Copy link
Collaborator

dkoeplin commented Oct 5, 2017

Not yet - will be adding it today

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants