-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature request] further documentation on the J1 core versions #74
Comments
Well, j1b is 32 bit, whereas j1a is 16. But apart from that: The j1a is able to run on SRAM's having only pseudo-dual port, whereas the original J1 was designed to run on real dual port SRAMS, which aren't available on all FPGA's. The J1[b] is all about one-basic-forth-instruction-per-clock, and so needs true dual port SRAM so the first port is always available each clock to read the instruction, and the second port may optionally do a read or a write to ram so as to allow Pseudo dual-port SRAM's can read from an address whilst writing to a different address, but each of the two ports is dedicated to only read or write. With true dual port SRAM, each of the ports can read OR write. It's possible to 'emulate' a true dual port sram with only pseudo dual port blocks, but it costs performance, since you need to clock the sram twice as fast to do that. ice40 architecture FPGA's only have pseudo-dual port embedded ram blocks, and the j1a was written to run on an ice40hx1k chip. So to accommodate memory access apart from just reading the next instruction, the j1a core has to have an 'alternate' mode, which is done by setting You can see Of course, this then means there is no need for opcode Instead that opcode is allowed to be used for a 'minus' op in the j1a, which would otherwise require The other difference can be seen if you Minor aside for if you can read C code, but are just coming to grips with verilog:
pc[12] isn't actually used to address the SRAM: the SRAM is generated in j1a/mkrom.py so that the initial contents can be set at FPGA compile time, so that the FPGA also bootstraps the core at configure time. (This isn't so necessary now that the icestorm tools have the ability to just replace SRAM contents without a recompile, but they couldn't do that back when j1a was written, and it's a neat way to make the FPGA configuration logic do your SoC core's bootstrapping too). Which is to say that The highest ram fetch address bit the design uses is It's a little confusing IMHO, but 'din' in ram.v is the connection flowing data from the RAM to the core, and vice-versa for 'dout'. Another thing which makes the J1 very fast: note that top of stack It makes one realize that Stack movements are just encoded as two-bit signed integers in the ALU opcode format - one return stack and one for data stack, although You could in principle have opcodes that operated to replace any number of stack items - you'd just rearrange the core such that the top few logical stack items are also registers, like This makes the J1 design pretty interesting for custom FPGA SoC use, IMHO. Of course, in practise I've found it much easier to extend the I/O section (in icestorm/j1a.v) to allow just hooking up 'accelerator' units, added to the design on an as-needed basis. The only 'deep' core modding I did was the j4a, which is kinda 4x j(1/4)a in a sense. Has 4x the context, and 'looks' like a 1/4 speed j1a to the code... until you put the other 'cores' to work (they're logical only, the ALU, SRAM and IO are all shared). Mainly it just has funky 'stack' modules, with a little bit of pipelining and tuning. It's probably got a bug, but has mostly worked out pretty well for me. It lets me run multiple dumb spin loop bit-bang IO to control/talk to different chips at different rhythms without any interlocks or glitches. Just for a maximum of 4 'threads', but this is heaps for simple thing like a PID controller. A nice consequence it has is that you can have a spin-loop based app running and still talk to swapforth over rs232 to get/set variables in SRAM without any timing changes. You can even actively hack / rewrite code for different jobs without upsetting at all ones that are running. Having no DRAM, no wait cycles, no bubbles and only an 'emergency' interrupt system (to recover crashed cores) is incredibly freeing when you're writing a real-time controller. Kind of like having a RTOS in hardware, only better; since the timing is FPGA-state-machine rock-solid, and interlocks are impossible. Anyway, the code is so short and beautiful for the J1 cores that 'documenting' them is probably more about learning to read verilog than anything else. Better to have a single source of truth and all that. One interesting observation: There are other parts of the instruction space which are 'available': J1 uses a 4-bit field to select one of 16 ops, but that could easily be extended to one of 32 ops, since that thirteenth bit is already 'free'... |
This intermediate level documentation was enormously helpful. I do read and write Verilog, but it takes a lot of work to extract this information from the code. In particular what the J4 does baffled me for months, now I get it. It is a barrel processor. You may wnat to put that sentence near the top. Indeed this whole document could happily go in the README. I was also a bit confused as to what pseudo dual port RAM does. Thanks again. |
I'm trying to wrap my head around how the J1 core evolved over time and what version of the core are featured in which repositories/folders. My current understanding is that starting from the original J1 core two new versions called J1a and J1b were created. Since the J1 repository was updated as well I figure the changes were backported?
Sadly I've hardly any knowledge of Verilog/VDHL and therefor a hard time reading the .v files. I'd appreciate it if someone could point out the key differences to me. E.g. this blog post mentions that the return bit was moved from bit 12 to 7... things like that.
Is there a paper describing the new versions of the core like there is from the original?
The text was updated successfully, but these errors were encountered: