Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make instr_dict.yaml more useful #123

Open
3 tasks
jmerdich opened this issue May 29, 2022 · 3 comments
Open
3 tasks

Make instr_dict.yaml more useful #123

jmerdich opened this issue May 29, 2022 · 3 comments

Comments

@jmerdich
Copy link

While the individual code generators are useful, it'd be nice to have a machine-readable version-- sometimes the default generators just aren't a fit! instr_dict.yaml almost gets there, but it only includes the instructions and none of the other useful sideband data like csrs or field positions.

To that end, I'd like to propose the following:

  • Move the current contents of the yaml file (ie, the raw mapping of instructions) to a subfield called eg. 'instructions'. This is a breaking change, but as far as I can tell, no open source code directly uses instr_dict.yaml.
  • Add a format version number to it to track changes
  • Start adding other data to get everything from the constants file automatically copied to the yaml file

I'm willing to put in effort here, but is this something the community is interested in?

@aswaterman
Copy link
Member

cc @neelgala

@SpriteOvO
Copy link

SpriteOvO commented Jun 9, 2022

+1, I agree that instr_dict needs some improvements.

The current parse.py generates a instr_dict in yaml format. A snippet looks like this:

amomaxu_d:
  encoding: 11100------------011-----0101111
  extension:
  - rv64_a
  mask: '0xf800707f'
  match: '0xe000302f'
  variable_fields:
  - rd
  - rs1
  - rs2
  - aq
  - rl
amomaxu_w:
  encoding: 11100------------010-----0101111
  extension:
  - rv_a
  mask: '0xf800707f'
  match: '0xe000202f'
  variable_fields:
  - rd
  - rs1
  - rs2
  - aq
  - rl
amomin_d:
  encoding: 10000------------011-----0101111
  extension:
  - rv64_a
  mask: '0xf800707f'
  match: '0x8000302f'
  variable_fields:
  - rd
  - rs1
  - rs2
  - aq
  - rl
amomin_w:
  encoding: 10000------------010-----0101111
  extension:
  - rv_a
  mask: '0xf800707f'
  match: '0x8000202f'
  variable_fields:
  - rd
  - rs1
  - rs2
  - aq
  - rl

We can see two problems here:

  • . in the instruction name and extension are escaped to _.
  • Not compact. variable_fields is probably better as a single line.

Although these can be solved backward-compatible (I'm not sure if it satisfies the yaml standard spec):

"amomaxu.d":
  encoding: 11100------------011-----0101111
  extension: ["rv64.a"]
  mask: '0xf800707f'
  match: '0xe000302f'
  variable_fields: [rd, rs1, rs2, aq, rl]
"amomaxu.w":
  encoding: 11100------------010-----0101111
  extension: ["rv.a"]
  mask: '0xf800707f'
  match: '0xe000202f'
  variable_fields: [rd, rs1, rs2, aq, rl]
"amomin.d":
  encoding: 10000------------011-----0101111
  extension: ["rv64.a"]
  mask: '0xf800707f'
  match: '0x8000302f'
  variable_fields: [rd, rs1, rs2, aq, rl]
"amomin.w":
  encoding: 10000------------010-----0101111
  extension: ["rv.a"]
  mask: '0xf800707f'
  match: '0x8000202f'
  variable_fields: [rd, rs1, rs2, aq, rl]

I believe instr_dict is generated for machine-readable purposes, so IMO json is probably better than yaml here:

{
  "amomaxu.d": {
    "encoding": "11100------------011-----0101111",
    "extension": ["rv64.a"],
    "mask": "0xf800707f",
    "match": "0xe000302f",
    "variable_fields": ["rd", "rs1", "rs2", "aq", "rl"]
  },
  "amomaxu.w": {
    "encoding": "11100------------010-----0101111",
    "extension": ["rv.a"],
    "mask": "0xf800707f",
    "match": "0xe000202f",
    "variable_fields": ["rd", "rs1", "rs2", "aq", "rl"]
  },
  "amomin.d": {
    "encoding": "10000------------011-----0101111",
    "extension": ["rv64.a"],
    "mask": "0xf800707f",
    "match": "0x8000302f",
    "variable_fields": ["rd", "rs1", "rs2", "aq", "rl"]
  },
  "amomin.w": {
    "encoding": "10000------------010-----0101111",
    "extension": ["rv.a"],
    "mask": "0xf800707f",
    "match": "0x8000202f",
    "variable_fields": ["rd", "rs1", "rs2", "aq", "rl"]
  }
}

@neelgala
Copy link
Collaborator

neelgala commented Jun 9, 2022

Hi.. sorry for the delay in getting to this.

@jmerdich thanks for those inputs - I am in agreement with all your suggestions and if you could submit a PR I can help review it. Since the concern is with just dumping the yaml contents I think it shouldn't break anything in this repo. Other projects consuming the instr_dict.yaml from this repo will obvioulsy have to make changes - so if someone has an issue with this they should speak now.

Regarding versioning - I would suggest a very basic light weight calendar versioning done by the CI would be best.

@SpriteOvO regarding compactness : I have used pyyaml and ruamel and unfortunately neither dumps the lists in a single-line while keeping everything else coherent.

I don't see a real issue with the replacing . with _ because the spec never uses _ in their instruction definition. So that probably wouldn't be a critical issue for now.

I don't mind switching to json considering there is probably greater support in multiple langauges to parse json than yaml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants