Skip to content

Conversation

jennijuju
Copy link
Collaborator

@jennijuju jennijuju commented Aug 15, 2025

add metadata schema for downstream integration

blocked by #131

@FilOzzy FilOzzy added this to FS Aug 15, 2025
@github-project-automation github-project-automation bot moved this to 📌 Triage in FS Aug 15, 2025
Comment on lines +8 to +18
"version": {
"type": "string",
"pattern": "^\\d+\\.\\d+\\.\\d+$",
"description": "Schema version (semantic versioning)",
"example": "1.0.0"
},
"created_at": {
"type": "string",
"format": "date-time",
"description": "ISO 8601 creation timestamp"
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would these be useful for?

Comment on lines +19 to +24
"ttl": {
"type": "integer",
"minimum": 0,
"default": 0,
"description": "Time-to-live in seconds (0 = no expiry)"
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

who's consuming this, and what is the user signalling with this?

"description": "Schema version (semantic versioning)",
"example": "1.0.0"
},
"created_at": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snake_case is is inconsistent here, createdAt if we really need this, or with_cdn if we want snake_case

"default": false,
"description": "Get FilCDN add-on"
},
"IndexIPFSandPublish": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to be shorter, also case needs to be consistent; just indexIPFS might be enough for this

"description": "Index IPFS CIDs and publish to IPNI"
}
},
"additionalProperties": false
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"additionalProperties": false
"additionalProperties": true

we should be able to let this be free-form for users, I imagine the only reason we're publishing a schema here is to advertise some standards that we encourage or know get used

Comment on lines +31 to +32
"type": "boolean",
"default": false,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to signal this in jsonschema but I was hoping that these boolean fields would just be omitted for false, and present but empty for true, so we don't waste bytes - key exists, value is ""

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, also, this is a string key string value thing, so I'm not sure we can even signal data types other than strings with this anyway

@rvagg
Copy link
Collaborator

rvagg commented Aug 18, 2025

Couple of thoughts:

  • I think maybe just a "Metadata Registry" section on the readme might be enough for this - a simple table there where we, and anyone else, can add known keys to it that get used for data sets and for pieces (separate tables). It's not a strict list, the contract or our software doesn't check for you, it's just a list for anyone that wants to signal a piece of metadata that might exist and what they use it for.
  • If we have confidence that certain fields should always be present, or that there are some core standards, then maybe we should consider the approach in A new Service Registry Contract for Filecoin Synapse #142 of having a bytes field with an abi encoded struct, followed by a free-form key/value metadata list - then we can stick in ones we know we'll likely have for the future, and expand that list as keys get standardised by some process, and leave the string[] k/v thing for the unstandardised ones. Then it gets to be more gas and storage efficient; although it isn't as friendly for inspection or subgraphs, so there's a tradeoff.
  • Mostly I think we should opt for things not being present in here. The happy default path should be for zero entries in here, you only put something in there when you want to signal opt-in behaviour (hence my comment about booleans - "false" should just be that it's not present, and "true" can be that the key is present but value is empty).

@rjan90 rjan90 moved this from 📌 Triage to ⌨️ In Progress in FS Aug 18, 2025
@rjan90
Copy link
Collaborator

rjan90 commented Aug 20, 2025

@rvagg With the latest changes you pushed in #131. Will this PR be redundant? Or are there still items here that we want to land?

@rvagg
Copy link
Collaborator

rvagg commented Aug 21, 2025

No, we should do something with this; it's more of a TODO than a solid proposal. IMO this ends up more like the multicodec registry - just a table with descriptions of "known metadata".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: ⌨️ In Progress

Development

Successfully merging this pull request may close these issues.

3 participants