Skip to content

Simplify abstract data model and specify one concrete representation #887

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 24, 2025

Conversation

msporny
Copy link
Member

@msporny msporny commented Mar 29, 2025

This PR is an attempt to address issue #855 by simplifying the abstract data model down to INFRA and specifying a single concrete JSON representation.

Note: We are currently in maintenance mode, which means that we cannot make any class 4 changes (change normative statements in ways that are not backwards compatible). That has hampered a more straightforward PR and required some normative acrobatics given that I couldn't remove previous normative statements. Please keep this in mind as you review as some wording is painfully contrived due to our maintenance mode status with this specification.


Preview | Diff

@msporny msporny added the class 2 Changes that do not functionally affect interpretation of the document label Mar 29, 2025
@msporny msporny requested a review from mccown as a code owner March 29, 2025 22:07
@iherman
Copy link
Member

iherman commented Mar 30, 2025

What does the term "Generalized JSON-LD Processor" means? We have the notion of "JSON-LD Processor" defined by the JSON-LD specification; what do we generalize about it?

@msporny
Copy link
Member Author

msporny commented Mar 30, 2025

What does the term "Generalized JSON-LD Processor" means? We have the notion of "JSON-LD Processor" defined by the JSON-LD specification; what do we generalize about it?

I'm re-using language that we agreed to in the VCWG: https://w3c.github.io/vc-data-model/#dfn-general-json-ld-processing

This is language that we're hoping to eventually incorporate into the JSON-LD v1.2 specification so we don't have to keep repeating ourselves in vc-data-model, vc-data-integrity, CID, and DID specs. I thought about linking to VCDM v2.0, should we do that instead? The only concern there is that "type specific credential processing" (for a VC) becomes "DID Method specific DID Document processing" (for a CID/DID document).

We really need to get the CID document into the DIDWG so we can make these changes/alignment more easily in the next revision.

@iherman
Copy link
Member

iherman commented Apr 1, 2025

@msporny, reacting on #887 (comment): I have re-read the changes and I still do not really understand what the notion of "Generalized JSON-LD Processing" brings to the table, as opposed to the usage of a JSON-LD Processor which is an existing term in the JSON-LD spec. In the VCDM, where you point to, the term is used in opposition to "Type specific credential processing", which is not relevant here. Personally, I would propose to use the JSON-LD terminology and that is that.

This is language that we're hoping to eventually incorporate into the JSON-LD v1.2 specification

Nobody knows whether this will happen or not, we do not even have a charter for JSON-LD 1.2. We cannot rely on that. And, as I said above, it may not be necessary in the first place.

We really need to get the CID document into the DIDWG so we can make these changes/alignment more easily in the next revision.

I do not understand what you mean by that.

@pchampin
Copy link
Contributor

This was discussed during the did meeting on 03 April 2025.

View the transcript

DID Core PR Processing

<ottomorac> w3c/did#887


@pchampin
Copy link
Contributor

This was discussed during the did meeting on 03 April 2025.

View the transcript

w3c/did#887

ottomorac: You've gotten some feedback from Ted in there, and from Ivan -- just be aware of the PR.

manu: 887 is a pretty big change.
… It's not normative. We're still trying to align with the DID spec.
… We're trying to eliminate the abstract data model and follow what the control spec is doing, which is establish a processing algorithm.
… It also makes the media type clear. There's now just one media type application/did. And you can also use JSON-LD on it to process it.
… Basically, what this does, the base specification we depend on is INFRA. It's debatable whether INFRA is abstract data model; I assert that it is, others assert that is isn't.
… This is the data model for the web and DOM environment, and it has a concrete mapping to JSON.
… You map INFRA to JSON and that's it.
… We used to have two representations in the spec, after years of debate. After we had some implementation experience, people asked why we had two.
… What I've done here is that we have one JSON representation. The PR includes the @context value and deletes the entire JSON-LD representation.
… We preserved the rules because we can't make breaking changes. We just need other folks to look and see if they agree.
… The other thing that I couldn't remove that there could be other representations. You can have a document and a YAML representation for it, as long as you can round-trip to JSON losslessly.
… Any representation is legitimate as long as it can round trip to the concrete data model.
… I don't think it makes any normative changes so we're within our charter.

ottomorac: Yes, that explanation makes sense.

ottomorac: Seems like we're on the right path forward.

manu: I'm probably going to leave it until the call next week to give extra review time.


index.html Outdated
Comment on lines 1608 to 1928
also compatible with processors that perform JSON-LD processing. Developers can
use any other [=representation=], such as CBOR or YAML, that is capable of
expressing the <a href="#data-model">data model</a>. The following sections
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
also compatible with processors that perform JSON-LD processing. Developers can
use any other [=representation=], such as CBOR or YAML, that is capable of
expressing the <a href="#data-model">data model</a>. The following sections
also compatible with processors that perform JSON-LD processing. The following sections

Single representation right?

Copy link
Member Author

@msporny msporny Apr 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've updated the language to say "single" representation and remove the YAML/CBOR statement in e354ce8.

index.html Outdated

<section>
<h3>Production</h3>
<h2>Generalized JSON-LD Processors</h2>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed during today's meeting, I find the term "generalized JSON-LD processors" confusing in this context. It seems to imply that those are processors more general than standard JSON-LD processors, which (IIUC) is not what is intended.

From @msporny 's explanations during today's meeting, I think that "general JSON-LD processors" (a wording that is used twice in the paragraph!) or "generic JSON-LD processors" would better convey the intended meaning.

Suggested change
<h2>Generalized JSON-LD Processors</h2>
<h2>General JSON-LD Processors</h2>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have a preference to use the term "Generic" (if we really want to qualify it). That term suggests using the processor as defined (presumably in a standard) whereas the term "General" might mean that there is also a "Specialized" JSON-LD Processor. And that is not the case; there are JSON processors that do not try to interpret @context but just consider it as a bona fide JSON key, but that does not qualify to refer to them as JSON-LD processors in the first place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like "Generic"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"General" implies contrasting "Special".
"Generic" implies contrasting "Specific".
"Generalized" implies contrasting "Specialized".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up dropping "Generic/General/Generalized" and just said "standard JSON-LD Processing algorithms" or "JSON-LD Processing using standards-compliant libraries" since it felt cleaner. Hope that works for everyone.

index.html Outdated
[=representation=] [=production=] rules as defined in [[[#json]]].
[[JSON-LD11|JSON-LD]] is a JSON-based format used to serialize
<a href="http://www.w3.org/TR/ld-glossary/#linked-data">Linked Data</a>. Some
implementations are expected to process [=DID documents=] using generalized
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
implementations are expected to process [=DID documents=] using generalized
implementations are expected to process [=DID documents=] using general

per my suggestion above

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, to be in sync with myself, I would use "generic" :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up dropping "Generic/General/Generalized" and just said "standard JSON-LD Processing algorithms" or "JSON-LD Processing using standards-compliant libraries" since it felt cleaner.

@w3cbot
Copy link

w3cbot commented Apr 17, 2025

This was discussed during the #did meeting on 17 April 2025.

View the transcript

Simplify abstract data model and specify one concrete representation

<ottomorac> w3c/did#887

ottomorac: this one is -- so, we said we'd try to eliminate the abstract data model, and follow the CID spec (establishing the algorithm)

manu: good summary, I think I said that I'd leave this open til this week, and if we didn't get more feedback, we'll merge it in
… so, I'll merge it by the end of the weekend
… I think it's fine as is, I don't think it'll blow anything up
… just to be clear, there are not supposed to be any normative changes (rather, no change in functionality at all in DID implementations)

manu: Marcus, I tried to re-do the image (the diagram)
… and tried using Inkscape, and no luck rendering it
… which app did you use?

markus_sabadello: I used LibreOffice Draw I think?
… I have the source files for all of these, I can send you source files. Or I could update them

manu: source files would be good, we should check it into version control

markus_sabadello: yeah, I ran into some problems as well

manu: yeah, just raise a PR to check in the source files

JoeAndrieu: there's a line in there that implies CBOR and YAML that are valid representations, and I think that's not quite accurate anymore

manu: ok, sounds good
… that sounds fine. I was trying to imply -- if somebody had say a YAML representation, as long as they could convert to a concrete representation (JSON doc), and then send it to you, it'd be fine

JoeAndrieu: if you wanted to include something like "Internal representations could use other formats", I could support that. As long as we're clear that the thing going over the wire needs to be JSON

manu: I see ok

manu: I'm trying to head off comments on "CBOR is illegal" etc.

JoeAndrieu: well that's the point though - it's not a legal over-the-wire representation
… if you want to call out internal, that's fine

pchampin: I thought I'd put a comment on the PR, but looks like not
… I want to recall Ivan's concern about 'generalized JSON-LD processing'
… I looked at the CID document
… and the source document actually distinguishes normal JSON processing from generalized JSON-LD processing
… the 'generalized' part seems very odd to me
… maybe I should try to propose a fix

manu: yeah, this is where this PR is difficult
… since we can't make Class 4 changes, I can't quite say what we want
… we can't get rid of representation changes, since there are normative statements that involve them
… so we can't remove those
… so I had to keep representations language around. we're trying to get it down to just _one_ JSON representation
… and the generalized processing thing is just trying to say -- don't break JSON-LD processors
… you can go field by field JSON processing. but your output shouldn't break JSON-LD
… that's the consensus we were able to get
… for people who just want to look at the JSON, look at the type, do some processing
… so it's fine if you do that, but whatever the outcome is, semantically, it still has to match up with how JSON-LD would've processed it, otherwise, there's a bug
… and that's the language that we reached consensus on with Google and others, who are generally against polyglot processing
… so that's why -- we're trying to not change anything normative.
… with a JSON mechanism, but is 100% compat with JSON-LD
… which would result in no Class 4 changes

ivan: to me, the term 'generalized JSON-LD processing' at that point doesn't bring anything
… that I wouldn't have just by saying 'JSON-LD processing'
… the latter has a clear meaning, via the JSON-LD standard
… the current language uses the term 'generalized JSON-LD proc' _in contrast_ with the type-specific thing
… I don't quite know how to put it clearly, but..
… somehow, I don't know what to do with this term
… I'd simply remove the 'generalized' term

<pchampin> in fact, the text says twice "generalIZED json-ld processing" and twice "general json-ld processing"; I prefer the second wording

<Zakim> manu, you wanted to note that Dave has strong feelings about this. :)

manu: Dave Longley had some pretty strong feelings about this, I'd hesitate to change it
… and I do think Dave has got a point. there are the algorithms in the JSON-LD spec and API spec.
… and that's what we're calling 'generalized JSON-LD processing'. meaning, you can take a generalized JSON-LD processor, you process it, and get a result
… because you can ALSO do non-generalized JSON-LD processing, you can do type-specific
… so, it's ok to not use a JSON-LD library to process the document
… so, one class is - 'generalized JSON-LD' is you're using a 100% standards-compliant JSON-LD library to do processing
… and the OTHER approach is you don't have a JSON-LD processor, you can do whatever manual processing,

but as long as the outcome is semantically the same, that's important

ivan: ok, let's move it to the PR discussion
… I'm not entirely convinced

manu: if there's a better language we can use.. hopefully this is a future JSON-LD WG discussion
… there are people that deeply hate JSON-LD libraries, and they're insisting that it's the only way to get proper output,
… and there's another group saying -- no, you can just treat it as JSON, etc
… so, this needs better language.
… and the best language so far was from the VC WG, which Google said they'd be ok with, etc

ottomorac: thanks, we'll come back to this.


@msporny msporny force-pushed the msporny-concrete-dm branch from c464f6a to 251840e Compare April 24, 2025 13:34
@msporny
Copy link
Member Author

msporny commented Apr 24, 2025

Editorial, multiple reviews, changes requested and made, no objections, merging.

@msporny msporny merged commit 4d46385 into main Apr 24, 2025
1 of 2 checks passed
@msporny msporny deleted the msporny-concrete-dm branch April 24, 2025 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
class 2 Changes that do not functionally affect interpretation of the document
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants