Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relaxing the restriction to allow POST to create new resources, regardless of their mime types #62

Open
csarven opened this issue Dec 20, 2015 · 9 comments

Comments

@csarven
Copy link
Member

csarven commented Dec 20, 2015

Trying to HTTP POST with Content-Type text/html and the data being HTML+RDFa, returns HTTP 415.

e.g:

curl -v -X POST -H'Content-Type: text/html; charset=utf-8' -H'Link: <http://www.w3.org/ns/ldp#Resource>, rel="type"' -H"Slug: x.html" https://example.org/ --data '<html></html>'

I've also tried it with ldp:NonRDFSource but still HTTP 415.

https://example.org/ is a ldp:BasicContainer.

Is Gold's implementation related to this (from the LDP Rec):

"6.2.1 LDP servers can support representations beyond those necessary to conform to this specification. These could be other RDF formats, like N3 or NTriples, but non-RDF formats like HTML [HTML401] and JSON [RFC4627] would likely be common. HTTP content negotiation ([RFC7231] Section 3.4 - Content Negotiation) is used to select the format."

I think it is getting tripped over this:

gold/server.go

Line 272 in 40e5ee4

if dataMime != "multipart/form-data" && !dataHasParser && req.Method != "PUT" && req.Method != "HEAD" && req.Method != "OPTIONS" {

and because of not listing text/html as one of the mimeParsers:

gold/mime.go

Line 14 in 40e5ee4

var mimeParser = map[string]string{

HTML+RDFa is an ldp:RDFSource. Of course whether the HTML includes RDF(a) or not - HTML could also include Turtle or JSON-LD in <script> - can't be known without looking at the content, so there is some reliance on the text/html or application/xhtml+xml mediatypes. Besides, the spec is pointing at HTML 4.01 (which is rather archaic, but that may be need to be mentioned in an errata).

@csarven csarven changed the title Relaxing the restriction to alllow POST to create new resources, regardless of their mime types Relaxing the restriction to allow POST to create new resources, regardless of their mime types Dec 20, 2015
@deiu
Copy link
Contributor

deiu commented Dec 21, 2015

I agree this is needed. I'll try to fix it asap.

@deiu
Copy link
Contributor

deiu commented Dec 24, 2015

In fact, giving it some thought, I think I will leave it like this for now, at least. If you want to POST new (non-RDF) resources to a container, you can still send a request with multipart/form-data. It's what most file "upload" scripts use these days. This functionality is currently supported in gold.

Here's an example using curl:

curl -F upload=@/foo/bar/picture.jpg https://example.org/container/

@deiu deiu closed this as completed Dec 24, 2015
@csarven
Copy link
Member Author

csarven commented Dec 24, 2015

The example I gave above was about an RDFSource, not NonRDFSource. I didn't quite understand the reason behind "leave it like this for now", when the original issue is about removing the unnecessary restrictions/limitations.

There is not much point in "upload"ing an HTML+RDFa via multipart/form-data if I can just as well do a PUT (today, that's what I'm doing). What I want to be able to do is have the option to POST an HTML+RDFa document on the same grounds as Turtle (which is another RDFSource). This is useful.

Is there any reason for different treatment here for RDFa and Turtle? Moreover, JSON-LD, JSON, SPARQL-Update get a pass. If gold doesn't want to support RDFa, that's fair enough, but I'd like to know. The POST gives me the ability to use a Slug if I need to or just go with whatever the server decides for me.

@deiu
Copy link
Contributor

deiu commented Dec 24, 2015

The example you gave above had Content-Type: text/html in it, which is not a recognized mime type for an RDFSource, at least in the LDP world. At least that's what made me think it wasn't an RDFSource document.

What do you mean by "Moreover, JSON-LD, JSON, SPARQL-Update get a pass"? Gold only supports the text/turtle mime type for LDP-style POST.

Yes, the reason is that RDFa is HTML+RDFa markup. AFAIK, and please correct me if I'm mistaken, RDFa is commonly serialized as text/html, and stored in .html documents (which are also treated as text/html by most servers). This makes it semantically different from Turtle, and arguably different from JSON* (which in itself is a data format).

@timbl
Copy link
Member

timbl commented Dec 24, 2015

@deiu But why not just allow POST too add any resource? At the moment I like @csarven don't see why the ability to be posted into a directory shouldn't be given to any resource, HTML, PNG, MPEG, whatever?

@csarven
Copy link
Member Author

csarven commented Dec 24, 2015

Looking at:

More specifically:

"A snapshot of the state can be expressed as an RDF graph. For example, any web document that has an RDF-bearing representation may be considered an RDF source."

I think that qualifies for any RDF representation e.g., Turtle, RDFa. We can re-serialize RDFa to Turtle or any other RDF graph just the same.

An RDFa document may have Content-Type: text/html or application/xhtml+xml . It can be stored in any filename (with any file extension). Moreover, an HTML+RDFa document doesn't have to be stored in a file per se. The graph/information can originate in an RDF (triple/quad) store in which gets mapped to some HTML before passing to the client. This is in fact a very common practice. What you see in HTML+RDFa is just one serialization (among many) of the RDF graph.

I don't understand what you mean by "data format", or what qualifies as "data" and what doesn't. An HTML+RDFa contains triple statements no different than the others.

I've pointed out JSON-LD, JSON, SPARQL Update because of them being listed in mimeParsers:

gold/mime.go

Line 14 in 40e5ee4

var mimeParser = map[string]string{

which are being exceptions which are checked in this line:

gold/server.go

Line 272 in 40e5ee4

if dataMime != "multipart/form-data" && !dataHasParser && req.Method != "PUT" && req.Method != "HEAD" && req.Method != "OPTIONS" {
. At least that's my reading of it.

@deiu
Copy link
Contributor

deiu commented Dec 24, 2015

"The graph/information can originate in an RDF (triple/quad) store in which gets mapped to some HTML before passing to the client. This is in fact a very common practice." Sure, but gold does not use a triple store, it uses the file system. An HTML document with RDFa markup is ultimately an HTML document, and gold will treat it as such. A client may decide to parse it as RDFa if it wants, but the document will be served as text/html. A client may also decide to store it on a server running gold, but in doing so it will set the Content-Type header to text/html, to match the mime type of the actual document.

What I'm trying to say is that there are no plans to have gold treat RDFa documents as anything else other than HTML documents.

@deiu
Copy link
Contributor

deiu commented Dec 24, 2015

@timbl @csarven I will need to check if it doesn't create any conflicts with the LDP part. There might be a way to have it be an easy fix, but then all resources will be treated equal -- i.e. no longer making a difference between RDFSource and NonRDFSource.

@deiu deiu reopened this Dec 24, 2015
@csarven
Copy link
Member Author

csarven commented Dec 24, 2015

Thanks @deiu . I think this raises a gray area which people have to figure out how to deal with. This is due to what LDP says about supporting other representations:

http://www.w3.org/TR/ldp/#ldp-http-other-representations

And the problematic sentence is:

HTTP content negotiation is used to select the format.

I think what the LDP spec intended was actually for "vanilla" HTML, not HTML+RDFa. Rightfully so, we can imagine that by only looking content-type for "vanilla" HTML, it is a NonRDFSource (for all intents and purposes). Since HTML+RDFa is an RDFSource, the spec's intention doesn't hold up. It actually doesn't cover that scenario since there is a conflict.

As far as I know, we are only typing resources with ldp:Resource, and not something with more specific i.e., RDFSource or NonRDFSource. So, I think the simplest and immediate change would be to at least let text/html and application/xhtml+xml go through the POST. We don't have to open the gates completely and lose some control over how those two Source types are differentiated. If some new serialization/format comes up, we can deal with that later, and add it to the list of exceptions. In fact, an XML family member, like SVG may be accompanied with RDF/XML all meanwhile it uses image/svg+xml even though we can dereference an SVG resource and get an RDF graph out of the box. I don't know about all media types e.g., PNG, MPEG, but if for example they may be accompanied with an RDF graph (and recognized as such) it is probably a good enough candidate to be (alternatively) treated as an RDFSource and not only NonRDFSource.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants