views: FAIR signposting level 1 support (HTTP Link headers) #2938

ptamarit · 2024-12-10T15:07:00Z

❤️ Thank you for your contribution!

Description

Fixes FAIR signposting level 1 support (HTTP Link headers & link rel item) #2937
Depends on views: FAIR signposting level 1 support & remove linkset link to itself invenio-rdm-records#1908
See description in linked ticket

Checklist

Ticks in all boxes and 🟢 on all GitHub actions status checks are required to merge:

I'm aware of the code of conduct.
I've created logical separate commits and followed the commit message format.
I've added relevant test cases.
I've added relevant documentation.
I've marked translation strings.
I've identified the copyright holder(s) and updated copyright headers for touched files (>15 lines contributions).
I've NOT included third-party code (copy/pasted source code or new dependencies).
- If you have added third-party code (copy/pasted or new dependencies), please reach out to an architect.

Frontend

I've followed the CSS/JS and React guidelines.
I've followed the web accessibility guidelines.
I've followed the user interface guidelines.

Reminder

By using GitHub, you have already agreed to the GitHub’s Terms of Service including that:

You license your contribution under the same terms as the current repository’s license.
You agree that you have the right to license your contribution under the current repository’s license.

ptamarit · 2024-12-10T15:08:32Z

invenio_app_rdm/records_ui/views/decorators.py

+def _get_signposting_authors(record):
+    authors = []
+    # Limit authors to the first 10.
+    for creator in islice(record["metadata"]["creators"], 0, 10):
+        for identifier in creator["person_or_org"].get("identifiers", []):
+            if identifier["scheme"] == "orcid":
+                authors.append(
+                    _get_header(
+                        "author", "https://orcid.org/" + identifier["identifier"]
+                    )
+                )
+    return authors


Lars suggested that we might choose to not include authors at all since the list might be long and the full list can be found in the linkset.

Maybe we could apply some sensible limit? E.g. if less than 50 authors, include, otherwise don't include at all and basically have people rely on the explicit authors linkset?

~~[ ] Include authors if up to 50, otherwise do not include.~~

I'm now relying on invenio_rdm_records/resources/serializers/signposting/schema.py's serialize_author which serializes all the authors.

ptamarit · 2024-12-10T15:09:18Z

invenio_app_rdm/records_ui/views/decorators.py

+                    _get_header(
+                        "author", "https://orcid.org/" + identifier["identifier"]
+                    )
+                )


Should we support other schemes like ROR, etc?

It might be that the safer option would be to use something like idutils.to_url(identifier, scheme) which will consistently produce a link.

~~[ ] Use idutils.to_url for authors.~~

I'm now relying on invenio_rdm_records/resources/serializers/signposting/schema.py's serialize_author which picks the first linkable ID.

ptamarit · 2024-12-10T15:11:44Z

invenio_app_rdm/records_ui/views/decorators.py

+        # then try to get the optional `link` from the custom license.
+        url = right.get("props", {}).get("url") or right.get("link")
+        if url:
+            licenses.append(_get_header("license", url))


The FAIR Signposting docs recommends to use SPDX license identifier (e.g. https://spdx.org/licenses/CC0-1.0).
However, in Zenodo we store URLs like https://creativecommons.org/publicdomain/zero/1.0/legalcode and not spdx.org URLs.

If props["scheme"] == "spdx" I think we can safely generate the URL like https://spdx.org/licenses/{right["id"]}. We might have licenses (or even non-SPDX licenses), in which case just using url like here would be ok.

Unfortunately our IDs are lower-cased (e.g. antlr-pd-fallback) while the SPDX URLs are are mixed-cased and case-sensitive (e.g. https://spdx.org/licenses/ANTLR-PD-fallback.html).

Ouch, I tried in the browser and copy-pasting URLs for some reason kept the original case... Ok, this is a bummer, I think we'll have to add the original spdx ID with the exact case as a props.spdx_id field or similar...

I think it would be fine to shelve this and just use the url, depends on whether we want to spend more time to re-import SPDX and update the existing license vocabulary (funnily, the dump we have is from more than a year ago).

ptamarit · 2024-12-10T15:13:48Z

invenio_app_rdm/records_ui/views/decorators.py

+
+def _get_signposting_linkset(pid_value):
+    api_url = record_url_for(_app="api", pid_value=pid_value)
+    return _get_header("linkset", api_url, "application/linkset+json")


Note: this is required for level 2 support and was already added in a previous pull request.
Here we only include a link of the type "application/linkset+json", but the docs requires to also include a link of type "application/linkset".

ptamarit · 2024-12-10T15:16:04Z

invenio_app_rdm/records_ui/views/decorators.py

+        ],
+        resource_type["id"],
+    )
+    url_schema_org = props.get("schema.org")


Not sure if there's a better way to do this lookup.
I followed what's done in invenio_rdm_records/resources/serializers/signposting/schema.py.

Perhaps just check that it is cached so we don't query db on every landing page request

From what I see these are indeed cached here, which is also mentioned in get_vocabulary_props.

ptamarit · 2024-12-10T15:18:03Z

invenio_app_rdm/records_ui/views/decorators.py

This is quite a lot of methods added to decorators.py, should it be moved to a signposting-specific file?

👍 Agree, I thought the was already some signposting-related directory.

~~[ ] Move the code to a signposting-related file or directory.~~

There is now much less code in decorators.py now that I rely on invenio_rdm_records/resources/serializers/signposting/schema.py.

slint

LGTM, some minor comments only

slint · 2024-12-11T16:14:36Z

invenio_app_rdm/records_ui/views/decorators.py

+                    _get_header(
+                        "author", "https://orcid.org/" + identifier["identifier"]
+                    )
+                )


It might be that the safer option would be to use something like idutils.to_url(identifier, scheme) which will consistently produce a link.

slint · 2024-12-11T16:15:40Z

invenio_app_rdm/records_ui/views/decorators.py

+def _get_signposting_authors(record):
+    authors = []
+    # Limit authors to the first 10.
+    for creator in islice(record["metadata"]["creators"], 0, 10):
+        for identifier in creator["person_or_org"].get("identifiers", []):
+            if identifier["scheme"] == "orcid":
+                authors.append(
+                    _get_header(
+                        "author", "https://orcid.org/" + identifier["identifier"]
+                    )
+                )
+    return authors


Maybe we could apply some sensible limit? E.g. if less than 50 authors, include, otherwise don't include at all and basically have people rely on the explicit authors linkset?

slint · 2024-12-11T16:18:21Z

invenio_app_rdm/records_ui/views/decorators.py

+        # then try to get the optional `link` from the custom license.
+        url = right.get("props", {}).get("url") or right.get("link")
+        if url:
+            licenses.append(_get_header("license", url))


If props["scheme"] == "spdx" I think we can safely generate the URL like https://spdx.org/licenses/{right["id"]}. We might have licenses (or even non-SPDX licenses), in which case just using url like here would be ok.

slint · 2024-12-11T16:22:56Z

tests/ui/test_signposting_ui.py

+    api_url = f"https://127.0.0.1:5000/api/records/{record_with_file.id}"
+    filename = "article.txt"
+
+    res = client.head(f"/records/{record_with_file.id}")


question/comment: I think the HEAD implementation for Flask/Invenio is that we just treat it as a GET request and skip the body of the response. In that case, we're not saving anything in terms of computation/performance (if that was the original goal of just testing the HEAD response only).

IMHO, it's ok to keep as is, since none of the logic done for generating the header links is that much more complex or adds that big of an overhead compared to the rest of the GET response.

The Link header should be included in both GET and HEAD, as stated in the FAIR Signposting docs says:

In addition to being available via HTTP GET requests, the HTTP header that contains Link is accessible via the HTTP HEAD request, which only returns transaction metadata not a resource representation. As such machine agents can obtain a map for their journey by issuing a HTTP HEAD even against resources that have access restrictions. All the while saving bandwidth and hence energy.

Modify the tests to assert not only HEAD, but also GET.

…ofileLvl1Serializer)

ptamarit · 2024-12-16T10:25:42Z

invenio_app_rdm/records_ui/views/decorators.py

+def add_signposting_content_resources(f):
+    """Add signposting links to the content resources view's response headers."""

    @wraps(f)
    def view(*args, **kwargs):
        response = make_response(f(*args, **kwargs))

        # Relies on other decorators having operated before it
        pid_value = kwargs["pid_value"]
-        signposting_link = record_url_for(_app="api", pid_value=pid_value)

-        response.headers["Link"] = (
-            f'<{signposting_link}> ; rel="linkset" ; type="application/linkset+json"'  # fmt: skip
-        )
+        signposting_headers = [
+            _get_signposting_collection(pid_value),
+            _get_signposting_linkset(pid_value),
+        ]
+
+        response.headers["Link"] = " , ".join(signposting_headers)
+
+        return response
+
+    return view
+
+
+def add_signposting_metadata_resources(f):
+    """Add signposting links to the metadata resources view's response headers."""
+
+    @wraps(f)
+    def view(*args, **kwargs):
+        response = make_response(f(*args, **kwargs))
+
+        # Relies on other decorators having operated before it
+        pid_value = kwargs["pid_value"]
+
+        signposting_headers = [
+            _get_signposting_describes(pid_value),
+            _get_signposting_linkset(pid_value),
+        ]
+
+        response.headers["Link"] = " , ".join(signposting_headers)
+


Note that, unlike the Landing Page which relies on invenio_rdm_records.resources.serializers.signposting, the Content Resources and Metadata Resources are not relying on invenio_rdm_records.resources.serializers.signposting because:

ContentResourceSchema and ContentResourceSchema expect the record to be passed via context={"record_dict"} which makes it more difficult to reuse here.

The logic is pretty simple to add only the collection, describes and linkset headers, so re-implementing it here is not that bad.

ptamarit · 2024-12-20T09:25:27Z

tests/ui/test_signposting_ui.py

+        # The test record does not have a license.
+        '<https://schema.org/Photograph> ; rel="type"',
+        '<https://schema.org/AboutPage> ; rel="type"',
+        f'<{api_url}> ; rel="linkset" ; type="application/linkset+json"',


The logic for the landing page is implemented in FAIRSignpostingProfileLvl1Serializer in invenio-rdm-records and is already tested there (see inveniosoftware/invenio-rdm-records#1908).
It stills makes sense to at least issue the HTTP call to the endpoint here, to make sure that the decorator is working properly, but maybe the assertion should be less detailed to avoid having to adapt this test every time we modify the other module?

views: FAIR signposting level 1 support

0042190

ptamarit commented Dec 10, 2024

View reviewed changes

views: FAIR signposting level 1 support (handle disabled files)

15672de

ptamarit force-pushed the 2937-fair-signposting-level-1 branch from 0929cec to 15672de Compare December 10, 2024 15:55

ptamarit changed the title ~~views: FAIR signposting level 1 support~~ views: FAIR signposting level 1 support (HTTP Link headers) Dec 10, 2024

slint approved these changes Dec 11, 2024

View reviewed changes

tests: FAIR signposting level 1 support (check HEAD and GET)

17401b6

ptamarit marked this pull request as draft December 13, 2024 07:53

views: FAIR signposting level 1 support (relying on FAIRSignpostingPr…

2e533ed

…ofileLvl1Serializer)

ptamarit marked this pull request as ready for review December 13, 2024 16:22

ptamarit mentioned this pull request Dec 16, 2024

views: FAIR signposting level 1 support & remove linkset link to itself inveniosoftware/invenio-rdm-records#1908

Open

10 tasks

ptamarit commented Dec 16, 2024

View reviewed changes

ptamarit commented Dec 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

views: FAIR signposting level 1 support (HTTP Link headers) #2938

views: FAIR signposting level 1 support (HTTP Link headers) #2938

ptamarit commented Dec 10, 2024 •

edited

Loading

ptamarit Dec 10, 2024

slint Dec 11, 2024

ptamarit Dec 12, 2024 •

edited

Loading

ptamarit Dec 16, 2024

ptamarit Dec 10, 2024

slint Dec 11, 2024

ptamarit Dec 12, 2024 •

edited

Loading

ptamarit Dec 16, 2024

ptamarit Dec 10, 2024

slint Dec 11, 2024

ptamarit Dec 12, 2024

slint Dec 12, 2024

ptamarit Dec 10, 2024

ptamarit Dec 10, 2024

lnielsen Dec 10, 2024

slint Dec 11, 2024

ptamarit Dec 10, 2024

slint Dec 11, 2024

ptamarit Dec 12, 2024 •

edited

Loading

ptamarit Dec 16, 2024

slint left a comment

slint Dec 11, 2024

slint Dec 11, 2024

slint Dec 11, 2024

slint Dec 11, 2024

ptamarit Dec 12, 2024 •

edited

Loading

ptamarit Dec 16, 2024

ptamarit Dec 20, 2024

views: FAIR signposting level 1 support (HTTP Link headers) #2938

Are you sure you want to change the base?

views: FAIR signposting level 1 support (HTTP Link headers) #2938

Conversation

ptamarit commented Dec 10, 2024 • edited Loading

Description

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ptamarit Dec 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ptamarit Dec 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ptamarit Dec 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

slint left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ptamarit Dec 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ptamarit commented Dec 10, 2024 •

edited

Loading

ptamarit Dec 12, 2024 •

edited

Loading

ptamarit Dec 12, 2024 •

edited

Loading

ptamarit Dec 12, 2024 •

edited

Loading

ptamarit Dec 12, 2024 •

edited

Loading