Skip to content

Commit

Permalink
webdav: add metalink support
Browse files Browse the repository at this point in the history
Motivation:

Metalink (see RFC 5854) is a standard XML-based format for describing
how to download a bunch of files.  In addition to providing the URLs, it
includes some file-level metadata; for example, file size and checksums.
There are several programs available that support metalink (e.g.,
aria2).  Metalink/HTTP (see RFC 6249) is a standard approach for
embedding metalink metadata into HTTP response headers.  Metalink/HTTP
also describes how to link a resource to a corresponding metalink
description.

A directory is often used to group together related content and
sometimes people would like to download all of that related content;
i.e., download all files in a directory.

Therefore, providing a metalink description of the contents of a
directory would allow a client (that supports the format) to download
all files from that directory.

Modification:

Refactor content-negotiation support to make it more modular.

Update the GET response of DcacheDirectoryResource to support multiple
formats; triggered by content-negotiation or by a query parameter in the
URL.

Add the ability to render a directory listing into an XML document that
follows the metalink format.

NB. This patch provides a valid, working proof-of-concept
implementation. There are (deliberately) some limitations; in
particular, it

  a. includes only the immediate children of the directory: there is no
     recursion,

  b. simply includes entries each file; in effect, asuming that all
     files are either public or the client is able to authenticate.

The patch also updates the HTML-based directory GET and HEAD responses
so they include an HTTP "Link" response header that identifies that
directory's corresponding metalink description, as described by RFC
6249.

Result:

The WebDAV endpoint now provides a metalink description of a directory's
(immediate) contents, simplifying the process of downloading files from
a directory.  The description is available through either HTTP content
negotiation or including a query parameter in the URL.  The HTML page
(describing a directory) also includes a link to the metalink
description.

Target: master
Requires-notes: yes
Requires-book: no
Patch: https://rb.dcache.org/r/14078/
Acked-by: Tigran Mkrtchyan
  • Loading branch information
paulmillar committed Sep 6, 2023
1 parent 707000c commit 1f79ce5
Show file tree
Hide file tree
Showing 5 changed files with 486 additions and 96 deletions.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
* dCache - http://www.dcache.org/
*
* Copyright (C) 2021 Deutsches Elektronen-Synchrotron
* Copyright (C) 2021-2023 Deutsches Elektronen-Synchrotron
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
Expand All @@ -20,11 +20,8 @@

import static com.google.common.base.Preconditions.checkArgument;
import static com.google.common.base.Preconditions.checkState;
import static java.util.Comparator.comparingDouble;
import static java.util.Objects.requireNonNull;

import com.google.common.base.Splitter;
import com.google.common.collect.Multimaps;
import com.google.common.net.MediaType;
import io.milton.http.HrefStatus;
import io.milton.http.Range;
Expand Down Expand Up @@ -56,15 +53,6 @@ public class AcceptAwareResponseHandler implements WebDavResponseHandler, Buffer

private static final Logger LOGGER = LoggerFactory.getLogger(AcceptAwareResponseHandler.class);

/**
* Describes which handler to prefer if the client's Accept request header highest q-value
* supported selects multiple handlers, of which none are the default handler. This exists
* mostly to provide consistent behaviour: the exact choice (probably) doesn't matter too much.
*/
private static final Comparator<MediaType> PREFERRING_SHORTER_NAMES =
Comparator.<MediaType>comparingInt(m -> m.toString().length())
.thenComparing(Object::toString);

private final Map<MediaType, WebDavResponseHandler> handlers = new HashMap<>();
private MediaType defaultType;
private WebDavResponseHandler defaultHandler;
Expand Down Expand Up @@ -260,84 +248,7 @@ public String generateEtag(Resource r) {

private WebDavResponseHandler selectHandler(Request request) {
String accept = request.getRequestHeader(Request.Header.ACCEPT);

if (accept == null) {
LOGGER.debug("Client did not specify Accept header,"
+ " responding with default MIME-Type \"{}\"", defaultType);
return defaultHandler;
}

LOGGER.debug("Client indicated response preference as \"Accept: {}\"", accept);
var acceptMimeTypes = Splitter.on(',').omitEmptyStrings().trimResults().splitToList(accept);
Comparator<MediaType> preferDefaultType = (MediaType m1, MediaType m2)
-> m1.equals(defaultType) ? -1 : m2.equals(defaultType) ? 1 : 0;

try {
var responseType = acceptMimeTypes.stream()
.map(MediaType::parse)
.sorted(comparingDouble(AcceptAwareResponseHandler::qValueOf).reversed())
.map(AcceptAwareResponseHandler::dropQParameter)
.flatMap(acceptType -> handlers.keySet().stream()
.filter(m -> m.is(acceptType))
.sorted(preferDefaultType.thenComparing(PREFERRING_SHORTER_NAMES)))
.findFirst();

responseType.ifPresent(m -> LOGGER.debug("Responding with MIME-Type \"{}\"", m));

return responseType.map(handlers::get).orElseGet(() -> {
LOGGER.debug("Responding with default MIME-Type \"{}\"", defaultType);
return defaultHandler;
});
} catch (IllegalArgumentException e) {
// Client supplied an invalid media type. Oh well, let's use a default.
LOGGER.debug("Client supplied invalid Accept header \"{}\": {}",
accept, e.getMessage());
return defaultHandler;
}
}

/**
* Filter out the 'q' value from the MIME-Type, if one is present. This is needed because the
* MIME-Type matching requires the server supports all parameters the client supplied, which
* includes the 'q' value. As examples: {@literal "Accept: text/plain" matches
* "text/plain;charset=UTF_8" "Accept: text/plain;charset=UTF_8" matches
* "text/plain;charset=UTF_8" "Accept: text/plain;q=0.5" does NOT match
* "text/plain;charset=UTF_8" } as there is no {@literal q} parameter in the right-hand-side.
* <p>
* Stripping off the q value allows {@literal Accept: text/plain;q=0.5} (matched as {@literal
* text/plain}) to match {@literal text/plain;charset=UTF_8}.
*/
private static MediaType dropQParameter(MediaType acceptType) {
var params = acceptType.parameters();

MediaType typeWithoutQ;
if (params.get("q").isEmpty()) {
LOGGER.debug("MIME-Type \"{}\" has no q-value", acceptType);
typeWithoutQ = acceptType;
} else {
var paramsWithoutQ = Multimaps.filterKeys(params, k -> !k.equals("q"));
typeWithoutQ = acceptType.withParameters(paramsWithoutQ);
LOGGER.debug("Stripping q-value from MIME-Type \"{}\" --> \"{}\"",
acceptType, typeWithoutQ);
}

return typeWithoutQ;
}

private static float qValueOf(MediaType m) {
List<String> qValues = m.parameters().get("q");

if (qValues.isEmpty()) {
return 1.0f;
}

String lastQValue = qValues.get(qValues.size() - 1);
try {
return Float.parseFloat(lastQValue);
} catch (NumberFormatException e) {
LOGGER.debug("MIME-Type \"{}\" has invalid q value: {}", m,
lastQValue);
return 1.0f;
}
var type = Requests.selectResponseType(accept, handlers.keySet(), defaultType);
return handlers.get(type);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import static org.dcache.namespace.FileAttribute.STORAGEINFO;

import com.google.common.collect.ImmutableSet;
import com.google.common.net.MediaType;
import diskCacheV111.services.space.Space;
import diskCacheV111.services.space.SpaceException;
import diskCacheV111.util.CacheException;
Expand All @@ -20,6 +21,7 @@
import io.milton.http.LockToken;
import io.milton.http.Range;
import io.milton.http.Request;
import io.milton.http.Response;
import io.milton.http.exceptions.BadRequestException;
import io.milton.http.exceptions.ConflictException;
import io.milton.http.exceptions.NotAuthorizedException;
Expand All @@ -39,6 +41,7 @@
import java.io.OutputStreamWriter;
import java.io.UnsupportedEncodingException;
import java.io.Writer;
import java.net.URI;
import java.net.URISyntaxException;
import java.util.ArrayList;
import java.util.Collections;
Expand All @@ -47,6 +50,7 @@
import java.util.Optional;
import javax.xml.namespace.QName;
import org.dcache.space.ReservationCaches;
import javax.xml.stream.XMLStreamException;
import org.dcache.vehicles.FileAttributes;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
Expand All @@ -60,6 +64,13 @@ public class DcacheDirectoryResource
MakeCollectionableResource, LockingCollectionResource,
MultiNamespaceCustomPropertyResource {

/**
* An EntityWriter provides the entity (i.e., the contents) of a GET request.
*/
private interface EntityWriter {
public void writeEntity(Writer writer) throws InterruptedException, CacheException, IOException;
};

private static final Logger LOGGER = LoggerFactory.getLogger(DcacheDirectoryResource.class);

private static final String DAV_NAMESPACE_URI = "DAV:";
Expand All @@ -73,6 +84,16 @@ public class DcacheDirectoryResource
private static final PropertyMetaData READONLY_LONG = new PropertyMetaData(READ_ONLY,
Long.class);

private static final MediaType DEFAULT_ENTITY_TYPE = MediaType.HTML_UTF_8;
private static final MediaType METALINK_ENTITY_TYPE = MediaType.create("application", "metalink4+xml");

private final Map<MediaType,EntityWriter> supportedMediaTypes = Map.of(
DEFAULT_ENTITY_TYPE, this::htmlEntity,
METALINK_ENTITY_TYPE, this::metalinkEntity);

private final Map<String,MediaType> supportedResponseMediaTypes = Map.of(
"metalink", METALINK_ENTITY_TYPE);

private final boolean _allAttributes;

public DcacheDirectoryResource(DcacheResourceFactory factory,
Expand Down Expand Up @@ -157,9 +178,10 @@ public void sendContent(OutputStream out, Range range,
throws IOException, NotAuthorizedException {
try {
Writer writer = new OutputStreamWriter(out, UTF_8);
if (!_factory.deliverClient(_path, writer)) {
_factory.list(_path, writer);
}
MediaType type = MediaType.parse(contentType);
EntityWriter entityWriter = Optional.ofNullable(supportedMediaTypes.get(type))
.orElseThrow();
entityWriter.writeEntity(writer);
writer.flush();
} catch (PermissionDeniedCacheException e) {
throw WebDavExceptions.permissionDenied(this);
Expand All @@ -171,14 +193,65 @@ public void sendContent(OutputStream out, Range range,
}
}

private void htmlEntity(Writer writer) throws IOException, InterruptedException,
CacheException {
if (_factory.deliverClient(_path, writer)) {
return;
}

_factory.list(_path, writer);
}

private void metalinkEntity(Writer writer) throws IOException,
InterruptedException, CacheException {
Request request = HttpManager.request();
// NB. Milton ensures directory URLs end with a '/' by issuing a redirection if not.
URI uri = URI.create(request.getAbsoluteUrl());
try {
_factory.metalink(_path, writer, uri);
} catch (XMLStreamException e) {
throw new WebDavException("Failed to write metalink description: " + e, this);
}
}

@Override
public Long getMaxAgeSeconds(Auth auth) {
return null;
}

@Override
public String getContentType(String accepts) {
return "text/html; charset=utf-8";
Request request = HttpManager.request();
Map<String,String> params = request.getParams();
MediaType type = Optional.ofNullable(params)
.map(p -> p.get("type"))
.flatMap(Optional::ofNullable)
.map(supportedResponseMediaTypes::get)
.flatMap(Optional::ofNullable)
.orElseGet(() -> Requests.selectResponseType(accepts,
supportedMediaTypes.keySet(), DEFAULT_ENTITY_TYPE));

// We must set the "Link" HTTP response header here, as we want it to appear for both HEAD
// and GET requests, and (not unreasonably) Milton doesn't call sendContent for HEAD requests.
if (type.equals(DEFAULT_ENTITY_TYPE)) {
/* There is a slight subtly here. A GET request that targets a directory with a
* non-trailing-slash URL (e.g., "https://example.org/my-directory") triggers Milton to
* issue a redirection to the corresponding trailing-slash URL
* (e.g., "https://example.org/my-directory/"). This redirection does not happen for
* HEAD requets. Therefore, metalinkUrl may be a non-trailing-slash URL for HEAD
* requests, while this cannot happen for GET requests.
*
* A non-trailing-slash metalinkUrl value is not a problem as a corresponding GET
* request will trigger a similiar redirection (to the equivalent trailing-slash URL)
* while preserving the query parameter.
*/
String metalinkUrl = HttpManager.request().getAbsoluteUrl() + "?type=metalink";
String linkValue = String.format("<%s>; rel=describedby; type=\"%s\"", metalinkUrl,
METALINK_ENTITY_TYPE);
HttpManager.response().setNonStandardHeader("Link", linkValue);
}

return type.toString();
}

@Override
Expand Down
Loading

0 comments on commit 1f79ce5

Please sign in to comment.