SOLR-17351: Decompose filestore "get file" API #3047

gerlowskija · 2025-01-20T15:07:31Z

https://issues.apache.org/jira/browse/SOLR-17351

Description

Solr's filestore "get file" API actually covers 4 or 5 distinct operations. Depending on provided query parameters, it can:

return filestore entry metadata (when meta=true is specified)
instruct the receiving Solr node to pull a file from another node's filestore and cache it locally (when getFrom=someOtherNode is specified)
instruct the receiving Solr node to push its cached copy of a file out to all other Solr nodes (when sync=true is specified)
return raw file contents
return JSON-ified file contents

This makes the code for this endpoint somewhat complex. It also makes it hard to model and parse the response on the client side.

Solution

This PR splits up the "get file" endpoint into a number of different APIs. Specifically:

metadata-fetching has been moved out to the endpoint, GET/api/cluster/filestore/metadata/some/path.txt
Filestore commands such as pushing/pulling files are now available at: POST /api/cluster/filestore/commands
Support for "JSON-ified" file data has been dropped in this PR (but will be retained but deprecated in the eventual 9.x backport)

These divisions allow us to generate SolrRequest/SolrResponse classes representing these APIs, meaning that SolrJ users no longer need to use GenericSolrRequest/GenericSolrResponse.

Tests

Some rewriting in TestDistribFileStore. Existing package and filestore tests continue to pass.

Checklist

Please review the following and check all that apply:

I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
I have created a Jira issue and added the issue ID to my pull request title.
I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
I have developed this patch against the main branch.
I have run ./gradlew check.
I have added tests for my changes.
I have added documentation for the Reference Guide

Tests pass

Identical to the pre-existing`/cluster/files/a/b.txt?meta=true`, but in its own endpoint. Does NOT remove the existing endpoint or switch tests over to use it, but we will want to do this eventually on `main`.

Identical to the pre-existing`/node/files/a/b.txt`, but in its own endpoint. Does NOT remove the existing endpoint or switch tests over to use it. Those can be handled in a separate commit to enable this one to be easily backported later to 9x if desired. This also doesn't enable codegen for the new API, though I think the blockers are sufficiently cleared up now to enable that if desired.

Still no modifications to use the API internally, or removal of the old endpoint, in order to ease backporting.

This should probably be pulled into a separate PR/branch, and modified to address all relevant endpoints and other loose ends of SOLR-17562.

TestDistribFileStore fails with the changes in this commit. The changes themselves are fine, but it fails because of a pre-existing *ahem* feature in SolrClient where SolrParams are encoded as form-params instead of query-params in some circumstances. I could probably hack around this for the moment (e.g. by using a builder setter to ensure 'getFrom' is sent as a true query param). But I've decided to halt progress on this branch and break some other pieces out into their own PRs, while I send some questions out to the community on this front.

Ensures v2 POST query-params aren't put in a 'form' body.

gerlowskija · 2025-01-20T15:09:31Z

Still TODO: ref-guide updates, additional testing.

epugh · 2025-01-20T15:39:02Z

Does this impact #3031 ? Since I am using cc.getFileStore I don't think so, but wanted to check.

gerlowskija · 2025-01-20T21:28:19Z

That PR uses the Java interface, and not the HTTP APIs directly, so it should be fine afaict.

epugh · 2025-01-21T18:58:48Z

solr/api/src/java/org/apache/solr/client/api/endpoint/ClusterFileStoreApis.java

+  @POST
+  @Operation(
+      summary =
+          "Pushes a file to other nodes, or pulls a file from other nodes in the Solr cluster.",


is the fact that one thing does a push or pull suggest a issue to fix? https://en.wikipedia.org/wiki/List_of_Doctor_Dolittle_characters#Pushmi-Pullyu

"Syncs a file via either pushing or pulling across the nodes in the Solr cluster." ???

(This ends up being moot since I take your "split up the endpoint" suggestion on L116)

epugh · 2025-01-21T18:59:46Z

solr/api/src/java/org/apache/solr/client/api/endpoint/ClusterFileStoreApis.java

+          String getFrom,
+      @Parameter(
+              description =
+                  "If true, triggers syncing for this file across all nodes in the filestore")


maybe it's own end point? Or, do we somehow eliminate the need for this?

Huh, I guess splitting these into separate endpoints is a little more in keeping with the currently documented convention. Done 👍

do we somehow eliminate the need for this?

In terms of eliminating the need entirely: I'm open to that if you have ideas? As I understand the current filestore impl, these two commands are essentially internal APIs that Solr relies on to ensure filestore entries are (eventually) present on all nodes. So we'd need some other way of doing that?

Relatedly, it'd be really nice if we had some way to flag the "internal-ness" of these APIs. We want the code-generation to cover them, so that solr-core itself has nice classes to use. But ideally something could signal to end-users that they should never be using these APIs themselves. A different package name? Some sort of Javadoc tag? Maybe that's worth its own JIRA ticket to spur some brainstorming...

solr/core/src/java/org/apache/solr/filestore/ClusterFileStore.java

epugh · 2025-01-21T19:25:13Z

solr/core/src/java/org/apache/solr/filestore/ClusterFileStore.java

@@ -194,6 +301,48 @@ public SolrJerseyResponse deleteFile(String filePath, Boolean localDelete) {
    return response;
  }

+  @Override
+  @PermissionName(PermissionNameProvider.Name.FILESTORE_WRITE_PERM)
+  public SolrJerseyResponse executeFileStoreCommand(String path, String getFrom, Boolean sync) {


aren't we trying to get rid of the "command" pattern in our new APIs?

We're definitely trying to avoid it where-ever possible. But it's not always possible unfortunately. We have a documented convention for handling these (hopefully rare) cases, which is pretty close to what this PR does.

Open to coming up for a different pattern for these cases if you have a suggestion for what they might look like?

epugh · 2025-01-21T19:30:12Z

solr/core/src/java/org/apache/solr/filestore/NodeFileStore.java

-      } catch (IOException e) {
-        throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Error getting file ", e);
-      }
+      ClusterFileStore.syncToAllNodes(fileStore, path);


I've been a bit confused between NodeFileStore, ClusterFileStore, DistribFileStore ;-).

Yeah, the name conflicts are unfortunate.

On the bright side - I think we can probably make "NodeFileStore" and "NodeFileStoreApi" go away as part of this PR. (Because with these APIs changes, all of our filestore endpoints will be under the "/cluster/filestore" path).

That would just leave:

ClusterFileStoreApi - annotated interface defining our filestore APIs

ClusterFileStore - implementations of those APIs

DistribFileStore - internal implementation of various filestore operations (syncing, deleting, creating, etc.)

Alright, the most recent iteration removes NodeFileStore/NodeFileStoreApi. There's still some room for confusion between ClusterFileStoreApi, ClusterFileStore, and DistribFileStore, but overall this PR leaves things better than it found it in that regard 👍

epugh · 2025-01-21T19:31:08Z

solr/core/src/java/org/apache/solr/filestore/NodeFileStore.java

-              .collect(Collectors.toList());
-      directoryListingResponse.files = Collections.singletonMap(path, directoryContents);
-      return directoryListingResponse;
+    if (type == FileStore.FileType.NOFILE


this feels overly finicky logic.. why is it so hard to decide when to get metadata? What are we missing that doens't just make this simple. A thing has a meta thing. That should be the only way.

Agreed. I think the reason this is finicky is that we cram a bunch of different cases into the "metadata" POJO. It's trying to represent too much. It gets used not just for file-metadata. but also directory listings, representing 404s, etc.

A good refactor IMO would to:

create a class DirectoryMetadata extends Metadata, to use in the directory case.

Have the 404 case throw new SolrException(ErrorCode.NOT_FOUND, "...");

That said - this code all pre-exists this PR, and I'm leery to wade into refactors for scope reasons.

epugh · 2025-01-21T19:31:42Z

solr/core/src/java/org/apache/solr/filestore/NodeFileStore.java

-                  throw new SolrException(
-                      SolrException.ErrorCode.SERVER_ERROR, "Error reading file " + pathCopy);
+    // User wants to get the "raw" file
+    // TODO Should we be trying to json-ify otherwise "raw" files in this way?  It seems like a


What is the use case FOR this? Do we have one right now?

Not that I know of. I'm going to nuke it based on the discussion here, and that might flush out something I don't understand.

(We'll need to make sure we're only deprecating it in the 9.x backport, but that's not a big deal...)

There's no discussion of it either in the PR or the JIRA that introduced it (unless I'm missing something?). My guess is that it was an attempt at convenience?

In any case; I think this entire file can now go. See my comment above for more details.

epugh · 2025-01-21T19:32:39Z

solr/core/src/java/org/apache/solr/packagemanager/PackageManager.java

@@ -173,7 +173,7 @@ public void uninstall(String packageName, String version)
        String.format(Locale.ROOT, "/package/%s/%s/%s", packageName, version, "manifest.json"));
    for (String filePath : filesToDelete) {
      DistribFileStore.deleteZKFileEntry(zkClient, filePath);
-      String path = "/api/cluster/files" + filePath;
+      String path = "/api/cluster/filestore/files" + filePath;


i like the more verbose path I think... Can a filestore have anything under than files? If not, what about just "/api/cluster/filestore"

Can a filestore have anything under than files?

It can: "metadata" and "commands" are both under the "filestore" path as siblings to "files".

epugh · 2025-01-21T19:33:34Z

solr/solr-ref-guide/modules/configuration-guide/pages/package-manager-internals.adoc

@@ -115,7 +115,7 @@ openssl dgst -sha1 -sign my_key.pem runtimelibs.jar | openssl enc -base64 | sed
 +
 [source, bash]
 ----
-curl --data-binary @runtimelibs.jar -X PUT  http://localhost:8983/api/cluster/files/mypkg/1.0/myplugins.jar?sig=<signature-of-jar>
+curl --data-binary @runtimelibs.jar -X PUT  http://localhost:8983/api/cluster/filestore/files/mypkg/1.0/myplugins.jar?sig=<signature-of-jar>


maybe drops the files?

(See other comments - "files" is our way to distinguish between "metadata" and "commands", which are both under the "/filestore" path.)

solr/core/src/java/org/apache/solr/filestore/ClusterFileStore.java

dsmiley · 2025-01-21T22:39:58Z

solr/core/src/java/org/apache/solr/filestore/DistribFileStore.java

+      ByteBuffer filedata = null;
+      try {
+        final var fileRequest = new FileStoreApi.GetFile(path);
+        final var client = coreContainer.getSolrClientCache().getHttpSolrClient(baseUrl);


Please don't use the SolrClientCache for anything other than streaming expressions. I'm stamping out such usages here: https://issues.apache.org/jira/browse/SOLR-17630
I'm particularly surprised to see you removed a use of the new requestWithBaseUrl.

Reverted to requestWithBaseUrl.

I'm a little confused about SolrClientCache going away, or only being suitable for streaming expressions though. (Perhaps I'm forgetting some context I previously had on this?) Anyway, I followed up with some questions on SOLR-17630 and we can continue that aspect of discussion there...

gerlowskija

Replies to some review comments. Still working my way through review feedback.

gerlowskija · 2025-01-24T19:35:21Z

solr/api/src/java/org/apache/solr/client/api/endpoint/ClusterFileStoreApis.java

+  @POST
+  @Operation(
+      summary =
+          "Pushes a file to other nodes, or pulls a file from other nodes in the Solr cluster.",


(This ends up being moot since I take your "split up the endpoint" suggestion on L116)

gerlowskija · 2025-01-24T19:40:03Z

solr/api/src/java/org/apache/solr/client/api/endpoint/ClusterFileStoreApis.java

+          String getFrom,
+      @Parameter(
+              description =
+                  "If true, triggers syncing for this file across all nodes in the filestore")


Huh, I guess splitting these into separate endpoints is a little more in keeping with the currently documented convention. Done 👍

do we somehow eliminate the need for this?

In terms of eliminating the need entirely: I'm open to that if you have ideas? As I understand the current filestore impl, these two commands are essentially internal APIs that Solr relies on to ensure filestore entries are (eventually) present on all nodes. So we'd need some other way of doing that?

Relatedly, it'd be really nice if we had some way to flag the "internal-ness" of these APIs. We want the code-generation to cover them, so that solr-core itself has nice classes to use. But ideally something could signal to end-users that they should never be using these APIs themselves. A different package name? Some sort of Javadoc tag? Maybe that's worth its own JIRA ticket to spur some brainstorming...

solr/core/src/java/org/apache/solr/filestore/ClusterFileStore.java

gerlowskija · 2025-01-24T20:16:37Z

solr/core/src/java/org/apache/solr/filestore/ClusterFileStore.java

@@ -194,6 +301,48 @@ public SolrJerseyResponse deleteFile(String filePath, Boolean localDelete) {
    return response;
  }

+  @Override
+  @PermissionName(PermissionNameProvider.Name.FILESTORE_WRITE_PERM)
+  public SolrJerseyResponse executeFileStoreCommand(String path, String getFrom, Boolean sync) {


We're definitely trying to avoid it where-ever possible. But it's not always possible unfortunately. We have a documented convention for handling these (hopefully rare) cases, which is pretty close to what this PR does.

Open to coming up for a different pattern for these cases if you have a suggestion for what they might look like?

solr/core/src/java/org/apache/solr/filestore/ClusterFileStore.java

gerlowskija · 2025-01-24T21:29:35Z

solr/core/src/java/org/apache/solr/filestore/DistribFileStore.java

+      ByteBuffer filedata = null;
+      try {
+        final var fileRequest = new FileStoreApi.GetFile(path);
+        final var client = coreContainer.getSolrClientCache().getHttpSolrClient(baseUrl);


Reverted to requestWithBaseUrl.

I'm a little confused about SolrClientCache going away, or only being suitable for streaming expressions though. (Perhaps I'm forgetting some context I previously had on this?) Anyway, I followed up with some questions on SOLR-17630 and we can continue that aspect of discussion there...

gerlowskija · 2025-01-24T21:30:21Z

solr/core/src/java/org/apache/solr/filestore/DistribFileStore.java

@@ -382,16 +380,14 @@ private void distribute(FileInfo info) {
          // trying to avoid the thundering herd problem when there are a very large no:of nodes


gerlowskija · 2025-01-24T21:33:48Z

solr/core/src/java/org/apache/solr/filestore/DistribFileStore.java

@@ -507,7 +503,8 @@ public void delete(String path) {

    final var solrParams = new ModifiableSolrParams();
    solrParams.add("localDelete", "true");
-    final var solrRequest = new GenericSolrRequest(DELETE, "/cluster/files" + path, solrParams);
+    final var solrRequest =
+        new GenericSolrRequest(DELETE, "/cluster/filestore/files" + path, solrParams);


Huh - I wonder why I didn't change this at the time? Anyway, fixed.

gerlowskija · 2025-01-26T13:24:09Z

solr/core/src/java/org/apache/solr/filestore/NodeFileStore.java

-      } catch (IOException e) {
-        throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Error getting file ", e);
-      }
+      ClusterFileStore.syncToAllNodes(fileStore, path);


Yeah, the name conflicts are unfortunate.

On the bright side - I think we can probably make "NodeFileStore" and "NodeFileStoreApi" go away as part of this PR. (Because with these APIs changes, all of our filestore endpoints will be under the "/cluster/filestore" path).

That would just leave:

ClusterFileStoreApi - annotated interface defining our filestore APIs

ClusterFileStore - implementations of those APIs

DistribFileStore - internal implementation of various filestore operations (syncing, deleting, creating, etc.)

gerlowskija · 2025-01-26T13:32:55Z

solr/core/src/java/org/apache/solr/filestore/NodeFileStore.java

-              .collect(Collectors.toList());
-      directoryListingResponse.files = Collections.singletonMap(path, directoryContents);
-      return directoryListingResponse;
+    if (type == FileStore.FileType.NOFILE


Agreed. I think the reason this is finicky is that we cram a bunch of different cases into the "metadata" POJO. It's trying to represent too much. It gets used not just for file-metadata. but also directory listings, representing 404s, etc.

A good refactor IMO would to:

create a class DirectoryMetadata extends Metadata, to use in the directory case.

Have the 404 case throw new SolrException(ErrorCode.NOT_FOUND, "...");

That said - this code all pre-exists this PR, and I'm leery to wade into refactors for scope reasons.

epugh · 2025-01-26T22:23:11Z

I enjoyed catching up on the comments on this PR!

gerlowskija · 2025-02-05T16:30:15Z

Alright - I think I've addressed the feedback so far? If I've missed anything, let me know. I've brought it up to date with 'main' and will aim to merge in the next few days pending any objections?

This reverts commit 2ab1243. Hoss reported a test failure due to some ObjectReleaseTracker violations. The InputStream handling did evolve a bit in the course of review, so its possible that caused some issue. Undoing this change while I investigate...

This PR splits up the "get file" endpoint into a number of different APIs. Specifically: - metadata-fetching has been moved out to the endpoint, GET/api/cluster/filestore/metadata/some/path.txt - Filestore commands such as pushing/pulling files are now available at: POST /api/cluster/filestore/commands - Support for "JSON-ified" file data has been dropped in this PR (but will be retained but deprecated in the eventual 9.x backport) These divisions allow us to generate SolrRequest/SolrResponse classes representing these APIs, meaning that SolrJ users no longer need to use GenericSolrRequest/GenericSolrResponse. (This commit apes an earlier commit which offered similar functionality but caused a few test failures. These have now been fixed.)

gerlowskija added 10 commits January 4, 2025 08:53

Tweak path on /api/cluster/files APIs

070cd64

Tests pass

Create API for file/data metadata fetching

11253ac

Identical to the pre-existing`/cluster/files/a/b.txt?meta=true`, but in its own endpoint. Does NOT remove the existing endpoint or switch tests over to use it, but we will want to do this eventually on `main`.

Create new filestore getFrom/sync API

909e7a2

Still no modifications to use the API internally, or removal of the old endpoint, in order to ease backporting.

Fix test failures

cbbd5dd

Allow generation of 'InputStream' responses

aa30575

This should probably be pulled into a separate PR/branch, and modified to address all relevant endpoints and other loose ends of SOLR-17562.

WIP: Initial attempt to switch over DistribFileStore

4d249d6

Merge branch 'main' into SOLR-17351-break-up-getfile-api-main

0ed3367

Generate 'setQueryParams' method

5f77816

Ensures v2 POST query-params aren't put in a 'form' body.

github-actions bot added documentation Improvements or additions to documentation client:solrj tests cat:api cat:packagemanager labels Jan 20, 2025

gerlowskija changed the title ~~Solr 17351 break up getfile api main~~ SOLR-17351: Decompose filestore "get file" API Jan 20, 2025

epugh reviewed Jan 21, 2025

View reviewed changes

dsmiley reviewed Jan 21, 2025

View reviewed changes

Address review comments, rd 1

be73a5b

gerlowskija commented Jan 26, 2025

View reviewed changes

gerlowskija added 6 commits February 4, 2025 11:53

Remove 'NodeFileStore' and switch over existing usage

8c0527c

Small tweaks for tests

9192739

Remove NOCOMMIT

c8b099c

Fix check

3123276

Merge branch 'main' into SOLR-17351-break-up-getfile-api-main

3a1c12d

CHANGES.txt entry

bfe2abf

gerlowskija merged commit 2ab1243 into apache:main Feb 14, 2025
4 checks passed

gerlowskija deleted the SOLR-17351-break-up-getfile-api-main branch February 14, 2025 16:55

		@@ -382,16 +380,14 @@ private void distribute(FileInfo info) {
		// trying to avoid the thundering herd problem when there are a very large no:of nodes

SOLR-17351: Decompose filestore "get file" API #3047

SOLR-17351: Decompose filestore "get file" API #3047

Uh oh!

Conversation

gerlowskija commented Jan 20, 2025

Description

Solution

Tests

Checklist

Uh oh!

gerlowskija commented Jan 20, 2025

Uh oh!

epugh commented Jan 20, 2025

Uh oh!

gerlowskija commented Jan 20, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gerlowskija left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!