Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to use resumptionToken in REST #4225

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from

Conversation

LuomaJuha
Copy link
Contributor

@LuomaJuha LuomaJuha commented Jan 31, 2025

Adds an option to request a resumption token when using search in REST.

uses parameter resumptionToken=<true|tokenid> for deciding if the request is new or to use older resumption token. Excludes the possibility to return facets from the result.

Moved resumption token handling to a trait.

Copy link
Member

@demiankatz demiankatz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @LuomaJuha! I haven't had a chance to try this hands-on yet, but here are some thoughts based on my first read-through.

*
* @return void
*/
public function setResumptionService(OaiResumptionServiceInterface $resumptionService): void
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to change the constructor of \VuFind\OAI\Server to call setResumptionService and use the trait's property instead of creating a separate local property through constructor property promotion. I don't think it's necessary to make this change, but it feels a little more consistent to me, even though it adds some verbosity.

): OaiResumptionEntityInterface {
$params['cursor'] = $cursor;
$params['cursorMark'] = $cursorMark;
$expire = time() + 24 * 60 * 60;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make the token lifetime a parameter of the function with a default value instead of hard-coding it, for future flexibility?

*
* @param string $token The resumption token to look up
*
* @return array|false Parameters associated with token or false if invalid or expired
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would ?array (replacing false with null below) be a slightly cleaner return type? The functionality would be essentially the same, though we would need to change the check on the return value in the OAI Server class.

$this->recordFormatter = $rf;
$this->facetFormatter = $ff;
foreach ($rf->getRecordFields() as $fieldName => $fieldSpec) {
if (!empty($fieldSpec['vufind.default'])) {
$this->defaultRecordFields[] = $fieldName;
}
}

$this->facetConfig = $this->getConfig('facets');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be made more flexible for compatibility with Search2ApiController. I think you should be able to get the name of the facets configuration from the search options class instead of hard-coding it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Seems like this lack of flexibility was an existing bug that got highlighted by the refactoring, rather than a new problem here... but either way it's worth fixing!)

@@ -139,21 +168,23 @@ public function __construct(
FacetFormatter $ff
) {
parent::__construct($sm);
$this->setResumptionService($this->getDbService(\VuFind\Db\Service\OaiResumptionServiceInterface::class));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary, but it might be nice to introduce constructor property promotion to simplify the initialization of the RecordFormatter and FacetFormatter. Since we're adding new properties and constructor behavior, it might just be nice to clean up some unnecessary redundancy while we're in the area.

@@ -900,7 +900,8 @@ searchAccessPermission = access.api.Search

; This is the maximum number of results that can be returned in a single response:
maxLimit = 100

; This is the maximum number of results that can be returned with resumption token in a single response:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we specify the default value (which currently appears to be 1000) here? And is it possible that limit is too high? Even if the performance of the cursor improves speed, don't we still need to worry about potentially running out of memory if the index contains large records?

$response = $this->doDefaultSearch($request);
}
} catch (Exception $e) {
return $this->output([], self::STATUS_ERROR, 400, $e->getMessage());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a very broad exception catch, is there any danger of leaking sensitive information here if something unexpected goes wrong? Would it be better to catch more specific exceptions for messages we want to pass through, and to use a more generic message (perhaps accompanied by debug logging) for unexpected exceptions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This also seems to be a pre-existing issue... maybe too big a problem to tackle in this PR, but at least something to consider for the future).

if (
isset($request['limit'])
&& (!ctype_digit($request['limit'])
isset($request['limit']) && (!ctype_digit($request['limit'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation changes here make this harder for me to read -- I think it might be good to keep isset($request['limit']) on its own line, and move the && (!ctype_digit($request['limit']) down to the next line so that all of the OR clauses are together. By breaking up the nested OR parentheses into multiple lines, this implies an association that isn't correct, and it confused me. If doing this would make the second line too long, let me know if you'd like to brainstorm other options... one possibility might be to define $limit before the validation, since that would let us simplify all the $request references in the check, and I don't see that it would break the logic in any way.

} else {
$params->setLimit(0);
$limit = $request['limit'] ??= 20;
$request['sort'] ??= self::DEFAULT_SORT;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to use the default sort from the search options object, and/or to make this configurable somehow? Using a constant seems like the least flexible option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants