Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OLD EES-5593 Public API wildcard data set versioning #5559

Closed
wants to merge 38 commits into from

Conversation

mmoalin
Copy link
Collaborator

@mmoalin mmoalin commented Jan 27, 2025

Please go to the most recent relevant PR: : #5577
OLD:
This PR adds datasets wildcard versioning support to the following API endpoints:

[get]
data-sets/{dataSetId}/meta
data-sets/{dataSetId}/query
data-sets/{dataSetId}/csv
data-sets/{dataSetId}/versions/{dataSetVersion}
data-sets/{dataSetId}/versions/{dataSetVersion}/changes

[post]
data-sets/{dataSetId}/query

For example:

  • dataSetVersion=2.* should return the latest minor version in the v2 series
  • dataSetVersion=2.1 should return v2.1
  • dataSetVersion=2.0 should return v2.0
  • dataSetVersion=2 should return v2.0

Thank you in advance; also thank you @mark, @lee & @duncan for your initial feedbacks over our calls!

@mmoalin mmoalin changed the title Ees 5593 wildcard ds versions Ees 5593 Public API wildcard data set versions Jan 27, 2025
@mmoalin mmoalin changed the title Ees 5593 Public API wildcard data set versions EES-5593 Public API wildcard data set versions Jan 27, 2025
@mmoalin mmoalin changed the title EES-5593 Public API wildcard data set versions EES-5593 Public API wildcard data set versioning Jan 27, 2025
@mmoalin mmoalin marked this pull request as ready for review January 28, 2025 09:27
@tomjonesdev
Copy link
Collaborator

I'll leave the proper review to someone who's familiar with the public API, but just a general comment: before this gets merged in, could you tidy up the commits with a reset please?

[InlineData("v1.*.*")]
[InlineData("v1.*")]
[InlineData("v1.2.*")]
public void ValidWildCardVersion_SuccessfullyParsed(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to assert what the value is that was parsed. I realise that we're essentially testing the library here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the changes so that they are more clear to understand. This tests the validation but not the object returned from 'TryParseWildcard' intentionally.
This is because 'TryParseWildcard' returns a 'SemVersionRange' (which contains a start and end range) - this doesn't contain a specific version that we can test against versionString, it contains the attributes 'start' and 'end' which are useful in other contexts but not in ours here.
Have added a comment & renamed the test cases to make this more clear.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. The reason we've wrapped SemVersionRange.TryParse is to pass in the parameter OptionalMinorPatch - This isn't being tested though. I can remove it and no tests fail.

[InlineData("v1.2.*", 1, 2, 0)]
[InlineData("v1.1.*", 1, 1, 1)]
[InlineData("v1.*", 1, 3, 0)]
public async Task TestDataSetVersions_ReturnsExpectedVersion(string versionString,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of extra edge cases:

  • Higher version than is available - what should that return? e,.g. v6.* Not found I'm assuming
  • In-between version: remove 3.0.0 from your available and query for v3.0.0. Show that it returns not found
  • Lower: I would remove 0.0.* from your available dataset versions and then query for 0.0.* to see what happens. Should be not found.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheers Lee, I've added 3 extra edge cases. I've tested the last edge case locally and got the expected result but i havent removed the 0.0.1 because i want to use it in the other test.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure you can test it with 1.0.1 instead for example. By not including a test for a lower version, there's no knowing what it will do or if it changes. I doubt for example in reality that we have dataset versions 0.0.1 so if the lowest version is 1.0.0 or 0.6.0, what happens if we request 0.0.1?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries thats done

if (!VersionUtils.TryParseWildcard(wildcardedVersion, out var semVersionRange))
return new NotFoundResult();

var parts = wildcardedVersion.Trim('v').Split('.');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you don't need to manually parse the string again here, you have the semVersionRange object which has done the work for you.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately because the semVersionRange is not a SemVersion (this can provide the single parsed version), it returns a range which has a start and end attribute. I've found the semVersionRange object doesn't provide a useful property for me to use. I'd be happy to discuss some more over a call or to provide a demo.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, have the TryParseWilcard return a different representation of a version that does do what you want. Create your own WildcardVersion if needs be if there isn't something appropriate in the Semver library and you can put the parse on that, and then it can expose the Major, Minor and Patch as uint? values for example.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great idea had a go at that, thanks again Lee

@leeoades
Copy link
Collaborator

I'll leave the proper review to someone who's familiar with the public API, but just a general comment: before this gets merged in, could you tidy up the commits with a reset please?

Hey @tomjonesdev - is there an option to squash merge the PR? I'm assuming that's what you mean by the reset?

if (!VersionUtils.TryParseWildcard(wildcardedVersion, out var semVersionRange))
return new NotFoundResult();

var parts = wildcardedVersion.Trim('v').Split('.');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, have the TryParseWilcard return a different representation of a version that does do what you want. Create your own WildcardVersion if needs be if there isn't something appropriate in the Semver library and you can put the parse on that, and then it can expose the Major, Minor and Patch as uint? values for example.


namespace GovUk.Education.ExploreEducationStatistics.Public.Data.Api.Tests.UnitTests;

public class WildCardDatasetVersionTests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, interesting arrangement. OK, I'd not noticed that. 👍

Comment on lines 59 to 69

### Automatically receiving minor updates

New data set versions can automatically be retreived from the API without having to specify the new data set version numbers explicitly. This can be used by
supplying endpoints with the parameter `dataSetVersion` an astrick `*` wildcard character in the place of the minor or major version.

For example:

- `dataSetVersion=*` or `dataSetVersion=v*` will return the latest major and minor version for the given data set.
- `dataSetVersion=2.*.*`, `dataSetVersion=2.*` or `dataSetVersion=v2.*` will return the latest minor version of the v2 series
- `dataSetVersion=2.3.*` or `dataSetVersion=v2.3.*` will return the latest patch version of the v2.3 series.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice first draft of this 😄

On reflection, we should probably go into a bit more detail about the dataSetVersion parameter generally.

Would suggest pasting in the following:

### Requesting data set versions

There are several API endpoints that accept a `dataSetVersion` query parameter e.g. the 
[Get a data set version](/reference-v1/endpoints/GetDataSetVersion/index.html) endpoint.
The `dataSetVersion` parameter can be supplied with a version number (e.g. 1.0, 1.1, 2.0, 
etc) that controls the data set version that should be used in the request.

For example:

- `dataSetVersion=1.1` uses version 1.1
- `dataSetVersion=1.0` uses version 1.0
- `dataSetVersion=1` uses version 1.0

### Automatically receiving minor version updates

When using the `dataSetVersion` parameter, minor data set version updates can automatically
be retrieved from the API without having to change the version number explicitly. This can 
be done by including an asterisk `*` wildcard character in place of a major or minor version.

For example: 

- `dataSetVersion=*` uses the latest major and minor version
- `dataSetVersion=2.*` uses the latest minor version of the v2 series

Note I've removed any mention of patch versions as end users aren't supposed to directly know these exist.

Currently, we only have patch versions as an internal concept. They've not really been implemented yet but exist at the DB level as we may need them later for types of 'data replacements' (one to explain another day).

To an end user, they should only be concerned with specifying major and minor versions. They should automatically receive the latest patch version (if there is one).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, that's very helpful. thank you Nick!


Assert.NotNull(wildcardVersion);
Assert.Equal((uint?)major, wildcardVersion.VersionMajor);
Assert.Equal((uint?)major, wildcardVersion.VersionMajor);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: duplicated line

{
versionString = versionString.Trim();
var parts = versionString.Trim('v').Trim().Split('.');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question: do we want to support leading blank space before the v? If we do, then I don't think this does.

public static bool TryParseWildcard(string versionString,
[NotNullWhen(true)] out WildcardVersion? version)
}
public record WildcardVersion(uint? VersionMajor, uint? VersionMinor, uint? VersionPatch)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to mention yesterday - my bad. I think we can call these properties Major, Minor and Patch. I don't think we need a Version prefix on the name.

[InlineData("v*", null, null, null)]
[InlineData("v1.*.*", 1, null, null)]
[InlineData("v1.*", 1, null, null)]
[InlineData("v1.2.*", 1, 2, null)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed - this currently supports whitespace around the numbers e.g. v1 . 2 . 3 which we probably don't want.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot, covered this now in the tests and the logic.

}

version = null;
if (parts.Length > 3 || !parts.All(part => part.Is("*") || uint.TryParse(part, System.Globalization.NumberStyles.None, null, out _)))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me and Lee had a slack conversation about not parsing the uint multiple times in this function, he has recommended I make more efficient use of this so just on this now.

@mmoalin mmoalin changed the title EES-5593 Public API wildcard data set versioning OLD EES-5593 Public API wildcard data set versioning Jan 31, 2025
@mmoalin mmoalin closed this Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants