-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add "GET /files" controller #21
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Pratiksha Sankhe <[email protected]>
Reviewer's Guide by SourceryThis pull request implements a new endpoint to retrieve a list of all files stored in the MinIO bucket. The changes include adding a new API route, implementing the corresponding controller function, and creating a test for the new endpoint. Sequence diagram for the '/list_files' endpointsequenceDiagram
actor User
participant API as Cloud Storage API
participant Controller as Controller
participant MinIO as MinIO Storage
User->>API: GET /list_files
API->>Controller: Handle request
Controller->>MinIO: list_objects(bucket_name)
MinIO-->>Controller: Return list of files
Controller-->>API: Return JSON with file list
API-->>User: 200 OK, JSON with file list
Class diagram for the updated controllerclassDiagram
class Controller {
+home()
+list_files()
}
class MinIOClient {
+list_objects(bucket_name)
}
Controller --> MinIOClient: uses
note for list_files "Handles GET requests to '/list_files' endpoint"
File-Level Changes
Assessment against linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @psankhe28 - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider implementing pagination for the file list to handle large numbers of files more efficiently.
- The error handling could be more granular. Instead of catching all S3Errors and returning a 500 status code, consider handling different types of S3Errors separately and returning appropriate status codes.
Here's what I looked at during the review
- 🟡 General issues: 2 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #21 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 4 4
Lines 30 45 +15
=========================================
+ Hits 30 45 +15
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Signed-off-by: Pratiksha Sankhe <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @psankhe28 - I've reviewed your changes - here's some feedback:
Overall Comments:
- Consider improving error handling in the
list_files
function. It currently only catchesS3Error
, but there might be other exceptions that could occur. - The test for the new endpoint (
test_get_files
) only checks the status code. Consider mocking the MinIO client and testing the actual file list returned for more thorough coverage.
Here's what I looked at during the review
- 🟢 General issues: all looks good
- 🟢 Security: all looks good
- 🟡 Testing: 2 issues found
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Signed-off-by: Pratiksha Sankhe <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @psankhe28. It's not far off, but the PR shows that you could improve your knowledge of FOCA, REST, OpenAPI etc. Please have a look at some online resources and a few other FOCA-based services and understand what's happening and why (my comments should give you pointers). I think it's worth it, for future PRs and a better understanding of backend programming :)
schema: | ||
type: array | ||
items: | ||
type: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want to return more than just a string, rather a list of objects following a model that you define. At the very least, when uploading, you should create a resource ID, maybe a UUID4 or something similar. Please look at other services and see how they do that.
Actually, it would have made more sense to start with implementing POST /files
, as that would have clarified that point better. It is expected by REST semantics that a POST
request creates a resource, and that resources has a certain ID, and that ID should be guaranteed to be unique (which a filename is not guaranteed to be).
You could also consider including other basic info, like file size, some sort of hash like MD5 (could be configurable; a hash could then be used to avoid duplicate files to be registered, if we want that; we can discuss that), maybe the MIME type of the file if it's available (or whatever info you get from the file magic string, if present), possibly a description if that is provided during upload (if there is a field for that) - or whatever else DRS needs to create a POST /objects
.
But given that you started with GET /files
, let's keep the model simple for now and only return a unique identifier and a filename. And then discuss about additional properties when you implement POST /files
.
except S3Error as err: | ||
return jsonify({"error": str(err)}), 500 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Best to raise an application-specific, custom error. Describe it in cloud_storage_handler.exceptions
and make sure to also include it in the exceptions
dictionary of that module. Have a look at HTTP status codes to see if you find a more appropriate status code than 500. If so, use that, otherwise keep 500. In any case, use the dictionary to map it to 500.
minio_config = current_app.config.foca.custom.minio | ||
bucket_name = minio_config.bucket_name | ||
minio_client = current_app.config.foca.custom.minio.client.client | ||
objects = minio_client.list_objects(bucket_name) | ||
files = [obj.object_name for obj in objects] | ||
return jsonify({"files": files}), 200 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider implementing a specific class for this controller, in a dedicated module. There will be more code added at a later point, and we don't want these functions to get too complex.
minio_config = current_app.config.foca.custom.minio | ||
bucket_name = minio_config.bucket_name | ||
minio_client = current_app.config.foca.custom.minio.client.client | ||
objects = minio_client.list_objects(bucket_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually the only line that could raise an S3Error
if I'm not mistaken. All the other lines could also raise errors, but not S3Error
s.
Please have a look at other FOCA implementations and understand when and when not to define errors. You only want to treat errors that you reasonably expect (and that you want to actually handle in some way). But make sure that you make use of FOCA's built-in error handling. If your error handling doesn't add anything to the existing error handling (which maps all known/defined errors to the error message and code defined in the exceptions
dictionary in cloud_storage_handler.exceptions
and raises a default 500
errors for any errors it doesn't know about), then leave it.
In this case, you can probably assume that the config is valid, because it was actually validated at some point. If it were not valid (and attributes would be missing), that would really be an unexpected error pointing to some deeper problem that the user shouldn't know about. In such cases, no specific error handling is needed, it is truly an InternalServerError
. Admins would then need to investigate, and they would look at the logs and see the actual error trace. That's exactly what we want to happen - so no need to or additional value from handling such errors specifically.
You need to think hard in each case whether there is something that we want to let the user/client to know. Most importantly, will there be a point in retrying or not? An S3 error could be because the S3 service has a temporary glitch, so it might actually be worth trying again. But a BadRequest
would always be expected to fail, so clients should not retry.
Here, I do think that you actually handle errors quite right: S3Error
is also the only error I'd catch, but you should only put the try
/ catch
around that one line that can actually raise such an error. And you want to define a custom error and use FOCA to map that error to a 500 or more appropriate error response with an informative error message.
Everything else you can rightly ignore, at least for now.
Make sure to learn about FOCA, read on the different HTTP status codes, REST philosophy and error handling, especially in web services.
Description
Checklist
of this project, including, in particular, with regard to any style guidelines
Conventional Commits specification; in particular, it clearly
indicates that a change is a breaking change
using the PR title as the commit message
changed behavior
or updated existing ones (only for Python, TypeScript, etc.)
(Google-style Python docstrings) for all
packages/modules/functions/classes/methods or updated existing ones
works
Comments
Summary by Sourcery
Add a new endpoint to list all files in the MinIO bucket and include an integration test to ensure its functionality.
New Features:
Tests: