-
Notifications
You must be signed in to change notification settings - Fork 495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bean Validation errors calling file redetect api endpoint #8821
Comments
Ah woops! I just saw this is quite possibly the same issue as #7527 . I'll leave this up until it gets looked at but if I should move my issue there let me know. |
I found a solution, which was to fork pyDataverse and manually add the mime-type to the file posts. I may create a pyDataverse PR to allow these to be passed in. I'm going to leave this up for now because the issues I had related to #7527 might be helpful. But feel free to close or delete if that makes sense. Edit: I've created a fork that allows passing the MIME-type to dataverse. This is needed for uploading with old installations https://github.com/OdumInstitute/pyDataverse/tree/mime_type_upload |
I did miss this issue back in June (on account of being on vacation, most likely). I also made a PR into pyDataverse, adding a way to explicitly supply the mime type as an argument on file upload (gdcc/pyDataverse#142); in parallel with the fix for handling of upload calls without type headers on the Dataverse side (#8392). It was never merged; I couldn't tell from your comments in gdcc/pyDataverse#118 if your saw and/or tried that. (I haven't looked yet, but sounds like your pyDataverse fork does the same thing). |
(the info below may not be of interest/practical value to you - you seem to have worked around it anyway - but should be useful for us/the dev. team in the context of the overall cleanup of the redetect functionality) The message at the top of the log - The error in the attached server log does look like a real constraint violation. Almost certainly the result of the redetect code trying to save the DataFile with the contentType set to null. I need to take a closer look at the code, but it appears that the method simply needs a null check in the end, before trying to save the new type in the database. Will make a PR if that's the case. (There are other things in the log there - like some index errors further down in the stack trace - but those are a result of not being able to save the datafile in the db). [Edit: #8835 already contains this null check!]. Interestingly, @donsizemore and I were looking at a virtually identical stack trace a few months ago, thrown by an attempt to recalculate the md5 of an old Odum file. Don traced it to a newline character in the mimetype of that file, which violated |
During development of CORE2, I've been using pyDataverse to handle our Dataverse interactions.
One aspect of this is uploading files. We ran into #8344 which causes mime type to not be set. Because we want to support older installations, I'm shooting for a solution that doesn't require the fix pushed by @landreev (though I'm glad it exists!).
The solution I've tried is to call the
redetect
endpoint to get the correct file type. This works and there are no errors thrown in the response... BUT there are concerning messages now appearing in our logs. Note this is on our S3-based test Dataverse running 5.3:file_redetect_error.log
I'm curious if anyone over at IQSS has insight as to what might be causing this? Maybe this is a pyDataverse issue but it seems like the calls are pretty straightforward. We are concerned specifically that all these warnings indicate that something is corrupting the metadata in our database.
Thanks much!
p.s. Incase it helps here are the responses from a few of our calls to the pyDataverse
upload_datafile
andredetect_file_type
functions:The text was updated successfully, but these errors were encountered: