-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
False validation error: invalid checksum #468
Comments
Looks like it's reading the manifest incorrectly, skipping over the first few bytes. I wonder if this is a bug in the tar stream library. The actual file md5 and the manifest md5 match. Both are "f89314b6c6daf78cdbb8129c59b0c672". However, the error message reports the manifest md5 as "c6daf78cdbb8129c59b0c672", which omits the first 8 bytes.
The bag in question has the following payload, amounting to 946 MB: f89314b6c6daf78cdbb8129c59b0c672 data/0056.mets.xml The md5 manifest is added at the end of the bagging process, which means it's preceed by the files in this list in the tar archive. Is there something off in the tar headers to make them start reading the md5 manifest at byte 8 instead of byte zero? |
From Greg at PTSEM: I tried using DART on a different computer to package and upload the same files. That worked. My laptop has version 2.0.11.1795 whereas my desktop, where the failure occurred, has 2.0.11.1925. That difference may be irrelevant -- just letting you know. |
From Greg: I've uploaded dozens of objects successfully using DART at the command line, but I encountered a second error message -- which is similar but not identical to the one we corresponded about yesterday: error: validate/completed - Operation completed with errors. Payload file data/01103.mets.xml not found in manifest-md5.txt Actually that file is listed on the first line of manifest-md5.txt. So whereas yesterday the problem seemed to be skipping the first several bytes of manifest-md5.txt, today it seems to be skipping the first line. I've attached the manifest-md5.txt file and the log lines pertaining to this object. If you'd like me to try anything in particular, let me know. Otherwise I'll try it with DART on my laptop computer, as I did yesterday successfully. manifest-md5.txtff9df139371d90c0c28b73a6eda6f78d data/01103.mets.xml |
Specs on the two DART versions. Note that the validation fails on the desktop machine with the newer version of Node, but it works on the laptop with the older version. Desktop - (fails) Laptop - (succeeds) |
This error occurs in two versions of DART on two different Macs. DART v2.0.11.1795 with Node.js 12.13.0 and DART v2.0.11.1925 with Node.js 12.18.3. |
The bags in which these errors occur do not contain files over 8GB, so this issue is not related to the tar-stream library's occasional corruption of tarballs containing files >8GB. |
From PTSEM:
I recently uploaded a few dozen objects to our production repo using DART at the command line. Most completed successfully, but for one of them, DART returned an error message:
error: validate/completed - Operation completed with errors. Bad md5 digest for 'data/0056.mets.xml': manifest says 'c6daf78cdbb8129c59b0c672', file digest is 'f89314b6c6daf78cdbb8129c59b0c672'.
The odd thing about this is that the manifest doesn't actually say "c6da..." at all. It has "f893...":
f89314b6c6daf78cdbb8129c59b0c672 data/0056.mets.xml
I expanded the tar file and ran md5 on that file and got this:
MD5 (.dart/bags/ptsem.edu.theocom.0056/data/0056.mets.xml) = f89314b6c6daf78cdbb8129c59b0c672
I can't tell where "c6da..." is coming from. Any ideas?
The text was updated successfully, but these errors were encountered: