Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checksum error checks fixes #215 #216

Merged
merged 3 commits into from
Nov 26, 2023
Merged

Checksum error checks fixes #215 #216

merged 3 commits into from
Nov 26, 2023

Conversation

spbooth
Copy link
Contributor

@spbooth spbooth commented Jul 31, 2023

suggested fix for issue #215

Copy link
Member

@msalle msalle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at the changes in detail, so I might have missed a few bits. Most important part is of course whether it fixes the issue. In any case I would clean up the code a bit which will make it easier to read. See my inline comments.

gass/copy/source/globus_gass_copy_glob.c Outdated Show resolved Hide resolved
gass/copy/source/globus_gass_copy_glob.c Show resolved Hide resolved
gass/copy/source/globus_gass_copy_glob.c Outdated Show resolved Hide resolved
Copy link
Member

@ellert ellert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes introduced are not consistent in its use of tabs and spaces.
Before the changes the file used (mostly) spaces only.

The PR has accumulated quite a lot of commits, where some changes things already changed in earlier commits. The PR could benefit from rearranging/merging the commits into logical units (or even merging ito one single commit).

@ellert
Copy link
Member

ellert commented Oct 30, 2023

@spbooth : I have updated the PR after reviewing it.

There are two changes that are not only white space corrections.
a) I moved the close(fd) statement up into the if(this is a file) conditional, since it is opened inside it.
b) Since the mdctx now lives much longer and there are goto errors while it exists I added its destruction to the error branch and adjusted some of the goto labels.

Let me know if I have misunderstood the intentions of the PR.

@fscheiner
Copy link
Member

@msalle: Are you OK with the changes made?

Copy link
Member

@msalle msalle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
there is still one comment open from me, about the use of length versus read_left.
I think that needs to be resolved/checked before we can merge it in. I'm pretty sure my suggested change is the correct one. I think other than that it looks ok to me.

gass/copy/source/globus_gass_copy_glob.c Show resolved Hide resolved
gass/copy/source/globus_gass_copy_glob.c Show resolved Hide resolved
gass/copy/source/globus_gass_copy_glob.c Outdated Show resolved Hide resolved
@ellert
Copy link
Member

ellert commented Nov 23, 2023

Can we merge this?

@msalle
Copy link
Member

msalle commented Nov 23, 2023

Can we merge this?

@ellert What are your thoughts on my comment https://github.com/gridcf/gct/pull/216/files#r1290396767

@ellert
Copy link
Member

ellert commented Nov 25, 2023

Can we merge this?

@ellert What are your thoughts on my comment https://github.com/gridcf/gct/pull/216/files#r1290396767

I already gave a long reply to this in the discussion above. As far as I can see the code is correct as it is.

@msalle
Copy link
Member

msalle commented Nov 25, 2023

Can we merge this?

@ellert What are your thoughts on my comment https://github.com/gridcf/gct/pull/216/files#r1290396767

I already gave a long reply to this in the discussion above. As far as I can see the code is correct as it is.

Not sure where you mean?
Just to clarify, my question is specifically about the lines 2165-2169 in globus_gass_copy_glob.c in Stephen's fork
Note that read_left is set to length in line 2128.
In the 5 lines in the while loop, n keeps being subtracted from read_left but the check is whether length is bigger equal 0 which doesn't seem to make sense unless I completely miss the reasoning? I would expect a check such as if (read_left >= n)
The lines are there in the original code too, see lines 2152-2156 there but since we are reordering and in any case fixing a bug, I think we should check this part is correct.

@ellert
Copy link
Member

ellert commented Nov 25, 2023

Can we merge this?

@ellert What are your thoughts on my comment https://github.com/gridcf/gct/pull/216/files#r1290396767

I already gave a long reply to this in the discussion above. As far as I can see the code is correct as it is.

Not sure where you mean? Just to clarify, my question is specifically about the lines 2165-2169 in globus_gass_copy_glob.c in Stephen's fork Note that read_left is set to length in line 2128. In the 5 lines in the while loop, n keeps being subtracted from read_left but the check is whether length is bigger equal 0 which doesn't seem to make sense unless I completely miss the reasoning? I would expect a check such as if (read_left >= n) The lines are there in the original code too, see lines 2152-2156 there but since we are reordering and in any case fixing a bug, I think we should check this part is correct.

Here I quote my reply again. The check is supposed to check if the original value of length was negative since this is the condition that determines if the bytes_left counter is used or not.

Either length is negative, and the routine reads all data until the end of the file, in this case the bytes_left counter is not used, and count stays constant. Or length is non-negative and the routine keeps track of the number of bytes read, i.e. the bytes_left counter is used, and count is adjusted to never exceed bytes_left.

The check input >= 0 determines whether bytes_left and count should be adjusted after each call to read() or not.

When the bytes_left counter is used, count is never greater than bytes_left. The n returned from read() is never greater than count, so bytes_left will never become negative when decreased by n, so count is never negative.

When the bytes_left counter is not used, count is never changed, so count is never negative.

@msalle
Copy link
Member

msalle commented Nov 26, 2023

Can we merge this?

@ellert What are your thoughts on my comment https://github.com/gridcf/gct/pull/216/files#r1290396767

I already gave a long reply to this in the discussion above. As far as I can see the code is correct as it is.

Not sure where you mean? Just to clarify, my question is specifically about the lines 2165-2169 in globus_gass_copy_glob.c in Stephen's fork Note that read_left is set to length in line 2128. In the 5 lines in the while loop, n keeps being subtracted from read_left but the check is whether length is bigger equal 0 which doesn't seem to make sense unless I completely miss the reasoning? I would expect a check such as if (read_left >= n) The lines are there in the original code too, see lines 2152-2156 there but since we are reordering and in any case fixing a bug, I think we should check this part is correct.

Here I quote my reply again. The check is supposed to check if the original value of length was negative since this is the condition that determines if the bytes_left counter is used or not.

Either length is negative, and the routine reads all data until the end of the file, in this case the bytes_left counter is not used, and count stays constant. Or length is non-negative and the routine keeps track of the number of bytes read, i.e. the bytes_left counter is used, and count is adjusted to never exceed bytes_left.

The check input >= 0 determines whether bytes_left and count should be adjusted after each call to read() or not.

When the bytes_left counter is used, count is never greater than bytes_left. The n returned from read() is never greater than count, so bytes_left will never become negative when decreased by n, so count is never negative.

When the bytes_left counter is not used, count is never changed, so count is never negative.

Ah, I understand, thanks!
I must say it's not the most clear code (I would personally have moved the block initialising read_left and count down to just before the while loop, and moved the if (length>=0) outside the while loop and combine with the other if (length>=0), but better leave it as is for now.

ellert and others added 3 commits November 26, 2023 11:44
The routine globus_l_gass_copy_cksm_file() in globus_gass_copy_glob.c
has a file read loop:

while ((n = read(fd, buf, count)) > 0)

If an IO error occurs in the read call it will return -1 and the loop
will terminate early generating an incorrect digest for the file but
no error report until the subsequent file transfer fails its checksum
test.

On a related note if you check for errors properly you discover that a
recursive globus-url-copy attempts to calculate checksums on
directory-urls using this routine (which always generates a error in
the read call).
@fscheiner
Copy link
Member

fscheiner commented Nov 26, 2023

Merging this now. Luckily no CI tests timed out on Travis this time.

@fscheiner fscheiner merged commit 0e792cd into gridcf:master Nov 26, 2023
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants