Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The downloaded file is different from the uploaded one #3623

Closed
1 task done
ImoutoChan opened this issue Jan 27, 2025 · 7 comments
Closed
1 task done

The downloaded file is different from the uploaded one #3623

ImoutoChan opened this issue Jan 27, 2025 · 7 comments
Assignees
Labels
bug This issue is a bug. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.

Comments

@ImoutoChan
Copy link

Describe the bug

Our tests are failing after updating from version 3.7.410.1 to 3.7.412.4 because the stream returned by GetObjectAsync differs from the one provided to PutObjectAsync.

Uploading

var request = new PutObjectRequest
{
    BucketName = bucketName,
    Key = key,
    InputStream = inputStream
};

await _client.PutObjectAsync(request, ct);

Download

var request = new GetObjectRequest
{
    BucketName = bucketName,
    Key = key
};

var response = await _client.GetObjectAsync(request, ct);
return response.ResponseStream;

Before the update, response.ResponseStream and inputStream were identical (verified via MD5 hashes). After the update, they are different.

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

The response stream from GetObjectAsync returns the exact bytes provided in the inputStream.

Current Behavior

The streams are different.

Reproduction Steps

Above

Possible Solution

No response

Additional Information/Context

No response

AWS .NET SDK and/or Package version used

3.7.410.1
3.7.412.4

Targeted .NET Platform

net 9.0

Operating System and version

win10

@ImoutoChan ImoutoChan added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jan 27, 2025
@github-actions github-actions bot added the potential-regression Marking this issue as a potential regression to be checked by team member label Jan 27, 2025
@dscpinheiro
Copy link
Contributor

Can you share how you're verifying the MD5 hashes? We made a change recently to automatically calculate checksums (#3610), but we didn't modify the SDK's MD5 implementation at all.

Also, are you using actual S3 or a 3rd-party implementation?

@dscpinheiro
Copy link
Contributor

For example, I just tested with the latest version of the S3 package and this returned the correct MD5 hash (CryptoUtilFactory is available in the Amazon.Util namespace):

var s3Client = new AmazonS3Client(RegionEndpoint.USWest2);
var inputStream = new FileStream("test-file.pdf", FileMode.Open);
var inputMD5 = Convert.ToBase64String(CryptoUtilFactory.CryptoInstance.ComputeMD5Hash(inputStream));

var putResponse = await s3Client.PutObjectAsync(new PutObjectRequest
{
    BucketName = "dspin-test",
    Key = "test.pdf",
    InputStream = inputStream
});

var getResponse = await s3Client.GetObjectAsync(new GetObjectRequest
{
    BucketName = "dspin-test",
    Key = "test.pdf"
});

var outputMD5 = Convert.ToBase64String(CryptoUtilFactory.CryptoInstance.ComputeMD5Hash(getResponse.ResponseStream));
Console.WriteLine(inputMD5 == outputMD5); // Outputs "True"

@ImoutoChan
Copy link
Author

I'm calculating the MD5 hashes myself in tests:

public static string GetMd5Hash(this byte[] bytes)
{
    var computed = MD5.HashData(bytes);
    return BitConverter.ToString(computed).Replace("-", "").ToLower();
}

public static string GetMd5Hash(this Stream stream)
{
    using var md5 = MD5.Create();
    var computed = md5.ComputeHash(stream);
    return BitConverter.ToString(computed).Replace("-", "").ToLower();
}

Also, are you using actual S3 or a 3rd-party implementation?

Yes, I'm using a third-party implementation

@dscpinheiro dscpinheiro removed needs-triage This issue or PR still needs to be triaged. potential-regression Marking this issue as a potential regression to be checked by team member labels Jan 27, 2025
@dscpinheiro
Copy link
Contributor

Got it. As mentioned in the announcement I linked earlier, the change was made intentionally but there's a config option you can set to prevent the SDK from calculating checksums (for PutObject calls):

var s3Config = new AmazonS3Config
{
    ServiceURL = "your-custom-service-url",
    RequestChecksumCalculation = RequestChecksumCalculation.WHEN_REQUIRED,
};
var s3 = new AmazonS3Client(s3Config);
await s3.PutObjectAsync(...);

Can you try that and let me know if it works? If possible, could you also share which 3rd-party implementation you're using?

@dscpinheiro dscpinheiro added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Jan 27, 2025
@dscpinheiro dscpinheiro self-assigned this Jan 27, 2025
@ImoutoChan
Copy link
Author

Thank you, that option fixed the behavior!

We’re using a local cloud provider: https://s3.ru-1.storage.selcloud.ru/, which claims full compatibility with the S3 API. However, I suspect they have not yet implemented the latest feature.

@dscpinheiro
Copy link
Contributor

Glad it worked for you! And yes, unfortunately after we made this change (which was already available - not by default - for S3 since 2022 - https://aws.amazon.com/blogs/aws/new-additional-checksum-algorithms-for-amazon-s3/), some 3rd-party implementations stopped working.

The config options can be used but obviously the ideal solution would be for the 3rd-party implementations to support this new behavior as it's in place for all AWS SDKs - not only .NET (this documentation page from S3 explains how the data integrity works in more detail: https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html)

Copy link

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.
Projects
None yet
Development

No branches or pull requests

2 participants