You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The report_sproxyd.py when listbuckets() is run has a 30 second timeout. When this timeout occurs and bucketd does not respond the listbucket() operation does not handle this, and breaks out of the while loop because there is no error response of 500 or 404. This appears to them move onto the next bucket and have no handling to ensure each bucket gets listed properly.
Error:
File "/home/s3/report_sproxyd.py", line 176 in run
session, key, versionid)
File "/home/s3/report_sproxyd.py", line 94, in listbucket
r = session.get(url, timeout=30, verify=False)
<SNIP>
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool )hots='127.0.0.1', port=9000) : Read timed out. (read timeout=30)
The block for line 176 is:
def run(self):
"""main function to be passed to the Threading class"""
total_size = 0
files = 0
payload = True
key = ""
versionid = ""
while payload:
while 1:
session = requests.Session()
session.headers.update({"x-scal-request-uids":"utapi-reindex-buckets"})
error, payload = self.listbucket(
session, key, versionid)
if error == 500:
time.sleep(15)
else:
break
if error == 404:
break
key, skeys, versionid = self.retkeys(payload)
self.skeyg += skeys
return(self.userid, self.bucket, self.skeyg)
The block for line 94 is:
def listbucket(self, session=None, marker="", versionmarker=""):
"""function to list the contains of the buckets"""
m = marker.encode('utf8')
mark = urllib.parse.quote(m)
params = "%s?listingType=Basic&maxKeys=1000>=%s" % (
self.bucket, mark)
url = "%s/default/bucket/%s" % (self._bucketd, params)
r = session.get(url, timeout=30, verify=False)
if r.status_code == 200:
r.encoding = 'utf-8'
payload = json.loads(r.text)
return (r.status_code, payload)
else:
return (r.status_code, "")
The issue can be observed by comparing the content of the keys.txt output file and the total number of buckets in s3api:
The quantity of buckets from s3api shows 83 buckets. The quantity of buckets from report-sproxyd-keys:basic only contains 56 buckets.
What appears to be causing this issue is empty S3 buckets. This appears to cause the listing keys of the bucket to fail.
Any bucket that is not empty but also exhibits a 30 second timeout in bucketd would lead to:
The listkeys.csv data would have all keys from all nodes
The report-sproxyd-keys:basic keys.txt data would not contain keys for buckets that timeout during listbucket().
Resulting in:
The P0/P1 scripts will perform the left anti join finding objects in listkeys.csv but not an associated key in bucketd. This will lead to the P4 script considering all keys where the bucketd timeout occured as S3 orphans and attempt to delete them.
The text was updated successfully, but these errors were encountered:
The report_sproxyd.py when listbuckets() is run has a 30 second timeout. When this timeout occurs and bucketd does not respond the listbucket() operation does not handle this, and breaks out of the while loop because there is no error response of 500 or 404. This appears to them move onto the next bucket and have no handling to ensure each bucket gets listed properly.
Error:
The block for line 176 is:
The block for line 94 is:
The issue can be observed by comparing the content of the keys.txt output file and the total number of buckets in s3api:
The quantity of buckets from s3api shows 83 buckets. The quantity of buckets from report-sproxyd-keys:basic only contains 56 buckets.
listbucket()
.The text was updated successfully, but these errors were encountered: