Skip to content

HADOOP-13611. S3A FS deleteOnExit to skip the exists check. #1924

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

steveloughran
Copy link
Contributor

  • new override method S3AFileSystem.deleteOnExit() which skips the exists
    check.
  • FileSystem.processDeleteOnExit() skips exists checks too; relies on delete()
    to do its work.
  • make sure all the delete/cancel/process deleteOnExit operations consistently
    qualify paths, especially the list of paths to delete

* new override method S3AFileSystem.deleteOnExit() which skips the exists
  check.
* FileSystem.processDeleteOnExit() skips exists checks too; relies on delete()
  to do its work.
* make sure all the delete/cancel/process deleteOnExit operations consistently
  qualify paths, especially the list of paths to delete

Change-Id: Icd99615f7e6adbb7a4c6e1cfdff44edc21956961
@steveloughran
Copy link
Contributor Author

testing s3 london w/ s3guard.

There are no explicit tests of deleteOnExit, but lots of implicit ones as things are always being scheduled for deletion. I suppose I should do one though...

@steveloughran
Copy link
Contributor Author

Note: I tried to allow FS instances to support parallel deletes for extra performance, but it all gets too complex, especially as the delete process is synchronized on the list of paths to delete; easy to cause problems on something which shouldn't be a critical path (unless something uses it in production a lot). What is key: no 404 creation in any registration of paths to delete

@bgaborg bgaborg self-requested a review March 30, 2020 16:42
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 14s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+0 🆗 mvndep 0m 22s Maven dependency ordering for branch
+1 💚 mvninstall 21m 5s trunk passed
+1 💚 compile 19m 28s trunk passed
+1 💚 checkstyle 3m 23s trunk passed
+1 💚 mvnsite 2m 28s trunk passed
+1 💚 shadedclient 23m 8s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 1m 26s trunk passed
+0 🆗 spotbugs 1m 8s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 3m 11s trunk passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 22s Maven dependency ordering for patch
+1 💚 mvninstall 1m 24s the patch passed
+1 💚 compile 17m 20s the patch passed
+1 💚 javac 17m 20s the patch passed
+1 💚 checkstyle 2m 47s the patch passed
+1 💚 mvnsite 2m 7s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedclient 15m 30s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 1m 27s the patch passed
+1 💚 findbugs 3m 27s the patch passed
_ Other Tests _
-1 ❌ unit 10m 9s hadoop-common in the patch passed.
+1 💚 unit 1m 15s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 46s The patch does not generate ASF License warnings.
131m 16s
Reason Tests
Failed junit tests hadoop.fs.viewfs.TestChRootedFileSystem
hadoop.fs.TestHarFileSystem
hadoop.fs.TestFileSystemCaching
hadoop.fs.TestFilterFileSystem
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1924/1/artifact/out/Dockerfile
GITHUB PR #1924
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux ab2c176bdab5 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 960c9eb
Default Java 1.8.0_242
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-1924/1/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1924/1/testReport/
Max. process+thread count 2605 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1924/1/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

And of course I didn't read the javadocs in FileSystem I added last month so should have expected the stack traces. [INFO]
[ERROR] Failures:
[ERROR] TestFileSystemCaching.testCancelDeleteOnExit:313
[ERROR] TestFileSystemCaching.testDeleteOnExit:259
[ERROR] TestFileSystemCaching.testDeleteOnExitFNF:278
Argument(s) are different! Wanted:
fileSystem.getFileStatus(/a);
-> at org.apache.hadoop.fs.TestFileSystemCaching.testDeleteOnExitFNF(TestFileSystemCaching.java:278)
Actual invocations have different arguments:
fileSystem.makeQualified(/a);
-> at org.apache.hadoop.fs.FilterFileSystem.makeQualified(FilterFileSystem.java:124)
fileSystem.getFileStatus(null);
-> at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:460)

[ERROR] TestFileSystemCaching.testDeleteOnExitRemoved:296
[ERROR] TestFilterFileSystem.testFilterFileSystem:170 1 methods were not overridden correctly - see log
[ERROR] TestHarFileSystem.testInheritedMethodsImplemented:393 1 methods were not overridden correctly - see log
[ERROR] Errors:
[ERROR] TestChRootedFileSystem.testDeleteOnExitPathHandling:346 ? NullPointer
[INFO]
[ERROR] Tests run: 4401, Failures: 6, Errors: 1, Skipped: 255
[INFO]

Hmm. I think I might just rewind all changes to FileSystem and have S3A build up its own list of files to delete and then delete them at teardown. Avoids complexity and would let us go to submitting delete operations to the thread pool

@steveloughran steveloughran changed the title HADOOP-16877. S3A FS deleteOnExit to skip the exists check. HADOOP-13611. S3A FS deleteOnExit to skip the exists check. Apr 8, 2020
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 8s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+0 🆗 mvndep 0m 25s Maven dependency ordering for branch
+1 💚 mvninstall 22m 17s trunk passed
+1 💚 compile 23m 58s trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 compile 19m 1s trunk passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+1 💚 checkstyle 3m 5s trunk passed
+1 💚 mvnsite 2m 43s trunk passed
+1 💚 shadedclient 22m 30s branch has no errors when building and testing our client artifacts.
-1 ❌ javadoc 0m 37s hadoop-common in trunk failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.
-1 ❌ javadoc 0m 38s hadoop-aws in trunk failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.
+1 💚 javadoc 1m 33s trunk passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+0 🆗 spotbugs 1m 21s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 4m 9s trunk passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 26s Maven dependency ordering for patch
+1 💚 mvninstall 1m 38s the patch passed
+1 💚 compile 22m 16s the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javac 22m 16s the patch passed
+1 💚 compile 19m 39s the patch passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+1 💚 javac 19m 39s the patch passed
+1 💚 checkstyle 3m 8s the patch passed
-1 ❌ mvnsite 0m 36s hadoop-aws in the patch failed.
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedclient 16m 36s patch has no errors when building and testing our client artifacts.
-1 ❌ javadoc 0m 38s hadoop-common in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.
-1 ❌ javadoc 0m 37s hadoop-aws in the patch failed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.
+1 💚 javadoc 1m 34s the patch passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
-1 ❌ findbugs 0m 31s hadoop-aws in the patch failed.
_ Other Tests _
-1 ❌ unit 10m 55s hadoop-common in the patch passed.
-1 ❌ unit 0m 31s hadoop-aws in the patch failed.
+1 💚 asflicense 0m 44s The patch does not generate ASF License warnings.
184m 21s
Reason Tests
Failed junit tests hadoop.fs.viewfs.TestChRootedFileSystem
hadoop.fs.TestHarFileSystem
hadoop.fs.TestFileSystemCaching
hadoop.fs.TestFilterFileSystem
Subsystem Report/Notes
Docker ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1924/3/artifact/out/Dockerfile
GITHUB PR #1924
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 70a60c3a7151 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / e756fe3
Default Java Private Build-1.8.0_252-8u252-b09-1~18.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_252-8u252-b09-1~18.04-b09
javadoc https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1924/3/artifact/out/branch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt
javadoc https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1924/3/artifact/out/branch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt
mvnsite https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1924/3/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt
javadoc https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1924/3/artifact/out/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt
javadoc https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1924/3/artifact/out/patch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1.txt
findbugs https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1924/3/artifact/out/patch-findbugs-hadoop-tools_hadoop-aws.txt
unit https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1924/3/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
unit https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1924/3/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1924/3/testReport/
Max. process+thread count 1768 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-1924/3/console
versions git=2.17.1 maven=3.6.0 findbugs=4.0.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran steveloughran added fs/s3 changes related to hadoop-aws; submitter must declare test endpoint work in progress PRs still Work in Progress; reviews not expected but still welcome labels Oct 2, 2020
@steveloughran steveloughran marked this pull request as draft October 2, 2020 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fs/s3 changes related to hadoop-aws; submitter must declare test endpoint work in progress PRs still Work in Progress; reviews not expected but still welcome
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants