Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java exception escaping Hash#[]= in addressable #3166

Closed
nirvdrum opened this issue Jul 18, 2023 · 12 comments · Fixed by sporkmonger/addressable#515
Closed

Java exception escaping Hash#[]= in addressable #3166

nirvdrum opened this issue Jul 18, 2023 · 12 comments · Fixed by sporkmonger/addressable#515
Labels
third-party-issue An issue which should be fixed upstream

Comments

@nirvdrum
Copy link
Collaborator

nirvdrum commented Jul 18, 2023

I've run into a case where a Java exception is escaping Hash#[]= during our application deployment phase. It occurred while running a Rake task to upload precompiled assets to a cloud storage location. Unfortunately, the gem in question is closed source so I can't share the code that induced it. Moreover, I've only seen the failure once so I can't easily reproduce it. We may have to debug by inspection.

truffleruby: an internal exception escaped out of the interpreter,
please report it to https://github.com/oracle/truffleruby/issues.

...

Ruby Thread id=101 from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/<private-asset-uploader>/asset_uploader.rb:183 terminated with internal error: (java.lang.RuntimeException)
	from org.truffleruby.core.thread.ThreadManager.printInternalError(ThreadManager.java:347)
	from org.truffleruby.core.thread.ThreadManager.threadMain(ThreadManager.java:336)
	from org.truffleruby.core.thread.ThreadManager.lambda$initialize$3(ThreadManager.java:308)
	from java.lang.Thread.run(Thread.java:833)
	from com.oracle.truffle.polyglot.PolyglotThread.access$001(PolyglotThread.java:53)
	from com.oracle.truffle.polyglot.PolyglotThread$1.execute(PolyglotThread.java:100)
	from com.oracle.truffle.polyglot.PolyglotThread$ThreadSpawnRootNode.executeImpl(PolyglotThread.java:134)
	from com.oracle.truffle.polyglot.PolyglotThread$ThreadSpawnRootNode.execute(PolyglotThread.java:125)
/artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/<private-asset-uploader>/asset_uploader.rb:196:in `join'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/<private-asset-uploader>/asset_uploader.rb:196:in `join'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/<private-asset-uploader>/asset_uploader.rb:196:in `each'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/<private-asset-uploader>/asset_uploader.rb:196:in `consume_in_parallel'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/<private-asset-uploader>/asset_uploader.rb:71:in `block in run'
	from /usr/local/ruby/graalvm/languages/ruby/lib/mri/benchmark.rb:311:in `realtime'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/activesupport-7.0.6/lib/active_support/core_ext/benchmark.rb:14:in `ms'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/<private-asset-uploader>/asset_uploader.rb:86:in `statsd_time'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/<private-asset-uploader>/asset_uploader.rb:70:in `run'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/shopify-cloud-2.23.0/lib/tasks/upload_assets.rake:6:in `block (2 levels) in <top (required)>'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/lib/rake/task.rb:281:in `block in execute'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/lib/rake/task.rb:281:in `each'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/lib/rake/task.rb:281:in `execute'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/bugsnag-6.25.2/lib/bugsnag/integrations/rake.rb:20:in `execute'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/lib/rake/task.rb:219:in `block in invoke_with_call_chain'
	from /usr/local/ruby/graalvm/languages/ruby/lib/mri/monitor.rb:218:in `mon_synchronize'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/lib/rake/task.rb:199:in `invoke_with_call_chain'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/lib/rake/task.rb:188:in `invoke'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/lib/rake/application.rb:160:in `invoke_task'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/lib/rake/application.rb:116:in `block (2 levels) in top_level'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/lib/rake/application.rb:116:in `each'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/lib/rake/application.rb:116:in `block in top_level'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/lib/rake/application.rb:125:in `run_with_threads'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/lib/rake/application.rb:110:in `top_level'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/lib/rake/application.rb:83:in `block in run'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/lib/rake/application.rb:186:in `standard_exception_handling'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/lib/rake/application.rb:80:in `run'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/rake-13.0.6/exe/rake:27:in `<top (required)>'
	from <internal:core> core/kernel.rb:381:in `load'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/bin/rake:25:in `<top (required)>'
	from <internal:core> core/kernel.rb:381:in `load'
	from /app/.local/share/gem/truffleruby/3.1.3.23.0.0/gems/bundler-2.4.13/lib/bundler/cli/exec.rb:58:in `kernel_load'
	from /app/.local/share/gem/truffleruby/3.1.3.23.0.0/gems/bundler-2.4.13/lib/bundler/cli/exec.rb:23:in `run'
	from /app/.local/share/gem/truffleruby/3.1.3.23.0.0/gems/bundler-2.4.13/lib/bundler/cli.rb:492:in `exec'
	from /app/.local/share/gem/truffleruby/3.1.3.23.0.0/gems/bundler-2.4.13/lib/bundler/vendor/thor/lib/thor/command.rb:27:in `run'
	from /app/.local/share/gem/truffleruby/3.1.3.23.0.0/gems/bundler-2.4.13/lib/bundler/vendor/thor/lib/thor/invocation.rb:127:in `invoke_command'
	from /app/.local/share/gem/truffleruby/3.1.3.23.0.0/gems/bundler-2.4.13/lib/bundler/vendor/thor/lib/thor.rb:392:in `dispatch'
	from /app/.local/share/gem/truffleruby/3.1.3.23.0.0/gems/bundler-2.4.13/lib/bundler/cli.rb:34:in `dispatch'
	from /app/.local/share/gem/truffleruby/3.1.3.23.0.0/gems/bundler-2.4.13/lib/bundler/vendor/thor/lib/thor/base.rb:485:in `start'
	from /app/.local/share/gem/truffleruby/3.1.3.23.0.0/gems/bundler-2.4.13/lib/bundler/cli.rb:28:in `start'
	from /app/.local/share/gem/truffleruby/3.1.3.23.0.0/gems/bundler-2.4.13/exe/bundle:45:in `block in <top (required)>'
	from /app/.local/share/gem/truffleruby/3.1.3.23.0.0/gems/bundler-2.4.13/lib/bundler/friendly_errors.rb:117:in `with_friendly_errors'
	from /app/.local/share/gem/truffleruby/3.1.3.23.0.0/gems/bundler-2.4.13/exe/bundle:33:in `<top (required)>'
	from <internal:core> core/kernel.rb:381:in `load'
	from /usr/local/ruby/graalvm/languages/ruby/bin/bundle:44:in `<main>'
Caused by:
java.lang.Object[] cannot be cast to org.truffleruby.core.hash.library.EmptyHashStore (java.lang.ClassCastException)
	from org.truffleruby.core.hash.library.EmptyHashStoreGen$HashStoreLibraryExports$Cached.set(EmptyHashStoreGen.java:108)
	from org.truffleruby.core.hash.HashNodes$SetIndexNode.set(HashNodes.java:241)
	from org.truffleruby.core.hash.HashNodesFactory$SetIndexNodeFactory$SetIndexNodeGen.executeAndSpecialize(HashNodesFactory.java:1195)
	from org.truffleruby.core.hash.HashNodesFactory$SetIndexNodeFactory$SetIndexNodeGen.execute(HashNodesFactory.java:1146)
	from org.truffleruby.language.RubyCoreMethodRootNode.execute(RubyCoreMethodRootNode.java:58)
/artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/addressable-2.8.4/lib/addressable/uri.rb:354:in `[]='
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/addressable-2.8.4/lib/addressable/uri.rb:354:in `block in <class:URI>'
	from <internal:core> core/hash.rb:186:in `default'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/addressable-2.8.4/lib/addressable/uri.rb:425:in `[]'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/addressable-2.8.4/lib/addressable/uri.rb:425:in `block in encode_component'
	from <internal:core> core/truffle/string_operations.rb:32:in `block in gsub_match_and_replace'
	from <internal:core> core/truffle/string_operations.rb:141:in `block in gsub_internal_yield_matches'
	from <internal:core> core/truffle/string_operations.rb:137:in `each'
	from <internal:core> core/truffle/string_operations.rb:137:in `gsub_internal_yield_matches'
	from <internal:core> core/truffle/string_operations.rb:30:in `gsub_match_and_replace'
	from <internal:core> core/string.rb:851:in `gsub'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/addressable-2.8.4/lib/addressable/uri.rb:424:in `encode_component'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/addressable-2.8.4/lib/addressable/template.rb:820:in `block in transform_capture'
	from <internal:core> core/enumerable.rb:532:in `inject'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/addressable-2.8.4/lib/addressable/template.rb:756:in `transform_capture'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/addressable-2.8.4/lib/addressable/template.rb:595:in `block in expand'
	from <internal:core> core/truffle/string_operations.rb:32:in `block in gsub_match_and_replace'
	from <internal:core> core/truffle/string_operations.rb:141:in `block in gsub_internal_yield_matches'
	from <internal:core> core/truffle/string_operations.rb:137:in `each'
	from <internal:core> core/truffle/string_operations.rb:137:in `gsub_internal_yield_matches'
	from <internal:core> core/truffle/string_operations.rb:30:in `gsub_match_and_replace'
	from <internal:core> core/string.rb:865:in `gsub!'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/addressable-2.8.4/lib/addressable/template.rb:594:in `expand'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/google-apis-core-0.11.0/lib/google/apis/core/http_command.rb:168:in `prepare!'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/google-apis-core-0.11.0/lib/google/apis/core/api_command.rb:85:in `prepare!'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/google-apis-core-0.11.0/lib/google/apis/core/http_command.rb:102:in `execute'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/google-apis-core-0.11.0/lib/google/apis/core/base_service.rb:418:in `execute_or_queue_command'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/google-apis-storage_v1-0.19.0/lib/google/apis/storage_v1/service.rb:1773:in `get_object'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/google-cloud-storage-1.44.0/lib/google/cloud/storage/service.rb:443:in `block in get_file'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/google-cloud-storage-1.44.0/lib/google/cloud/storage/service.rb:911:in `execute'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/google-cloud-storage-1.44.0/lib/google/cloud/storage/service.rb:442:in `get_file'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/google-cloud-storage-1.44.0/lib/google/cloud/storage/bucket.rb:1374:in `file'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/<private-asset-uploader>/asset_uploader.rb:97:in `block in load_file_silent'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/activesupport-7.0.6/lib/active_support/logger_silence.rb:18:in `block in silence'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/activesupport-7.0.6/lib/active_support/logger_thread_safe_level.rb:43:in `log_at'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/activesupport-7.0.6/lib/active_support/logger_silence.rb:18:in `silence'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/<private-asset-uploader>/asset_uploader.rb:96:in `load_file_silent'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/<private-asset-uploader>/asset_uploader.rb:105:in `upload'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/<private-asset-uploader>/asset_uploader.rb:74:in `block (2 levels) in run'
	from /artifacts/bundle/truffleruby/3.1.3.23.0.0/gems/<private-asset-uploader>/asset_uploader.rb:188:in `block (2 levels) in consume_in_parallel'
@eregon
Copy link
Member

eregon commented Jul 18, 2023

This looks like concurrent mutation of a Hash.
From the Ruby stacktrace below it should be possible to get an idea in which code/gem it happens.

It's often worthwhile to fix in the gem because it can fix related concurrency issues.

We also plan to make Hash thread-safe in general, like CRuby.

@eregon
Copy link
Member

eregon commented Jul 18, 2023

https://github.com/sporkmonger/addressable/blob/addressable-2.8.4/lib/addressable/uri.rb#L354
So concurrent access to SEQUENCE_UPCASED_PERCENT_ENCODING_TABLE could cause this.
I found https://github.com/sporkmonger/addressable/blob/addressable-2.8.4/lib/addressable/uri.rb#L425
And this method very likely can be called from multiple threads.

A simple solution would be to use a Concurrent::Map from concurrent-ruby instead of raw Hash there.
And the same for SEQUENCE_ENCODING_TABLE just above.
(as a note, this is the correct pattern: ruby-concurrency/concurrent-ruby#970)

@eregon
Copy link
Member

eregon commented Jul 18, 2023

Upstream issue: sporkmonger/addressable#514

@dentarg
Copy link

dentarg commented Jul 18, 2023

This looks like concurrent mutation of a Hash

Who is doing that, not addressable, right?

We also plan to make Hash thread-safe in general, like CRuby

Would that solve sporkmonger/addressable#514?

@nirvdrum
Copy link
Collaborator Author

This looks like concurrent mutation of a Hash

Who is doing that, not addressable, right?

I guess it can be hard to say where this ought to be fixed. If it's done in addressable, it can be fixed in one spot. If not done in addressable, each caller would be responsible for wrapping addressable calls in a Mutex. If that's the path taken, it'd be helpful to add a note that addressable is not thread-safe so that clients of the library know what their responsibilities are.

I'm in favor of addressing it in addressable because it can be done far more granularly. Only the calls that need special attention can be updated, whereas delegating to a client means no call can be deemed safe without docs explicitly saying so. As for whether to rely on concurrent-ruby, I have less of an opinion. It certainly would address the problem and likely in the most performant way, but it is another dependency.

@dentarg
Copy link

dentarg commented Jul 18, 2023

Sorry, I didn't fully realise what was going on. This should be addressed (hehe) in addressable. Too bad CRuby didn't want to address this, seems like it will continue to bite people running non-CRuby...

@byroot
Copy link

byroot commented Jul 19, 2023

If I may, I think the problem here is beyond the unsynchronized access.

How many possible entries is there in these hashes?

Either they are small enough and should be generated upfront, or either they are too big, and they shouldn't be lazily generated like this as it's pretty much a memory leak.

(sorry it's a quick drive by comment, I haven't fully dug in the implementation yet).

@dentarg
Copy link

dentarg commented Jul 19, 2023

@byroot the hashes was introduced by sporkmonger/addressable#329 as caches to decrease object allocations, so yes, sounds to me that they could possibly grow indefinitely. Should that change be reverted instead?

@byroot
Copy link

byroot commented Jul 19, 2023

IMHO yes.

Now I understand that doing string escaping in Ruby can be quite costly, so it's a bit of a balancing act. But to me such an unbounded global cache is very smelly.

I'm a bit busy today, but I'll try to find some time to see if we can optimize this escaping enough that the cache wouldn't be necessary.

Or alternatively, a precomputed, fixed sized hash could be used for the common values (e.g. common ASCII characters) and anything outside of that could not be cached.

NB: All this is a drive by comment, I didn't have time to dig much into the implementation yet.

@byroot
Copy link

byroot commented Jul 19, 2023

(I just realized I'm commenting on Truffle instead of the issues in Addressable, sorry I haven't finished my coffee).

@eregon eregon changed the title Java exception escaping Hash#[]= Java exception escaping Hash#[]= in addressable Jul 22, 2023
casperisfine pushed a commit to casperisfine/addressable that referenced this issue Jul 31, 2023
Fix: sporkmonger#514
Fix: oracle/truffleruby#3166

These hashes lazily memoize percent encoded characters, this is an
issue on GVL-less Ruby implementations as it can cause concurrent
access.

But these are actually quite wastful as the key is always a single byte
so rather than use string keys are lazily memoize these, we can precompute
two static arrays of 255 elements once and for all.
casperisfine pushed a commit to casperisfine/addressable that referenced this issue Jul 31, 2023
Fix: sporkmonger#514
Fix: oracle/truffleruby#3166

These hashes lazily memoize percent encoded characters, this is an
issue on GVL-less Ruby implementations as it can cause concurrent
access.

But these are actually quite wastful as the key is always a single byte
so rather than use string keys are lazily memoize these, we can precompute
two static arrays of 255 elements once and for all.
casperisfine pushed a commit to casperisfine/addressable that referenced this issue Jul 31, 2023
Fix: sporkmonger#514
Fix: oracle/truffleruby#3166

These hashes lazily memoize percent encoded characters, this is an
issue on GVL-less Ruby implementations as it can cause concurrent
access.

But these are actually quite wastful as the key is always a single byte
so rather than use string keys are lazily memoize these, we can precompute
two static arrays of 255 elements once and for all.
casperisfine pushed a commit to casperisfine/addressable that referenced this issue Jul 31, 2023
Fix: sporkmonger#514
Fix: oracle/truffleruby#3166

These hashes lazily memoize percent encoded characters, this is an
issue on GVL-less Ruby implementations as it can cause concurrent
access.

But these are actually quite wastful as the key is always a single byte
so rather than use string keys are lazily memoize these, we can precompute
two static arrays of 255 elements once and for all.
dentarg pushed a commit to sporkmonger/addressable that referenced this issue Jul 31, 2023
Fix: #514
Fix: oracle/truffleruby#3166

These hashes lazily memoize percent encoded characters, this is an
issue on GVL-less Ruby implementations as it can cause concurrent
access.

But these are actually quite wasteful as the key is always a single byte
so rather than use string keys are lazily memoize these, we can precompute
two static arrays of 255 elements once and for all.

Co-authored-by: Jean Boussier <[email protected]>
@eregon eregon added the third-party-issue An issue which should be fixed upstream label Jul 31, 2023
@eregon
Copy link
Member

eregon commented Jul 31, 2023

Fixed in sporkmonger/addressable#515 by @byroot

@eregon eregon closed this as completed Jul 31, 2023
@dentarg
Copy link

dentarg commented Aug 3, 2023

Fixed in sporkmonger/addressable#515

Released with Addressable 2.8.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
third-party-issue An issue which should be fixed upstream
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants