Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize feature loading when require is called with absolute .rb path #3276

Merged

Conversation

rwstauner
Copy link
Collaborator

@rwstauner rwstauner commented Sep 25, 2023

If require is called with an absolute path like /path/to/thing.rb we can try that as is first rather than trying .rb.rb and .rb.so first.

After enabling bootsnap's LoadPathCache on TruffleRuby we noticed that it was still doing a lot of ENOENT stat calls and found that it was trying odd-looking suffixes first:

$ bin/jt ruby --ruby.log-feature-location -e 'require "#{ENV["PWD"]}/tmp/rws.rb"' 2>&1 | grep rws.rb
  'require "#{ENV["PWD"]}/tmp/rws.rb"'
[ruby] INFO: starting search from -e:1 for feature /rws-data/repos/truffleruby-ws/truffleruby/tmp/rws.rb...
[ruby] INFO: trying /rws-data/repos/truffleruby-ws/truffleruby/tmp/rws.rb.rb...
[ruby] INFO: trying /rws-data/repos/truffleruby-ws/truffleruby/tmp/rws.rb.so...
[ruby] INFO: trying /rws-data/repos/truffleruby-ws/truffleruby/tmp/rws.rb...
[ruby] INFO: found in /rws-data/repos/truffleruby-ws/truffleruby/tmp/rws.rb
found rws.rb

With this change it only has to try one file (instead of three):

$ bin/jt ruby --ruby.log-feature-location -e 'require "#{ENV["PWD"]}/tmp/rws.rb"' 2>&1 | grep rws.rb
  'require "#{ENV["PWD"]}/tmp/rws.rb"'
[ruby] INFO: starting search from -e:1 for feature /rws-data/repos/truffleruby-ws/truffleruby/tmp/rws.rb...
[ruby] INFO: trying /rws-data/repos/truffleruby-ws/truffleruby/tmp/rws.rb...
[ruby] INFO: found in /rws-data/repos/truffleruby-ws/truffleruby/tmp/rws.rb
found rws.rb

I considered making this check "if the path has an extension", but since the next condition looks specifically for .so and does something extra I thought leaving it as .rb might be safest.

I tried to dig through the history of the file for extra context on the rest of this function.
The only one that stood to me was 8d10af1 but I don't fully understand the scenario there.

I also wasn't sure if there are tests that were worth adding for this... have any pointers?

What do you think? Does this make sense? Is there other opportunity for improvement to this function?

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Sep 25, 2023
Copy link
Member

@eregon eregon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this makes a lot of sense.
Of course for user-written require calls it's quite rare they specify the .rb at the end, but for Bootsnap that's actually the common case.

Comment on lines 410 to 415
if (!triedWithoutExtension) {
final String withoutExtension = findFeatureWithExactPath(path);
if (withoutExtension != null) {
return withoutExtension;
}
}
Copy link
Member

@eregon eregon Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this case is actually needed semantically, i.e. can require ever load anything which is not a .rb or a .so/bundle/dll?
It seems not:

$ echo 'p :a' > a
$ ruby -I. -e 'require "a"'
cannot load such file -- a (LoadError)

$ mv a a.rb
$ ruby -I. -e 'require "a"'
:a

There are specs for this kind of stuff in spec/ruby/core/kernel/shared/require.rb.
It would be good to add one for this case if there is not already one.

Given we already had this bug before, and there is no tag in spec/tags/core/kernel/require_tags.txt that sounds like it it seems there is no spec for this already, so could you add one, and then remove this part?

Copy link
Member

@eregon eregon Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah but my example above uses $LOAD_PATH and findFeatureWithAndWithoutExtension() is not used for $LOAD_PATH search.
However the behavior is as I said above:

$ ruby -v --disable-gems -I. -e 'require "#{Dir.pwd}/a"'
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
-e:1:in `require': cannot load such file -- /home/eregon/tmp/loaddir/a (LoadError)

$ ruby -v --disable-gems -I. -e 'require "#{Dir.pwd}/a"'
truffleruby 24.0.0-dev-54e96355, like ruby 3.2.2, GraalVM CE Native [x86_64-linux]
:a

i.e. there we are not compatible with CRuby, and this shows we don't need to try the path as-is if it doesn't end in .rb, .so or .bundle.
But we still need to check if it ends in .bundle.
So how about merging the first 2 ifs, or the opposite, make it a if .rb, if .so, if macOS && .bundle?

Copy link
Member

@eregon eregon Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW I also noticed that files like a.rb.rb don't get loaded on CRuby by require "#{Dir.pwd}/a.rb", only by require "#{Dir.pwd}/a.rb.rb". So we could also structure this like:

if hasExtension(path)
  on macOS replace .so by .bundle first (translateIfNativePath does that)
  try as-is
else
  try + .rb
  try + .CEXT_EXTENSION
end

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that makes sense to me!

I was curious if there was a way to test that the java code wasn't trying extra calls (like "a.rb.rb").

We have a spec for absolute paths with .rb
I added one for absolute path with .so on darwin
and 2 for absolute paths without extensions (rb and C).

Are we missing anything?

Copy link
Collaborator Author

@rwstauner rwstauner Sep 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it doesn't like trying to load C-named files that are just text files:

Kernel.require (path resolution) loads c-extension file when passed absolute path without extension when no .rb is present ERROR
RuntimeError: /rws-data/repos/truffleruby-ws/truffleruby/spec/ruby/fixtures/code/a/load_fixture.so: file too short

I'm not surprised, but also not sure how these files have been used before.
Do we want to keep the test and inspect the error message (which will assert that we attempted to load the right file)? Or is there another way to write that test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like CRuby doesn't pass those specs: https://github.com/oracle/truffleruby/actions/runs/6364893890/job/17281491472?pr=3276, maybe there is different behavior for require and load :/

Probably we should move the specs which differ behavior for both to :kernel_require instead of :kernel_require_basic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given both CRuby and TruffleRuby error with a /file too short/ exception in that case it seems reasonable to check that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was curious if there was a way to test that the java code wasn't trying extra calls (like "a.rb.rb").

I'm not aware of one, that seems difficult to test. We could add a benchmark perharps but that would likely be noisy based on disk access, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the specs around and fixed them and did some trivial optimization, now all should be green :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for your help!

@rwstauner rwstauner force-pushed the rwstauner/load-absolute-rb-first branch from a95fbb8 to 0026103 Compare September 30, 2023 18:39
@eregon eregon force-pushed the rwstauner/load-absolute-rb-first branch from 0026103 to 89836bd Compare October 2, 2023 11:28
@eregon eregon added the in-ci The PR is being tested in CI. Do not push new commits. label Oct 2, 2023
@eregon eregon added this to the 24.0.0 Release (March 19, 2024) milestone Oct 2, 2023
@eregon eregon force-pushed the rwstauner/load-absolute-rb-first branch from 89836bd to 86305fa Compare October 2, 2023 11:43
@eregon eregon added the shopify label Oct 2, 2023
@eregon eregon self-assigned this Oct 2, 2023
If require is called with an absolute path like `/path/to/thing.rb`
we can try that as is first rather than trying `.rb.rb` and `.rb.so` first.

Add require specs for absolute paths:
- with no extension (RB ext and C ext)
- .so becomes .bundle on darwin
@eregon eregon force-pushed the rwstauner/load-absolute-rb-first branch from 86305fa to 721877f Compare October 2, 2023 13:52
@graalvmbot graalvmbot merged commit b2de494 into oracle:master Oct 2, 2023
14 checks passed
@rwstauner rwstauner deleted the rwstauner/load-absolute-rb-first branch October 3, 2023 03:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in-ci The PR is being tested in CI. Do not push new commits. OCA Verified All contributors have signed the Oracle Contributor Agreement. shopify
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants