Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only query the requested path in the file API #1868

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

mvanderkamp
Copy link

This is to follow up on discussion #1866

Currently this is still just the first step as discussed, but I figured a PR is a better place to look at and discuss actual code than a discussion post.

I basically just extracted the relevant lines from the body of TreeInfo. I did also leave in a call to TreeInfo for the if empty(path) case though, since it seemed to me like the commit could still be either a regular hash or a stage number in that case, which get the ftime in different ways.

return [-1, '000000', '', '', -1]
endif
let newftime = getftime(fugitive#Find('.git/index', dir))
let [info, filename] = split(lines[0], "\t")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filename here might be the input path, or it might look like path/child, because the original input was a directory name. In the latter case, we need to instead return [newftime, '040000', 'tree', '', 0]. Basically, we're replacing this functionality:

while filename =~# '/'
let filename = substitute(filename, '/[^/]*$', '', '')
let s:indexes[a:dir][1][stage][filename] = [newftime, '040000', 'tree', '', 0]
endwhile

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! And it looks like it giving a directory can lead to lots of results as all files under that directory are listed. Hmm. For now I'll try the assumption that if ls-files returns more than one line for a path, that path represents a directory. I can't get ls-files to report on an empty directory though so it might work, though I don't know enough yet to be confident that this assumption holds.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't stage an empty directory. But you can stage a directory containing exactly one file. So "more than one line" is insufficient.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ach, sorry! Somewhere in there while tinkering with the output of ls-files and ls-tree I somehow convinced myself that the directory would also be listed by ls-files as well as its contents. I'll take another crack at it.

let [_, ftime] = s:TreeInfo(dir, commit)
let entry = [ftime, '040000', 'tree', '', -1]
elseif commit =~# '^:\=[0-3]$'
let [lines, exec_error] = s:LinesError([dir, 'ls-files', '--stage', '--', path])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let result = fugitive#Execute(['--literal-pathspecs', 'ls-files', '--stage', '--', path])
let line = result.stdout[0]
if result.exit_status || empty(line)

The --literal-pathspecs is a necessary change. The rest is just modernization.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mvanderkamp
Copy link
Author

Okay I've pushed a new attempt that just reuses the while-loop approach from TreeInfo since that was already doing the right thing. I thought about avoidng the while-loop by doing something with comparing the filename returned by ls-files against the requested path, but I figured the loop is actually fairly insignificant if run on just one line so may as well use it since it already does the right thing.

Repository owner deleted a comment Oct 19, 2021
Repository owner deleted a comment Oct 19, 2021
Repository owner deleted a comment Oct 19, 2021
@tpope
Copy link
Owner

tpope commented Oct 19, 2021

You're building a whole data structure just to pluck one value out of it and throw the rest away. This is not a good approach, least of all from a code readability perspective. You should need little more than if filename ==# path.

@mvanderkamp
Copy link
Author

Good point, thanks! I was overthinking things.

@mvanderkamp
Copy link
Author

This is going to return the wrong information during merge conflicts, since it will always grab the first line, regardless of which stage was actually requested. Is there a way to only request one stage? I can't find it, and if not, we'll have to loop over the lines. I might take a stab at adding some caching back in at the same time.

@tpope
Copy link
Owner

tpope commented Oct 24, 2021

Good catch. I don't see anything in git-ls-files(1) about selecting a particular stage. But iterating seems straightforward.

@mvanderkamp
Copy link
Author

Okay, how does that look? Is the caching fine? In the original dicussion you suggested a more granular caching approach would be appropriate. I took that to mean using the repo and path as the cache key, giving a flatter structure overall.

It's a bit awkward having to fill in all four stages in the cache for tree objects- is there a better way to handle that?

@tpope
Copy link
Owner

tpope commented Oct 31, 2021

Okay, how does that look? Is the caching fine? In the original dicussion you suggested a more granular caching approach would be appropriate. I took that to mean using the repo and path as the cache key, giving a flatter structure overall.

At least throw a // in the key. Don't create a bogus path like /repo/.git/dir/file that looks real. There is also potential for collisions here.

How is performance without caching? Caching was basically mandatory back when we used system(), but now that we're on jobs I would be open to ditching it.

It's a bit awkward having to fill in all four stages in the cache for tree objects- is there a better way to handle that?

Supporting 0 only is probably good enough. This whole fake tree object thing is a crude workaround for a broken abstraction anyways.

@mvanderkamp
Copy link
Author

mvanderkamp commented Nov 1, 2021

Fetching an uncached entry:

count  total (s)   self (s)  function
    1   0.007091   0.000342  <SNR>130_IndexInfo()

Cached:

count  total (s)   self (s)  function
    1   0.001120   0.000068  <SNR>130_IndexInfo()

So roughly 6x to 7x faster. I don't have the experience to know what's generally acceptable for plugins. Personally I think 7 extra milliseconds isn't a big deal, so I'd honestly be fine with ditching the caching after all given the increased risk of bugs.

@tpope
Copy link
Owner

tpope commented Nov 1, 2021

Remind me, why doesn't it apply to :Gdiffsplit outside of a merge conflict? Edit: Oh, you edited that away, I take it you changed your mind.

@mvanderkamp
Copy link
Author

mvanderkamp commented Nov 2, 2021

Sorry I deleted that part of the message. I wrote it down and then realized it sounded weird, tested it, found that actually it was just a bug in the my caching implementation, then somehow forgot to delete that part of the post.

@tpope
Copy link
Owner

tpope commented Nov 5, 2021

7ms is nigh imperceptible on its own, but that's going to be multiplied by number of checks, which in turn will be multiplied by levels of nesting. This can get out of hand quick. Even with caching this could amount to a pretty big setback for smaller repos.

The Projectionist stuff isn't super high value and can even be confusing in some cases. I'm starting to get tempted to change it to opt-in rather than opt-out.

@mvanderkamp
Copy link
Author

Sorry for taking a while to get back to this! That makes sense that it wouldn't just be 7ms, I forgot that this is going to be hit repeatedly. That makes it sound like caching will be crucial to minimize the cost.

Here's what I've changed:

  • I used the // separator as you suggested for the cache key, and also added a cache:// prefix to really make it clear this isn't a normal path.
  • Dropped support for stages 1-3 for tree objects.
  • Fixed the bug that prevented this from working correctly on index files outside of merge conflicts.

Turning this functionality off by default seems reasonable to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants