Fix dropped watches #47482

MarcCelani-at · 2022-01-17T22:28:40Z

On unix systems, fs.watch places a watch on an inode, not a file path. If a given file is assigned a new inode, the watch is no longer valid and any changes to that file will not generate a watch for TypeScript to consume. This is quite common when using standard tools like vim and git.

This changes the watch handling code to always recreate the watch any time a rename event occurs and the file still exists. In this case, it is very likely that the inode changed (at least in my experience), so its a simple solution to the problem and I would guess it has low overhead. This patch eliminates the most common cause of this bug. There are still race conditions after this PR if someone is editing a file in the background at the exact same time that tsc is setting the watch for the first time. To fix that, setting a watch should be done as follows:

stat the file
create the watch with fs.watch
stat the file and make sure that the inode has not changed. If it hasn't, the watch is active.

IMO the TypeScript team should seriously consider just using chokidar. In addition to avoiding bugs like these (both the one fixed here and the one not fixed and described above), it will also provide much better performance on MacOS because it uses the native fsevents system calls.

jakebailey · 2022-01-19T05:18:22Z

Just to comment on the quote below (and not the PR, which itself seems fine; someone else on the team may be better to review it than me):

IMO the TypeScript team should seriously consider just using chokidar

FWIW it's unfortunately more complicated than that; tsc/tsserver make use of a number of different watching methods for different types of files and directories (which makes it difficult to consider switching to something else with different tradeoffs), and adding a third party watcher would introduce a dependency to the typescript package (currently, it has none).

Also of interest may be #43790 (and its linked issues), and that VS Code is actually removing all of its watchers in favor of @parcel/watcher.

andrewbranch · 2022-01-20T23:12:02Z

I would guess it has low overhead

Have you compared the time between the git checkout and the end of updateGraph in TS Server logs before and after?

MarcCelani-at · 2022-01-21T00:22:35Z

I should have been more clear...Re-adding the watch and processing files we are supposed to process comes with cost. We should not expect fixing this bug comes at zero cost.

I think what you are more interested in is what is the overhead of resetting the watch. Setting a watch on every file in our mono repo (over 10k files) takes less than a second, so I don't think we are talking about anything that will have serious performance implications here.

That said, this solution can potentially be optimized by stating the file and checking the inode rather than blindly resetting the watch on every rename event. That's a bit more of a complicated change and I doubt it's worth it given the low cost of setting a watch and that stating the file will introduce new cost as well.

sheetalkamat

This is incorrect change as this is switching the watcher to "Polling mode" watch instead of relying on file system events which is what it is supposed to do. Polling watch is always going to be accurate but it is CPU intensive and hence less desirable for directories and is only used when there is error..
Note that rename event happens whenever a file is added or removed to the folder

If you share your logs, i will be able to investigate the issue and propose the next steps

Eg when watching temp folder, adding file temp/something.js will invoke rename event

MarcCelani-at · 2022-03-25T22:20:59Z

@sheetalkamat I'm confused by your comment. When I think of poll, I imagine periodically rechecking the state of the file system rather than waiting for a callback from the OS. Here, we are waiting on a callback from the OS, but we are resetting the watch when the callback fires because it is very likely to be invalid on Unix OS.

When I use polling on our large monorepo, there is a noticeable overhead to it. I see no measurable overhead to resetting the watch when a file changes.

MarcCelani-at · 2022-03-25T22:25:50Z

I'm not sure what else I can provide here to help you debug. I detailed two very reliable repros on MacOS in the issue description and comments. The one in the comments involves only using VSCode and vim in a separate terminal. But happy to go over some logs over VC if that will move this forward. We are having to maintain this patch across TS upgrades which is a pain.

MarcCelani-at · 2022-03-25T22:36:32Z

Also, one final clarification - the rename events occur (at least on macOS) for all sorts of reasons, including when the file changes. It does not only fire for adding/removing a file from the directory.

jakebailey · 2022-03-28T17:39:28Z

I'm relatively new to the watch code, but I wanted to write out my understanding of the original code and this PR in case it clears things up (so do correct me if I'm wrong).

It seems like watchPresentFileSystemEntry will call fs.watch and only fall back to watchPresentFileSystemEntryWithFsWatchFile when needed. That function is documented in some comments as being based on polling (the function name is super long so I'm just going to call it "the polling watcher" from now on).

The original big long condition in callbackChangingToMissingFileSystemEntry is true when the there's a rename event and the file no longer exists, which then leads to the watcher being turned into a polling watcher that waits until an event fires, then recreates the watcher with watchPresentFileSystemEntry again.

The PR changes the condition to, on a rename, if it doesn't exist, do the same thing as above, but if it does exist, the watcher is recreated assuming that a rename event is likely to mean that the folder points somewhere else on disk via a different inode (as documented in the PR description), but is done so by calling that polling watcher creation function.

That latter point I think is the contentious one; I'd think that if we know (or very likely know) that the watcher is now bad, that it would need to be recreated via watchPresentFileSystemEntry in order to have it potentially pick the non-polling watcher again (as it would have originally).

But, it also feels like it should be expensive to be recreating directory watches (which are potentially recursive) over and over again. Maybe that's why the PR uses polling in the first place, because those are effectively "free" given they defer work until some later poll as opposed to asking the system to do some bookkeeping on FS events?

Maybe ideally, it'd be nice if there was a cheap way to just get the inode number of a directory and compare it without doing the work to recreate a watcher. But, this feels like something that node's watcher should really be handling, as the API is based on paths. It sure would be a surprise for watchers to just totally stop working depending on the details of what some program does to rename or replace a directory/file with other contents or something.

sheetalkamat · 2022-03-28T19:21:46Z

The PR changes the condition to, on a rename, if it doesn't exist, do the same thing as above, but if it does exist, the watcher is recreated assuming that a rename event is likely to mean that the folder points somewhere else on disk via a different inode (as documented in the PR description), but is done so by calling that polling watcher creation function.

I think issue here is the number of times the watch will be recreated (if not using polling). Using polling watch on any change in directory is not ideal either. We already have option to allow watchDirectory to use Polling https://www.typescriptlang.org/tsconfig#watch-watchDirectory so i don't think this makes optimal change for general use.

I will need to debug and experiment a little to see what better approach will be here.

MarcCelani-at · 2022-03-30T23:44:20Z

ahh ok I think I understand the problem. It's this line:

watcher = watchPresentFileSystemEntryWithFsWatchFile();

Should that just be

watcher = watchPresentFileSystemEntry();

?

I can test this to make sure that the repro steps I outlined in the attached Issue are fixed and update this PR if we think that would resolve the concern here.

jakebailey · 2022-03-30T23:48:41Z

I think that addresses my commentary on the original "it's now going to poll which is wrong" comment, but as mentioned above, recreating a non-polling watcher may get expensive. It'd be worth testing (as it's not a huge change), but I can't claim that's "the thing" that would make the PR correct, compared to some other mechanism to address this problem.

MarcCelani-at · 2022-03-30T23:59:43Z

Well, given that inode changes require a new watch, I see three possible paths:

Depend fully on recursive watches. This is theoretically most scalable and efficient if it is trustworthy, but we don't seem to trust it on all systems or haven't written the code to do it by default.
Stat the file on rename and check for inode changes. If renames typically correspond with inode changes, this stat is only adding more cost.
Change the watch on any rename (the intention of this PR). Some watch recreations may be wasteful but we avoid unnecessary file stats.

I'm happy to do either 2 or 3 here.

sheetalkamat · 2022-03-31T19:31:22Z

Approach 3 is definitely not a generic solution we are looking for.. Just like how we have heuristics for checking directory is still present or not, we would use some kind of heuristics but that needs some experimenting and looking at the events to come up with a plan.. I am going to spend some time on that early next week and will get back to you.

MarcCelani-at · 2022-04-11T19:32:53Z

I have completed testing the change to use watcher = watchPresentFileSystemEntry();. Interestingly, I have found this to be more reliable than the previous commit (contrary to the belief that "Polling watch is always going to be accurate"). I have a very reliable repro in my monorepo that causes incorrect diagnostics with upstream VSCode as well as with watcher = watchPresentFileSystemEntryWithFsWatchFile();, but does not repro with watchPresentFileSystemEntry. I wonder if the polling watch code is prone to race conditions when files are changing rapidly (as is the case in a git checkout). At any rate, I know @sheetalkamat is still deciding on a long term solution here, but I will update the PR to use watcher = watchPresentFileSystemEntry() in case anyone wants to apply this patch to typescript to resolve race conditions.

sheetalkamat · 2022-05-06T19:33:53Z

Closing in favor of #48997

Fix dropped watches

f5f8740

typescript-bot added the For Uncommitted Bug PR for untriaged, rejected, closed or missing bug label Jan 17, 2022

DanielRosenwasser requested review from amcasey, andrewbranch, jakebailey and RyanCavanaugh January 19, 2022 23:09

RyanCavanaugh assigned andrewbranch Jan 20, 2022

andrewbranch mentioned this pull request Jan 21, 2022

Persistent incorrect semantic diagnostic after large rebase or checkout: File is not listed within the file list of project #47274

Closed

andrewbranch requested a review from sheetalkamat March 17, 2022 22:56

typescript-bot added For Backlog Bug PRs that fix a backlog bug and removed For Uncommitted Bug PR for untriaged, rejected, closed or missing bug labels Mar 17, 2022

sheetalkamat requested changes Mar 24, 2022

View reviewed changes

MarcCelani-at requested a review from sheetalkamat March 27, 2022 01:50

watchPresentFileSystemEntry

b61738b

sheetalkamat mentioned this pull request May 6, 2022

Always recreate the file watcher when rename event occurs #48997

Merged

sheetalkamat closed this May 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix dropped watches #47482

Fix dropped watches #47482

MarcCelani-at commented Jan 17, 2022

jakebailey commented Jan 19, 2022

andrewbranch commented Jan 20, 2022

MarcCelani-at commented Jan 21, 2022

sheetalkamat left a comment •

edited

Loading

MarcCelani-at commented Mar 25, 2022

MarcCelani-at commented Mar 25, 2022 •

edited

Loading

MarcCelani-at commented Mar 25, 2022

jakebailey commented Mar 28, 2022 •

edited

Loading

sheetalkamat commented Mar 28, 2022

MarcCelani-at commented Mar 30, 2022

jakebailey commented Mar 30, 2022

MarcCelani-at commented Mar 30, 2022

sheetalkamat commented Mar 31, 2022

MarcCelani-at commented Apr 11, 2022

sheetalkamat commented May 6, 2022

Fix dropped watches #47482

Fix dropped watches #47482

Conversation

MarcCelani-at commented Jan 17, 2022

jakebailey commented Jan 19, 2022

andrewbranch commented Jan 20, 2022

MarcCelani-at commented Jan 21, 2022

sheetalkamat left a comment • edited Loading

Choose a reason for hiding this comment

MarcCelani-at commented Mar 25, 2022

MarcCelani-at commented Mar 25, 2022 • edited Loading

MarcCelani-at commented Mar 25, 2022

jakebailey commented Mar 28, 2022 • edited Loading

sheetalkamat commented Mar 28, 2022

MarcCelani-at commented Mar 30, 2022

jakebailey commented Mar 30, 2022

MarcCelani-at commented Mar 30, 2022

sheetalkamat commented Mar 31, 2022

MarcCelani-at commented Apr 11, 2022

sheetalkamat commented May 6, 2022

sheetalkamat left a comment •

edited

Loading

MarcCelani-at commented Mar 25, 2022 •

edited

Loading

jakebailey commented Mar 28, 2022 •

edited

Loading