-
Notifications
You must be signed in to change notification settings - Fork 1.3k
☂️ Search: improve Zoekt indexing #58133
Comments
meta: Nice. Seeing this makes me also want to start using tracking issues for sprints of work. |
Here are some profiling results for CPU
Takeaways:
Memory allocations
Takeaways:
Peak memory usage |
Here's a profile of memory usage after fixing some obvious issues, taken right after we finish building the 10th shard out of ~20). Peak memory usage
Takeaways:
|
If you want to experiment with removing go-git, or atleast avoid it for the heavy lifting you can see a few experiments I did here sourcegraph/zoekt#424 This was me a while ago experimenting with ideas around how to more efficiently get stuff off of gitserver for searching/indexing. |
Documenting the results of profiling universal-ctags versus scip-ctags on Peak memory usage Processing time Takeaway: currently, the main benefit of scip-ctags is its superior symbol quality, not its resource usage |
There is definitely more we can do here, but I'm closing this out as a "completed" round of work. Highlights:
Will file follow-up issues about better observability in case of OOMs and about trying GOMEMLIMIT. |
Here's a rough formula for calculating the peak memory usage of Zoekt indexserver:
Total: ~400MB * (num_threads) + 2GB |
Uh oh!
There was an error while loading. Please reload this page.
Zoekt can sometimes fail to index large repos because of timeouts or memory issues. This can result in missing or out-of-date search results. There’s also little visibility into the indexing process: we don't report progress or surface errors clearly, and we don't have good observability tools for debugging problems. This issue tracks a round of improvements we want to make to search indexing.
Indexing performance
Indexing observability
Squash bugs
/cc @sourcegraph/search-platform
The text was updated successfully, but these errors were encountered: