Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performances of offsets references. #2481

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

DarkaMaul
Copy link
Contributor

@DarkaMaul DarkaMaul commented Jun 11, 2024

The problem

When trying to fix the unused-import detector problem, I noticed that the detector was kind of slow.
The profiler showed that running it on the contracts-bedrock took about 22 seconds :

         105222726 function calls (104934133 primitive calls) in 22.206 seconds

   Ordered by: internal time
   List reduced from 7611 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    17924    3.495    0.000   11.096    0.001 slither_core.py:204(_compute_offsets_from_thing)
20003325/20003323    1.677    0.000    2.606    0.000 {built-in method builtins.isinstance}
  7061912    1.621    0.000    2.640    0.000 source_mapping.py:111(__eq__)
 10785792    1.302    0.000    1.888    0.000 source_mapping.py:101(__hash__)
     6064    0.988    0.000    1.585    0.000 data_dependency.py:390(transitive_close_dependencies)
14421547/14420087    0.780    0.000    0.781    0.000 {built-in method builtins.hash}
  7521219    0.754    0.000    1.072    0.000 naming.py:34(__eq__)
   221780    0.642    0.000    0.780    0.000 compilation_unit.py:227(filename_lookup)
  4616924    0.494    0.000    0.929    0.000 <frozen abc>:117(__instancecheck__)
  4616924    0.433    0.000    0.435    0.000 {built-in method _abc._abc_instancecheck}

The main culprit here is the __compute_offsets_from_thing method.

I noticed that the data is stored at each offset for every mapping in slither_core:

        self._offset_to_objects: Optional[Dict[Filename, Dict[int, Set[SourceMapping]]]] = None
        self._offset_to_references: Optional[Dict[Filename, Dict[int, Set[Source]]]] = None
        self._offset_to_implementations: Optional[Dict[Filename, Dict[int, Set[Source]]]] = None
        self._offset_to_definitions: Optional[Dict[Filename, Dict[int, Set[Source]]]] = None

It was also running all checks within the loop, even if they did not depend on the loop offset.

The solution

  • I deduplicated the results in the offset_to_* mappings and created a new mapping offset_to_min_offset that contains the referenced min_offset. This one is only from int to Set[int], so it's reasonably compact and fast.
  • I factorized the logic to retrieve informations from these mappings in a new method _get_offset
  • I simplified the SourceMapping equality function to reduce its complexity

The results

         52232380 function calls (51943791 primitive calls) in 14.138 seconds

   Ordered by: internal time
   List reduced from 7612 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
9610218/9610216    1.084    0.000    1.851    0.000 {built-in method builtins.isinstance}
     6064    1.030    0.000    1.647    0.000 data_dependency.py:390(transitive_close_dependencies)
   221780    0.674    0.000    0.819    0.000 compilation_unit.py:227(filename_lookup)
    17924    0.580    0.000    1.685    0.000 slither_core.py:199(_compute_offsets_from_thing)
   108183    0.421    0.000    1.257    0.000 crytic_compile.py:265(filename_lookup)
  3734101    0.406    0.000    0.767    0.000 <frozen abc>:117(__instancecheck__)
  3734101    0.359    0.000    0.361    0.000 {built-in method _abc._abc_instancecheck}
    11612    0.355    0.000    0.674    0.000 data_dependency.py:491(convert_to_non_ssa)
    12265    0.304    0.000    0.315    0.000 node.py:117(__init__)
    34918    0.257    0.000    0.533    0.000 ssa.py:367(is_used_later)

We are going down from 22 seconds to about 14s so about 35% performance improvement.

Summary by CodeRabbit

  • New Features

    • Enhanced offset mapping capabilities in the core functionality.
  • Refactor

    • Improved internal logic for handling offsets and references.
  • Bug Fixes

    • Fixed equality checks in source mapping to use relative filenames for more accurate comparisons.

Copy link

coderabbitai bot commented Jun 11, 2024

Walkthrough

Walkthrough

The recent changes enhance the SlitherCore class by introducing a new _offset_to_min_offset attribute and a generic _get_offset method. These modifications streamline the handling of offset mappings and improve the functionality of methods like offset_to_references, offset_to_implementations, and offset_to_definitions. Additionally, the __eq__ method in source_mapping.py now compares filenames using their relative property for more accurate equality checks.

Changes

File(s) Change Summary
slither/core/slither_core.py Added TypeVar to imports, introduced _offset_to_min_offset attribute, modified offset handling logic, added _get_offset method, and updated several methods to use _get_offset.
slither/core/source_mapping/source_mapping.py Modified the __eq__ method to compare filenames using the relative property and removed other attribute comparisons.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant SlitherCore
    participant SourceMapping

    User->>SlitherCore: Call offset_to_objects()
    SlitherCore->>SlitherCore: _get_offset()
    SlitherCore-->>User: Return Set[SourceMapping]

    User->>SlitherCore: Call offset_to_references()
    SlitherCore->>SlitherCore: _get_offset()
    SlitherCore-->>User: Return Set[SourceMapping]

    User->>SlitherCore: Call offset_to_implementations()
    SlitherCore->>SlitherCore: _get_offset()
    SlitherCore-->>User: Return Set[SourceMapping]

    User->>SlitherCore: Call offset_to_definitions()
    SlitherCore->>SlitherCore: _get_offset()
    SlitherCore-->>User: Return Set[SourceMapping]
Loading

This diagram visualizes the interaction between the user and the SlitherCore class, highlighting how the _get_offset method is utilized in various offset-related methods.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 02df0dc and ecca8da.

Files selected for processing (2)
  • slither/core/slither_core.py (4 hunks)
  • slither/core/source_mapping/source_mapping.py (1 hunks)
Additional context used
Ruff
slither/core/source_mapping/source_mapping.py

168-168: Ambiguous variable name: l (E741)


170-170: Ambiguous variable name: l (E741)

slither/core/slither_core.py

516-517: Use a single if statement instead of nested if statements (SIM102)

Additional comments not posted (1)
slither/core/slither_core.py (1)

91-91: The addition of new attributes and methods for handling offset mappings in the SlitherCore class is well-implemented and should contribute to the performance improvements as described in the PR.

Also applies to: 204-255, 313-365

Verification successful

The new attributes and methods for handling offset mappings in the SlitherCore class are properly integrated into the existing codebase. The search results show that _get_offset and _offset_to_min_offset are used in various parts of the slither_core.py file, indicating proper usage and integration.

  • slither/core/slither_core.py: Multiple lines show the usage of _offset_to_min_offset and _get_offset.
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify integration of new offset handling methods with existing functionalities.

# Test: Search for usage of new methods and attributes. Expect: Proper integration without issues.
rg --type python $'_get_offset|_offset_to_min_offset'

Length of output: 85



Script:

#!/bin/bash
# Description: Verify integration of new offset handling methods with existing functionalities.

# Test: Search for usage of new methods and attributes. Expect: Proper integration without issues.
rg '_get_offset|_offset_to_min_offset'

Length of output: 1214

@@ -112,12 +112,8 @@ def __eq__(self, other: Any) -> bool:
try:
return (
self.start == other.start
and self.length == other.length
and self.filename == other.filename
and self.filename.relative == other.filename.relative
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simplification of the __eq__ method in the Source class looks good and aligns with the PR's goal to improve performance.

Would you like me to help create unit tests to ensure this new behavior works as expected?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant