Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruby: Query for database calls in a loop #18304

Merged
merged 12 commits into from
Feb 17, 2025
Merged

Conversation

yoff
Copy link
Contributor

@yoff yoff commented Dec 17, 2024

Add query for hoisting Rails ActiveRecord calls.
It uses ActiveRecordModelFinderCall to identify relevant calls and this PR is making that class public. Please let me know if that is not acceptable.

Possible Improvements;

  • We could consider Associations.
    Associations are lazy-loading by default, so something like
    in a loop over article do
    article.book
    if you have 1000 articles it will do a 1000 calls to book.
    If you already did article includes book, there should be no problem.
  • We could also consider instances of ActiveRecordInstanceMethodCall, for instance
    calls to pluck. (An earlier version added exactly calls to pluck which added exactly 1 result on gh/gh; currently the query has 195 results, so the complication did not seem worth it.)

The query comes with a .qhelp and autofixes have been validated on a small set (~15) of alerts.

This query is intended for CCR, so I am not sure if we want a change note.

@yoff yoff force-pushed the ruby/performance-queries branch from aa8751c to 5d3a541 Compare January 14, 2025 16:27
@yoff yoff force-pushed the ruby/performance-queries branch 2 times, most recently from 489bd2a to 8e36d3f Compare February 5, 2025 15:32
Copy link
Contributor

github-actions bot commented Feb 5, 2025

QHelp previews:

ruby/ql/src/queries/performance/DatabaseQueryInLoop.qhelp

Database query in a loop

When a database query operation, for example a call to a query method in the Rails `ActiveRecord::Relation` class, is executed in a loop, this can lead to a performance issue known as an "n+1 query problem". The database query will be executed in each iteration of the loop. Performance can usually be improved by performing a single database query outside of a loop, which retrieves all the required objects in a single operation.

Recommendation

If possible, pull the database query out of the loop and rewrite it to retrieve all the required objects. This replaces multiple database operations with a single one.

Example

The following (suboptimal) example code queries the User object in each iteration of the loop:

repo_names_by_owner.map do |owner_slug, repo_names|
    owner_id, owner_type = User.where(login: owner_slug).pluck(:id, :type).first
    owner_type = owner_type == "User" ? "USER" : "ORGANIZATION"
    rel_conditions = { owner_id: owner_id, name: repo_names }

    nwo_rel = nwo_rel.or(RepositorySecurityCenterConfig.where(rel_conditions)) unless neg
    nwo_rel = nwo_rel.and(RepositorySecurityCenterConfig.where.not(rel_conditions)) if neg
  end

To improve the performance, we instead query the User object once outside the loop, gathering all necessary information in a single query:

# Preload User data
user_data = User.where(login: repo_names_by_owner.keys).pluck(:login, :id, :type).to_h do |login, id, type|
  [login, { id: id, type: type == "User" ? "USER" : "ORGANIZATION" }]
end

repo_names_by_owner.each do |owner_slug, repo_names|
  owner_info = user_data[owner_slug]
  owner_id = owner_info[:id]
  owner_type = owner_info[:type]
  rel_conditions = { owner_id: owner_id, name: repo_names }

  nwo_rel = nwo_rel.or(RepositorySecurityCenterConfig.where(rel_conditions)) unless neg
  nwo_rel = nwo_rel.and(RepositorySecurityCenterConfig.where.not(rel_conditions)) if neg
end

This does not take assicoations into account.
It uses ActiveRecordModelFinderCall to identify relevant calls.
This class has therefor been made public.
@yoff yoff force-pushed the ruby/performance-queries branch from 8369852 to 5feb401 Compare February 5, 2025 15:48
@yoff yoff marked this pull request as ready for review February 5, 2025 16:03
@yoff yoff requested a review from a team as a code owner February 5, 2025 16:03
@yoff yoff added the no-change-note-required This PR does not need a change note label Feb 5, 2025
Copy link
Collaborator

@adityasharad adityasharad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together - the general direction looks reasonable and I like the simple statement of the problem. Suggestions mainly involve documenting the logic, adding tests, and clarifying the help text for both readers and autofix.

I also recommend a PR title and description stating exactly what you are adding, so we can easily find and understand it later.
I agree you don't need a change note yet.

ruby/ql/src/queries/performance/CouldBeHoisted.ql Outdated Show resolved Hide resolved
* @tags performance
*/

// Possible Improvements;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's file internal issues for both ideas.

ruby/ql/src/queries/performance/CouldBeHoisted.ql Outdated Show resolved Hide resolved
ruby/ql/src/queries/performance/CouldBeHoisted.ql Outdated Show resolved Hide resolved
ruby/ql/src/queries/performance/CouldBeHoisted.ql Outdated Show resolved Hide resolved
ruby/ql/src/queries/performance/CouldBeHoisted.ql Outdated Show resolved Hide resolved
ruby/ql/src/queries/performance/CouldBeHoisted.ql Outdated Show resolved Hide resolved
ruby/ql/src/queries/performance/CouldBeHoisted.qhelp Outdated Show resolved Hide resolved
ruby/ql/src/queries/performance/CouldBeHoisted.ql Outdated Show resolved Hide resolved
ruby/ql/src/queries/performance/CouldBeHoisted.ql Outdated Show resolved Hide resolved
@yoff yoff requested a review from adityasharad February 7, 2025 22:47
@adityasharad adityasharad changed the title Ruby/performance queries Ruby: Query for database calls in a loop Feb 7, 2025
adityasharad
adityasharad previously approved these changes Feb 7, 2025
Copy link
Collaborator

@adityasharad adityasharad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, ready to merge when DCA is happy.


class DatabaseQueryInLoopTest
def test
### These are bad
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we please use InlineExpectationsTestQuery.ql instead, like e.g. https://github.com/github/codeql/blob/main/ruby/ql/test/query-tests/security/cwe-022/PathInjection.qlref?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, yes!

/** Gets the name of a built-in method that involves a loop operation. */
string getALoopMethodName() {
result in [
"each", "reverse_each", "map", "map!", "foreach", "flat_map", "in_batches", "one?", "all?",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should in_batches be excluded since it seems to signal some reflection on looping.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do this later.

@yoff yoff requested a review from hvitved February 11, 2025 21:33
Copy link
Collaborator

@adityasharad adityasharad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hvitved are you happy with test changes (Qlucie run seemed to pass, although I was surprised we didn't have to change any of the inline comments)?

@yoff
Copy link
Contributor Author

yoff commented Feb 17, 2025

@hvitved are you happy with test changes (Qlucie run seemed to pass, although I was surprised we didn't have to change any of the inline comments)?

Inline comments are added. The existing comments do not start with # $ so they are only read by humans :-)

@yoff yoff merged commit 4b53e1c into github:main Feb 17, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation no-change-note-required This PR does not need a change note Ruby
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants