Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow modeladmin views because of PagePermissionHelper.get_valid_parent_pages #19

Open
haplo opened this issue May 22, 2019 · 4 comments

Comments

@haplo
Copy link

haplo commented May 22, 2019

Issue Summary

I have been profiling the Wagtail admin views because my users experience some very slow load times (10-30 seconds). I implemented some big optimizations already (see wagtail/wagtail#5311 and wagtail/wagtail#5314), and when I profiled the index view I found out that a lot of the time goes into PagePermissionHelper.get_valid_parent_pages because it builds a massive query with hundreds or thousands of OR operators.

See the profiling for WMAView.dispatch:
profiler_get_valid_parent_pages

I propose a solution in the next message.

Steps to Reproduce

  1. Create a group with add, edit, publish permissions over hundreds of pages of the same content type.
  2. Add an user to the group.
  3. Log in as the user and go to that content type index page.
  4. Page load time should be higher than ideal.

I have confirmed that this issue can be reproduced as described on a fresh Wagtail project: no, I profiled over a local Wagtail 2.4 with a snapshot of production data.

Technical details

  • Python version: 3.7
  • Django version: 2.0.3
  • Wagtail version: 2.4
  • Browser version: Firefox 60.6.1esr (64-bit) on Fedora 29 (not relevant to this issue)
@haplo
Copy link
Author

haplo commented May 22, 2019

So the problem with PagePermissionHelper.get_valid_parent_pages lies in generating a queryset with hundreds or thousands of OR operators. There are currently three uses for this method:

  1. To see if there is at least one valid parent page (calling .exists() over the queryset)
  2. To get the total number of valid parent pages (calling .count() over the queryset)
  3. To actually get the valid parent pages for use in the page chooser view.

I propose getting rid of the queryset and instead generating and returning an iterator. That way code currently using exists can just check if there is at least one element, code using count() can just do len() and the page chooser can cast the iterator into a list if necessary.

If someone can give me the OK for this change I will work on the implementation right away.

@ababic
Copy link

ababic commented May 23, 2019

Hi @haplo,

Thank you for reporting, and for looking so deeply into the performance here. Anything you can do to help improve things would be greatly appreciated, thanks :)

I thought it might be helpful to first provide a little more information about the two main causes for get_valid_parent_pages() inefficiency (in case it's of any help to you or anyone else that stumbles across this issue):

  1. The number of potential parent pages is large and varied in type. This can usually be drastically improved by setting the parent_page_types attribute on the model being listed. Typically, ModelAdmin classes are used for things like events, articles, or other pages that typically live under a small number of listing pages, so setting parent_page_types usually makes a lot of sense (and will also result in a much more useful set of choices in the ChooseParentView when adding a new item)

  2. The user has many separate 'add' permissions in different parts of the page tree. Sometimes this is unavoidable or difficult to change in an established project, but this can also be helped by having separate groups for users with really wide-ranging permissions, rather than adding them to lots of groups with very specific permissions in different parts of the tree. (If efficiency in get_valid_parent_pages() can be improved anywhere, I'd suspect it would be here!)

@haplo
Copy link
Author

haplo commented May 29, 2019

Thank you for your response @ababic.

  1. The parent_page_types are set and limited, so I don't think that's the cause for the slowness. This is also supported by the profiling, as most of the time is gone in __or__ queryset chaining, which is done later in get_valid_parent_pages().

  2. It's true that some of my users have a lot of add permissions on pages and the situation could probably be simplified. I'm going to profile again after removing the permissions, see how big an impact that makes.

On an unrelated note, would you get someone to review wagtail/wagtail#5311 and wagtail/wagtail#5314? Those optimizations are ready to go and would be great to have them in the next release.

@ababic
Copy link

ababic commented Jun 1, 2019

@haplo. Just wanted to let you know: I raised the matter of getting your PRs reviewed in the last core team meeting, and someone will get to them soon. If you do manage to find a way to optimise this part of modeladmin, I'll glady review that for you personally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants