Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add meta and query parameter to fetch patterns matching a list of allowed blocks #540

Merged
merged 5 commits into from
Jan 2, 2023

Conversation

ryelle
Copy link
Contributor

@ryelle ryelle commented Nov 16, 2022

See WordPress/gutenberg#45237

Another really important issue is that if we make any request to the PD, the eligible patterns to be inserted are subject to the allowedBlocks in the editor or a block. What this means is that we could fetch 10 patterns for example where none of them could be inserted.

This PR aims to add an API parameter allowed_blocks to the local endpoint[1] so that sites with restrictions can pass through the list of valid blocks, and the pattern directory will return only patterns with blocks on the list.

To do this, a new meta value is used to track the blocks in a pattern, wpop_contains_block_types. There's also a migration script which can be run on a sandbox to update all existing patterns, a hook to update the meta when a pattern is updated, and the meta is set on translated patterns when those are updated.

cc @ntsekouras


[1] Once merged, it should be available in api.w.org, but will need another Gutenberg PR before it can be used in the editor.

How to test the changes in this Pull Request:

Test the migration script

  1. You can run the script on your docker wp-env using the cli container.
  2. To dry-run over all posts, use:
    yarn wp-env run cli "php wp-content/plugins/pattern-directory/bin/update-contains-block-types.php --all --verbose"
  3. To apply the changes, use:
    yarn wp-env run cli "php wp-content/plugins/pattern-directory/bin/update-contains-block-types.php --all --verbose --apply"
  4. You can also run on just one post by using --post=[id] instead of --all
  5. The updated meta should be visible in the API, or via wp-cli yarn wp-env run cli "wp post meta list [ID]"
  6. Spot check some posts that the meta value reflects the post content

Test the pattern update hook

  1. Create or update a pattern, change the block types used
  2. Check the post meta
  3. The meta value should exist, and reflect the blocks used

Test the API endpoint

This requires some patterns having the meta set, which is why it's last.

Make a request to your site using a list of block types, ex:

http://localhost:8888/wp-json/wp/v2/wporg-pattern?locale=en_US&allowed_blocks[]=core/cover&allowed_blocks[]=core/heading&allowed_blocks[]=core/spacer&allowed_blocks[]=core/paragraph

This requests only patterns with core/cover, core/heading, core/spacer, core/paragraph. Any pattern with other blocks should not be returned in the list.

@ryelle ryelle added [Component] Pattern Directory API The pattern API on WordPress.org, and/or the CPT endpoint [Component] Pattern Directory The backend of the pattern directory: submission, management, etc labels Nov 16, 2022
@ryelle ryelle requested a review from tellyworth November 16, 2022 21:10
@ryelle ryelle self-assigned this Nov 16, 2022
Copy link

@ntsekouras ntsekouras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for working on this @ryelle! It might be the last big missing piece for the integration with GB, so I'm really excited!

In the GB PR I have added support for this. How can we test this though and land this PR?

$block_names = wp_list_pluck( $all_blocks, 'blockName' );
$block_names = array_filter( $block_names );
$block_names = array_unique( $block_names );
sort( $block_names );

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it will help performance sorting the block names?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly this is mostly for devex, so it's easier to check the value against the expected value.

$all_blocks = _flatten_blocks( $blocks );

// Get the list of block names and convert it to a single string.
$block_names = wp_list_pluck( $all_blocks, 'blockName' );

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to remove the core/ prefix from the blocks and the reason is that we can end up with a quite big GET request that could exceed the limit. If every core block is allowed it seems we have 60 of them.

If in the future we add support in pattern directory for third party blocks, we can revisit and I don't think a solution would be that hard.

For now in my GB PR, I'm stripping the core/ prefix: WordPress/gutenberg@0b3d6e0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If every core block is allowed it seems we have 60 of them.

If every block is allowed, you could just not send the parameter, and we won't need to do any filtering.

If in the future we add support in pattern directory for third party blocks

Yeah, this is trying to be future-friendly, so the full block name is necessary in the meta (otherwise we'll need another script to update it later…).

we can end up with a quite big GET request that could exceed the limit.

I think we're safe for now with just core blocks - are you running into the limit in your PR?

Copy link

@ntsekouras ntsekouras Nov 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If every block is allowed, you could just not send the parameter, and we won't need to do any filtering.

That would be more convoluted as we would need to have operations in sync with the Blocks API(like parent, ancestor, etc..), so for now let's not do that. Even with having the above though, he problem remains if for example we have just one disallowed block.

I think we're safe for now with just core blocks - are you running into the limit in your PR?

With the prefix the length of allowed_blocks is 962 characters. It's quite long but I think we will be in the limit for now at least. Let's try that.

Yeah, this is trying to be future-friendly, so the full block name is necessary in the meta (otherwise we'll need another script to update it later…).

We could just modify the script to add the prefix in non-core blocks, but as I said above we can try with the prefix for now and let's see if we need something different in the future.

I have updated the GB PR to send the prefix.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be more convoluted as we would need to have operations in sync with the Blocks API(like parent, ancestor, etc..), so for now let's not do that.

Hm... I just realized that there are some nuances in this. What I'm sending you right now doesn't include the blocks for example that have a specific parent(ex List item and List), but your parsing of the content will probably have the List item as well, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your parsing of the content will probably have the List item as well, correct?

Yes, it collapses down all the blocks used in a pattern, for example: core/column,core/columns,core/group,core/heading,core/list,core/list-item,core/paragraph,core/separator,core/social-link,core/social-links

This comment was marked as outdated.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the logic to include the parent/ancestor relationship and if all available core blocks are allowed to be inserted, I'm sending the all value as a string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If every core block is allowed it seems we have 60 of them.

This is going to affect the cacheability of the API significantly if every API request sends a list of their installed blocks that we then filter by. Due to the Pattern APIs using WordPress to serve the API requests with nginx caching in-front, this could be a longer-term performance issue.

We should require/suggest strongly, at a minimum, that the allowed_blocks parameter should be sorted consistently to avoid two sites with different block orders getting different cache keys.

I can't see that clients could reasonably send a "I don't support this block" flag though, as it doesn't know what it doesn't know about.

Copy link
Member

@dd32 dd32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted above, this is going to have some potentially significant performance bottlenecks.

At a high-level, this feels like something that potentially should be being filtered client-side, noting that would play havoc with the pagination.

The other aspect of this is that currently a Pattern that uses a WordPress 6.2 block might be returned to a 6.0 site, it would make sense to have the API differ in response in that case, but the client should still validate the pattern can be used by it.

$args['meta_query']['allowed_blocks'] = array(
'key' => 'wpop_contains_block_types',
'compare' => 'REGEXP',
'value' => '^((' . implode( '|', $allowed_blocks ) . '),?)+$',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't going to be amazing performance wise, but I'm not sure I can see a better way to achieve this, other than filtering post-query.

$all_blocks = _flatten_blocks( $blocks );

// Get the list of block names and convert it to a single string.
$block_names = wp_list_pluck( $all_blocks, 'blockName' );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If every core block is allowed it seems we have 60 of them.

This is going to affect the cacheability of the API significantly if every API request sends a list of their installed blocks that we then filter by. Due to the Pattern APIs using WordPress to serve the API requests with nginx caching in-front, this could be a longer-term performance issue.

We should require/suggest strongly, at a minimum, that the allowed_blocks parameter should be sorted consistently to avoid two sites with different block orders getting different cache keys.

I can't see that clients could reasonably send a "I don't support this block" flag though, as it doesn't know what it doesn't know about.

@ryelle
Copy link
Contributor Author

ryelle commented Nov 21, 2022

At a high-level, this feels like something that potentially should be being filtered client-side, noting that would play havoc with the pagination.

That's the issue— the editor could request X patterns, but if a site only supports a few blocks, once they're filtered out it will look like nothing's possible. The editor could continue to request additional pages of patterns to fill out the original query, but then it's running multiple API requests, and I think the experience of that would be slower than this approach.

The other aspect of this is that currently a Pattern that uses a WordPress 6.2 block might be returned to a 6.0 site, it would make sense to have the API differ in response in that case, but the client should still validate the pattern can be used by it.

Having the client validate blocks was suggested as part of the solution for compatibility issues here, the other part is manually setting a meta value on patterns for a minimum wordpress version (#536).

@dd32
Copy link
Member

dd32 commented Nov 22, 2022

At a high-level, this feels like something that potentially should be being filtered client-side, noting that would play havoc with the pagination.

That's the issue— the editor could request X patterns, but if a site only supports a few blocks, once they're filtered out it will look like nothing's possible. The editor could continue to request additional pages of patterns to fill out the original query, but then it's running multiple API requests, and I think the experience of that would be slower than this approach.

Yeah, it's a balancing act of where to have the performance impact.
Multiple cached responses would likely be more efficient than a single uncached response though.
The other aspect to consider is how many people does this actually affect? Multiple API requests for 0.5% of users is better than the other 99.5% getting a 50% cache hit rate or something. I don't personally see it being an issue if only 1 or 2 results can be used/displayed by the client before the next API request is fired.
There's also a limit in characters in a GET request - the path + query must be less than ~2,000 characters by spec, sure, that's 200+ blocks depending upon the implementation, but it's a limit that may eventually be run into. (I believe ~8k chars is the limit for api.wordpress.org before a HTTP 414 Request too large is hit, but you still have to consider Proxy servers/etc)

The other option is for Gutenberg to only include that parameter when it knows that a core block has been disabled. That could be done by having a static list of blocks that "should" exist in that Gutenberg version, and instead sending a exclude_blocks parameter.

@ntsekouras
Copy link

The other aspect to consider is how many people does this actually affect? Multiple API requests for 0.5% of users is better than the other 99.5% getting a 50% cache hit rate or something. I don't personally see it being an issue if only 1 or 2 results can be used/displayed by the client before the next API request is fired.

I actually had a conversation with @mcsf about this yesterday and he was asking the same thing and suggesting this. Besides the pagination handling, that in my mind seems 'chaotic' 😄 but could be done, I don't think we have this info. I'd say that most people have all blocks activated, but 🤷 . If that's the usual case though to support all blocks, I'm not sending the param in the request, so no problems there..

All the patterns operations are expensive(parsing, rendering) and adding the recursive validation and eligibility for insertion checks might take a lot of time - we might end up needing to implement and test these though..

The other option is for Gutenberg to only include that parameter when it knows that a core block has been disabled. That could be done by having a static list of blocks that "should" exist in that Gutenberg version, and instead sending a exclude_blocks parameter.

The workaround for this would be to support both include and exclude params and only send the one that contains less blocks. For example if only a couple of blocks are disabled, send them.

I think what we are missing right now is actually test the current approach against WordPress/gutenberg#45237 and see the performance impact. How can we do that?

@tellyworth
Copy link

tellyworth commented Nov 28, 2022

By way of challenging an assumption: is the client side definitely doing the right thing by not displaying disabled blocks to the user at all? Would it be a better experience all round if it included them in the list, but showed them greyed out/disabled/similar? That way the user won't go hunting through x pages for a block they know exists but can't find, and there's an opportunity for the UI to explain to the user why that block isn't there.

That would obviously simplify the server side by eliminating the need for a query parameter at all.

@ntsekouras
Copy link

is the client side definitely doing the right thing by not displaying disabled blocks to the user at all?

Yes, the Gutenberg side shouldn't change, as it would add more noise and be way more confusing to show any unavailable blocks and explain why. The current alternative we might explore is doing all the work client side and handle pagination and possible multi-requests there. We should test the current approach first though, since it's implemented in both projects - but not merged to neither yet.

@tellyworth
Copy link

Thanks Nik.

Let's merge this so we can test it out. If performance turns out to be an issue we can go with client-side handling.

@ryelle ryelle merged commit 7b94374 into trunk Jan 2, 2023
@ryelle ryelle deleted the update/endpoint-contains-blocks branch January 2, 2023 21:03
@ryelle
Copy link
Contributor Author

ryelle commented Jan 2, 2023

@ntsekouras This has been merged, so you can now query api.wordpress.org/patterns with allowed_blocks, ex:

https://api.wordpress.org/patterns/1.0/?allowed_blocks[]=core/heading&allowed_blocks[]=core/gallery

@ntsekouras
Copy link

Thanks @ryelle! 🎉

I'm testing the GB PR and it seems can fetch patterns with more blocks, if we also provide the search param, ex:

https://api.wordpress.org/patterns/1.0/?search=image&allowed_blocks[]=core/heading&allowed_blocks[]=core/gallery

Any insights about this?

@ryelle
Copy link
Contributor Author

ryelle commented Jan 9, 2023

Huh… I'm guessing it has to do with the fact that we query ElasticSearch for search results, and it probably doesn't know to pay attention to the allowed_blocks. Tracking this in #547.

ryelle added a commit that referenced this pull request Jan 9, 2023
If the `allowed_blocks` meta filter is set, we should also pass that through to the ElasticSearch filters.

See #540. Fixes #547.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Component] Pattern Directory API The pattern API on WordPress.org, and/or the CPT endpoint [Component] Pattern Directory The backend of the pattern directory: submission, management, etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants