Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try excluding social media from search results #692

Closed
wants to merge 6 commits into from

Conversation

JamieC24
Copy link
Contributor

This pull request is an intended solution to the Trello card about removing social media results from the site search

This code looks for strings containing the names of social media (e.g twitter, facebook, youtube, linkedin, instagram) and sets a is_social variable to true if so.
This then allows us to only display result cards when the is_social variable is false.
It also allows us to exclude social media results from the final reported number of search results.

As SOLR isn't enabled on staging and cannot be tested locally, this has not been tested on any environment. As such, anything you can spot that might blow up if it gets pushed to production would be welcomely reported!

Copy link
Contributor

@sean-dunwoody sean-dunwoody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @JamieC24 ,

Thanks for submitting this (and the description that outlines the background to this).

A couple of things:

  1. We should ideally really make these changes at the Solr level. Have you talked with Gareth or Simon about potentially changing the index so that it doesn't index these items?
  2. If we have to implement it this way, it would be better to make these changes in the controller rather than the view, that way we don't have to run these checks twice. I believe the controller for this code is found here:
    public function results(Request $request): View

$is_social = true;
}
if($is_social === true) {
$number = $number -= 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a mixture of using the short and long ways of doing this. Either it should be:

$number = $number - 1;

Or:

$number -= 1;

I'm not 100% sure if the code would work correctly as it currently is (although I think it should do). Regardless of that, it should be changed to either of the above options.

See:
https://www.php.net/manual/en/language.operators.assignment.php#language.operators.assignment.arithmetic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point! Good spot!

@JamieC24
Copy link
Contributor Author

Hi @JamieC24 ,

Thanks for submitting this (and the description that outlines the background to this).

A couple of things:

1. We should ideally really make these changes at the Solr level. Have you talked with Gareth or Simon about potentially changing the index so that it doesn't index these items?

2. If we have to implement it this way, it would be better to make these changes in the controller rather than the view, that way we don't have to run these checks twice. I believe the controller for this code is found here: https://github.com/FitzwilliamMuseum/fitz-main-website/blob/c373ff87a98912741eceffcaa58c598522c7e836/app/Http/Controllers/highlightsController.php#L128

Thanks for your notes on this!

  1. Simon did mention this would be better as a change on the Solr level, but he also said it would be quite a delicate process, and that a change like this would probably be best in the short term.
  2. I've just pushed a revised version of this commit that does the check in the highlights controller and prunes the result from the $records array if its url contains any reference to social media

@JamieC24 JamieC24 requested a review from sean-dunwoody June 19, 2024 14:12
@portableant
Copy link
Member

portableant commented Jun 19, 2024

If you call me I'll tell you how to achieve this more easily and remove the indexed content from solr or you can just follow this,. It's not very difficult to remove these data.

In the solr schema, contentType key has several social formats - twitter, sidecar and instagram - you can exclude these rather than doing str matching. To adjust the index there's several ways.

  1. You can remove the social content from the index via solr payload in the solr console (all documented in the mkdocs repo) - eg
{'delete': {'query': 'contentType:twitter'}}
  1. If you don't have access to the solr console on your IP, then write a call that can be run via tinker or cron - see

https://github.com/FitzwilliamMuseum/fitz-web-docs/blob/main/docs/websites/main-website/solr/search-index/force-updating-the-index.md for a list of these - try using shopifyrefresh as a template for doing that eg

https://github.com/FitzwilliamMuseum/fitz-main-website/blob/master/app/Http/Controllers/solrImportController.php#L519

To stop the indexing of the twitter data (instagram should have stopped ages ago and is commented out), remove this line or comment out:

https://github.com/FitzwilliamMuseum/fitz-main-website/blob/master/app/Console/Kernel.php#L68

These data were originally indexed to enable social content to be surfaced for exhibition pages and to dump these data to csv for monitoring.

@JamieC24 JamieC24 closed this Jun 20, 2024
@JamieC24 JamieC24 deleted the hotfix/search-socials branch June 20, 2024 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants