Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it easier to search for old posts #104

Closed
punchagan opened this issue Nov 14, 2015 · 13 comments
Closed

Make it easier to search for old posts #104

punchagan opened this issue Nov 14, 2015 · 13 comments
Labels

Comments

@punchagan
Copy link
Member

punchagan commented Nov 14, 2015

Add full text search. Initial work here

@punchagan punchagan changed the title Make it easier to search for old posts by authors Make it easier to search for old posts Nov 14, 2015
@punchagan punchagan mentioned this issue Nov 14, 2015
@qwo
Copy link
Member

qwo commented Feb 2, 2017

@punchagan
from zulip

I'm wondering if blaggregator entries are searchable in some way, e.g. "search for all the blaggregator aggregated posts that have 'html'

Would just full text search on the title work in this case or are you looking to add elastic search for full body content search? Does blaggregator store all previous posts ever trafficked on the feed?

@davidbalbert
Copy link
Member

If you haven't considered this already, we've used Postgres's full text search functionality and really like it. I looked over the old Blaggregator search branch, and see you're using elasticsearch, so it might be a moot point.

We use elasticsearch on Community and Postgres's full text search on the new directory on recurse.com, and I have been wishing that we used Postgres for both. This is mostly because we know Postgres much better, and having elasticsearch means we now have two data stores to maintain, upgrade, and understand.

Anyway, take that for what it's worth :).

@punchagan
Copy link
Member Author

@davidbalbert Yes, I'm actually in favor of moving away from Elastic (to Postgres's full text search).

@stanzheng The old code is throw away code. I think we should give Postgres's full text search a spin for this. The content of posts wasn't being stored in Blaggregator until recently, and we could try to backfill the content using a separate script. Though, for a start we could have full text search just on the title, and add search over the post content in a separate step.

@davidbalbert
Copy link
Member

👍

@qwo
Copy link
Member

qwo commented Feb 7, 2017

So I was able to work on this a bit today;

I got postgres full text search working with the routes of

localhost:8000/search/?q=Google

to show a view like below. I was wondering if it was best to keep the search.html view or just enhance the postlist.html so it doesn't have a UX feel. Here is the original search.html view.

I still have to add an input box to allow http gets somewhere in the top bar but essentially it would look very similar.

cc @punchagan

screen shot 2017-02-06 at 8 33 39 pm

@punchagan
Copy link
Member Author

Thanks for starting to look into this, @stanzheng.

I would vote for having a new search.html which includes postlist.html, just like the profile.html and new.html templates do.

@qwo
Copy link
Member

qwo commented Feb 7, 2017

Sounds good! I also notice the search context might be a nice to preview for the user which might be a derivation from the main view.

From my query of "space" i got 5 hits that have the word space, but I don't think they're in any order except recency inserted.
screen shot 2017-02-06 at 8 46 11 pm

Do you also have any advice about which type of search we should use? I've currently using
I'm just looking at this page 😄
I'm using basic post_list = Post.objects.filter(content__search=query) right now but some of the SearchQuery SearchRank etc .. might be overkill but useful.
https://docs.djangoproject.com/en/1.10/ref/contrib/postgres/search/

@qwo
Copy link
Member

qwo commented Feb 7, 2017

Though, for a start we could have full text search just on the title, and add search over the post content in a separate step.

Oh actually it can do this now! if this was cool, I could send a PR.

I was thinking about the post content and how to weight it vs title or context!

@qwo
Copy link
Member

qwo commented Feb 7, 2017

Hmm it looks like postlist.html has new hard coded in the paginated footer; I'll look into refactoring that so it can be reusable for both the search.html and new.html page

https://github.com/recursecenter/blaggregator/blob/master/home/templates/home/postlist.html#L25-L35

@punchagan
Copy link
Member Author

Hmm it looks like postlist.html has new hard coded in the paginated footer; I'll look into refactoring that so it can be reusable for both the search.html and new.html page

The pagination could be split out into a smaller template file, while you are at it.

@punchagan
Copy link
Member Author

punchagan commented Feb 7, 2017

Though, for a start we could have full text search just on the title, and add search over the post content in a separate step.

Oh actually it can do this now! if this was cool, I could send a PR.

Go ahead and send a PR! I was asking for a separate step for this, since there would be a lot of posts without full content, in the DB currently. But, we can always add it after adding the search functionality.

I was thinking about the post content and how to weight it vs title or context!

We can play around with a few things, and see what works best.

@punchagan
Copy link
Member Author

I'm using basic post_list = Post.objects.filter(content__search=query) right now but some of the SearchQuery SearchRank etc .. might be overkill but useful.

I don't really have any advice, but, basic is good for a start. Once we have the basic search in place (at least as a PR), we can try out a few things and see what works best.

@punchagan
Copy link
Member Author

Closing this for now, as the search feature doesn't seem to get used all that much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants