Indicate for Queries that they are not valid without user added tokens #217

westei · 2018-02-26T10:20:32Z

With #200 all queries are added so that the client is notified about this query to be configured.

So we need a new way telling the client that this query is currently not usefull (e.g. no tokens are assigned).

User generated tokens can still make those queries active

The text was updated successfully, but these errors were encountered:

ruKurz · 2018-03-05T10:07:59Z

mrsimpson · 2018-03-08T11:26:33Z

I used the new message-search for some time now and it works as expected. However, the tokens extracted are often too meaningless (e. g. only adverbs/adjectives like "gerne"). Often, no tokens at all are extracted rendering the query useless. At the same time, the more-like-this-based query gave quite decent results and I have to say I miss it.

I came to the conclusion that bot query providers have got different constraints making them useful. In this case, the message-search is useful once an interesting term, noun or a user defined token has been provided. The MLT-search is useful as long as there is no user-defined-token is available.
Applying those constraints and respecting them in the UI by hiding useless queries could be a reasonable approach: As long as there are no keywords, related conversations are shown. As soon as the user specifies one to be search for, it is hidden and the search is visible.

WDYT?

westei · 2018-03-09T08:32:41Z

Ich stimme Dir zu. Die Herausforderung ist, dass MLT as Ergebnis eine Conversation hatte und die ConversationSearch nun einzelne Messages zurück gibt.

Damit der MLT Provider mit der Token basierten ConverationSearch zusammenpasst muss man Ihn so umbauen, dass er auch Messages als Ergebnis liefert. Dazu muss man testen ob der MLT Ansatz mit den potenziell kurzen Texten von Messages funktioniert. Wenn nicht, dann muss man sich auch da noch etwas einfallen lassen

mrsimpson · 2018-03-09T10:51:07Z

@westei what about analyzing similarities of a "window" of messages. I don't know how to implement this inside solr (without permutating too much) and blowing up the index

westei · 2018-03-09T11:26:54Z

For sure this would increase the index size, but I do not think this is a problem. The problem is more that you will get overlapping segments of conversations as results ( e.g. a result c1#m[3-8]and an other result c1#m[5-10]). One needs to collect those and generate the response accordingly.

For the Conversation Search we use Solr Grouping to get the Results grouped by conversation and the Response merges overlapping sections (based on the context configuration).

For Solr MLT one can not use grouping so implementing this would be much harder. In addition without grouping one can not tell Solr to only include max. 3 results for a conversation (otherwise If I request e.g. 10 results I could get all 10 from the same conversation)

Maybe one can combine MLT with FieldCollapsing to get the desired behaviour

mrsimpson · 2018-03-12T10:35:58Z

jup, understood. I imagine that a window of the messages issued by the author could be a good base for the more-like-this analysis

ruKurz · 2018-03-12T10:42:35Z

@westei Could you please provide a suggestion how to solve this problem. So we can discuss how to proceed/implement a better user experience?

ruKurz · 2018-03-12T11:38:41Z

Challenge: Combining the conversation-search query builder with the conversation-mlt query builder raises the question how to create a comprehensible user experience.

Suggestion

Server-side: Only use the conversation-mlt when no tokens have been extracted and no user tokens have
UI-Side: Adjust the user interface to present the conversation-mlt in the same way as the conversation-search results. (Do not make any UX difference, and hide the information of the query builder, used from the user. The user gets: Related conversations independent on the query builder)

@janrudolph Do you agree?

westei · 2018-04-02T18:08:57Z

After some experiments and testing I come to the conclusion that the best technical solution is to:

calculate a textual context for every message
- this context will be indexed in a field configured to be used by Solr MLT
on every related Conversation request I will make a Solr MLT request (will all the filters) with interestingTerms=details but without selecting any terms.
- this will allow to reconstruct the query that Solr /mlt would internally use to select related conversations (e.g. for the context Java und Solr wozu das ganze? the interesting terms would be "interestingTerms":["text:wozu",1.0, "text:test",1.1805785,"text:solr",1.2813209]).
In the Related Query Response I will provide those information. So the Widget can decide to do a similar conversation query (by using those parameters) or not (by excluding those).

This has the huge advantage that the MLT query is only used to retrieve interesting terms and a normal Solr Query is used for retrieving the results. All the special functionality on how to correctly retrieve related conversations incl. contextual messages does already work for normal Solr queries so their is no need to duplicate this functionality for Solr MLT queries

In UI Terms:

This would make the Similarity based search a feature of the related conversation search (the exact thing requested by @mrsimpson).

This also allows to combine queries for Tokens with similarity based constraints. Something that could be useful if one wants to search for a custom token that is relatively common in the dataset - as the context would rank results containing the custom token in a similar context to the top of the result list.

For that "Similarity" would need to be an switch that can be activated/deactivated by the user (similarly as filters as discussed in #228. I would suggest to enable "Similarity" if no user added token is present and deactivate it as soon as the user adds a custom token or pins an extracted Token.

mrsimpson · 2018-04-02T19:40:32Z

@westei I didn’t fully get the Solr implementation details, but the gist. And it sounds as if this made best use of the technology involved 👍

…onversation Search Query Builder * Indexing now stores a MLT Context for every Message. This includes the text of the surrounding messages based on content length, time difference, min/max message counts * Implementation if the similarity feature * The Related Conversation QueryBuilder performs a Solr MLT query on the conversation to get interesting terms * those are normalised so that the maximum boost is `1.0` * Query params for similarity search are built and added to the Query (field: `similarityQuery`) * The widget can send this as `q.alt`: In this case this query is used if no query is present * The widget can also combine this with real tokens. In this case the query params need to be appended to the other query parameters. * in this case the real query params should use an additional boost factor (in the range of `5 - 10`) NOTE: this increases the conversation index version from `5` to `6` so Smarti should trigger a full reindex on startup (for embedded setups). If this does not work for some reason the re-indexing needs to be manually triggered by deleting the current conversation index. For remote Solr Servers the schema needs to be updated and the index data need to be deleted to force a re-index on startup

westei · 2018-04-04T09:23:25Z

Server-Side implementation is ready in the #217-similarity-feature-for-related-conversation-search branch.

As the implementation has changes in the same files as #228 I used the branch of this Issue as a starting point. So the pull request #234 should be merged with 0.7.0 before.

westei · 2018-04-04T09:31:39Z

@Peym4n the Related Conversation Query now has a new field similarityQuery that contains the parameters for similarity queries.

The widget should send those parameters as value for the q.alt parameter. This has the effect that similarity search is used in cases where no tokens are present

In addition the Widget should only consider custom and maybe pinned Tokens for the conversation search ("Expertengespräche"). This will make similarity the default behaviour.

As alternative we could add an [Similarity] button that can be enabled/disabled (similar to optional filters). If enabled the similarity query together with pinned and custom tokens would be used. If disabled all shown tokens with no Similarity would be used. Still similarity would be used as default if no token is extracted.

…nto #217-similarity-feature-for-related-conversation-search

Peym4n · 2018-04-04T13:55:53Z

The widget part is also implemented.
Now only user tokens and pinned tokens will be used for related conversation search and when none of them exists (even when unpinned tokens exist) the similarity query is used.

There is an escaping bug on the server side which @westei will fix.

-similarity-feature-for-related-conversation-search

* WordDilimiter: original is now kept also on query time to make searches like `c++` actually match the indexed token `c++` * added a PatternReplaceFilterFactory to remove quotes `,` and `;` on both sides and other tailing punctuation marks on terms

…' of github.com:redlink-gmbh/smarti into #217-similarity-feature-for-related-conversation-search

…-related-conversation-search Similarity feature for related conversation search (#217)

westei self-assigned this Feb 26, 2018

westei added the enhancement label Feb 26, 2018

westei mentioned this issue Mar 5, 2018

Smarti widget: Provide "can not help"-information #213

Closed

mrsimpson added the bug label Mar 5, 2018

mrsimpson added this to the v0.7.1 milestone Mar 5, 2018

westei added the ready label Mar 19, 2018

westei added in progress and removed ready labels Apr 2, 2018

westei assigned Peym4n Apr 4, 2018

westei added a commit that referenced this issue Apr 4, 2018

#217: forgot to add the ConversationContextUtils class

ab0d4b6

Peym4n pushed a commit that referenced this issue Apr 4, 2018

Merge remote-tracking branch 'remotes/origin/#228-advanced_filters' i…

c681842

…nto #217-similarity-feature-for-related-conversation-search

westei added a commit that referenced this issue Apr 4, 2018

#217: similarity terms need to be query escaped

389d06e

westei mentioned this issue Apr 4, 2018

Similarity feature for related conversation search (#217) #235

Merged

ghost added in review and removed in progress labels Apr 4, 2018

westei added in progress and removed in review labels Apr 4, 2018

Peym4n pushed a commit that referenced this issue Apr 4, 2018

#217: Implement similarity query in the widget

2ff6132

westei added a commit that referenced this issue Apr 5, 2018

Merge branch 'release_0.7.0' of github.com:redlink-gmbh/smarti into #217

cfc3633

-similarity-feature-for-related-conversation-search

westei added a commit that referenced this issue Apr 5, 2018

Merge branch '#217-similarity-feature-for-related-conversation-search…

dd089cb

…' of github.com:redlink-gmbh/smarti into #217-similarity-feature-for-related-conversation-search

westei added a commit that referenced this issue Apr 5, 2018

Merge pull request #235 from redlink-gmbh/#217-similarity-feature-for…

86a592d

…-related-conversation-search Similarity feature for related conversation search (#217)

westei added a commit that referenced this issue Apr 5, 2018

#217: fixes a small bug that the displayValue was Lucene Query escaped

4936369

westei mentioned this issue Apr 5, 2018

Improve UI message, if no results are found #220

Closed

Peym4n added in review and removed in progress labels Apr 5, 2018

ja-fra closed this as completed Apr 20, 2018

ghost removed the in review label Apr 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Indicate for Queries that they are not valid without user added tokens #217

Indicate for Queries that they are not valid without user added tokens #217

westei commented Feb 26, 2018 •

edited by ruKurz

Loading

ruKurz commented Mar 5, 2018

mrsimpson commented Mar 8, 2018

westei commented Mar 9, 2018

mrsimpson commented Mar 9, 2018

westei commented Mar 9, 2018

mrsimpson commented Mar 12, 2018

ruKurz commented Mar 12, 2018

ruKurz commented Mar 12, 2018 •

edited

Loading

westei commented Apr 2, 2018

mrsimpson commented Apr 2, 2018

westei commented Apr 4, 2018 •

edited by Peym4n

Loading

westei commented Apr 4, 2018

Peym4n commented Apr 4, 2018 •

edited

Loading

Indicate for Queries that they are not valid without user added tokens #217

Indicate for Queries that they are not valid without user added tokens #217

Comments

westei commented Feb 26, 2018 • edited by ruKurz Loading

ruKurz commented Mar 5, 2018

mrsimpson commented Mar 8, 2018

westei commented Mar 9, 2018

mrsimpson commented Mar 9, 2018

westei commented Mar 9, 2018

mrsimpson commented Mar 12, 2018

ruKurz commented Mar 12, 2018

ruKurz commented Mar 12, 2018 • edited Loading

westei commented Apr 2, 2018

mrsimpson commented Apr 2, 2018

westei commented Apr 4, 2018 • edited by Peym4n Loading

westei commented Apr 4, 2018

Peym4n commented Apr 4, 2018 • edited Loading

westei commented Feb 26, 2018 •

edited by ruKurz

Loading

ruKurz commented Mar 12, 2018 •

edited

Loading

westei commented Apr 4, 2018 •

edited by Peym4n

Loading

Peym4n commented Apr 4, 2018 •

edited

Loading