Feature/ogc 508 replace elastic search by postgres text search #918

Tschuppi81 · 2023-07-13T11:52:40Z

INITIAL FEEDBACK TO TEXT SEARCH IN POSTGRES - WORK IN PROGRESS

Feature/ogc 508 replace elastic search by postgres text search

core: Replace elastic search by postgres full text search

With this changes elastic search (es) is still in place and productive while we introduce the postgres full text search accessible using 'search_postgres' keyword in the url.
es url: http://localhost:8080/onegov_agency/bs/search?q=Arnold
psql url: http://localhost:8080/onegov_agency/bs/search_postgres?q=Arnold

hint: onegov-core upgrade

Main changes:

Re-indexing
../onegov-cloud/src/onegov/search/cli.py
Upgrade script (uses the same code as re-indexing)
../onegov-cloud/src/onegov/search/upgrade.py
Extension of mixin class Searchable
.. /onegov-cloud/src/onegov/search/mixins.py
Adjusting all searchable models adding the fts index in postgres

Open Items:

unit tests are not reworked yet
language configuration
improve ranking

Checklist

I have performed a self-review of my code
I considered adding a reviewer
I have added an upgrade hint such as data migration commands to be run
I made changes/features for both org and town6
I have tested my code thoroughly by hand
- I have tested database upgrades
I have added tests for my changes/features

…tendees

…e for users

…e for tickets

TYPE: Feature

TYPE: Feature LINK: OGC-908

Enables bugbear in pre-commit and CI linting, also introduces a garbage collector friendly LRU cache variant. TYPE: Feature LINK: OGC-1052

I also added a db upgrade job to add the columns to the two tables. In addition it will parse the people.address column and separate the information in the newly added fields. TYPE: Feature LINK: ogc-966 HINT: Run onegov-people --select /onegov_agency/* migrate-people-address-field --dry-run after upgrade

TYPE: Bugfix

TYPE: Feature LINK: PRO-1173

TYPE: Feature LINK: OGC-764

The attendee receives a notification on registration or cancellation of their participation. TYPE: Feature LINK: PRO-1126

TYPE: Feature LINK: OGC-928

TYPE: Feature LINK: OGC-746

TYPE: Feature LINK: PRO-1167

Fix root-level page interpretation bug for news, which was mistakenly being treated as falsy (index 0). TYPE: Bugfix LINK: OGC-863

TYPE: Feature LINK: OGC-1167

TYPE: Feature LINK: PRO-1126

TYPE: Feature

TYPE: Bugfix LINK: pro-1116

TYPE: Bugfix LINK: OGC-1073

Slightly larger page text Version number in footer Hover effect on Navigation TYPE: Feature

src/onegov/search/mixins.py

src/onegov/search/integration.py

Daverball

This will probably break a lot of things (anything that uses directory_id in a query) and also does not do what you want it to do. The reason this previously failed for search indexing is simply because to_tsvector is not implemented for UUID so it needs to first be converted to TEXT, however, I'm not sure if the format is the same between Postgres and Python (i.e. if there will be dashes or not in the resulting string if you do a cast).

Type coercion is going to be tricky in general and doing it automatically might not be possible, although you may be able to do something with CASE and pg_typeof, instead of the the coalesce you've been doing to handle NULL.

to_tsvector definitely works for strings and JSON, although in the case of JSON i'm not sure what will end up in the search index, if it's just the values or also the keys. I don't remember if it works for array and hash, but you will need to figure out a way to make all the various cases work correctly.

Also removing all those hybrid_property is also not what you want to do to make this work, but you will need to figure out a way to generate an equivalent SQL expression for what the hybrid_property returns, so it can end up in the search index. In some cases this might not be possible and we may need a computed column to calculate the value in-app and store this value in the database.

Another route would be to forego the automatically updated index and instead generate and update it in-app by subscribing to the relevant orm events, as we have been doing previously for elastic search. This would mean that in-app we pull out all the string values that should end up in the index and then update the tsvector column based on those values.

This reverts commit 21e87ff.

This reverts commit c81a188.

…postgres-text-search

…CourseEvent

Tschuppi81 · 2023-11-27T17:06:39Z

@Daverball In the above commit I am fighting with more complex hybrid expressions. Most of them gather data from other tables and I could not figure out how to properly join the corresponding tables. Currently the 'reindexing' fails in CourseEvent.
Note: This commit to be reverted - just shows the join problem...

Daverball · 2023-11-27T18:29:25Z

@Tschuppi81 This is going to be tricky in general and probably a bad idea in terms of performance anyways, you are going to be better off adding caches for the joined attributes and using @observes methods to keep the caches up to date.

Or we could give up on calculating the search index online, since we're still quite a ways off and just generate a dictionary offline of all the search values which then can be turned into a TSVECTOR directly if interpreted as JSONB. While somewhat inefficient, this would still be quicker than what we had before with elasticsearch, since we save ourselves from having to talk with the elasticsearch server. It would also get rid of the issues related to polymorphism, since SQLAlchemy is aware of what polymorphic identity we're talking about, while in postgres we would have to write CASE statements to distinguish the various versions in the same table.

…sion in CourseEvent" This reverts commit e6fcdb1.

Tschuppi81 and others added 30 commits April 17, 2023 05:12

Adding psql index columns and re-index functionality for users and at…

759e4e3

…tendees

Disable ES, make user search working using postgresql

0c4159a

Adding fts index functions to create, drop index as well as db upgrad…

5e89675

…e for users

Adding fts index functions to create, drop index as well as db upgrad…

4ad864e

…e for tickets

Switch from ES app to psql app

8fcf1e9

Switch from ES app to psql app

b208736

Initial move from es to psql search for town6

41f3111

Core: Fix a lot bugbear warnings.

942141c

TYPE: Feature

Org: Prioritize Events in search, and sort chronologically.

db272dc

TYPE: Feature LINK: OGC-908

Core: Enable flake8 bugbear.

128e988

Enables bugbear in pre-commit and CI linting, also introduces a garbage collector friendly LRU cache variant. TYPE: Feature LINK: OGC-1052

Release release-2023.15

80c09c6

Town6: Small Fixes

85b7f7e

TYPE: Bugfix

Feriennet: New banners and logo

de479b1

TYPE: Feature LINK: PRO-1173

Town6: Make color inversion on icon links possible

0d5fdcd

TYPE: Feature LINK: OGC-764

Feriennet: E-mail notifications on registration for activity

60b8a6a

The attendee receives a notification on registration or cancellation of their participation. TYPE: Feature LINK: PRO-1126

Org: Add more options to "further information" on directories

a729ec6

TYPE: Feature LINK: OGC-928

Town6: External event url

a77bf28

TYPE: Feature LINK: OGC-746

Feriennet: Invoice Items payment with dates

1ef6587

TYPE: Feature LINK: PRO-1167

Release release-2023.16

6d32013

Town6: Fixes news not being displayed if it's the first item.

a710d57

Fix root-level page interpretation bug for news, which was mistakenly being treated as falsy (index 0). TYPE: Bugfix LINK: OGC-863

Town6: Remove "Onegov Cloud Team" in mail-footer

81b1563

TYPE: Feature LINK: OGC-1167

Feriennet: Edit email text

8ae0716

TYPE: Feature LINK: PRO-1126

Release release-2023.17

a582271

Town6: Make image preview visible

0995031

TYPE: Feature

Feriennet: Make form more robust if field is missing

29e4b5e

TYPE: Bugfix LINK: pro-1116

Release release-2023.18

48e39a5

Ballot: Fixes file constraints.

cefd094

TYPE: Bugfix LINK: OGC-1073

Release release-2023.19

fb6f106

Org: Small Adjustments

44513c9

Slightly larger page text Version number in footer Hover effect on Navigation TYPE: Feature

Tschuppi81 commented Jul 31, 2023

View reviewed changes

src/onegov/search/mixins.py Outdated Show resolved Hide resolved

Daverball reviewed Jul 31, 2023

View reviewed changes

src/onegov/search/mixins.py Outdated Show resolved Hide resolved

Tschuppi81 added 2 commits August 3, 2023 06:21

Adding additional test

b9c9286

Resolve merge conflicts

f87d524

msom removed their request for review August 14, 2023 09:15

Tschuppi81 added 2 commits August 17, 2023 11:28

Rework search mixin to use expressions

dfb7ee9

Extend upgrade context by has_index function

0589cf9

Daverball reviewed Aug 17, 2023

View reviewed changes

src/onegov/search/mixins.py Outdated Show resolved Hide resolved

Tschuppi81 added 2 commits August 17, 2023 14:37

Improve collecting index properties

1773dd4

Rework index creation taking into account the property type

86cebc2

Daverball reviewed Aug 22, 2023

View reviewed changes

src/onegov/search/integration.py Outdated Show resolved Hide resolved

Tschuppi81 added 5 commits September 12, 2023 07:17

Remove dependency to onegov.org

a312b45

Resolve merge conflicts

9aac0d0

Switch hybrid to regular properties to make re-index work

21e87ff

Via property the directory id with type uuid can now be indexed

c81a188

Add search score

59a478c

Daverball reviewed Sep 13, 2023

View reviewed changes

Tschuppi81 added 9 commits November 9, 2023 10:55

Revert "Switch hybrid to regular properties to make re-index work"

400865e

This reverts commit 21e87ff.

Revert "Via property the directory id with type uuid can now be indexed"

308a7de

This reverts commit c81a188.

Merge master

2bd99c3

Adds hybrid properties for search index properties

64a9655

Removing filter keywords as we also do not index the event tags

90bc8a9

Merge branch 'master' into feature/ogc-508-replace-elastic-search-by-…

2b311e9

…postgres-text-search

Adds hybrid properties for search index properties

ccc6c0c

Adds directory id to fts index

eae5dcd

DEV ONLY Issue to join data from other table in hybrid expression in …

e6fcdb1

…CourseEvent

Tschuppi81 added 2 commits November 28, 2023 08:31

Revert "DEV ONLY Issue to join data from other table in hybrid expres…

ccbf117

…sion in CourseEvent" This reverts commit e6fcdb1.

Fix hybrid properties and add expressions for Pages

9e454e3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/ogc 508 replace elastic search by postgres text search #918

Feature/ogc 508 replace elastic search by postgres text search #918

Tschuppi81 commented Jul 13, 2023

Daverball left a comment •

edited

Loading

Tschuppi81 commented Nov 27, 2023

Daverball commented Nov 27, 2023 •

edited

Loading

Feature/ogc 508 replace elastic search by postgres text search #918

Are you sure you want to change the base?

Feature/ogc 508 replace elastic search by postgres text search #918

Conversation

Tschuppi81 commented Jul 13, 2023

Checklist

Daverball left a comment • edited Loading

Choose a reason for hiding this comment

Tschuppi81 commented Nov 27, 2023

Daverball commented Nov 27, 2023 • edited Loading

Daverball left a comment •

edited

Loading

Daverball commented Nov 27, 2023 •

edited

Loading