Skip to content
This repository has been archived by the owner on Aug 30, 2022. It is now read-only.

External Solr - decoupling and synchronization #222

Open
mrsimpson opened this issue Mar 12, 2018 · 4 comments
Open

External Solr - decoupling and synchronization #222

mrsimpson opened this issue Mar 12, 2018 · 4 comments
Assignees
Labels
Milestone

Comments

@mrsimpson
Copy link
Collaborator

Expected Behaviour

When using an external Solr, there should be a mechanism to handle non-availability of the Solr-server.

Actual Behaviour

Updates get lost

@ruKurz
Copy link
Collaborator

ruKurz commented Mar 12, 2018

Currently Smarti supports Solr Synch. If MongoDB and Solr are running out of synch, a delta synch (based on timestamps) is executed. When switching over to an external Solr cloud, the current synch mechanism must be disabled because multiple Smarti instances connected to the same Solr are concurrent. In the worst case all of those Smarti clients request for a synchronization at the same time for the same conversations. (This will work, but causes multiple indexation of the same conversations what is not effecient).

To solve this issue, the Smarti cluster has to make sure triggering the synchronization only once.

@westei could you please confirm that what I've understood and written here is correct?

@westei
Copy link
Member

westei commented May 14, 2018

Yes. Cloud sync is not needed when we use an external Solr (as any Smarti instance will directly index the updates to the shared Solr).

If a sync of the index with the state in the MongoDB is necessary this SHOULD only occur from a single Smarti instance to avoid performing indexing twice.

@westei
Copy link
Member

westei commented Mar 29, 2019

This is related to #306

Regarding re-sync being executed concurrently by multiple Smarti instances. For an other project we implemented Execution-Tokens that are synced by Mongo. To avoid this we would need to bring a similar functionality to Smarti

@westei
Copy link
Member

westei commented Apr 4, 2019

Summary of the changes to fix this issue (se #306 for implementation work):

  • Added a shared lock mechanism to ensure that only a single Smarti instance performs the sync. See this pull request for more details on the implementation
  • The conversation Index can now run with embedded Mode on/off. The state is automatically detected based on the type of the Solr client injected to the ConversationIndexer. While this should be fine in typical cases there might be configurations where one needs to override the default using smarti.index.conversation.embedded=true/false (e.g. when using standalone Solr Server that are private for each smarti instance.
    • in embedded mode Smarti still supports the old behaviour (with cloud sync smarti.index.conversation.syncDelay=15 seconds)
    • with embedded=false Smarti deactivates cloud sync. Instead each Smarti instance will index updates it performs to conversations in the shared Solr server. This sync operation uses a shared lock over all Smarti instances. This ensures that the sync operation is only performed once at a time.
    • to cope with service interruptions of the Solr in embedded=false mode a synchronisation Cron is used (default: smarti.index.conversation.syncCron=16 0 2 * * *). This cron ensures that the Solr index is brought in sync from time to time. It will reindex all conversations with changes since the last sync.
  • During startup an initial sync with the Solr index is performed (in both modes) unless smarti.index.rebuildOnStartup=true in this case the whole index is rebuild. A full rebuild is also triggered if the conversation index version was changed as part of a smart release.

NOTE: The ConversationIndexSync service is still optional. If it is not available any sync operations are disabled and full rebuild of the conversation index will be performed on startup. A WARN message is logged during startup if this service is not available.

westei added a commit that referenced this issue Apr 4, 2019
documentation for #222
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants