Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch 7.* and 8.* integration. OpenSearch integration. #469

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

ivanmrsulja
Copy link
Member

@ivanmrsulja ivanmrsulja commented Jun 10, 2024

What does this pull request do?

Updates current ES 6.x integration to 8.x.

What's new?

Changes in ResponseParser and ES documentation on the first draft.

Example:

  • Changed src/main/java/edu/cornell/mannlib/vitro/webapp/searchengine/elasticsearch/ResponseParser.java to be in line with current ES API
  • Updated src/main/java/edu/cornell/mannlib/vitro/webapp/searchengine/elasticsearch/Elasticsearch_notes_on_the_first_draft.md with new mapping
  • Updated example.applicationSetup.n3 to show ES setup example

How should this be tested?

Initial setup

  • Install elasticsearch/opensearch somewhere.
  • Create a search index with the appropriate mapping (see below).
  • Check out VIVO and this branch of Vitro (see below), and do the usual installation procedure.
  • Modify {vitro_home}/config/applicationSetup.n3 to use this driver (see below).
  • Modify the vitro.local.searchengine.url configuration property to contain ES index base URL
  • Modify the vitro.local.searchengine.username configuration property to contain ES/OS basic auth username
  • Modify the vitro.local.searchengine.password configuration property to contain to contain ES/OS basic auth password
  • Start elasticsearch/opensearch
  • Start VIVO

A mapping for the search index

curl -X PUT "localhost:9200/vivo?pretty" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "default": {
            "type": "english"
          }
        }
      }
    }
  },
  "mappings": {
    "dynamic_templates": [
      {
        "field_sort_template": {
          "match": "*_label_sort",
          "mapping": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword"
              }
            },
            "fielddata": true
          }
        }
      },
      {
        "field_ss_template": {
          "match": "*_ss",
          "mapping": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            },
            "fielddata": true
          }
        }
      }
    ],
    "properties": { 
      "ALLTEXT": { 
        "type": "text",
        "analyzer": "english"
      }, 
      "ALLTEXTUNSTEMMED": { 
        "type": "text",
        "analyzer": "standard"
      }, 
      "DocId": {
        "type": "keyword"  
      }, 
      "classgroup": {
        "type": "keyword"  
      }, 
      "type": {
        "type": "keyword"  
      }, 
      "mostSpecificTypeURIs": {
        "type": "keyword"  
      }, 
      "indexedTime": { 
        "type": "long" 
      },
      "nameRaw": { 
        "type": "keyword" 
      },
      "URI": { 
        "type": "keyword" 
      },
      "THUMBNAIL": { 
        "type": "integer" 
      },
      "THUMBNAIL_URL": { 
        "type": "keyword" 
      },
      "nameLowercaseSingleValued": {
        "type": "text",
        "analyzer": "standard",
        "fielddata": true
      },
      "BETA" : {
        "type" : "float"
      }
    }
  }
}
'

Modify applicationSetup.n3

  • Change this (it is already changed in this PR):
# ----------------------------
#
# Search engine module: 
#    The Solr-based implementation is the only standard option, but it can be
#    wrapped in an "instrumented" wrapper, which provides additional logging 
#    and more rigorous life-cycle checking.
#

:instrumentedSearchEngineWrapper 
    a   <java:edu.cornell.mannlib.vitro.webapp.searchengine.InstrumentedSearchEngineWrapper> , 
        <java:edu.cornell.mannlib.vitro.webapp.modules.searchEngine.SearchEngine> ;
    :wraps :solrSearchEngine .

  • To this:
# ----------------------------
#
# Search engine module: 
#    The Solr-based implementation is the only standard option, but it can be
#    wrapped in an "instrumented" wrapper, which provides additional logging 
#    and more rigorous life-cycle checking.
#

:instrumentedSearchEngineWrapper 
    a   <java:edu.cornell.mannlib.vitro.webapp.searchengine.InstrumentedSearchEngineWrapper> , 
        <java:edu.cornell.mannlib.vitro.webapp.modules.searchEngine.SearchEngine> ;
    :wraps :elasticSearchEngine .

:elasticSearchEngine
    a   <java:edu.cornell.mannlib.vitro.webapp.searchengine.elasticsearch.ElasticSearchEngine> ,
        <java:edu.cornell.mannlib.vitro.webapp.modules.searchEngine.SearchEngine> .

Your setup should be completed now 😃 ! After this, you should perform common manual tests that are done for every new release.

Interested parties

@chenejac

Reviewers' expertise

Candidates for reviewing this PR should have some of the following expertises:

  1. Java
  2. Elasticsearch 7.* or 8.*

@ivanmrsulja ivanmrsulja marked this pull request as draft June 11, 2024 07:11
@chenejac chenejac marked this pull request as ready for review June 18, 2024 13:54
@ivanmrsulja ivanmrsulja changed the title Small mapping update and response parsing fix. Elasticsearch 7.* and 8.* integration. Jun 24, 2024
@ivanmrsulja ivanmrsulja changed the title Elasticsearch 7.* and 8.* integration. Elasticsearch 7.* and 8.* integration. OpenSearch integration. Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant