Skip to content

Latest commit

 

History

History

chive

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Vald Similarity Search using chiVe Dataset

This example shows the text similarity search example with chiVe dataset.

It uses the Vald cluster for the search engine and Jupyter Notebook for running an example.

Requirements

NOTE: It is recommended to do "Get Started" before running Notebook.

To execute this example, it requires the Vald cluster.

And the following requirements will be installed when executing the example.

How it works

Vald Installation

  1. Prepare Kubernetes Cluster

    # verify
    kubectl get cluster-info
  2. Add Vald charts to Helm repo

    helm repo add vald https://vald.vdaas.org/charts
  3. Deploy the Vald cluster

    helm install vald vald/vald --values path/to/helm/values.yaml

    NOTE: When using the chiVe dataset, please use sample-values.yaml or correct the following points.

    # edit path/to/helm/values.yaml
    agent:
      ngt:
        dimension: 300
        distance_type: cos
  4. Verify

    kubectl get pods

Run Jupyter Notebook on Docker

  1. Download the dataset

    Before running the Docker image, please download the chiVe dataset applied for Magnitude.

    curl "https://sudachi.s3-ap-northeast-1.amazonaws.com/chive/chive-1.2-mc90.magnitude" -o "chive-1.2-mc90.magnitude"
  2. Verify the endpoint of Vald cluster

    This example requires the Vald cluster endpoint to send requests from Jupyter Notebook. Please verify your cluster endpoint.

    • If Kubernetes ingress is enabled, you can use ingress host and port.

      kubectl get ingress
    • If disabled, you can use the endpoint by executing kubectl port-forward.

      # port-forward (the endpoint will be {host ip}:8081)
      kubectl port-forward svc/vald-lb-gateway 8081:8081
  3. Run Jupyter Notebook on Docker

    # use python-3.7.6 image because Magnitude DOES NOT apply new python version. (2022-06)
    docker run --user root -it -v $(pwd):/home/jovyan/work -p 8888:8888 -e UB_UID=root -e GRANT_SUDO=yes jupyter/datascience-notebook:python-3.7.6

Execute example

  1. Access via browser

  2. Select Notebook and execute the example

Cleanup

  1. Stop the Docker container by Ctrl-C.

  2. Delete Vald cluster by helm uninstall vald or according to the method you deployed.