This example shows the text similarity search example with chiVe dataset.
It uses the Vald cluster for the search engine and Jupyter Notebook for running an example.
NOTE: It is recommended to do "Get Started" before running Notebook.
To execute this example, it requires the Vald cluster.
And the following requirements will be installed when executing the example.
-
Prepare Kubernetes Cluster
# verify kubectl get cluster-info
-
Add Vald charts to Helm repo
helm repo add vald https://vald.vdaas.org/charts
-
Deploy the Vald cluster
helm install vald vald/vald --values path/to/helm/values.yaml
NOTE: When using the chiVe dataset, please use sample-values.yaml or correct the following points.
# edit path/to/helm/values.yaml agent: ngt: dimension: 300 distance_type: cos
-
Verify
kubectl get pods
-
Download the dataset
Before running the Docker image, please download the chiVe dataset applied for Magnitude.
curl "https://sudachi.s3-ap-northeast-1.amazonaws.com/chive/chive-1.2-mc90.magnitude" -o "chive-1.2-mc90.magnitude"
-
Verify the endpoint of Vald cluster
This example requires the Vald cluster endpoint to send requests from Jupyter Notebook. Please verify your cluster endpoint.
-
If Kubernetes ingress is enabled, you can use ingress host and port.
kubectl get ingress
-
If disabled, you can use the endpoint by executing
kubectl port-forward
.# port-forward (the endpoint will be {host ip}:8081) kubectl port-forward svc/vald-lb-gateway 8081:8081
-
-
Run Jupyter Notebook on Docker
# use python-3.7.6 image because Magnitude DOES NOT apply new python version. (2022-06) docker run --user root -it -v $(pwd):/home/jovyan/work -p 8888:8888 -e UB_UID=root -e GRANT_SUDO=yes jupyter/datascience-notebook:python-3.7.6
-
Access via browser
-
Select Notebook and execute the example
-
Stop the Docker container by
Ctrl-C
. -
Delete Vald cluster by
helm uninstall vald
or according to the method you deployed.