Vespa Search Application

Overview

This project involves creating a search application using Vespa, a platform for scalable and fast data serving. The main objective is to process movie data, deploy a Vespa instance in Docker, and execute various types of searches. The tasks include data processing, application deployment, and query execution.

Prerequisites

Python 3.x
Docker Desktop (Ensure it is installed and running)
vespacli or pyvespa Python module

Steps to Complete the Assignment

1. Data Processing

Run the provided script to process tmdb_5000_movies.csv into a Vespa-compatible JSON format.

from process_script import process_tmdb_csv
process_tmdb_csv("tmdb_5000_movies.csv", "clean_tmdb.jsonl")

Verify the output: Ensure that clean_tmdb.jsonl contains the required fields (doc_id, title, and text).

2. Run Vespa as a Docker Container

Pull and Run Vespa Container:

docker pull vespaengine/vespa
docker run --detach --name vespa-hybrid --hostname vespa-container --publish 19071:19071 --publish 8082:8080 vespaengine/vespa

Verify the Container:
- Run docker ps to confirm the container is running.
- Access http://localhost:19071 to check the deployment API.

3. Configure Vespa and Ingest Data

Install vespacli:
```
pip install --ignore-installed vespacli
```

Deploy the Application:

vespa config set target local
vespa deploy --wait 300 app

Feed Data into Vespa:

vespa feed -t http://localhost:8082 clean_tmdb.jsonl

4. Run Search Queries

Connect to Vespa Using Python:

from vespa.application import Vespa

app = Vespa(url="http://localhost", port=8082)

Run Keyword Search:

df = keyword_search(app, "Harry Potter and the Half-Blood Prince")
print(df)

Run Semantic Search:

df = semantic_search(app, "Harry Potter and the Half-Blood Prince")
print(df)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
app		app
README.md		README.md
clean_tmdb.jsonl		clean_tmdb.jsonl
process_tmdb_csv_2_jsonl.py		process_tmdb_csv_2_jsonl.py
pyvesap_search.py		pyvesap_search.py
tmdb_5000_movies.csv		tmdb_5000_movies.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vespa Search Application

Overview

Table of Contents

Prerequisites

Steps to Complete the Assignment

1. Data Processing

2. Run Vespa as a Docker Container

3. Configure Vespa and Ingest Data

4. Run Search Queries

About

Releases

Packages

Languages

NiranjanRao07/Vespa-AI

Folders and files

Latest commit

History

Repository files navigation

Vespa Search Application

Overview

Table of Contents

Prerequisites

Steps to Complete the Assignment

1. Data Processing

2. Run Vespa as a Docker Container

3. Configure Vespa and Ingest Data

4. Run Search Queries

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages