Backend for document retrieval which can be used as context for LLMs
This Flask API provides endpoints for scraping news articles from Google news website. It includes background tasks to periodically scrape news depending on the query and a mechanism to log API requests and track user call frequency.
Setup Instructions Prerequisites
Python 3.7 or higher
Docker (optional, for containerization)
Flask
SQLite
The API will be available at http://127.0.0.1:5000.
/health: Checks if the server is active, if it is it displays status as active.
/search: search function takes few parameters such as data text top_k threshold user_id in order to perform the query. since there is no frontend implemented yet this can be accessed via the terminal using the following command. on powershell :
Invoke-RestMethod -Uri http://127.0.0.1:5000/search -Method Post -Body '{
"text": "testing",
"top_k": 3,
"threshold": 0.8,
"user_id": "user123"
}' -ContentType "application/json"
on Linux:
curl -X POST http://127.0.0.1:5000/search \
-H "Content-Type: application/json" \
-d '{
"text": "testing",
"top_k": 3,
"threshold": 0.8,
"user_id": "user123"
}'
The results will be displayed as:
If the user api limit is hit then:
The program first parses the articles database inorder to check if the query has been called before. If it has it returns the data from the database. If the query is not found then the scraper is run and data is added to articles.db in an effort to optimize.
articles.db
Table: articles
Columns:
id: INTEGER PRIMARY KEY AUTOINCREMENT
link: TEXT
Purpose: Stores news articles associated with queries.
api_requests.db
Table: api_requests
Columns:
id: INTEGER PRIMARY KEY AUTOINCREMENT
user_id: TEXT
query: TEXT
results: TEXT
inference_time: REAL
timestamp: DATETIME DEFAULT CURRENT_TIMESTAMP
Purpose: Logs API requests including query, results, and inference time.
user_calls.db
Table: user_calls
Columns:
user_id: TEXT PRIMARY KEY
call_frequency: INTEGER
Purpose: Tracks the frequency of API calls per user.
The docker image can be built as: docker build -t example
docker build -t example
To run the docker container:
docker run -p 5000:5000 my-flask-app
The application can be accessed at http://localhost:5000.