Book-talk

This is an experiment using LLM (large language model) completions to help me remember the details of books and other text I've read. In this case, our AI assistant has "read" the contents of Flights by Olga Tokarczuk and can look up excerpts from the book before answering questions.

This happens in a few steps:

Pre-process source text(s) by generating embeddings (vector representations of chunks of text text) and writing these to file.
Seed a database with the text + embeddings for easy lookup (in this case, use postgres + pgvector to store the embeddings).
Spin up a web server. The frontend is a simple form that takes text question and displays text answer. The API handles this with POST /ask.
The backend it does the following: fetches a vector embedding for the question, and then uses it to look up most relevant chunk(s) of text (pgvector uses some version of approximate nearest neighbor search to calculate the vector's cosine similarity).
Construct a text query using the context and question and run it through the LLM. In this case, OpenAI's /completions API.
Save questions/answers in another table (allows us to skip step 5 if question was already asked)

The above is built using Ruby on Rails / React. It uses OpenAI for the embeddings/completions, but they could be swapped with a different LLM, especially if we want to do more fine tuning down the line.

Initial version based on askmybook.com. Also, thanks to these resources.

Structure

This is a pretty standard RoR setup. The main concerns/files are:

pre-processing script bin/generate_embeddings
- this reads a pdf page-by-page, fetches openai embeddings for each page, and saves them along with page data to csv
database
- db/seeds seeds db with csv data rake db:seed
- db/schema for db schema
  - Document (stores pre-processed context/pages)
  - Question (stores questions asked in app)
api
frontend
- it's a react app, bundled with esbuild (package.json and served by the home/index route (the main layout imports the bundle: app/views/layouts/application.html.erb -- the route to views/home/index.html.erb is just a blank page)
- the react that gets bundled/loaded app/javascript
  - note: frontend uses react-router for some browser-side history manipulation, {index_path}/questions/:id etc)

Dependencies

See Gemfile and package.json for dependencies. Install them with:

bundle install
yarn install

postgres
pg-vector

Setup

Run the script bin/generate_embeddings with book.pdf in root

rails runner bin/generate_embeddings --pdf book.pdf

This will generate two csv files: book.pdf.pages.csv and book.pdf.embeddings.csv

Run locally

bin/dev

Info

Ruby version ruby-3.0.0
View available routes rails routes
Database creation rails db:create
Database initialization rails db:migrate
How to run the test suite bin/dev test

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
app		app
bin		bin
config		config
db		db
lib		lib
log		log
public		public
storage		storage
test		test
tmp		tmp
vendor		vendor
.gitattributes		.gitattributes
.gitignore		.gitignore
.ruby-version		.ruby-version
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
Procfile.dev		Procfile.dev
README.md		README.md
Rakefile		Rakefile
config.ru		config.ru
package.json		package.json
render.yaml		render.yaml
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Book-talk

Structure

Dependencies

Setup

Run locally

Info

TODO:

About

Releases

Packages

Languages

dericko/booktalk

Folders and files

Latest commit

History

Repository files navigation

Book-talk

Structure

Dependencies

Setup

Run locally

Info

TODO:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages