diff --git a/README.md b/README.md
index 743a5c9ff8..254e3c6b5b 100644
--- a/README.md
+++ b/README.md
@@ -10,9 +10,6 @@
     <a href="https://github.com/devflowinc/trieve/stargazers">
         <img src="https://img.shields.io/github/stars/devflowinc/trieve.svg?style=flat&color=yellow" alt="Github stars"/>
     </a>
-    <a href="https://github.com/devflowinc/trieve/issues">
-        <img src="https://img.shields.io/github/issues/devflowinc/trieve.svg?style=flat&color=success" alt="GitHub issues"/>
-    </a>
     <a href="https://discord.gg/CuJVfgZf54">
         <img src="https://img.shields.io/discord/1130153053056684123.svg?label=Discord&logo=Discord&colorB=7289da&style=flat" alt="Join Discord"/>
     </a>
@@ -25,7 +22,7 @@
     <b>All-in-one solution for search, recommendations, and RAG</b>
 </h2>
 
-[![Trieve dashboard preivew](https://cdn.trieve.ai/dashboard.webp)](https://dashboard.trieve.ai)
+[![Trieve dashboard preivew](https://cdn.trieve.ai/landing-tabs/dark-mode-docsearch.webp)](https://dashboard.trieve.ai)
 
 ## Quick Links
 
diff --git a/pdf2md/.env.dist b/pdf2md/.env.dist
index 6fd9317487..982f95e615 100644
--- a/pdf2md/.env.dist
+++ b/pdf2md/.env.dist
@@ -26,6 +26,6 @@ LLM_MODEL=gpt-4o-mini
 # PDF2MD HTTP API server
 API_KEY=admin
 
-# Chunkr - Get your API key from https://chunkr.ai
+# OPTIONAL: Chunkr - Get your API key from https://chunkr.ai
 CHUNKR_API_URL=https://api.chunkr.ai
 CHUNKR_API_KEY=*********************
\ No newline at end of file
diff --git a/pdf2md/CONTRIBUTING.md b/pdf2md/CONTRIBUTING.md
index 12501286a7..f74cb7bf87 100644
--- a/pdf2md/CONTRIBUTING.md
+++ b/pdf2md/CONTRIBUTING.md
@@ -4,10 +4,16 @@
 
 ```bash
 cd server
-cp .env.dist .env
+cp .env.dist ./server/.env
 ```
 
-## Run dep processes
+You will need to replace `LLM_API_KEY` with your key for OpenRouter, OpenAI, LiteLLM, or whichever OpenAI compliant API you are using with the `LLM_BASE_URL`.
+
+If you want to support Chunkr then you can get an API key for their service from [chunkr.ai](https://chunkr.ai) and set it as the value for `CHUNKR_API_KEY`.
+
+## Run dependency services
+
+This will start MinIO S3, Clickhouse, and Redis.
 
 ```bash
 docker compose --profile dev up -d
diff --git a/pdf2md/LICENSE b/pdf2md/LICENSE
new file mode 100644
index 0000000000..d13cc4b26a
--- /dev/null
+++ b/pdf2md/LICENSE
@@ -0,0 +1,19 @@
+The MIT License (MIT)
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/pdf2md/README.md b/pdf2md/README.md
index 1a3a507134..0dca7e99b9 100644
--- a/pdf2md/README.md
+++ b/pdf2md/README.md
@@ -1,153 +1,93 @@
-# Contributing to PDF2MD
+<p align="center">
+  <img height="100" src="https://trieve.b-cdn.net/trieve-logo.png" alt="Trieve Logo">
+</p>
+<p align="center">
+<strong><a href="https://pdf2md.trieve.ai/redoc">API reference</a> | <a href="https://cal.com/nick.k/meet">Meet a Maintainer</a> | <a href="https://discord.gg/eBJXXZDB8z">Discord</a> | <a href="https://matrix.to/#/#trieve-general:trieve.ai">Matrix</a> | <a href="mailto:humans@trieve.ai">humans@trieve.ai</a>
+</strong>
+</p>
 
-## Project Setup
+<p align="center">
+    <a href="https://github.com/devflowinc/trieve/stargazers">
+        <img src="https://img.shields.io/github/stars/devflowinc/trieve.svg?style=flat&color=yellow" alt="Github stars"/>
+    </a>
+    <a href="https://discord.gg/CuJVfgZf54">
+        <img src="https://img.shields.io/discord/1130153053056684123.svg?label=Discord&logo=Discord&colorB=7289da&style=flat" alt="Join Discord"/>
+    </a>
+    <a href="https://matrix.to/#/#trieve-general:trieve.ai">
+        <img src="https://img.shields.io/badge/matrix-join-purple?style=flat&logo=matrix&logocolor=white" alt="Join Matrix"/>
+    </a>
+</p>
 
-### Setup ENV's
+<h1 align="center">🦀 PDF2MD 🦀</h1>
 
-```bash
-cd server
-cp ../.env.dist .env
-```
+<h2 align="center">
+    <b>Self-hostable API server and pipeline for converting PDF's to markdown using thrifty large language vision models like GPT-4o-mini and gemini-flash-1.5.</b>
+</h2>
 
-### Start docker dependency services
+<h4 align="center">Written in Rust. Try at <a href="https://pdf2md.trieve.ai">pdf2md.trieve.ai</a>.</h4>
 
-- redis
-- s3
-- clickhouse-db
+[![PDF2MD service preview](https://cdn.trieve.ai/pdf2md/pdf2md-preview.webp)](https://pdf2md.trieve.ai)
 
-```bash
-docker compose up -d
-```
+## The Stack
 
-### Run Server + Workers
+There's no compelling reason why Rust is necessary for this, but we wanted to have some fun 😜. Everything is free and open source. You can self-host easily with `docker-compose` or `kube` following the [SELF-HOSTING guide here](https://github.com/devflowinc/trieve/tree/main/pdf2md/SELF-HOSTING.md).
 
-Strongly recommend using tmux or another multiplex system to handle the different proceses.
+- [minijinja templates](https://github.com/mitsuhiko/minijinja) for the [UI](https://pdf2md.trieve.ai)
+    - there was no way I was going to write more JSX
+- [PDFObject](https://github.com/pipwerks/pdfobject) to view PDF's in the [demo UI](https://pdf2md.trieve.ai).
+- [actix/actix-web](https://github.com/actix/actix-web) for the HTTP server
+- [fun redis queue macro system](https://github.com/devflowinc/trieve/blob/main/pdf2md/server/src/operators/redis.rs#L7-L62) for worker pattern async processing
+    - redis queues are a core part of our infra for Trieve, but we made our system a lot more repeatable with this macro
+    - there will be a future release of this macro in an isolated crate
+- [Clickhouse](https://github.com/ClickHouse/ClickHouse) for task storage
+    - we have had a surprising amount of Postgres issues (especially write locks) building Trieve, so Clickhouse as the primary data store here is cool
+- [MinIO S3](https://github.com/minio/minio) for file storage
 
-```bash
-cargo watch -x run #HTTP server
-cargo run --bin supervisor-worker
-cargo run --bin chunk-worker
-```
+## How does PDF2MD work?
 
-### Run tailwindcss server for demo UI
+Workers horizontally scale on-demand to handle high volume periods. Usually `chunk-worker` needs to scale before `supervisor-worker`. Pages for a given `Task` stream in as the `chunk-worker` calls out to the LLM to get markdown for them.
 
-```
-npx tailwindcss -i ./static/in.css -o ./static/output.css --watch
-```
+### 1. HTTP server
 
-### Testing using the CLI
+1. HTTP server receives a base64 encoded PDF and decodes it
+3. Creates `FileTask` for document in ClickHouse
+4. Adds `FileTask` along with the base64 encoded file to `files_to_process` queue in Redis
 
-Make your changes then use the following to run:
+### 2. Supervisor Worker
 
-```bash
-cd cli
-cargo run -- help #or other command instead of help
-```
+1. `supervisor-worker` continuously polls the `files_to_process` Redis queue until it grabs a `FileTask` and its base64
+2. Decodes the base64 into a PDF and puts the PDf into S3
+3. Splits the PDF into pages, converts them to JPEGs
+4. Puts each JPEG page image into S3
+5. Pushes a `ChunkingTask` for each page into the `files_to_chunk` Redis queue
 
-## Deploying 
+### 3. Chunk Worker
 
-### Docker Compose
+1. `chunk-worker` continuously polls the `files_to_chunk` Redis queue until it grabs a `ChunkingTask`
+2. Gets its page image from S3
+3. Sends the image to the LLM provider at `LLM_BASE_URL` along with the `prompt` and `model` on the request to get markdown
+4. Updates the task with the markdown for the page
 
-Use the docker-compose-prod.yaml file to deploy the application.
+## Why Make This?
 
-```bash
-docker compose up -f docker-compose-prod.yaml -d
-```
+Trieve has used [apache tika](https://tika.apache.org/) to process various filetypes for the past year which means that files with complex layouts and diagrams have been poorly ingested. 
 
-You can either chose to build locally or pull the pre-built images from the docker hub.
+We saw [OmniAI](https://github.com/getomni-ai) launch [xerox](https://github.com/getomni-ai/zerox) and show that 4o-mini was a viable and cheap way to handle these filetypes and decided it was time to integrate something better than Tika into Trieve.
 
-#### Build Options
-##### Build On Machine:
+We previously lightly contributed to [Chunkr](https://github.com/lumina-ai-inc/chunkr) which is a more advanced system that leverages layout detection and dedicated OCR models to process documents, but still felt the need to build something ourselves since it was a bit complex to work into Trieve's local dev and self-hosting setup. Xerox's approach using just a VLLM was ideal and the path we went with.
 
-```bash
-docker compose up -f docker-compose-prod.yaml -d --build
-```
+We wrote our own API server and pipeline using Rust, Redis queues, and Clickhouse in the Trieve-style to achieve this. Try it using our demo UI hosted at [pdf2md.trieve.ai](https://pdf2md.trieve.ai).
 
-##### Use Pre-built Images:
-```bash
-docker compose up -f docker-compose-prod.yaml -d --pull always
-```
+## Roadmap
 
-#### Setup Caddy reverse proxy (optional)
+Please contribute if you can! We could use help 🙏.
 
-Setup a Caddyfile with the following content:
+1. Rename everything from `chunk` to `page` because we eventually decided that we would only deal PDF --> Markdown conversion and not chunking. Consider using [chonkie](https://github.com/bhavnicksm/chonkie) with the markdown output for this.
+2. Use [Clickhouse MergeTree](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/mergetree) instead of updating `Task`'s in Clickhouse as that's more correct.
+3. `supervisor-worker` can get overwhelmed when it receives a large PDF as splitting into pages can take a while. There should be something better here.
+4. Users should be able to send a URL to a file instead of base64 encoding it if they have one because that's easier. 
+5. Users should be able to point `PDF2MD` at an S3 bucket and let it process all of them automatically instead of having to send each file 1 by 1 🤮.
 
-```bash
-# Global options
-{
-    email developer@example.com
-}
+---
 
-# Define a site block for pdftomd.example.com
-pdftomd.example.com {
-    reverse_proxy localhost:8081
-}
-```
-
-Start the caddy reverse proxy. This should also handle your ssl
-
-```bash
-sudo systemctl reload caddy.service
-```
-
-### Kubernetes
-
-```bash
-kubectl apply -f k8s/
-```
-
-You can now access pdf2md within the kubernetes cluster at `http://pdf2md.default.svc.cluster.local`
-To access it from outside the cluster:
-- You can use a service of type `LoadBalancer` or `NodePort`.
-- You can setup an Ingress (by default, the ingress is enabled in the k8s files).
-
-#### Setup Ingress (optional)
-
-```bash
-kubectl get ingress
-```
-
-##### GKE Ingress
-
-For gke ingress, you need to set add `kubernetes.io/ingress.class` annotation to `gce` in the ingress yaml file.
-
-Here is an example of how it looks:
-
-```yaml
-apiVersion: networking.k8s.io/v1
-kind: Ingress
-metadata:
-  name: pdf2md-ingress
-  annotations:
-    kubernetes.io/ingress.class: "gce"
-spec:
-  defaultBackend:
-    service:
-      name: pdf2md-api
-      port:
-        number: 80
-```
-
-NAME             CLASS    HOSTS   ADDRESS          PORTS   AGE
-pdf2md-ingress   <none>   *       34.107.134.128   80      4h33m
-```
-
-##### EKS Ingress
-
-For eks you need to set kubernetes.io/ingress.class to `alb` and set `spec.ingressClassName` to `alb` in the ingress yaml file.
-
-```yaml
-apiVersion: networking.k8s.io/v1
-kind: Ingress
-metadata:
-  name: pdf2md-ingress
-  annotations:
-    kubernetes.io/ingress.class: "alb"
-spec:
-  ingressClassName: "alb"
-  defaultBackend:
-    service:
-      name: pdf2md-api
-      port:
-        number: 80
-```
+Made with ❤️ in San Francisco
diff --git a/pdf2md/SELF-HOSTING.md b/pdf2md/SELF-HOSTING.md
new file mode 100644
index 0000000000..74f5f3fdb8
--- /dev/null
+++ b/pdf2md/SELF-HOSTING.md
@@ -0,0 +1,107 @@
+# Deploying 
+
+### Docker Compose
+
+Use the docker-compose-prod.yaml file to deploy the application.
+
+```bash
+docker compose up -f docker-compose-prod.yaml -d
+```
+
+You can either chose to build locally or pull the pre-built images from the docker hub.
+
+#### Build Options
+##### Build On Machine:
+
+```bash
+docker compose up -f docker-compose-prod.yaml -d --build
+```
+
+##### Use Pre-built Images:
+```bash
+docker compose up -f docker-compose-prod.yaml -d --pull always
+```
+
+#### Setup Caddy reverse proxy (optional)
+
+Setup a Caddyfile with the following content:
+
+```bash
+# Global options
+{
+    email developer@example.com
+}
+
+# Define a site block for pdftomd.example.com
+pdftomd.example.com {
+    reverse_proxy localhost:8081
+}
+```
+
+Start the caddy reverse proxy. This should also handle your ssl
+
+```bash
+sudo systemctl reload caddy.service
+```
+
+### Kubernetes
+
+```bash
+kubectl apply -f k8s/
+```
+
+You can now access pdf2md within the kubernetes cluster at `http://pdf2md.default.svc.cluster.local`
+To access it from outside the cluster:
+- You can use a service of type `LoadBalancer` or `NodePort`.
+- You can setup an Ingress (by default, the ingress is enabled in the k8s files).
+
+#### Setup Ingress (optional)
+
+```bash
+kubectl get ingress
+```
+
+##### GKE Ingress
+
+For gke ingress, you need to set add `kubernetes.io/ingress.class` annotation to `gce` in the ingress yaml file.
+
+Here is an example of how it looks:
+
+```yaml
+apiVersion: networking.k8s.io/v1
+kind: Ingress
+metadata:
+  name: pdf2md-ingress
+  annotations:
+    kubernetes.io/ingress.class: "gce"
+spec:
+  defaultBackend:
+    service:
+      name: pdf2md-api
+      port:
+        number: 80
+```
+
+NAME             CLASS    HOSTS   ADDRESS          PORTS   AGE
+pdf2md-ingress   <none>   *       34.107.134.128   80      4h33m
+```
+
+##### EKS Ingress
+
+For eks you need to set kubernetes.io/ingress.class to `alb` and set `spec.ingressClassName` to `alb` in the ingress yaml file.
+
+```yaml
+apiVersion: networking.k8s.io/v1
+kind: Ingress
+metadata:
+  name: pdf2md-ingress
+  annotations:
+    kubernetes.io/ingress.class: "alb"
+spec:
+  ingressClassName: "alb"
+  defaultBackend:
+    service:
+      name: pdf2md-api
+      port:
+        number: 80
+```
diff --git a/pdf2md/server/src/templates/skeleton.html b/pdf2md/server/src/templates/skeleton.html
index c5b77bbb81..3943a003a1 100644
--- a/pdf2md/server/src/templates/skeleton.html
+++ b/pdf2md/server/src/templates/skeleton.html
@@ -91,7 +91,7 @@
               Meet With Sales
             </a>
             <a
-              href="https://github.com/devflowinc/trieve/tree/main/pdf2md/self-hosting.md"
+              href="https://github.com/devflowinc/trieve/tree/main/pdf2md/SELF-HOSTING.md"
               class="text-sm/6 font-semibold text-gray-900 hover:text-magenta-500"
               target="_blank"
             >