feat: add KNOWLEDGE_ALLOW_PDF environment variable #42

0xbbjoker · 2025-10-15T05:59:06Z

Add KNOWLEDGE_ALLOW_PDF env var to control PDF upload support (default: enabled)
Block PDF uploads in file upload, URL upload, and docs loader when disabled
Improve error handling to return clean JSON responses (HTTP 403) instead of stack traces
Implement proper toast notification system in frontend UI
Fix toast positioning to top-right for better visibility with console open
Update README with security note about disabling PDF support

Addresses security concerns around PDF file format while maintaining backward
compatibility. When KNOWLEDGE_ALLOW_PDF=false is set, all PDF uploads are
blocked with user-friendly error messages.

Note

Adds KNOWLEDGE_ALLOW_PDF to toggle PDF support (default enabled), blocks PDFs across loaders/uploads when disabled, improves frontend toasts and error handling, and updates docs.

Config/Backend:
- Add KNOWLEDGE_ALLOW_PDF env with helper isPdfAllowed (default: true); refine parseBooleanEnv.
- Pass runtime to loadDocsFromPath; service updated to use it.
Uploads/Routes:
- Multer allowed types respect PDF setting; return clean JSON errors with PDF_DISABLED (403).
- URL import rejects PDFs when disabled; improved error responses.
Docs Loader:
- Skip .pdf files when PDF support is disabled.
Frontend (Knowledge tab):
- Implement toast system and top-right toast container; wire into uploads/imports.
- Robust error parsing for uploads/URL imports with user-friendly PDF-disabled messages.
- Minor UI enhancements: modal header controls and PDF viewer zoom/reset.
Docs:
- README: security note and config sample for KNOWLEDGE_ALLOW_PDF.
Version:
- Bump to 1.5.12.

^{Written by Cursor Bugbot for commit ffc0fc1. This will update automatically on new commits. Configure here.}

Summary by CodeRabbit

New Features
- Option to disable PDF uploads for added control; app now skips PDFs when disabled and shows clear messages.
- Enhanced Knowledge tab: toast notifications, improved error handling, and success feedback.
- Expanded URL import dialog with list management and validation.
- Upgraded search panel with input, threshold slider, and loading/error states.
- Improved document viewer with PDF zoom controls and fallbacks.
Documentation
- Added guidance on disabling PDF support via environment configuration.
Chores
- Version bumped to 1.5.12.

coderabbitai · 2025-10-15T05:59:36Z

Walkthrough

Adds a runtime-configurable PDF support flag, wiring it through config, routes, docs loading, and service initialization. Updates backend upload filters and URL import to reject PDFs when disabled. Expands the frontend Knowledge tab with a full toast system, enhanced search and URL import UI, and improved PDF viewing. Documentation/version bumped.

Changes

Cohort / File(s)	Summary
Docs & Versioning `README.md`, `package.json`	Documented PDF-disable flag (KNOWLEDGE_ALLOW_PDF) and bumped version 1.5.11 → 1.5.12.
Config & Feature Flag `src/config.ts`	Added isPdfAllowed(runtime?) and updated parseBooleanEnv to accept a defaultValue; defaults PDF allowance to true when unspecified.
Docs Loader Integration `src/docs-loader.ts`	loadDocsFromPath signature adds optional runtime; skips PDFs when disabled using isPdfAllowed(runtime) with warnings.
Backend Routes & Service `src/routes.ts`, `src/service.ts`	Enforced PDF checks in multer filter and URL-based uploads; tailored errors/status (403 for PDF disabled); service passes runtime to loadDocsFromPath.
Frontend Knowledge Tab & Toasts `src/frontend/ui/knowledge-tab.tsx`	Replaced ad-hoc toasts with centralized toast state, hook, and container; expanded search UI, URL import dialog, error parsing, and PDF viewing with zoom/iframe and fallbacks.
Index Cleanup `src/index.ts`	Removed unused logger and IAgentRuntime imports; no API changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor U as User
  participant UI as Frontend UI
  participant R as Routes (upload)
  participant C as Config (isPdfAllowed)
  participant S as KnowledgeService

  U->>UI: Upload file / URL
  UI->>R: POST /knowledge/upload (file or URL)
  R->>C: isPdfAllowed(runtime)
  alt PDF disabled
    R-->>UI: 403 PDF_DISABLED (JSON/text)
    UI->>UI: Show toast "PDF uploads disabled"
  else PDF allowed
    R->>S: Process upload
    S-->>R: Success
    R-->>UI: 200 OK
    UI->>UI: Show success toast
  end

sequenceDiagram
  autonumber
  participant KS as KnowledgeService
  participant DL as Docs Loader
  participant C as Config (isPdfAllowed)

  KS->>DL: loadDocsFromPath(..., runtime)
  DL->>C: isPdfAllowed(runtime)
  alt PDF disabled
    DL->>DL: Skip .pdf files, log warning
  else PDF allowed
    DL->>DL: Load all supported files
  end
  DL-->>KS: Loaded documents (filtered)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

A hop, a skip, a flag flips on—
PDFs allowed? Or softly gone.
I nibble logs, toast pops: “Hooray!”
Search burrows deep to find the way.
Zooming pages, carrot-bright,
Version bumped—we launch tonight! 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title succinctly describes the primary feature introduced in this pull request—adding the KNOWLEDGE_ALLOW_PDF environment variable to control PDF support—while following a clear and concise conventional commit style. It directly reflects the main change without extraneous details and is specific enough for team members to understand the purpose at a glance.
Docstring Coverage	✅ Passed	Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/knowledge-allow-pdf-env-var

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

0xbbjoker · 2025-10-15T06:06:20Z

@BugBot review

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/service.ts (1)

323-341: Add service‑level PDF guard (defense‑in‑depth).

Routes/docs‑loader enforce the flag, but direct service calls can bypass it. Guard here to uphold the security invariant.

-import { validateModelConfig } from './config';
+import { validateModelConfig, isPdfAllowed } from './config';

       let documentContentToStore: string;
       const isPdfFile =
         contentType === 'application/pdf' || originalFilename.toLowerCase().endsWith('.pdf');
 
       if (isPdfFile) {
+        // Enforce PDF policy at the service layer too (defense-in-depth)
+        if (!isPdfAllowed(this.runtime)) {
+          const err: any = new Error('PDF uploads are disabled by policy');
+          err.code = 'PDF_DISABLED';
+          err.status = 403;
+          throw err;
+        }
         // For PDFs: extract text for fragments but store original base64 in main document
         try {
           fileBuffer = Buffer.from(content, 'base64');

If you’d prefer the check earlier, you can also add a fast‑fail in addKnowledge before processDocument.

🧹 Nitpick comments (9)

package.json (1)

81-232: Expose KNOWLEDGE_ALLOW_PDF in agentConfig.pluginParameters.

Add this boolean so ops/UIs can discover/configure it (default true).

       "KNOWLEDGE_PATH": {
         "type": "string",
         "description": "Filesystem path where the knowledge loader searches for documents. Overrides the default ./docs directory.",
         "required": false,
         "default": "./docs",
         "sensitive": false
       },
+      "KNOWLEDGE_ALLOW_PDF": {
+        "type": "boolean",
+        "description": "Allow PDF uploads and processing. Set to false to block PDFs (403).",
+        "required": false,
+        "default": true,
+        "sensitive": false
+      },
       "LOAD_DOCS_ON_STARTUP": {
         "type": "boolean",
         "description": "Controls whether the plugin should automatically load documents from the docs folder when the agent starts. Any value other than the string 'false' enables loading.",
         "required": false,
         "default": true,
         "sensitive": false
       }

src/frontend/ui/knowledge-tab.tsx (6)

91-147: Improve toast accessibility (screen readers).

Add ARIA roles/live regions so toasts announce properly.

-  return (
-    <div className="fixed top-4 right-4 z-[9999] flex flex-col gap-2 w-full max-w-md pointer-events-none">
+  return (
+    <div
+      className="fixed top-4 right-4 z-[9999] flex flex-col gap-2 w-full max-w-md pointer-events-none"
+      role="region"
+      aria-live="polite"
+      aria-label="Notifications"
+    >
       {toastState.toasts.map((toast) => (
-        <div
+        <div
           key={toast.id}
           className={cn(
             'p-4 rounded-lg shadow-2xl border pointer-events-auto',
             'animate-in slide-in-from-top-5 fade-in duration-300',
             'bg-card text-card-foreground border-border',
             toast.variant === 'destructive' && 'bg-destructive text-destructive-foreground border-destructive'
           )}
+          role={toast.variant === 'destructive' ? 'alert' : 'status'}
         >

818-839: Add 403/PDF-disabled fallback for non-JSON error bodies (URL import).

If server returns 403 without JSON, show the PDF-disabled message.

-      if (!result.ok) {
+      if (!result.ok) {
         // Try to parse error response as JSON
         let errorMessage = `Upload failed: ${result.statusText}`;
         try {
           const contentType = result.headers.get('content-type');
           if (contentType && contentType.includes('application/json')) {
             const errorData = await result.json();
             if (errorData.error?.message) {
               errorMessage = errorData.error.message;
             } else if (errorData.error?.code === 'PDF_DISABLED') {
               errorMessage = 'PDF uploads are currently disabled. Please contact your administrator to enable PDF support.';
             }
           } else {
             const errorText = await result.text();
             if (errorText) {
               errorMessage = errorText;
             }
           }
         } catch (e) {
           // If parsing fails, keep the default error message
           console.error('Error parsing URL upload error response:', e);
         }
+        if (result.status === 403 && !/pdf uploads are currently disabled/i.test(errorMessage)) {
+          errorMessage =
+            'PDF uploads are currently disabled. Please contact your administrator to enable PDF support.';
+        }
         throw new Error(errorMessage);
       }

891-914: Mirror 403/PDF-disabled fallback for file uploads.

Same handling as URL path for consistency.

-      if (!response.ok) {
+      if (!response.ok) {
         // Try to parse error response as JSON
         let errorMessage = `Upload failed: ${response.statusText}`;
         try {
           const contentType = response.headers.get('content-type');
           if (contentType && contentType.includes('application/json')) {
             const errorData = await response.json();
             if (errorData.error?.message) {
               errorMessage = errorData.error.message;
             } else if (errorData.error?.code === 'PDF_DISABLED') {
               errorMessage = 'PDF uploads are currently disabled. Please contact your administrator to enable PDF support.';
             }
           } else {
             const errorText = await response.text();
             if (errorText) {
               errorMessage = errorText;
             }
           }
         } catch (e) {
           // If parsing fails, keep the default error message
           console.error('Error parsing upload error response:', e);
         }
+        if (response.status === 403 && !/pdf uploads are currently disabled/i.test(errorMessage)) {
+          errorMessage =
+            'PDF uploads are currently disabled. Please contact your administrator to enable PDF support.';
+        }
         throw new Error(errorMessage);
       }

1536-1543: Hide .pdf from accept list when PDFs are disabled (UX).

Read a flag (e.g., window.ELIZA_CONFIG.knowledgeAllowPdf) and omit .pdf from the accept string to avoid misleading users. Backward-compatible if the flag isn’t present.

1768-1791: Sandbox the PDF iframe for defense‑in‑depth.

Add sandbox/referrer policy to isolate the embedded viewer.

-                          <iframe
+                          <iframe
                             src={pdfDataUrl}
                             className="w-full border-0 shadow-md"
                             style={{
                               height: '90vh',
                               maxWidth: '1200px',
                               backgroundColor: 'var(--background)',
                             }}
+                            sandbox=""
+                            referrerPolicy="no-referrer"
                             title="PDF Document"
                             onError={() => {
                               console.error('Failed to load PDF in iframe');
                             }}
                           />

224-338: Minor: normalize extension handling in getCorrectMimeType.

You already lowercase ext; drop the uppercase 'R' entry and consider consolidating with a Set for readability.

README.md (1)

68-69: Clarify runtime behavior when PDFs are disabled.

Note that blocked attempts return HTTP 403 with JSON code PDF_DISABLED to aid operators and UI handling.

Also applies to: 142-146

src/routes.ts (1)

303-311: Consider checking content type before fetching URL content.

The current implementation fetches the content from the URL (line 276) before checking if it's a PDF (line 304). This means the system downloads the PDF content before rejecting it.

While this works correctly, it could be optimized by:

Sending a HEAD request first to check the content type
Only fetching the content if the type is allowed

However, this optimization adds complexity and may not be necessary if most URLs are valid. The current approach is acceptable.

If you'd like to optimize this, here's a suggested approach:

          logger.debug(`[Document Processor] 🌐 Fetching content from URL: ${fileUrl}`);

+         // Optional: Check content type with HEAD request first
+         if (!isPdfAllowed(runtime)) {
+           try {
+             const headResponse = await fetch(fileUrl, { method: 'HEAD' });
+             const contentType = headResponse.headers.get('content-type');
+             if (contentType === 'application/pdf') {
+               logger.warn(
+                 `[Document Processor] ❌ PDF upload rejected (PDF support disabled): ${fileUrl}`
+               );
+               throw new Error(
+                 'PDF file uploads are disabled. Contact administrator to enable PDF support.'
+               );
+             }
+           } catch (headError) {
+             // If HEAD fails, continue with normal fetch
+             logger.debug(`[Document Processor] HEAD request failed, proceeding with fetch`);
+           }
+         }
+
          // Fetch the content from the URL
          const { content, contentType: fetchedContentType } = await fetchUrlContent(fileUrl);

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 33680de and ffc0fc1.

📒 Files selected for processing (8)

README.md (2 hunks)
package.json (1 hunks)
src/config.ts (1 hunks)
src/docs-loader.ts (4 hunks)
src/frontend/ui/knowledge-tab.tsx (5 hunks)
src/index.ts (1 hunks)
src/routes.ts (5 hunks)
src/service.ts (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Cursor Bugbot

🔇 Additional comments (13)

package.json (1)

4-4: Version bump looks good.

src/index.ts (1)

6-6: Types‑only import cleanup is correct.

src/service.ts (1)

67-73: Good: runtime passed to loadDocsFromPath.

Enables runtime‑aware PDF gating during startup loads.

src/config.ts (2)

5-10: LGTM! Backward-compatible default parameter.

The updated signature with an optional defaultValue parameter is well-designed. The fallback logic correctly returns the defaultValue when the input is undefined or null, maintaining the function's robustness.

12-27: LGTM! Runtime-aware PDF control with safe defaults.

The isPdfAllowed function properly checks runtime settings first, then falls back to environment variables, with a sensible default of true (enabled). This maintains backward compatibility—existing deployments continue allowing PDFs unless explicitly disabled.

src/docs-loader.ts (4)

7-7: LGTM! Import added for PDF control.

The import of isPdfAllowed is correctly placed and aligns with the new PDF support feature.

37-43: LGTM! Runtime parameter properly added.

The function signature now accepts an optional runtime parameter, enabling runtime-aware PDF filtering. This change aligns with the service-level integration mentioned in the AI summary.

53-57: LGTM! Clear warning when PDF support is disabled.

The check for PDF support and corresponding warning message properly inform operators when PDFs will be skipped during document loading.

91-95: LGTM! PDF files correctly skipped when disabled.

The conditional skip logic is correct—when a PDF file is encountered and PDF support is disabled, it's silently skipped with appropriate debug logging. The use of continue ensures the loop proceeds to the next file.

src/routes.ts (4)

8-8: LGTM! Import added for PDF control.

The import of isPdfAllowed is correctly placed and enables runtime PDF filtering in the routes.

16-34: LGTM! Dynamic MIME type filtering based on PDF support.

The implementation correctly:

Initializes allowedMimeTypes with all supported types

Checks PDF support via isPdfAllowed(runtime)

Filters out 'application/pdf' when disabled

Logs an appropriate warning

This ensures PDFs are blocked at the Multer level when support is disabled.

42-51: LGTM! User-friendly error message for blocked PDFs.

The conditional error message properly distinguishes between:

PDF uploads when PDF support is disabled (clear, actionable message)

Other unsupported file types (generic message with allowed types)

The error messaging is user-friendly and avoids exposing technical details.

1239-1252: LGTM! Error differentiation works and the PDF-disabled message is consistent across the codebase.
Consider extracting the literal into a constant and adding a unit test to guard the err.message.includes('PDF file uploads are disabled') logic against future changes.

cursor

✅ Bugbot reviewed your changes and found no bugs!

feat: add KNOWLEDGE_ALLOW_PDF environment variable

ffc0fc1

0xbbjoker requested a review from odilitime October 15, 2025 05:59

coderabbitai bot reviewed Oct 15, 2025

View reviewed changes

cursor bot reviewed Oct 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add KNOWLEDGE_ALLOW_PDF environment variable #42

feat: add KNOWLEDGE_ALLOW_PDF environment variable #42

Uh oh!

0xbbjoker commented Oct 15, 2025 •

edited by cursor bot

Loading

Uh oh!

coderabbitai bot commented Oct 15, 2025 •

edited

Loading

Uh oh!

0xbbjoker commented Oct 15, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

cursor bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: add KNOWLEDGE_ALLOW_PDF environment variable #42

Are you sure you want to change the base?

feat: add KNOWLEDGE_ALLOW_PDF environment variable #42

Uh oh!

Conversation

0xbbjoker commented Oct 15, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

0xbbjoker commented Oct 15, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

✅ Bugbot reviewed your changes and found no bugs!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

0xbbjoker commented Oct 15, 2025 •

edited by cursor bot

Loading

coderabbitai bot commented Oct 15, 2025 •

edited

Loading