Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advanced Categorization, Rate Limiting, and Feature Flags #177

Merged
merged 17 commits into from
Mar 23, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
20aa0c0
Add rate limiting and error handling utilities for LLM and transactio…
kevingatera Feb 26, 2025
01186f7
Add new category suggestion feature with LLM support - WIP
kevingatera Feb 28, 2025
c42d7f5
Implement web search tool and improve category suggestion workflow
kevingatera Mar 1, 2025
82e5228
Add rule-based transaction categorization support
kevingatera Mar 5, 2025
3284dae
Add dry run mode and unified LLM response handling
kevingatera Mar 6, 2025
0a048cd
Improve JSON parsing and category logging in transaction processing
kevingatera Mar 7, 2025
ecd871c
Refactor prompt generation and LLM service & update tests
kevingatera Mar 8, 2025
f7b15e6
Implement feature flag system for dynamic configuration management
kevingatera Mar 8, 2025
d3e9d39
Add VSCode debug configuration for tests
kevingatera Mar 8, 2025
359f788
Refactor feature flag and transaction processing logic with improved …
kevingatera Mar 8, 2025
5a81371
Bump version to 2.0.0 given how much has changed
kevingatera Mar 8, 2025
07aba42
Add unit tests for RateLimiter
kevingatera Mar 23, 2025
c56dd2f
Replace env vars with features for startup options
kevingatera Mar 23, 2025
e153b52
Add tests for RateLimiter to enforce rate limits and handle retries w…
kevingatera Mar 23, 2025
c6e9870
Remove extra rules from prompt template added by accident
kevingatera Mar 23, 2025
6e506b1
Add FreeWebSearchService and integrate with ToolService. Update types…
kevingatera Mar 23, 2025
dd6029d
Update .env.example and README for feature flags; remove DRY_RUN and …
kevingatera Mar 23, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,15 @@ ACTUAL_SERVER_URL=http://actual_server:5006
ACTUAL_PASSWORD=
ACTUAL_BUDGET_ID=
CLASSIFICATION_SCHEDULE_CRON="0 */4 * * *"
CLASSIFY_ON_STARTUP=true
SYNC_ACCOUNTS_BEFORE_CLASSIFY=true

# Feature flags - can be specified as an array
FEATURES='["freeWebSearch", "suggestNewCategories", "rerunMissedTransactions", "classifyOnStartup", "syncAccountsBeforeClassify"]'

# Tools and API keys
# ENABLED_TOOLS=webSearch
VALUESERP_API_KEY=

# LLM configuration
LLM_PROVIDER=openai
OPENAI_API_KEY=
OPENAI_MODEL=gpt-4o-mini
Expand Down
2 changes: 1 addition & 1 deletion .eslintrc.json
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
}
],
"no-unused-vars": "off",
"@typescript-eslint/no-unused-vars": ["error"]
"@typescript-eslint/no-unused-vars": ["error", { "argsIgnorePattern": "^_" }]
},
"parserOptions": {
"ecmaVersion": 2020,
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@ tmp/budgets/
dist/

.env
*.log
19 changes: 19 additions & 0 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,25 @@
"sourceMaps": true,
"console": "integratedTerminal",
"cwd": "${workspaceFolder}"
},
{
"type": "node",
"request": "launch",
"name": "Debug Tests",
"runtimeExecutable": "npm",
"runtimeArgs": [
"run",
"test",
"--",
"--runInBand",
"--watchAll=false"
],
"skipFiles": [
"<node_internals>/**"
],
"sourceMaps": true,
"console": "integratedTerminal",
"cwd": "${workspaceFolder}"
}
]
}
95 changes: 93 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,22 @@ The app sends requests to the LLM to classify transactions based on their descri

#### ✅ Every guessed transaction is marked as guessed in notes, so you can review the classification.

#### 🌱 Suggest and create new categories for transactions that don't fit existing ones

When enabled, the LLM can suggest entirely new categories for transactions it cannot classify, and optionally create them automatically.

#### 🌐 Web search for unfamiliar merchants

Using the ValueSerp API, the system can search the web for information about unfamiliar merchants to help the LLM make better categorization decisions.

#### 🔎 Free web search alternative

A self-hosted alternative to ValueSerp that uses free public search API (DuckDuckGo) to search for merchant information without requiring an API key.

#### 🔄 Re-run missed transactions

Re-process transactions previously marked as unclassified.

## 🚀 Usage

Sample `docker-compose.yml` file:
Expand All @@ -53,9 +69,9 @@ services:
ACTUAL_PASSWORD: your_actual_password
ACTUAL_BUDGET_ID: your_actual_budget_id # This is the ID from Settings → Show advanced settings → Sync ID
CLASSIFICATION_SCHEDULE_CRON: 0 */4 * * * # How often to run classification.
CLASSIFY_ON_STARTUP: true # Whether to classify transactions on startup (don't wait for cron schedule)
SYNC_ACCOUNTS_BEFORE_CLASSIFY: false # Whether to sync accounts before classification
LLM_PROVIDER: openai # Can be "openai", "anthropic", "google-generative-ai", "ollama" or "groq"
FEATURES: '["classifyOnStartup", "syncAccountsBeforeClassify", "freeWebSearch", "suggestNewCategories"]'
# VALUESERP_API_KEY: your_valueserp_api_key # API key for ValueSerp, required if webSearch tool is enabled
# OPENAI_API_KEY: # optional. required if you want to use the OpenAI API
# OPENAI_MODEL: # optional. required if you want to use a specific model, default is "gpt-4o-mini"
# OPENAI_BASE_URL: # optional. required if you don't want to use the OpenAI API but OpenAI compatible API, ex: "http://ollama:11424/v1
Expand Down Expand Up @@ -95,6 +111,26 @@ services:
# ANSWER BY A CATEGORY ID - DO NOT CREATE ENTIRE SENTENCE - DO NOT WRITE CATEGORY NAME, JUST AN ID. Do not guess, if you don't know the answer, return "uncategorized".
```

## Feature Configuration

You can configure features in using the FEATURES array (recommended):

The `FEATURES` environment variable accepts a JSON array of feature names to enable:

```
FEATURES='["freeWebSearch", "suggestNewCategories", "classifyOnStartup", "syncAccountsBeforeClassify"]'
```

Available features:
- `webSearch` - Enable web search for merchant information
- `freeWebSearch` - Enable free web search for merchant information (self-hosted alternative to ValueSerp)
- `suggestNewCategories` - Allow suggesting new categories for transactions
- `classifyOnStartup` - Run classification when the application starts
- `syncAccountsBeforeClassify` - Sync accounts before running classification
- `dryRun` - Run in dry run mode (enabled by default)
- `dryRunNewCategories` - Only log suggested categories without creating them (enabled by default)
- `rerunMissedTransactions` - Re-process transactions previously marked as unclassified

## Customizing the Prompt

To create a custom prompt, modify the `PROMPT_TEMPLATE` environment variable to include or exclude variables as needed.
Expand All @@ -120,3 +156,58 @@ loops.
7. `date`: The date of the transaction. This is taken from `transaction.date`.
8. `cleared`: A boolean indicating if the transaction is cleared. This is taken from `transaction.cleared`.
9. `reconciled`: A boolean indicating if the transaction is reconciled. This is taken from `transaction.reconciled`.

## New Category Suggestions

When `suggestNewCategories` feature is enabled, the system will:

1. First try to classify transactions using existing categories
2. For transactions that can't be classified, request a new category suggestion from the LLM
3. Check if similar categories already exist
4. If in dry run mode (`dryRunNewCategories` is enabled), just log the suggestions
5. If not in dry run mode, create the new categories and assign transactions to them

This feature is particularly useful when you have transactions that don't fit your current category structure and you want the LLM to help expand your categories intelligently.

## Tools Integration

The system supports various tools that can be enabled to enhance the LLM's capabilities:

1. Enable tools by including them in the `FEATURES` array or by setting `ENABLED_TOOLS`
2. Provide any required API keys for the tools you want to use

Currently supported tools:

### webSearch

The webSearch tool uses the ValueSerp API to search for information about merchants that the LLM might not be familiar with, providing additional context for categorization decisions.

To use this tool:
1. Include `webSearch` in your `FEATURES` array or `ENABLED_TOOLS` list
2. Provide your ValueSerp API key as `VALUESERP_API_KEY`

This is especially helpful for:
- New or uncommon merchants
- Merchants with ambiguous names
- Specialized services that might be difficult to categorize without additional information

The search results are included in the prompts sent to the LLM, helping it make more accurate category assignments or suggestions.

## Dry Run Mode

The `dryRun` feature is enabled by default. In this mode:
- No transactions will be modified
- No categories will be created
- All proposed changes will be logged to console
- System will show what would happen with real execution

To perform actual changes:
1. Remove `dryRun` from your FEATURES array
2. Ensure `suggestNewCategories` is enabled if you want new category creation
3. Run the classification process

Dry run messages will show:
- Which transactions would be categorized
- Which rules would be applied
- What new categories would be created
- How many transactions would be affected by each change
4 changes: 2 additions & 2 deletions app.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import cron from 'node-cron';
import { cronSchedule, classifyOnStartup } from './src/config';
import { cronSchedule, isFeatureEnabled } from './src/config';
import actualAi from './src/container';

if (!cron.validate(cronSchedule)) {
Expand All @@ -12,7 +12,7 @@ cron.schedule(cronSchedule, async () => {
});

console.log('Application started');
if (classifyOnStartup) {
if (isFeatureEnabled('classifyOnStartup')) {
(async () => {
await actualAi.classify();
})();
Expand Down
Loading