Skip to content

Commit

Permalink
refactor: remove unused code
Browse files Browse the repository at this point in the history
  • Loading branch information
gventuri committed Jan 8, 2025
1 parent 3c6698c commit c1429b3
Show file tree
Hide file tree
Showing 57 changed files with 34 additions and 10,371 deletions.
4 changes: 0 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ all: help ## default target executed when no arguments are given to make
#############################

UNIT_TESTS_DIR ?= tests/unit_tests/
INTEGRATION_TESTS_DIR ?= tests/integration_tests/

# setup_python: ## ensure we're using Python 3.10
# @echo "Setting up Python 3.10..."
Expand Down Expand Up @@ -77,9 +76,6 @@ tests-coverage: install_deps ## run unit tests and generate coverage report
poetry run coverage run --source=pandasai -m pytest $(UNIT_TESTS_DIR)
poetry run coverage xml

integration: ## run integration tests
poetry run pytest $(INTEGRATION_TESTS_DIR)

###########################
# SPELLCHECK AND FORMATTING
###########################
Expand Down
6 changes: 0 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,12 +158,6 @@ Olivia gets paid the most.

You can find more examples in the [examples](examples) directory.

## 🔒 Privacy & Security

In order to generate the Python code to run, we take some random samples from the dataframe, we randomize it (using random generation for sensitive data and shuffling for non-sensitive data) and send just the randomized head to the LLM.

If you want to enforce further your privacy you can instantiate PandasAI with `enforce_privacy = True` which will not send the head (but just column names) to the LLM.

## 📜 License

PandasAI is available under the MIT expat license, except for the `pandasai/ee` directory (which has it's [license here](https://github.com/Sinaptik-AI/pandas-ai/blob/master/pandasai/ee/LICENSE) if applicable.
Expand Down
3 changes: 0 additions & 3 deletions docs/v2/library.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -210,15 +210,12 @@ To customize PandasAI's `SmartDataframe`, you can either pass a `config` object
Settings:

- `llm`: the LLM to use. You can pass an instance of an LLM or the name of an LLM. You can use one of the LLMs supported. You can find more information about LLMs [here](/v2/llms)
- `llm_options`: the options to use for the LLM (for example the api token, etc). You can find more information about the settings [here](/v2/llms).
- `save_logs`: whether to save the logs of the LLM. Defaults to `True`. You will find the logs in the `pandasai.log` file in the root of your project.
- `verbose`: whether to print the logs in the console as PandasAI is executed. Defaults to `False`.
- `enforce_privacy`: whether to enforce privacy. Defaults to `False`. If set to `True`, PandasAI will not send any data to the LLM, but only the metadata. By default, PandasAI will send 5 samples that are anonymized to improve the accuracy of the results.
- `save_charts`: whether to save the charts generated by PandasAI. Defaults to `False`. You will find the charts in the root of your project or in the path specified by `save_charts_path`.
- `save_charts_path`: the path where to save the charts. Defaults to `exports/charts/`. You can use this setting to override the default path.
- `open_charts`: whether to open the chart during parsing of the response from the LLM. Defaults to `True`. You can completely disable displaying of charts by setting this option to `False`.
- `enable_cache`: whether to enable caching. Defaults to `True`. If set to `True`, PandasAI will cache the results of the LLM to improve the response time. If set to `False`, PandasAI will always call the LLM.
- `use_error_correction_framework`: whether to use the error correction framework. Defaults to `True`. If set to `True`, PandasAI will try to correct the errors in the code generated by the LLM with further calls to the LLM. If set to `False`, PandasAI will not try to correct the errors in the code generated by the LLM.
- `max_retries`: the maximum number of retries to use when using the error correction framework. Defaults to `3`. You can use this setting to override the default number of retries.
- `custom_whitelisted_dependencies`: the custom whitelisted dependencies to use. Defaults to `{}`. You can use this setting to override the default custom whitelisted dependencies. You can find more information about custom whitelisted dependencies [here](/v2/custom-whitelisted-dependencies).
- `security`: The “security” parameter allows for three levels depending on specific use cases: “none,” “standard,” and “advanced.” "standard" and "advanced" are especially useful for detecting malicious intent from user queries and avoiding the execution of potentially harmful code. By default, the “security” is set to "standard." The security check might introduce stricter rules that could flag benign queries as harmful. You can deactivate it in the configuration by setting “security” to “none.”
Expand Down
33 changes: 1 addition & 32 deletions docs/v2/llms.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,38 +7,7 @@ The generated code is then executed to produce the result.

[![Choose the LLM](https://cdn.loom.com/sessions/thumbnails/5496c9c07ee04f69bfef1bc2359cd591-00001.jpg)](https://www.loom.com/share/5496c9c07ee04f69bfef1bc2359cd591 "Choose the LLM")

You can either choose a LLM by either instantiating it and passing it to the `SmartDataFrame` or `SmartDatalake` constructor,
or by specifying it in the `pandasai.json` configuration file.

If the model expects one or more parameters, you can pass them to the constructor or specify them in the `pandasai.json`
file, in the `llm_options` parameters, Here’s an example of how to structure your `pandasai.json` file:

```json
{
"llm": "BambooLLM",
"llm_options": {
"api_key": "API_KEY_GOES_HERE"
}
}
```
> **Note:**
> `pandasai.json` can be configure for any LLM.
## Working with pandasai.json file

In this example, `data.csv` is your data file, and pandasai.json is the configuration file. Make sure the configuration file is named `pandasai.json` and is in the same folder as your code.

```python
from pandasai import SmartDataframe
from pandasai.config import load_config_from_json

# Load configuration from pandasai.json
config = load_config_from_json()

df = SmartDataframe("data.csv", config=config)
response = df.chat("give me revenue of Top 5 companies for year 2021")
print(response)
```
You can instantiate the LLM by passing it as a config to the SmartDataFrame or SmartDatalake constructor.

## BambooLLM

Expand Down
15 changes: 0 additions & 15 deletions docs/v3/overview-nl.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -26,18 +26,13 @@ import pandasai as pai

pai.config.set({
"llm": "openai",
"llm_options": {
"api_key": "YOUR_API_KEY"
},
"save_logs": True,
"verbose": False,
"save_charts": False,
"save_charts_path": "exports/charts/",
"open_charts": True,
"enable_cache": True,
"use_error_correction_framework": True,
"max_retries": 3,
"enforce_privacy": False,
"security": "none",
"custom_whitelisted_dependencies": {}
})
Expand Down Expand Up @@ -78,21 +73,11 @@ pai.config.set({
- **Default**: `True`
- **Description**: Whether to enable caching. If set to True, PandasAI will cache the results of the LLM to improve the response time. If set to False, PandasAI will always call the LLM. Learn more about [caching](/v3/chat-and-cache#cache).

#### use_error_correction_framework
- **Type**: `bool`
- **Default**: `True`
- **Description**: Whether to use the error correction framework. If set to True, PandasAI will try to correct the errors in the code generated by the LLM with further calls to the LLM. If set to False, PandasAI will not try to correct the errors in the code generated by the LLM.

#### max_retries
- **Type**: `int`
- **Default**: `3`
- **Description**: The maximum number of retries to use when using the error correction framework. You can use this setting to override the default number of retries.

#### enforce_privacy
- **Type**: `bool`
- **Default**: `False`
- **Description**: Whether to enforce privacy. If set to True, PandasAI will not send any data to the LLM, but only the metadata. By default, PandasAI will send 5 samples that are anonymized to improve the accuracy of the results. Learn more about [privacy settings](/v3/privacy-and-security).

#### security
- **Type**: `str`
- **Default**: `"none"`
Expand Down
19 changes: 0 additions & 19 deletions docs/v3/privacy-and-security.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -21,25 +21,6 @@ pai.config.set({
})
```

## Enforce Privacy

PandaAI allows you to control how much data is shared with the LLM during analysis. By default, PandaAI sends 5 anonymized samples to improve the accuracy of results. However, you can enforce stricter privacy by configuring the privacy settings:

```python
import pandasai as pai

pai.config.set({
"enforce_privacy": True
})
```

When `enforce_privacy` is set to `True`:
- Only metadata about your data will be sent to the LLM
- No actual data samples will be shared
- The LLM will rely solely on column names and data types for analysis

This is particularly useful when working with sensitive or confidential data where data privacy is crucial.

## Custom whitelisted dependencies
By default, PandasAI only allows to run code that uses some whitelisted modules.
This is to prevent malicious code from being executed on the server or locally.
Expand Down
28 changes: 18 additions & 10 deletions examples/from_csv.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,22 @@
"""Example of using PandasAI with a CSV file."""
# """Example of using PandasAI with a CSV file."""

from pandasai import Agent
# from pandasai import Agent

# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
# # By default, unless you choose a different LLM, it will use BambooLLM.
# # You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)

agent = Agent(
"examples/data/Loan payments data.csv",
)
response = agent.chat("How many loans are from men and have been paid off?")
# agent = Agent(
# "examples/data/Loan payments data.csv",
# )
# response = agent.chat("How many loans are from men and have been paid off?")

print(response)
# Output: 247 loans have been paid off by men.
# print(response)
# # Output: 247 loans have been paid off by men.


import pandasai as pai
import pandasai_sql

df = pai.load("sipa/users")
response = df.chat("How many users in total?")
print(response)
17 changes: 0 additions & 17 deletions examples/from_google_sheets.py

This file was deleted.

31 changes: 0 additions & 31 deletions examples/judge_agent.py

This file was deleted.

18 changes: 0 additions & 18 deletions examples/security_agent.py

This file was deleted.

64 changes: 0 additions & 64 deletions examples/semantic_agent.py

This file was deleted.

47 changes: 0 additions & 47 deletions examples/templates/sample_flask_salaries.html

This file was deleted.

Loading

0 comments on commit c1429b3

Please sign in to comment.