refactor: remove unused code

sinaptik-ai · Jan 8, 2025 · c1429b3 · c1429b3
1 parent 3c6698c
commit c1429b3
Show file tree

Hide file tree

Showing 57 changed files with 34 additions and 10,371 deletions.
diff --git a/Makefile b/Makefile
@@ -7,7 +7,6 @@ all: help  ## default target executed when no arguments are given to make
 #############################
 
 UNIT_TESTS_DIR ?= tests/unit_tests/
-INTEGRATION_TESTS_DIR ?= tests/integration_tests/
 
 # setup_python:  ## ensure we're using Python 3.10
 # 	@echo "Setting up Python 3.10..."
@@ -77,9 +76,6 @@ tests-coverage: install_deps  ## run unit tests and generate coverage report
 	poetry run coverage run --source=pandasai -m pytest $(UNIT_TESTS_DIR)
 	poetry run coverage xml
 
-integration:  ## run integration tests
-	poetry run pytest $(INTEGRATION_TESTS_DIR)
-
 ###########################
 # SPELLCHECK AND FORMATTING
 ###########################

diff --git a/README.md b/README.md
@@ -158,12 +158,6 @@ Olivia gets paid the most.
 
 You can find more examples in the [examples](examples) directory.
 
-## 🔒 Privacy & Security
-
-In order to generate the Python code to run, we take some random samples from the dataframe, we randomize it (using random generation for sensitive data and shuffling for non-sensitive data) and send just the randomized head to the LLM.
-
-If you want to enforce further your privacy you can instantiate PandasAI with `enforce_privacy = True` which will not send the head (but just column names) to the LLM.
-
 ## 📜 License
 
 PandasAI is available under the MIT expat license, except for the `pandasai/ee` directory (which has it's [license here](https://github.com/Sinaptik-AI/pandas-ai/blob/master/pandasai/ee/LICENSE) if applicable.

diff --git a/docs/v2/library.mdx b/docs/v2/library.mdx
@@ -210,15 +210,12 @@ To customize PandasAI's `SmartDataframe`, you can either pass a `config` object
 Settings:
 
 - `llm`: the LLM to use. You can pass an instance of an LLM or the name of an LLM. You can use one of the LLMs supported. You can find more information about LLMs [here](/v2/llms)
-- `llm_options`: the options to use for the LLM (for example the api token, etc). You can find more information about the settings [here](/v2/llms).
 - `save_logs`: whether to save the logs of the LLM. Defaults to `True`. You will find the logs in the `pandasai.log` file in the root of your project.
 - `verbose`: whether to print the logs in the console as PandasAI is executed. Defaults to `False`.
-- `enforce_privacy`: whether to enforce privacy. Defaults to `False`. If set to `True`, PandasAI will not send any data to the LLM, but only the metadata. By default, PandasAI will send 5 samples that are anonymized to improve the accuracy of the results.
 - `save_charts`: whether to save the charts generated by PandasAI. Defaults to `False`. You will find the charts in the root of your project or in the path specified by `save_charts_path`.
 - `save_charts_path`: the path where to save the charts. Defaults to `exports/charts/`. You can use this setting to override the default path.
 - `open_charts`: whether to open the chart during parsing of the response from the LLM. Defaults to `True`. You can completely disable displaying of charts by setting this option to `False`.
 - `enable_cache`: whether to enable caching. Defaults to `True`. If set to `True`, PandasAI will cache the results of the LLM to improve the response time. If set to `False`, PandasAI will always call the LLM.
-- `use_error_correction_framework`: whether to use the error correction framework. Defaults to `True`. If set to `True`, PandasAI will try to correct the errors in the code generated by the LLM with further calls to the LLM. If set to `False`, PandasAI will not try to correct the errors in the code generated by the LLM.
 - `max_retries`: the maximum number of retries to use when using the error correction framework. Defaults to `3`. You can use this setting to override the default number of retries.
 - `custom_whitelisted_dependencies`: the custom whitelisted dependencies to use. Defaults to `{}`. You can use this setting to override the default custom whitelisted dependencies. You can find more information about custom whitelisted dependencies [here](/v2/custom-whitelisted-dependencies).
 - `security`: The “security” parameter allows for three levels depending on specific use cases: “none,” “standard,” and “advanced.” "standard" and "advanced" are especially useful for detecting malicious intent from user queries and avoiding the execution of potentially harmful code. By default, the “security” is set to "standard." The security check might introduce stricter rules that could flag benign queries as harmful. You can deactivate it in the configuration by setting “security” to “none.”

diff --git a/docs/v2/llms.mdx b/docs/v2/llms.mdx
@@ -7,38 +7,7 @@ The generated code is then executed to produce the result.
 
 [![Choose the LLM](https://cdn.loom.com/sessions/thumbnails/5496c9c07ee04f69bfef1bc2359cd591-00001.jpg)](https://www.loom.com/share/5496c9c07ee04f69bfef1bc2359cd591 "Choose the LLM")
 
-You can either choose a LLM by either instantiating it and passing it to the `SmartDataFrame` or `SmartDatalake` constructor,
-or by specifying it in the `pandasai.json` configuration file.
-
-If the model expects one or more parameters, you can pass them to the constructor or specify them in the `pandasai.json`
-file, in the `llm_options` parameters, Here’s an example of how to structure your `pandasai.json` file:
-
-```json
-{
-  "llm": "BambooLLM",
-  "llm_options": {
-    "api_key": "API_KEY_GOES_HERE"
-  }
-}
-```
-> **Note:** 
-> `pandasai.json` can be configure for any LLM.
-
-## Working with pandasai.json file
-
-In this example, `data.csv` is your data file, and pandasai.json is the configuration file. Make sure the configuration file is named `pandasai.json` and is in the same folder as your code.
-
-```python
-from pandasai import SmartDataframe
-from pandasai.config import load_config_from_json
-
-# Load configuration from pandasai.json
-config = load_config_from_json()
-
-df = SmartDataframe("data.csv", config=config)
-response = df.chat("give me revenue of Top 5 companies for year 2021")
-print(response)
-```
+You can instantiate the LLM by passing it as a config to the SmartDataFrame or SmartDatalake constructor.
 
 ## BambooLLM
 

diff --git a/docs/v3/overview-nl.mdx b/docs/v3/overview-nl.mdx
@@ -26,18 +26,13 @@ import pandasai as pai
 
 pai.config.set({
    "llm": "openai",
-   "llm_options": {
-      "api_key": "YOUR_API_KEY"
-   },
    "save_logs": True,
    "verbose": False,
    "save_charts": False,
    "save_charts_path": "exports/charts/",
    "open_charts": True,
    "enable_cache": True,
-   "use_error_correction_framework": True,
    "max_retries": 3,
-   "enforce_privacy": False,
    "security": "none",
    "custom_whitelisted_dependencies": {}
 })
@@ -78,21 +73,11 @@ pai.config.set({
 - **Default**: `True`
 - **Description**: Whether to enable caching. If set to True, PandasAI will cache the results of the LLM to improve the response time. If set to False, PandasAI will always call the LLM. Learn more about [caching](/v3/chat-and-cache#cache).
 
-#### use_error_correction_framework
-- **Type**: `bool`
-- **Default**: `True`
-- **Description**: Whether to use the error correction framework. If set to True, PandasAI will try to correct the errors in the code generated by the LLM with further calls to the LLM. If set to False, PandasAI will not try to correct the errors in the code generated by the LLM.
-
 #### max_retries
 - **Type**: `int`
 - **Default**: `3`
 - **Description**: The maximum number of retries to use when using the error correction framework. You can use this setting to override the default number of retries.
 
-#### enforce_privacy
-- **Type**: `bool`
-- **Default**: `False`
-- **Description**: Whether to enforce privacy. If set to True, PandasAI will not send any data to the LLM, but only the metadata. By default, PandasAI will send 5 samples that are anonymized to improve the accuracy of the results. Learn more about [privacy settings](/v3/privacy-and-security).
-
 #### security
 - **Type**: `str`
 - **Default**: `"none"`

diff --git a/docs/v3/privacy-and-security.mdx b/docs/v3/privacy-and-security.mdx
@@ -21,25 +21,6 @@ pai.config.set({
 })
 ```
 
-## Enforce Privacy
-
-PandaAI allows you to control how much data is shared with the LLM during analysis. By default, PandaAI sends 5 anonymized samples to improve the accuracy of results. However, you can enforce stricter privacy by configuring the privacy settings:
-
-```python
-import pandasai as pai
-
-pai.config.set({
-    "enforce_privacy": True
-})
-```
-
-When `enforce_privacy` is set to `True`:
-- Only metadata about your data will be sent to the LLM
-- No actual data samples will be shared
-- The LLM will rely solely on column names and data types for analysis
-
-This is particularly useful when working with sensitive or confidential data where data privacy is crucial.
-
 ## Custom whitelisted dependencies
 By default, PandasAI only allows to run code that uses some whitelisted modules. 
 This is to prevent malicious code from being executed on the server or locally. 

diff --git a/examples/from_csv.py b/examples/from_csv.py
@@ -1,14 +1,22 @@
-"""Example of using PandasAI with a CSV file."""
+# """Example of using PandasAI with a CSV file."""
 
-from pandasai import Agent
+# from pandasai import Agent
 
-# By default, unless you choose a different LLM, it will use BambooLLM.
-# You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
+# # By default, unless you choose a different LLM, it will use BambooLLM.
+# # You can get your free API key signing up at https://pandabi.ai (you can also configure it in your .env file)
 
-agent = Agent(
-    "examples/data/Loan payments data.csv",
-)
-response = agent.chat("How many loans are from men and have been paid off?")
+# agent = Agent(
+#     "examples/data/Loan payments data.csv",
+# )
+# response = agent.chat("How many loans are from men and have been paid off?")
 
-print(response)
-# Output: 247 loans have been paid off by men.
+# print(response)
+# # Output: 247 loans have been paid off by men.
+
+
+import pandasai as pai
+import pandasai_sql
+
+df = pai.load("sipa/users")
+response = df.chat("How many users in total?")
+print(response)
diff --git a/examples/from_google_sheets.py b/examples/from_google_sheets.py
diff --git a/examples/judge_agent.py b/examples/judge_agent.py
diff --git a/examples/security_agent.py b/examples/security_agent.py
diff --git a/examples/semantic_agent.py b/examples/semantic_agent.py
diff --git a/examples/templates/sample_flask_salaries.html b/examples/templates/sample_flask_salaries.html