Data-Copilot

Overview

Data-Copilot is a LLM-based system that help you address data-related tasks.

Data-Copilot connects data sources from different domains and diverse user tastes, with the ability to autonomously manage, process, analyze, predict, and visualize data.

See our paper: Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow, Wenqi Zhang, Yongliang Shen, Weiming Lu, Yueting Zhuang

🔥Demo video

Since gpt3.5 has only a 4k input token limit, it currently can access to Chinese stocks, funds and some economic data.

The Data-Copilot can query and predict data autonomously:

Support model and data sources:

	CHN Stock	CHN Fund	CHN Economic data	CHN Financial data
Openai-GPT3.5	✓	✓	✓	✓
Azure-GPT3.5	✓	✓	✓	✓
Qwen-72b-Chat	✓	✓	✓	✓

We propose Data-Copilot, an LLM-based system linking Chinese financial markets such as stock, funds, economic, financial data, and live news

⭐ Data-Copilot can autonomously manage, process, analyze, predict, and visualize data. When a request is received, it transforms raw data into informative results that best match the user’s intent.
⭐ Acting as a designer: Data-Copilot independently designs versatile interface tools with different functions through self-request and iterative refinement.
⭐ As a dispatcher: DataCopilot adeptly invokes the corresponding interfaces sequentially or in parallel and transforms raw data from heterogeneous sources into graphics, tables, and text, without human assistance.

🌳 QuickStart

First replace openai.key and Tushare token in main.py with your personal Openai key and Tushare token. The organization of the whole project is as follows:

|-- README.md
|-- app.py
|-- create_tool
|   |-- Atomic_api_json.py
|   `-- all_atomic_api.json
|-- lab_gpt4_call.py
|-- lab_llms_call.py
|-- main.py
|-- output
|-- prompt_lib
|   |-- prompt_economic.json
|   |-- prompt_financial.json
|   |-- prompt_fund.json
|   |-- prompt_intent_detection.json
|   |-- prompt_stock.json
|   |-- prompt_task.json
|   `-- prompt_visualization.json
|-- requirements.txt
|-- tool.py
|-- tool_lib
|   |-- atomic_api.json
|   |-- tool_backup.json
|   |-- tool_economic.json
|   |-- tool_financial.json
|   |-- tool_fund.json
|   |-- tool_stock.json
|   `-- tool_visualization.json

app.py is the file to start gradio. main.py is the processing flow of interface scheduling, and lab_gpt4_call.py is the file to call the GPT35 model. The tool_lib and tool.py contain the interface tools obtained after the first phase of interface design. The folder prompt_lib contains the design of the prompt and the in context demonstration.

Requirements

pip install -r requirements.txt

Then run the following command:

For Local

python main.py

You can select the LLM in main.py by setting:

model='<the model you choose>'

Remember to fill in the key of the LLM you chose:

For GPT, fill in the key of Openai in main.py
```
openai_key = os.getenv("OPENAI_KEY")
```
For Qwen-72b-Chat, fill in the key in lab_llms_call.py
```
dashscope.api_key='<your api key>'
```

Also, remember to fill in the Tushare token before running the code:

In tool.py for Tushare token

tushare_token = os.getenv('TUSHARE_TOKEN')
pro = ts.pro_api(tushare_token)

For Gradio

The Gradio demo is now hosted on Hugging Face Space. You can also run the following commands to start the demo locally:

python app.py

🌿 How to play

You can try our Data-Copilot for Chinese financial markets in Hugging Face Space:

It has access to Chinese stocks, funds and some economic data. But because gpt3.5 only has 4k input token length, the current data access is still relatively small. In the future, data-copilot will support more data from foreign financial markets.

Step 1 Enter your Openai or Openai-Azure key, please try to use openai's paid API. If you plan to use azure's services, please remember to input both api-base and engine, except for key.
Step 2 Click the OK button to submit
Step 3 Enter the request you want to query in the text box, or select a question directly from the example box and it will appear in the text box.
Step 4 Click the Start button to submit the request
Step 5 Data-Copilot will display the intermediate scheduling process in the Solving Step, and the final will present text (Summary and Result), images and tables.

🍺 Some cases

A case for Check the inflow of northbound every trading date

Citation

If you find this work useful in your method, you can cite the paper as below:

@article{zhang2023data,
  title={Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow},
  author={Zhang, Wenqi and Shen, Yongliang and Lu, Weiming and Zhuang, Yueting},
  journal={arXiv preprint arXiv:2306.07209},
  year={2023}
}

Contact

If you have any questions, please contact us by email: [email protected]

Name	Name	Last commit message	Last commit date
Latest commit zwq2018 Merge pull request #49 from Meng-03/main Apr 3, 2024 c32f9df · Apr 3, 2024 History 44 Commits
assets	assets	Delete video.GIF	Jun 19, 2023
create_tool	create_tool	new file: SW2021_industry_L1.csv	Jun 13, 2023
fonts	fonts	new file: SW2021_industry_L1.csv	Jun 13, 2023
logo	logo	new file: SW2021_industry_L1.csv	Jun 13, 2023
output	output	new file: SW2021_industry_L1.csv	Jun 13, 2023
prompt_lib	prompt_lib	new file: SW2021_industry_L1.csv	Jun 13, 2023
tool_lib	tool_lib	new file: SW2021_industry_L1.csv	Jun 13, 2023
.DS_Store	.DS_Store	modified: README.md	Jun 13, 2023
.gitattributes	.gitattributes	modified: README.md	Jun 12, 2023
LICENSE	LICENSE	Create LICENSE	Jun 10, 2023
README.md	README.md	add other llm test	Apr 2, 2024
SW2021_industry_L1.csv	SW2021_industry_L1.csv	new file: SW2021_industry_L1.csv	Jun 13, 2023
SW2021_industry_L2.csv	SW2021_industry_L2.csv	new file: SW2021_industry_L1.csv	Jun 13, 2023
SW2021_industry_L3.csv	SW2021_industry_L3.csv	new file: SW2021_industry_L1.csv	Jun 13, 2023
app.py	app.py	new file: SW2021_industry_L1.csv	Jun 13, 2023
demo1.png	demo1.png	Add files via upload	Jun 19, 2023
flowchart.md	flowchart.md	Create flowchart.md	Jul 5, 2023
lab_gpt4_call.py	lab_gpt4_call.py	new file: SW2021_industry_L1.csv	Jun 13, 2023
lab_llm_local_call.py	lab_llm_local_call.py	Add other llms for experiment	Mar 22, 2024
lab_llms_call.py	lab_llms_call.py	add other llm test	Apr 2, 2024
main.py	main.py	Update main.py	Apr 3, 2024
requirements.txt	requirements.txt	Add other llms for experiment	Mar 22, 2024
tool.py	tool.py	new file: SW2021_industry_L1.csv	Jun 13, 2023
tushare_daily_20230421211129.csv	tushare_daily_20230421211129.csv	new file: SW2021_industry_L1.csv	Jun 13, 2023
tushare_fund_basic_20230508193747.csv	tushare_fund_basic_20230508193747.csv	new file: SW2021_industry_L1.csv	Jun 13, 2023
tushare_fund_basic_20230516041211.csv	tushare_fund_basic_20230516041211.csv	new file: SW2021_industry_L1.csv	Jun 13, 2023
tushare_fund_basic_20230605184116.csv	tushare_fund_basic_20230605184116.csv	new file: SW2021_industry_L1.csv	Jun 13, 2023
tushare_fund_basic_20230605184607.csv	tushare_fund_basic_20230605184607.csv	new file: SW2021_industry_L1.csv	Jun 13, 2023
tushare_fund_basic_all.csv	tushare_fund_basic_all.csv	new file: SW2021_industry_L1.csv	Jun 13, 2023
tushare_index_basic_20230427223903.csv	tushare_index_basic_20230427223903.csv	new file: SW2021_industry_L1.csv	Jun 13, 2023
tushare_stock_basic_20230421210721.csv	tushare_stock_basic_20230421210721.csv	new file: SW2021_industry_L1.csv	Jun 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data-Copilot

Overview

🔥Demo video

🌳 QuickStart

Requirements

For Local

For Gradio

🌿 How to play

🍺 Some cases

Citation

Contact

Acknowledgement

About

Releases

Packages

Contributors 3

Languages

License

zwq2018/Data-Copilot

Folders and files

Latest commit

History

Repository files navigation

Data-Copilot

Overview

🔥Demo video

🌳 QuickStart

Requirements

For Local

For Gradio

🌿 How to play

🍺 Some cases

Citation

Contact

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages