Scripts for data collection
- yahoo: get US/CN stock data from Yahoo Finance
- fund: get fund data from http://fund.eastmoney.com
- cn_index: get CN index from http://www.csindex.com.cn, CSI300/CSI100
- us_index: get US index from https://en.wikipedia.org/wiki, SP500/NASDAQ100/DJIA/SP400
- contrib: scripts for some auxiliary functions
Specific implementation reference: https://github.com/microsoft/qlib/tree/main/scripts/data_collector/yahoo
- Create a dataset code directory in the current directory
- Add
collector.py
- add collector class:
CUR_DIR = Path(__file__).resolve().parent sys.path.append(str(CUR_DIR.parent.parent)) from data_collector.base import BaseCollector, BaseNormalize, BaseRun class UserCollector(BaseCollector): ...
- add normalize class:
class UserNormalzie(BaseNormalize): ...
- add
CLI
class:class Run(BaseRun): ...
- add collector class:
- add
README.md
- add
requirements.txt
Basic data | |
---|---|
Features | Price/Volume: - $close/$open/$low/$high/$volume/$change/$factor |
Calendar | <freq>.txt: - day.txt - 1min.txt |
Instruments | <market>.txt: - required: all.txt; - csi300.txt/csi500.txt/sp500.txt |
Features
: data, digital- if not adjusted, factor=1
To make the component running correctly, the dependent data are required
Component | required data |
---|---|
Data retrieval | Features, Calendar, Instrument |
Backtest | Features[Price/Volume], Calendar, Instruments |