Skip to content

Preprocessor of raw Binance candle data, normalizes, adds indicators and labels.

Notifications You must be signed in to change notification settings

waffle-empire/DataPreprocessor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Preprocessor

Table of contents

Cloning

git clone https://github.com/Research-Project-Crypto/DataPreprocessor.git --recursive

If you forgot to clone recursively you can use the following command:

git submodule update --init --recursive

Build Dependencies

The following instructions are made for Arch based linux systems but they will give you an idea on how to port it to any other systems.

pacman -S --noconfirm --needed gcc make
pacman -S --noconfirm --needed premake

yay -S ta-lib
# or
paru -S ta-lib

Compiling the application

premake5 gmake2

make config=release

Using the application

With arguments

Position Argument
1 Data input folder
2 Data output folder

Example:

./bin/Release/DataPreprocessor data/input data/output

Downside of argument only

With argument only mode you are unable to specify the type of the input data, you can only parse CSV text data.

Without argument using settings.json

If you don't give any arguments the application will default to reading the settings from settings.json.

{
    "input": {
        "input_folder": "data/input",
        "is_binary": false
    },
    "output": {
        "output_folder": "data/output"
    }
}

Data Input Format

CSV

The csv reader expects 6 fields of which all of them should be double floating point numbers.

event_time,open,close,high,low,volume

Binary

Usually you won't need this mode unless you've used the TickerTimescaleSwap, then you NEED to set the is_binary value to true in settings.json.

Verify Data Integrity

Included with this project is a python script with which you can verify the binary output data.

python3 scripts/binary_reader.py

requires numpy

It will loop over all the cells slowly, this mostly to shortly verify calculation mistakes in the program.

About

Preprocessor of raw Binance candle data, normalizes, adds indicators and labels.

Resources

Stars

Watchers

Forks