A Service to pull and validate OHLCV Data from various Crypto Exchanges using Python and CCXT.
Black Oak uses the ccxt library which is a collection of available crypto exchanges or exchange classes. Please note that some exchanges don't offer to pull unlimited historical data. Some exchanges may limit the access to historical data to a few months or days only.
The recommended to use Black Oak is running it with Docker.
Runs the docker container and mounts the local ./data
directory of the host system inside the Container.
make docker-run-with-mountpoint
Black Oak requires Python 3.7 or higher.
pip install -r requirements.txt
The .envrc takes care of installing the conda environment, dependencies and pre-commit
direnv allow
./bin/run_ohlcv_data_fetcher.sh config.toml
[settings]
# debug output true / false
debug = true
# persistence ( at the moment we only support file, aws kinesis firehose will be implemented later )
persistence = "file"
# a list of proxies to rotate through
# you can use {rand} to append a unique session id for every http request
proxies = [
"http://username:[email protected]?session={rand}:22225",
]
# rate limit should be enabled in case you don't use any proxy
ratelimit = false
# this section is the list of exchanges, assets and resolutions to poll
[exchanges]
[exchanges.bitfinex]
filter_symbols = [] # passing an empty list will pull all available symbols!!!
filter_resolutions = [] # passing an empty list will pull all available resolutions
since = "2017-01-01 00:00:00"
until = "2019-06-26 00:00:00"
limit = 5000
[exchanges.bitmex]
filter_symbols = ["BTC/USD", "ETH/USD"]
filter_resolutions = ["1m", "1h", "1d"]
since = "" # leave since empty to fetch data from the first date available
until = "2019-06-26 00:00:00"
limit = 750
It's recommended to explicitly set since and until dates. Some Exchanges such as Binance might only return the latest data points if you don't explicitly set a date which goes further back in time. Therefore to say, the since date should always be set.
It's recommended to run black-oak with a proxy server such as luminati.io to work around the exchange's rate limits.
Black Oak comes with a script to check the data integrity of a file. This script will check line by line if the time distance between the current and the previous candle is accurate. If it isn't, it will return the corrupted items with some details such as missing number of candles.
The distance in seconds between two candles we can pass with the --resolution
parameter. It's an optional parameter. If
you don't pass one, the script will take the distance between the first two candles. In general, we advise you to pass the
resolution
Usage:
./bin/check_data_integrity.sh --help
Run the integrity checks and write result
# check a single file and write to BINANCE_LTCUSDT_1h_REPAIRED
./bin/check_data_integrity.sh data/BINANCE_LTCUSDT_1h -w -o BINANCE_LTCUSDT_1h_COMPLETE
# check the file and explicitly set the resolution to minutes
./bin/check_data_integrity.sh data/BINANCE_LTCUSDT_1h -r minute
# check a single file and write to the same file
./bin/check_data_integrity.sh data/BINANCE_LTCUSDT_1h -w
# checks a directory of files and writes to new files (FILE_REPAIRED)
./bin/check_data_integrity.sh data/ -wn
# checks a directory of files and writes to new files (FILE_REPAIRED) and saves the test report to the file test_report.txt
./bin/check_data_integrity.sh data/ -wn > test_report.txt
# checks a directory of files and writes to new files (FILE_REPAIRED) and redirects the output to the console and file
./bin/check_data_integrity.sh data/ -wn >&1 | tee test_report.txt