Data obtained from http://188.226.170.94/, the guy on reddit claims it is from binance.

In [7]:
import pandas as pd
import matplotlib.pyplot as plt
In [2]:
df = pd.read_csv("/home/adrian/Downloads/ETHBTC_1m.csv")
In [13]:
len(df) / 60 / 24
Out[13]:
471.25347222222223

I guess the amount of data is decent

In [9]:
df.Volume.plot()
Out[9]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f860e47a0f0>

Having just 1 exchange for volume aggregating is less than optimal.

In [14]:
df.Close.plot()
Out[14]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f86124af400>
In [32]:
cryptoarchive = pd.read_csv("/home/adrian/Downloads/ETHBTC_cryptoarchive.csv", sep='|', header=None,
                            names=['Timestamp','Open','High','Low','Close','Volume','TakerBuyQuoteAssetVolume',
                                   'TakerBuyBaseAssetVolume','QuoteAssetVolume','TradesNumber'])
In [34]:
cryptoarchive.Volume.plot()
Out[34]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f860e301518>
In [37]:
cryptoarchive.head()
Out[37]:
Timestamp Open High Low Close Volume TakerBuyQuoteAssetVolume TakerBuyBaseAssetVolume QuoteAssetVolume TradesNumber
0 1500004800 0.08 0.08 0.08 0.08 0.043 0.0 0.0 0.003 1
1 1500004860 0.08 0.08 0.08 0.08 0.000 0.0 0.0 0.000 0
2 1500004920 0.08 0.08 0.08 0.08 0.306 0.0 0.0 0.024 2
3 1500004980 0.08 0.08 0.08 0.08 0.212 0.0 0.0 0.017 1
4 1500005040 0.08 0.08 0.08 0.08 0.165 0.0 0.0 0.013 2
In [38]:
df.head()
Out[38]:
Open time Open High Low Close Volume Close time Number of trades
0 1500004800000 0.08 0.08 0.08 0.08 0.043 1500004859999 1
1 1500004860000 0.08 0.08 0.08 0.08 0.000 1500004919999 0
2 1500004920000 0.08 0.08 0.08 0.08 0.306 1500004979999 2
3 1500004980000 0.08 0.08 0.08 0.08 0.212 1500005039999 1
4 1500005040000 0.08 0.08 0.08 0.08 0.165 1500005099999 2

Looks like cryptoarchive also gets the data from binance.

cryptocompare has very relaxed rate limits with which I could pull most of the minutely data in 2 hours. The downside is that I don't have a good feeling with that site and that some guy on reddit claims that it has future bias issues, so I want to stay as far away as possible from that. Though maybe I could use the volume estimates to aggregate the data from a different source...

nomics is basically useless because the data it provides is too recent.

I guess coinapi.io is my only option...