How To Download Intraday Stock Data From Google Finance?

In this post we discuss how we are going to download intraday High Frequency Trading data using Python. Python is a powerful scripting language. Python is now extensively being used by the big banks, hedge funds and other big institutions for machine learning, artificial intelligence, statistical analysis and algorithmic trading. Did you check our Million Dollar Trading Challenge? High Frequency Trading is another name for intraday trading. Intraday trading is done on M1, M5, M15, M30, H1 and H4 timeframes. Daily and weekly timeframes are low frequency timeframes.

Price feeds are expensive. Most price feed vendors charge $50-$100 per month for providing price feed services. But we don’t need to pay this monthly cost. In this post we give you the list of a few places online from where you can download your price feeds. Quantitative trading is on the rise. Today more than 60% of the trades on NYSE are being placed by algorithms. Algorithmic trading is the future now. If you are a manual trader, in the next few years you will be competing more and more with these algorithmic trading systems. Did you check our Million Dollar Trading Challenge II?

Until and unless we don’t have good intraday high frequency trading (HFT) data with us we cannot use it to build quantitative predictive models. The idea is to use the HFT data and find patterns in it for making trading decisions. First we build models for testing. Once we have a model that has a high predictive accuracy during testing, we can build a prototype trading system and test it in live trading. Python is a good prototyping language. Once we have a good system we can then switch to C++, C# or Java and build a more robust model that is very fast. So first we need to build our model. Building a good model requires high quality HFT data.

I started off as a manual trader. I use candlestick patterns in my trading system a lot. Naked trading means trading solely based on price action. Read this post on how to do naked trading. The idea is to build a quantitative trading model that supplements the manual trading system and makes it more accurate. The idea is to build quantitative trading models based on our manual trading strategies. We will download the data from Google Finance and Yahoo Finance. Of course this will be historical data. Historical data serves our purpose of building quantitative models. The following is the url for downloading data from Google Finance:

http://www.google.com/finance/getprices?i=[PERIOD]&p=[DAYS]d&f=d,o,h,l,c,v&df=cpct&q=[TICKER]

In the above url, PERIOD is the HFT data time interval. It should be in seconds. For example if you want to download 1 minute data we will use 60 for PERIOD. 1 minute is also the lowest time interval .TICKER is the ticker symbol. For example it can be MSFT for Microsoft, AAPL for Apple etc. DAYS is the number of days HFT data that you want. Now keep this in mind that the data from Google Finance is delayed and is not live data. You cannot use it for live trading. But of course as said above you can use the data for quantitative model building. So this data serves our purpose. Now there are a few things about this data. The time is in unix format. We will need to convert that into the standard time format. Sounds complicated? Not really if you know how to make the conversion. I will show you how to make the conversion. We download AAPL stock data. AAPL is the ticker symbol for Apple stock. Apple stock is pretty popular with day traders. We will download the 1 minute AAPL data from Google Finance.

#import the libraries
import pandas as pd
#download tick data for AAPL stock
data = pd.read_csv("http://www.google.com/finance/getprices?q=AAPL&i=300&p=10d&f=d,o,h,l,c,v", skiprows=8, header=None)
data.head()

This is the output!

data.head()
3    116.92    117.03    116.78       117  466878
0  4  117.0360  117.0765  116.8700  116.9200  402779
1  5  117.1573  117.1900  116.8900  117.0400  457673
2  6  117.1100  117.1850  117.0600  117.1500  311011
3  7  117.1100  117.1600  117.0000  117.1100  291641
4  8  117.1500  117.1600  117.0599  117.1045  246471

Now adding timestamp in our known format means we will need to convert the data from Unix format to the usual format that we are accustomed to that includes day, hour, minutes.

#add a timestamp to the intraday AAPL data
import pandas as pd, numpy as np, datetime
x=np.array(pd.read_csv("http://www.google.com/finance/getprices?q=AAPL&i=300&p=10d&f=d,o,h,l,c,v",skiprows=7,header=None))
date=[]
for i in range(0,len(x)):
    if x[i][0][0]=='a':
       t= datetime.datetime.fromtimestamp(int(x[i][0].replace('a','')))
       date.append(t)
    else:
       date.append(t+datetime.timedelta(minutes =int(x[i][0])))
data1=pd.DataFrame(x,index=date)
data1.columns=['a','Open','High','Low','Close','Vol']
data1.head()
data1.tail()

When you use the above code make sure you indent the for loop and the if else statement. This is a must in python otherwise you will get a traceback error. Below is the output:

>>> data1=pd.DataFrame(x,index=date)
a     Open     High     Low   Close     Vol
2016-12-21 19:30:00  a1482330600   116.84   116.84   116.8   116.8  225329
2016-12-21 19:31:00            1   117.27   117.35   116.8  116.84  633227
2016-12-21 19:32:00            2      117   117.35     117  117.28  400675
2016-12-21 19:33:00            3   116.92   117.03  116.78     117  466878
2016-12-21 19:34:00            4  117.036  117.076  116.87  116.92  402779>>>
data1.columns=[‘a’,’Open’,’High’,’Low’,’Close’,’Vol’]
a     Open    High     Low    Close      Vol
2017-01-04 06:20:00  650  116.695   116.7  116.61   116.64   240149
2017-01-04 06:21:00  651  116.625  116.72  116.62  116.695   243736
2017-01-04 06:22:00  652  116.745  116.75  116.62   116.62   316471
2017-01-04 06:23:00  653   116.61  116.76   116.6  116.745   402803
2017-01-04 06:24:00  654   116.64  116.74  116.58    116.6  1839289>>>
data1.head()
>>> data1.tail()

In the above data you can see the original time format which is a1482330600. We have converted it to the regular format. In the next post I am going to show you how you are going to download google finance intraday in real time.

AAPL Intraday data

In the above chart, there are holidays so you can see straight lines. Whatever if you compare python with R, I would say R is much simpler when it comes to dealing with financial time series data. Python is a bit difficult when it comes to dealing with financial time series data. In the above code we have change the Unix time format to the standard time format. Whatever you can use the above method to download intraday stock data absolutely free.