8 minutes reading time (1506 words)

Accessing and Manipulating Historical Data from Yahoo Finance in Excel

cover

Yahoo Finance displays historical data for stock prices, indices, bond yields, fx rates, commodity prices etc on their website and allow anyone to download a text file with these data for free.

If all you want is a very quick and uncomplicated way (three mouse clicks!) to simply get live feeds in your spreadsheet, check out the 30 seconds videos below:

Getting easily live stock prices in Excel

Getting easily live forex rates in Excel

Getting historical data in Excel

Update as of January 2020: Deriscope has now a new dedicated ribbon that greatly simplifies the live feeds user interface.

For details, visit the Introduction to Deriscope Live Feeds.

Deriscope automates this process through a special spreadsheet formula called dsLive that takes as input the handle name of an object containing the specs of your particular request.

This means that in order to get historical data in the spreadsheet through Deriscope, I need to place two separate function calls:

1) Use the Create function of the type HistRequestYF to build the request object

2) Apply the dsLive formula on that object to get the data from Yahoo Finance

The easiest way to generate both formulas is to use the wizard.

I simply click on the Tools button, select the Insert Function item, followed by the Live Feeds item, then the (Yahoo Finance) item and finally click on Time Series in order to paste the necessary formulas at and below my currently selected cell.

The following video shows these steps: 

that results in the following: 


Understanding the dsLive formula in cell A1

As you see, cell A1 contains the formula =dsLive(A2), which takes one input argument and returns the text &Variant_A1:1.1

The prefix & indicates that &Variant_A1:1.1 is the handle name of some object. In fact it points to an object of type Variant that contains the historical data retrieved from Yahoo Finance.

Because cell A1 is currently selected, I can see the object's contents inside the Browse Area of the wizard.

The input cell A2 contains the handle name of an object that has been created by the ds formula as explained next.


Understanding the ds formula in cell A2

Cell A2 contains the formula =ds (A3:B15), which takes one input argument and returns the text &HistYF_A2:1.1, which is the handle name of an object of type Hist YF.

In the following screenshot I have selected the cell A2 so that you can see the formula in the formula bar and also the contents of the respective object inside the wizard.

Note the request is constructed by the wizard with regard to the daily historical prices of the Microsoft stock (ticker of MSFT) for the course of one year.  


Mandatory versus optional input

The first important point is that the Deriscope wizard has pasted at range A3:B15 all input key/value pairs, including those that are optional.

I can tell which keys are optional by looking at the object's contents inside the wizard, where the optional keys are dimmed out (see previous screenshot).

The wizard has actually used the default values for the optional keys, so that I can get the exact same result even if I truncate my input data, as shown below: 


Output as array versus output as object 

An important key is the Output As Object that can be set to either TRUE or FALSE.

The default value is TRUE that results in the output being the handle name of an object. The historical data are then not returned to the spreadsheet but are kept inside the referenced object.

If I wish I can use the wizard to transfer all or part of the object's contents into the spreadsheet as shown in the next video: 

The above method involves an unnecessary redirection if my goal is to have all of the output data in the spreadsheet cells.

An easier and more efficient method would then be to set the Output As Object to TRUE, in which case dsLive should be inserted as an array formula because it returns the historical data as an array without the intermediate creation of an object, as shown below: 


Setting the sampling interval

I can easily change the sampling interval by clicking the validation dropdown on the cell next to the Interval key as shown below:  

Setting it to Month, I get the following result: 


History, Dividends or Splits?

The Events key controls the kind of data requested from the server.

There exist three options as shown below: 

Setting it to Dividends, I get the following result:  


Controlling the time range

The keys From and To supply the earliest and latest date of the requested historical data respectively.

If I select any date-containing cell, a popup calendar appears that helps me to change the contained date, as shown below: 


Controlling the displayed columns

You have noticed that the historical data occupy 7 columns.

It is nevertheless possible to display only a subset of these columns by changing the value for the key Columns. If I select the value cell, the following validation dropdown appears: 

If I set it to Exclude, the function returns error.

I don't need to panic! I only need to select the cell containing the error and read the information displayed in the Info Area of the wizard, which tells me to add one additional key called Column List, the purpose of which is to specify the indices of the about to be excluded columns.

In the screenshot below, you see how I manage to exclude the Open, High, Low and Volume columns: 


Multiple symbols

It is possible to enter an array of values next to the Symbols key.

So I add the symbol GOOG next to MSFT, but I get an error as shown below: 

The displayed error message tells me that a required key called Column List is missing from the input data.

The reason for this complaint is that in the case of multiple symbols the default value for the missing Columns key is Include rather than All, because it is more common to request one single column when multiple symbols are processed.

So I add the missing Column List key with a value of 4 to get the Close column as shown below: 

If I had set 4,5 I would have gotten both the Close and Adj Close columns as shown below:


Rescaling all values to some common base

The key Rescale Values applies in all cases, but it is really useful when several symbols are processed.

The previous example with MSFT and GOOG is indicative of the difficulty we face when we analyze and compare series of numbers that do not start from the same base. MSFT price is around 100 where GOOG price is around 1,000.

The solution is to set some arbitrary date on which both stocks are rescaled so that they are equal to some arbitrary common vase value.

If not specified, Deriscope assumes that the common value is 100 and applies to the earliest date in the set of historical data.

This is exactly what I have done below, where I have also set the Interval to 3 Months so that all output data can be easily seen inside the wizard: 

Now I can easily compare the price evolution of the two stocks.

Since both have started at 100 a year ago, I can safely conclude that Microsoft grew faster than Google. I can also create a common chart without simultaneous display problems since both curves move around 100. 


Setting the Timeout

The key Timeout allows you to increase the default maximum waiting time in case there are network problems while contacting the Yahoo server.  


Visiting the server's url

The key Go to URL allows you to navigate to the Yahoo website where the data are coming from.

This might be useful if you do not trust the data displayed in Excel and would like to visit their origin in order to double-check their validity. 


A YouTube video with voice narration is also available below: 

You may download the spreadsheet produced with the above steps here

Feel free to contact me if you want to share any thoughts with regard to this product or if you want to request any particular features. Contact info and social media links are available at my web site https://www.deriscope.com

Parametric Yield Curve Fitting to Bond Prices unde...
Asian Option Pricing in Excel using QuantLib: Mont...