10 minutes reading time (2013 words)

# How to compute the VaR: Step-by-Step Excel Guide

The purpose of this article is to show you step-by-step how you can calculate the Value at Risk (VaR) of any portfolio by generating all simulation samples in the spreadsheet. This is great for understanding what's going on but it becomes too complex and slow when the number of samples generated by the simulation exceeds 100.

If you don't already know what VaR is, how is defined and/or are looking for a production-grade VaR computation tool for real portfolios, you should read the Value at Risk article at the Risk Management Guru site.

The Value at Risk (VaR) computation of any given portfolio of financial products can be broken in the following 8 stages:

### Stage 1: Determination of the underlying market variables that drive the portfolio price.

Simply collect together all market variables that drive the price of each constituent product.

Take as an example a US-based portfolio manager, who has a portfolio that consists of fixed coupon USD treasuries, fixed coupon European government bonds, Microsoft shares and two-year at-the-money options on the DJI index. Then the market variables that drive the portfolio price consist of:

USD interest rates

EUR interest rates

EUR/USD exchange rate

MSFT price

DJI index quote

2 year atm DJI index implied volatility

A natural simplifying assumption is to replace the "USD interest rates" and "EUR interest rates" above with "Flat increase in USD interest rates" and "Flat increase in EUR interest rates" because the interest rates for any given currency are not a single number, but rather a curve consisting of many maturity-depending numbers.

For simplicity, below I will build a spreadsheet that will calculate the VaR of a single asset, specifically a European equity option on Microsoft stock. Then my market variables will consist of three numbers: the stock price, the implied vol and the interest rate for the option maturity.

### Stage 2: Assumption about the stochastic process that each market data is assumed to follow during its future time evolution.

Now I must make assumptions about how the market data evolve. These assumptions will lead to a collection of so called risk factors that are mathematical stochastic variables that drive the observable market data. Often the risk factors are identical with the market data, but they need not be.

For example, in my stock option case, the risk factors driving both stock price and implied vol are these market data themselves and are assumed to follow a lognormal diffusion. But for the risk factor associated with the interest rate, I opt to choose the spread between the realized interest rate at the future time and the initial interest rate value today. In this way, I can associate a normal diffusion with the spread, which is a good assumption for short time intervals.

The yellow area below contains two columns. In the first column I enter the labels I want to use for referring to my three risk factors: Price, Vol and Spread. I may choose any names I want here. In the second column I enter the special Deriscope Types that I use further below in my spreadsheet (see second image below) to create the Deriscope objects that represent the respective stochastic processes.

The name Geom Brownian Process is the formal name of the Deriscope Type that represents a lognormal diffusion. The name OU Process is the Deriscope Type that represents an Ornstein–Uhlenbeck diffusion that can be also used to model a normal diffusion with drift.

If you are not familiar with the Deriscope notion of Type, you may want to read the article series at Introduction to Deriscope.

The image below shows the section of my spreadsheet with the three formulas that create the three objects of type Stoch Process. The precedent arrows are also displayed for each formula to assist in the visualization of the dependencies.

Due to cell B30 being selected, the formula bar shows the formula in that cell, which is =ds(B31:C33;B34:D35) and returns the text &Price.1, which is the handle name of the object of Type Geom Brownian Process created by that formula. Since this Type inherits from Stoch Process, the object &Price.1 is of type Stoch Process as well. The cells with green color contain links to hard-coded input data in other cells, as explained below in stage 3.

### Stage 3: Determination of the initial conditions and SDE (Stochastic Differential Equation) parameters that apply on the stochastic processes assumed in stage 2.​

This stage involves the determination of the initial risk factor values, drifts and volatilities. The first of them are simply read off the current market. The remaining two are calculated by applying statistical techniques on the respective time series of historical data. This is due to the fact that VaR is computed through a histogram that reflects real-world probabilities, which precludes the usage of option-implied drifts and volatilities.

The yellow areas below show the entry of these data, which are hard-coded here. In a real world application, these data must be the result of some regression analysis of historical data.

These data are then linked to the cells used as input to the formulas that create the respective Stoch Process objects described in stage 2.

### Stage 4: Monte Carlo generation (simulation) of a large number N of possible combinations of values that are attained by the risk factors at the end of the specified time horizon.

In my example, where there exist a total of 3 risk factors, the Monte Carlo should generate N arrays, where each array should consist of 3 numbers. Each such array is called a sample.

The Deriscope Function that generates the Monte Carlo samples is called Simulated Values and is a static Function of the Deriscope Type Stoch Process. It returns an array consisting of N rows, where N is the number of requested samples. Since it returns an array, the respective ds formula must be pasted as an array formula using CTRL-SHIFT-RETURN.

The image below shows the formula =ds(B11:C18;C20:C23;B25;B26:D28) pasted over the grey area, which is a range consisting of three columns and 100 rows, since in this case N = 100 according to cell C15.

The top row of the output array contains the first simulated sample, the array of values {100.0383562 , 0.2 , 0} for the risk factors Price, Vol and Spread respectively. These happen to be the values attained at horizon at time 0.005479452 (according to the specification in cell C13) at the absence of any stochastic noise. This peculiarity is only a side effect of having set the random engine to Low Discrepancy in cell C17, which results in random numbers that follow a deterministic Sobol sequence. The value for Price is slightly higher than 100 due to the assumed stock price growth of 7%.

The remaining rows contain realizations of Price, Vol and Spread that correspond to stochastic "kicks" generated by the random engine. These "kicks" are correlated according to the input matrix in range B26:D28.

### Digression: Monte Carlo generation of correlated, normally distributed random numbers

Before moving on to the next stage, it is a good idea to gain trust in the output of the Simulated Values Function by running a small example, where I simulate three correlated random variables, x1, x2 and x3, each of which follows the standard normal probability distribution of mean 0 and standard deviation 1. So each of them has the probability density of (2π)- ½e- ½x². I do that because in this simple case I know that the resulting samples should be such that their mean and standard deviation converges to 0 and 1 respectively, as their number tends to infinity. Similarly for their pairwise correlations too.

The following image shows the array output of the Simulated Values Function in that case. I have run a simulation that generates 100 samples with a 0.7 correlation for the pair (x1,x3) and a 0 correlation for the pairs (x1,x2) and (x2,x3).

The formulas at the middle/top are built-in Excel formulas that return the various correlations, standard deviations and means of the numbers in the output array. You notice that they approach the corresponding input values, as they should.

The charts on the left also indicate that all three variables are – short of – normally distributed.

This next image shows the same example, but this time with 1,000 samples.

Finally this one shows the case of 1,000 samples again, but with the random generator Sobol engine replaced with a Pseudo Random engine. The result explains why I have chosen to use a Sobol engine for the VaR calculation.

### Stage 5: Calculation of the current asset price.​

Here a pricing software such as Deriscope is indispensable. The fair price of the referenced asset must be calculated using a preferred set of models and the current market data. The resulting price must be copied and pasted as value in a safe cell because it will serve as the reference asset price later, in the calculation of the simulated loss.

The image below shows the pricing result of the Deriscope formula =ds(E3:F6) in cell E2 as 25,41655799

### Stage 6: Calculation of the asset price corresponding to each sample of risk factors generated in stage 4.

Now the asset price calculation must be repeated N times and can easily last for several minutes or even hours, depending on the complexity of the portfolio and of course how big N is. The same model inputs must be used as in the previous stage. The market data though will be different. They must equal the market data expected to be realized at horizon according to the simulated risk factors. This means that the Market Set object &Mkt_E13:6.1 shown above cannot be reused. A new Market Set object must be built for each sample, out of which a respective sample-specific asset price can be calculated.

On another note, each price calculation must be relative to the future time specified by the VaR's horizon. Since Deriscope always returns the price relative to the valuation date set in the wizard, I should first change the latter date to whatever date my VaR horizon is.

This is how the valuation date looks in my wizard right now:

and this is how it looks after I change the date to the next business day, where the deep red color acts as a visual reminder of the unusual setting, whereby the evaluation date differs from today:

The next image shows my previous table with the output simulated values, where I have added three columns with the respective market objects and one more column with the formula that returns the asset price with respect to these market objects:

### Stage 7: Construction of the probability distribution of the asset loss realized at horizon as implied by the N future asset prices calculated in stage 6.​

This is a simple chart and involves no modelling assumptions or any particular expertise on statistics. The final result can be displayed as a histogram where on the horizontal x axis is the realized asset loss and on the vertical y axis is the observed sample frequency, i.e. the number of samples that give rise to a respective asset loss recorded on the x axis.

I only need an additional column that reports the loss Li for each ith sample as the difference V0 - Vi, where V0 is the current asset value in cell N11 and Vi is the asset value corresponding to the ith sample as calculated in column O.

Below you see the additional Loss column and the corresponding histogram chart.

### Stage 8: VaR equals the special point on the x axis that has the following property: The number of all samples that lie on its left constitute a% of N, where a% is the assumed confidence interval, typically 99% or 95%.

To this end, I make use of the built-in Excel formula =PERCENTILE.INC(array,k), which returns the kth percentile of the probability distribution implied by the data in array.

This final image shows the VaR result in cell L2 and the respective formula inside the formula bar: