\section{Basics on Brownian Motion}\label{sec:BBM}

Suppose that $G_k$, for $k\in\{1,\dots,N\}$, are iid random variables, with $G_k\sim N(0,\sigma)$. Because these variables are independent then

$$ Z_N=\displaystyle\sum_{k=1}^N G_k \sim N\left(0,\sqrt{N}\sigma\right). $$

For a reason that will be clear later, we make $Z_0=G_0=0$ and $\sigma=\sqrt{\Delta}$, where $\Delta$ is the time step between two consecutive elements of the sequence. This particular choice of $\sigma$ makes us working with the so called Standard Brownian Motion (see definition \ref{def:SBM}).Supposing that $t\in(t_i,t_f)$, we have then

$$ \Delta =\displaystyle\frac{t_f-t_i}{N}. $$

When we add up noisy processes, we necessarilly end up having more aggregated noise. We we look to our processes, we would like to have more fine detail but controlling the overall level of noise (variance). By locking the noise to the inverse number of detail points in the interval, we have how much detail we want without increasing the global level of noise. You are encourage to change the number of the variable nsteps bellow to see this idea in action. At the same time, the variance it is always proportional to the square root of the duration of the process.

In [1]:
# just load the libraries
using Plots
using Distributions
using StatPlots
In [3]:
nsteps = 1000
initial_t = 0 ;
final_t = 10 ;
x = linspace(initial_t,final_t,nsteps) ;
sigma = sqrt((final_t - initial_t)/nsteps)
y = sigma * randn(nsteps);
y[1] = 0;
z = cumsum(y) ;
plot(x,z,legend = false)
Out[3]:
0.0 2.5 5.0 7.5 10.0 0 2 4 6

Because we want to create a correspondence between time and the ordering of the sequence $Z_n$, for each $t\in(t_i,t_f)$, we make to correspond to some $n\in\{1,\dots,N\}$ by $t = t_i + n\cdot\Delta$. We will jump between the two designations whenever it was more suitable for us.

A special reference must be made in respect to the structure of the process that we have just created. From the global point of view, assuming that the time interval is $[0,T]$, at time $t=0$ the value of the process, name it $Z_0$, is not random. Indeed, we even decided to have $Z_0=0$. This just means that at the present we have all information that we need and no randomness. If $Z_T$ denotes the value of the process at time $t=T$, $Z_T$ will be indeed a random variable. Due to previous observations we know that $Z_T\sim N\left(0,\sqrt{N\Delta}\right)$, with $T=N\cdot\Delta$. Simple maths gives us the conclusion

$$ Z_T\sim N\left(0,\sqrt{T}\right). $$

From the local point of view, between two consecutive points $Z_{t}$ and $Z_{t+\Delta}$, by definition, and for some $n$, $Z_{t+\Delta}-Z_{t}=G_n\sim N\left(0,\sqrt{\Delta}\right)$. So, locally, the process $Z_{t+\Delta}-Z_{t}$ has the same structure of the global process $Z_T$. This is usually called a fractal structure. This type of process is so special that it deserves to be defined.

\begin{definition}\label{def:SBM}[Standard Brownian Motion] A stochastic process $Z_t$, defined in $t\in[0,T]$ such that \begin{enumerate} \item $Z_0=0$; \item $Z_t$ is continuous; \item All the increments of $Z_t$ are independent; \item For $0\leq s < t\leq T$, $Z_t-Z_s\sim N\left(0,\sqrt{t-s}\right),$ \end{enumerate} is called a (unidimensional) **Standard Brownian Motion** or **Wiener Process**. \end{definition}

In this case the proof of normality of $Z_T$ was quite straightforward. In most cases there is not so simple to know what distribution rules the values of the process far from the origin. In these cases we have no alternative to use simulation in order to try to understand its behavior.

We will create some basic simulation of the final values of the process $Z_T$ in order to empirically prove the theoretical result $Z_T\sim N\left(0,\sqrt{T}\right)$.

In [4]:
nsteps = 100;
npaths = 50000;

initial_t = 0 ;
final_t = 10 ;
sigma = sqrt((final_t - initial_t)/nsteps);

zvalues = zeros(npaths) ;

for i in 1:npaths
    tmpx = sigma * randn(nsteps)
    tmpx[1] = 0 
    y = cumsum(tmpx)
    zvalues[i] = y[nsteps]
end

# now we just print the values of mean~0 and std~sqrt(final_t - initial_t)
println(mean(zvalues))
println(std(zvalues))
-0.0139939572928784
3.153991247547653

This observation becames more clear if we create a histogram of densities of the final values of the trajectories and compare it with the density of the gaussian $N\left(0,\sqrt{T}\right)$.

In [5]:
# using the PyPlot backend.
pyplot()
nbins = 50 ;

histogram(zvalues,normed = true) # Histogram
plot!(Normal(0,sigma * sqrt(nsteps)),lw=2)
Out[5]:

\section{Brownian Motion with a drift}\label{sec:BMD}

In this section we will extend a little more the concept of Brownian Motion (BM) introduced in the previous section, adding to it a drift. Conceptually, if $Z_T\sim N\left(0,\sqrt{T}\right)$ is a BM, we consider the continuous stochastic process

\begin{equation}\label{eq:BBD.1} X_t = \mu t + \sigma Z_t. \end{equation}

It is obvious that for each $t\in(0,T)$ we have $X_t\sim N\left(\mu t,\sigma\sqrt{t}\right)$. To the coefficients $\mu$ and $\sigma^2$ we call the \textbf{drift} and \textbf{diffusion} coefficients. Equation (\ref{eq:BBD.1}) states that $X_t$ is the solution of the stochastic differential equation (SDE)

\begin{equation}\label{eq:BBD.2} dX_t = \mu dt + \sigma dZ_t. \end{equation}

We are here in the realm of Itô's Calculus. However, because we are mainly interested here in the simulation of the stochastic processes, we skip most of the results of this wonderful branch of mathematics. This relation may be extended when the drif and the diffusion are dependents on the variable $t$, i.e.,

\begin{equation}\label{eq:BBD.3} dX_t = \mu(t) dt + \sigma(t) dZ_t, \end{equation}

in which case the solution of the SDE (\ref{eq:BBD.3}) is given by

\begin{equation}\label{eq:BBD.4} X(t) = X(0) + \int_0^t \mu(s)ds +\int_0^t\sigma(s)dZ_s. \end{equation}

While the first integral in (\ref{eq:BBD.4}) is the \textit{usual} integral, the second is the Itô's integral. The difference between the two lies in the fact that the quadratic variation of differentiable functions is zero while, because the BM is continous but nowhere differentiable, the quadratic variation of BM is equal to length of the interval. The most relevant result for us are

\begin{align*} E\left(X_t-X_s\right) &= \int_s^t \mu(u)du,\\ Var\left(X_t-X_s\right) &= \int_s^t \sigma^2(u)du, \end{align*}

being both \textit{usual} integrals. See \cite{shreve2004} for a thorough treatment of the subject.

The next instruction computes the simulation of a process with drift and diffusion coefficient both constants.

In [6]:
npaths = 50000; # the number of paths
nsteps = 100;   # the number of points in each path

initial_t = 0 ;
final_t = 10 ;
stdsigma = sqrt((final_t - initial_t)/nsteps); # this is the sigma to SBM

# constant drift and constant diffusion coefficient
drift = 3
vol = 2

# the place where we will put the final values of the paths
zvalues = zeros(npaths) ; 

for i in 1:npaths
    tmpx = stdsigma * randn(nsteps)
    tmpx[1] = 0 
    y = drift * linspace(initial_t,final_t,nsteps) + vol * cumsum(tmpx)
    zvalues[i] = y[nsteps]
end

# now we just print the empirical values and the theoretical values
println("The mean value of the process is $(mean(zvalues))")
println("The theoretical value for the mean is $(drift*(final_t - initial_t))")
println("The standard deviation of the proces is $(std(zvalues))")
println("The theoretical value for std is $(vol * sqrt(final_t - initial_t))")
The mean value of the process is 30.013226397783022
The theoretical value for the mean is 30
The standard deviation of the proces is 6.274037887706898
The theoretical value for std is 6.324555320336759

It may seems not very difficult to change the previous code in order to compute the same simulation with drift and diffusion coefficient both dependent on time. However, that is not the case. The deep reason for this is based again in the most fundamental principles of Itô's Calculus.

Looking back again to the case where both the drift and the diffusion coefficient where constant, the discretization of equation (\ref{eq:BBD.2}) is

\begin{equation}\label{eq:BMD.10} X_{t_{i+1}} = X_{t_{i}} + \mu\cdot(\Delta) + \sqrt{\Delta}\sigma\left(Z_{t_{i+1}}-Z_{t_{i}}\right). \end{equation}

Because $Z_{t_{i+1}}-Z_{t_{i}}=G_{t_{i+1}}$ and because $\mu$ and $\sigma$ are both constants, the cumulative sum of the differences defined by equation (\ref{eq:BMD.10}) leads to the solution

$$ X_n = X_0 + \mu\cdot(n\Delta) + \sqrt{\Delta} Z_n. $$

When $X_0=0$, this is the solution of the previous code. In the case when both the drift and the diffusion are dependent on time (in fact, the considerable changes happen when the diffusion is a function of time), the solution of the differential equations is more difficult because its close form uses the Itô's integral over the diffusion

\begin{equation}\label{eq:BMD.20} X_t = X_0 + \int_0^t \mu(s)ds + \int_0^t\sigma(s)\,dZ_s. \end{equation}

The discretization of the interval $(0,T)$ gives a similar expression over each subinterval

\begin{equation}\label{eq:BMD.30} X_{t_{i+1}} = X_{t_{i}} + \int_{t_i}^{t_{i+1}} \mu(s)ds + \int_{t_i}^{t_{i+1}}\sigma(s)\,dZ_s. \end{equation}

Under the scope of our study it is enough to know that $$ \int_{t_i}^{t_{i+1}}\sigma(s)\,dZ_s = \sqrt{\int_{t_i}^{t_{i+1}}\sigma(s)^2\,ds}\cdot G_{i+1}^S, $$

where $G_{i+1}^S$ is a standard normal variable. Being so, the solution in (\ref{eq:BMD.30}) for $t=T$ is given by the cumulative sum of the differences

$$ X_{t_{i+1}} = X_{t_{i}} + \mu(t_{i+1})\Delta + \sigma(t_{i+1})\sqrt{\Delta} G_{i+1}^S. $$

This shows that, indeed, there is a big difference between the solutions of constant diffusion and nonconstant one. This is the reasoning coded bellow.

In [7]:
npaths = 50000; # the number of paths
nsteps = 100;   # the number of points in each path

initial_t = 0 ;
final_t = 10 ;
delta = (final_t - initial_t)/nsteps
Delta = final_t - initial_t
stdsigma = sqrt(delta); # this is the sigma to SBM

# nonconstant drift and nonconstant diffusion coefficient
driftfunc(t) = t^2
volfunc(t) = sqrt(t)
volsqfunc(t) = volfunc(t)^2

# compute the integrals to find the theoretical mean value and 
# variance of our model
qdrift = quadgk(driftfunc, initial_t, final_t)
qvol = quadgk(volsqfunc, initial_t , final_t)

time_array = linspace(initial_t,final_t,nsteps)

drift = map(driftfunc,time_array)
vol = map(volfunc,time_array)

# the place where we will put the final values of the paths
zvalues = zeros(npaths) ; 

for i in 1:npaths
    
    tmpx = randn(nsteps) 
 
    y = drift * delta + sqrt(delta) * vol .* tmpx
    y[1] = 0
    zvalues[i] = cumsum(y)[nsteps]
end

# now we just print the empirical values and the theoretical values
println("The empirical mean value of the process is $(mean(zvalues))")
println("The theoretical value for the mean is $(qdrift[1])")
println("The empirical variance of the proces is $(var(zvalues))")
println("The theoretical value for std is $(qvol[1])")
The empirical mean value of the process is 335.0494003668029
The theoretical value for the mean is 333.33333333333326
The empirical variance of the proces is 50.05807529238948
The theoretical value for std is 50.0

References

(Shreve, 2004) Steven Shreve, ``Stochastic Calculus for Finance II'', 2004.

(Cont, 2001) Cont R., ``Empirical properties of asset returns: stylized facts and statistical issues'', Quantitative Finance, vol. 1, number 2, pp. 223-236, 2001.