Johan Mari Henri Ten Kate (1831-1910) was a Dutch painter.
The topics are
\begin{equation}{\label{a}}\tag{A}\mbox{}\end{equation}
Basic Definitions and Notations.
A measurable space \((F,{\cal F})\) is separable when \({\cal F}\) is generated by a countable collection of sets. In particular, if \(F\) is a locally compact space with countable basis, the \(\sigma\)-field of its Borel sets is separable; it will often be denoted by \({\cal B}(F)\). For instance, \({\cal B}(\mathbb{R}^{n})\) is the \(\sigma\)-field of Borel subsets of the \(n\)-dimensional Euclidean space.
Example. The following sequence of simple functions approximates a random variable \(X\).
\[X_{n}(\omega )=\sum_{k=-n\cdot 2^{n}}^{n\cdot 2^{n}-1}\frac{k}{2^{n}}\cdot
I_{[\frac{k}{2^{n}},\frac{k+1}{2^{n}})}(\omega ).\]
These variables are constructed by taking the interval \([-n,n]\) and dividing it into \(n\cdot 2^{n+1}\) equal subintervals. On the set where the values of \(X\) belong to the interval \([\frac{k}{2^{n}},\frac{k+1}{2^{n}})\), \(X\) is replaced by \(\frac{k}{2^{n}}\) its smallest value on that set. Note that all the sets \(\left\{\frac{k}{2^{n}}\leq X<\frac{k+1}{2^{n}}\right\}\) are measurable, since \(X\) is a random variable. It is easy to see that \(X_{n}\)’s are increasing, therefore converge to a limit, and for all \(\omega\), \(X(\omega )=\lim_{n\rightarrow\infty}X_{n}(\omega )\). \(\sharp\)
Let \((F,{\cal F})\) be a measurable space. Any \({\cal F}\)-measurable function is a pointwise limit of a sequence of \({\cal F}\)-measurable simple functions. For example, a \({\cal F}\)-measurable function \(f:F\rightarrow \mathbb{R}\) is the pointwise limit of the sequence \(\{f_{n}\}\) of \({\cal F}\)-measurable simple functions defined by (ref. Chung and Williams \cite [p.3]{chu})
\[f_{n}=\sum_{k=0}^{n\cdot 2^{n}}\frac{k}{2^{n}}\cdot I_{\{k\cdot 2^{-n}\leq f<(k+1)\cdot 2^{-n}\}}+
\sum_{k=-1}^{-n\cdot 2^{n}}\frac{k+1}{2^{n}}\cdot I_{\{k\cdot 2^{-n}\leq f<(k+1)\cdot 2^{-n}\}}\]
satisfying \(|f_{n}|\uparrow |f|\).
Stochastic Processes.
An important measure is the Borel measure \(\mu_{B}\) on the \(\sigma\)-field \({\cal B}\) of Borel subsets of \(\mathbb{R}\), which assigns to each finite interval \([a,b]\) its length, i.e., \(\mu_{B}([a,b])=b-a\). We see that \(\mu_{B}\) is not a finite measure. However, it is \(\sigma\)-finite on \(\mathbb{R}\) such that we can write \(\mathbb{R}=\bigcup_{n=1}^{\infty}[-n,n]\) with \(\mu_{B}([-n,n])=2n<\infty\) for each \(n=1,2,\cdots\). The measure space \((\mathbb{R},{\cal B},\mu_{B})\) is not complete in the sense that there exist subsets \(B^{*}\) of \(\mathbb{R}\) with \(B^{*}\not\in {\cal B}\) but \(B^{*}\subset B\) for some \(B\in {\cal B}\) with \(\mu_{B}(B)=0\). Intuitively we would expect such subsets to have zero measure too. A procedure of measure theory allows us to enlarge the \(\sigma\)-field \({\cal B}\) to a \(\sigma\)-field \({\cal E}\) and to extend the measure \(\mu_{B}\) uniquely to a measure \(\mu\) on \({\cal E}\) so that \((\mathbb{R},{\cal E},\mu )\) is complete, that is \(E^{*}\in {\cal E}\) with \(\mu (E^{*})=0\) whenever \(E^{*}\subset E\) for some \(E\in {\cal E}\) with \(\mu (E)=0\). In particular, we have \(\mu (B)=\mu_{B}(B)\) for each \(B\in {\cal B}\), whereas for each \(E\in {\cal E}\setminus {\cal B}\) there exists a \(B\in {\cal B}\) and a \(E^{*}\in {\cal E}\) so that \(E=B\cup E^{*}\) with \(\mu(E)=\mu_{B}(B)\) and \(\mu (E^{*})=0\). We call \({\cal E}\) the \(\sigma\)-field of Lebesgue subsets of \(\mathbb{R}\) and \(\mu\) the Lebesgue measure on \(\mathbb{R}\).
A random variable \(X\) induces a probability distribution \(\mathbb{P}_{X}\) on \((\mathbb{R},{\cal B})\) defined by \(\mathbb{P}_{X}(B)=\mathbb{P}(X^{-1}(B))\) for all \(B\in {\cal B}\). \(\mathbb{P}_{X}\) is a probability measure, but the resulting probability space \((\mathbb{R},{\cal B},\mathbb{P}_{X})\) is not complete. However, we can complete it to \((\mathbb{R},{\cal E},\mathbb{P}_{X})\) by extending \(\mathbb{P}_{X}\) to the Lebesgue subsets of \(\mathbb{R}\), where for convenience we use the same symbol \(\mathbb{P}_{X}\) for the extended measure. One advantage of working with this completed probability space is that every subset of null event is itself a null event. Thus if \(X\) is a (Lebesgue measurable) random variable and if \(Y(\omega)=X(\omega )\) a.s., then \(Y\) is also a random variable; that is, \(Y\) is essentially indistinguishable from \(X\). Another advantage is that \(\mathbb{P}_{X}\) and the corresponding distribution function \(F_{X}\) defined by \(F_{X}(x)=\mathbb{P}_{X}((-\infty ,x])\) for all \(x\in \mathbb{R}\), can be related to the Lebesgue measure \(\mu\) on \((\mathbb{R},{\cal E})\). For example, when we have an absolutely continuous probability measure \(\mathbb{P}_{X}\ll\mu\), then there is a density function \(f(x)\) and the integral relationship
\[F_{X}(x)=\int_{-\infty}^{x}f(s)ds\]
holds.
Often the \(\sigma\)-field \({\cal F}\) of events in a probability space \((\Omega ,{\cal F},\mathbb{P})\) contains much more information that is detectable by looking at the values taken by a particular random variable \(X\). For this reason, it is often useful to consider the \(\sigma\)-field \({\cal F}(X)\) generated by the subsets of the form \(X^{-1}(E)\) for all \(E\in {\cal F}\). This is the coarsest \(\sigma\)-field consistent with the measurability of \(X\), or equivalently minimal with respect to the events detectable by \(X\). In general, if \(\Omega\) is a set and \(f_{i}\) for \(i\in I\) is a collection of functions from \(\Omega\) to measurable spaces \((F_{i},{\cal F}_{i})\), the smallest \(\sigma\)-filed on \(\Omega\) for which the functions \(f_{i}\) are measurable is denoted by \(\sigma (f_{i},i\in I)\). Let \(\Lambda\) be a collection of subsets of \(\Omega\). We denote by \(\sigma (\Lambda )\) the smallest \(\sigma\)-field containing \(\Lambda\), which also says that \(\sigma (\Lambda )\) is generated by \(\Lambda\). Therefore, the \(\sigma\)-filed \(\sigma (f_{i},i\in I)\) is generated by the family
\[\Lambda =\{f_{i}^{-1}(A_{i}):A_{i}\in {\cal F}_{i}\mbox{ for }i\in I\}.\]
Finally if \({\cal F}_{i}\) for \(i\in I\) is a family of \(\sigma\)-fields on \(\Omega\), we denote by \(\bigvee_{i}{\cal F}_{i}\) the \(\sigma\)-field generated by \(\bigcup_{i}{\cal F}_{i}\).
We now consider the product \(\sigma\)-fields and product measures. Let \((\Omega_{1},{\cal F}_{1},\mu_{1})\) and \((\Omega_{2},{\cal F}_{2},\mu_{2})\) be two measure spaces. Then, we define the product \(\sigma\)-filed \({\cal F}_{1}\times {\cal F}_{2}\) on \(\Omega_{1}\times\Omega_{2}\) to be the \(\sigma\)-filed generated by the subsets \(A_{1}\times A_{2}\) for all \(A_{1}\in {\cal F}_{1}\) and \(A_{2}\in {\cal F}_{2}\). The product measure \(\mu_{1}\times\mu_{2}\) is the extension to a measure on \({\cal F}_{1}\times {\cal F}_{2}\) of a set function satisfying
\[\mu_{1}\times\mu_{2}(A_{1}\times A_{2})=\mu_{1}(A_{1})\mu_{2}(A_{2})\mbox{ for any }A_{1}\in {\cal F}_{1}\mbox{ and } A_{2}\in {\cal F}_{2}.\]
In the case that the measurable spaces coincide, we write \(\Omega^{2}\) for \(\Omega\times\Omega\) and \({\cal F}^{2}\) for \({\cal F}\times {\cal F}\). When \(\Omega =\mathbb{R}\) and \({\cal F}={\cal E}\), we obtain the \(\sigma\)-field \({\cal E}^{2}\) of two-dimensional Lebesgue subsets and call the product measure \(\mu\times\mu\) the two-dimensional Lebesgue measure. Analogous definitions apply for the \(n\)-dimensional products \(\mathbb{R}^{n}\) and \({\cal E}^{n}\).
A stochastic process is a mathematical model for the occurrence of a random phenomenon. The randomness is captured by the introduction of a measurable space \((\Omega ,{\cal F})\), called the sample space, on which probability measures can be placed. Therefore, a stochastic process is a collection of random variables \(\{X_{t}\}_{0\leq t<\infty}\) on \((\Omega ,{\cal F})\), which take values in a second measurable space \((S,{\cal S})\) called the state space. For a fixed sample point \(\omega\in\Omega\), the function \(t\mapsto X_{t}(\omega )\) for \(t\geq 0\), is the sample path (realization, trajectory) of the process \(X\) associated with \(\omega\). It provides the mathematical model for a random experiment whose outcome can be observed continuously in time.
Definition. Let \(\{{\bf X}_{t}\}_{t\geq 0}\) be a \((\mathbb{R}^{d},{\cal B}(\mathbb{R}^{d}))\)-valued stochastic process. The stochastic process \(\{{\bf X}_{t}\}_{t\geq 0}\) is called measurable when, for every \(A\in {\cal B}(\mathbb{R}^{d})\), the set \(\{(t,\omega ): X_{t}(\omega )\in A\}\) belongs to the product \(\sigma\)-field \({\cal B}([0,\infty ))\times {\cal F}\); that is, the mapping
\[(t,\omega )\mapsto X_{t}(\omega ):([0,\infty )\times\Omega,
{\cal B}([0,\infty ))\times {\cal F})\rightarrow (\mathbb{R}^{d},{\cal B}(\mathbb{R}^{d}))\]
is measurable. \(\sharp\)
It is an immediate consequence of Fubini’s theorem that the sample paths of such a process are Borel-measurable functions of \(t\in [0,\infty )\).
Definition. For a given space there may exist distinct stochastic processes \(\{X_{t}\}_{t\in T}\) and \(\{Y_{t}\}_{t\in T}\) with the same distribution functions. We say that such processes are equivalent and that one is a version of the other. \(\sharp\)
A very broad class of stochastic processes, called separable processes, was introduced by Doob in order to circumvent difficulties arising from the uncountability of the time interval \(T\). A stochastic process \(\{X_{t}\}_{t\in T}\) defined on a complete probability space \((\Omega ,{\cal F},\mathbb{P})\) is called a separable process when there is a countably dense subset \(S=\{s_{1},s_{2},\cdots\}\) of \(T\) such that for any open interval \(I_{o}\) and any closed interval \(I_{c}\) the subset
\[\hat{A}=\bigcup_{t\in T\cap I_{o}}\{\omega\in\Omega :X_{t}(\omega )\in I_{c}\}\]
of \(\Omega\) differs from the event
\[A=\bigcup_{s_{j}\in S\cap I_{o}}\{\omega\in\Omega :X_{s_{j}}(\omega )\in I_{c}\}\]
by a subset of a null event. By the completeness of the probability space, the set \(\hat{A}\) is itself an event and \(\mathbb{P}(\hat{A})=\mathbb{P}(A)\). Consequently, for a separable process \(\{X_{t}\}_{t\in T}\), sets like \(\{\omega\in\Omega :X_{t}(\omega )\geq 0\mbox{ for all }t\in T\}\) are events and the functions defined, a.s., by
\[\inf_{s\leq t}X_{s}(\omega ),\sup_{s\leq t}X_{s}(\omega ),\lim_{s\rightarrow t}X_{s}(\omega )\]
are random variables when they exist.
Loéve showed that for any stochastic process \(\{X_{t}\}_{t\in T}\) there is always an equivalent process \(\{\tilde{X}_{t}\}_{t\in T}\) defined on the same probability space which is separable, although his proof required \(\hat{X}_{t}\) to be extended random variables, that is possibly taking values \(\pm\infty\). When the process \(\{X_{t}\}_{t\in T}\) is continuous in probability for each \(t\in T\), matters are much nicer: there is an equivalent separable process \(\{\tilde{X}_{t}\}_{t\in T}\) which is jointly measurable, that is the function \(\tilde{X}:T\times\Omega \rightarrow \mathbb{R}\) is measurable with respect to the product measure \((T\times\Omega ,{\cal E}\times {\cal F},\mu\times\mathbb{P})\).
Monotone Class Theorem.
A fundamental theorem of measure theory that we will need from time to time is known as the Monotone Class Theorem. Actually there are several such theorems.
Proposition. Let \(\Lambda\) be a collection of subsets of \(\Omega\) satisfying the following conditions:
- \(\Omega\in\Lambda\);
- if \(A,B\in\Lambda\) and \(A\subset B\), then \(B\setminus A\in \Lambda\);
- if \(\{A_{n}\}_{n=1}^{\infty}\)$ is an increasing sequence of elements of \(\Lambda\) then \(\bigcup_{n}A_{n}\in\Lambda\).
If \(\Gamma\subset\Lambda\), where \(\Gamma\) is closed under finite intersections, then \(\sigma (\Gamma )\subset\Lambda\). \(\sharp\)
The above version deals with sets. We turn to the functional version.
\begin{equation}{\label{prod1}}\tag{1}\mbox{}\end{equation}
Definition \ref{prod1}. A monotone vector space \({\cal H}\) on a space \(\Omega\) is defined to be the collection of bounded, real-valued functions \(f\) on \(\Omega\) satisfying the following conditions
- \({\cal H}\) is a vector space over \(\mathbb{R}\).
- \(I_{\Omega}\in {\cal H}\), i.e., constant functions are in \({\cal H}\).
- If \(\{f_{n}\}_{n\geq 1}\subseteq {\cal H}\) and \(0\leq f_{1}\leq f_{2}\leq\cdots\leq f_{n}\leq\cdots\) and \(\lim_{n\rightarrow\infty}f_{n}=f\) and \(f\) is bounded, then \(f\in {\cal H}\). \(\sharp\)
Definition. A collection \({\cal M}\) of real-valued functions on a space \(\Omega\) is said to be multiplicative when \(f,g\in {\cal M}\) implies \(fg\in {\cal M}\). \(\sharp\)
Theorem (Monotone Class Theorem). Let \({\cal M}\) be a multiplicative calss of bounded real-valued functions
defined on a space \(\Omega\). If \({\cal H}\) is a monotone vector space containing \({\cal M}\), then \({\cal H}\) contains all bounded, \(\sigma ({\cal M})\)-measurable functions. \(\sharp\)
We have a family \(f_{i},i\in I\), of functions of \(\Omega\) into measurable spaces \((F_{i},{\cal F}_{i})\). We assume that for each \(i\in I\), there is a subclass \({\cal N}_{i}\) of \({\cal F}_{i}\), closed under finite intersections and such that \(\sigma ({\cal N}_{i})={\cal F}_{i}\). We then have the following results.
Proposition. Let \({\cal N}\) be the family of sets of the form \(\cap_{i\in J} f_{i}^{-1}(A_{i})\), where \(A_{i}\) ranges through \({\cal N}_{i}\) and \(J\) ranges through the finite subsets of \(I\). Then \(\sigma ({\cal N})=\sigma (f_{i},i\in I)\). \(\sharp\)
Proposition. Let \({\cal H}\) be a vector space of real-valued functions on \(\Omega\), containing \(I_{\Omega}\), satisfying property (iii) of Definition \ref{prod1} and containing all the functions \(I_{A}\) for \(A\in {\cal N}\). Then \({\cal H}\) contains all the bounded, real-valued, \(\sigma (f_{i},i\in I)\)-measurable functions. \(\sharp\)
Completion.
If \((\Omega ,{\cal F})\) is a measurable space and \(\mathbb{P}\) a probability measure on \({\cal F}\), the completion \({\cal F}^{\mathbb{P}}\) of \({\cal F}\) with respect to \(\mathbb{P}\) is the \(\sigma\)-filed of subsets \(B\) of \(\Omega\) such that there exist \(B_{1}\) and \(B_{2}\) in \({\cal F}\) with \(B_{1}\subset B\subset B_{2}\) and \(\mathbb{P}(B_{2}\setminus B_{1})=0\). If \({\cal P}\) is a family of probability measures on \({\cal F}\), the \(\sigma\)-filed
\[{\cal F}^{{\cal P}}=\bigcap_{\mathbb{P}\in {\cal P}}{\cal F}^{\mathbb{P}}\]
is called the completion of \({\cal F}\) with respect to \({\cal P}\). If \({\cal P}\) is the family of all probability measures on \({\cal F}\), then \({\cal F}^{{\cal P}}\) is denoted by \({\cal F}^{*}\) and is called the \(\sigma\)-filed of universally measurable sets.
Filtrations and Stopping Times.
Let \((\Omega ,{\cal F},P)\) be a probability space. We wish to model a situation involving randomness unfolding with time. First of all, we consider the discrete-time case. We suppose, for simplicity, that information is never lost or forgotten: thus, as time increases we learn more. We recall that the \(\sigma\)-fields represent information or knowledge. We thus need a sequence of \(\sigma\)-fields \(\{{\cal F}_{n}\}\) which are increasing \({\cal F}_{n}\subseteq {\cal F}_{n+1}\) for \(n=0,1,2,\cdots\) with \({\cal F}_{n}\) representing the information or knowledge available to us at time \(n\). We shall always suppose all \(\sigma\)-fileds to be complete. Thus \({\cal F}_{0}\) represents the initial information (if there is none, \({\cal F}_{0}=\{\emptyset ,\Omega\}\), the trivial \(\sigma\)-filed). On the other hand,
\[{\cal F}_{\infty}\equiv\lim_{n\rightarrow\infty}{\cal F}_{n}\]
represents all we ever will know. Often \({\cal F}_{\infty}\) will be \({\cal F}\). Such a family \(\{{\cal F}_{n}\}\) is called a filtration. A probability space endowed with such a filtration, \((\Omega ,{\cal F}, \{{\cal F}_{n}\},P)\) is called a filtered probability space.
A random variable \(T\) taking values in \(\{0,1,2,\cdots\}\) is called a stopping time (or optional time) if
\[\{T\leq n\}=\{\omega :T(\omega )\leq n\}\in {\cal F}_{n}.\]
Equivalently,
\[\{T=n\}\in {\cal F}_{n}\mbox{ or }\{T\geq n\}\in {\cal F}_{n}.\]
Think of \(T\) as a time at which you decide to quit a gambling game: whether or not you quit at time \(n\) depends only on the history up to and including time \(n\), not the future. Thus stopping times model gambling and other situations where there is no foreknowledge, or prescience of the future; in particular, in a financial context, where there is no insider trading.
Now we consider the continuous-time case. Let \((\Omega ,{\cal F},P)\) be a complete probability space. The filtration \(\{{\cal F}_{t}\}_{t\geq 0}\) is a nondecreasing family of sub-$\sigma$-fileds of \({\cal F}\). i.e., \({\cal F}_{s}\subseteq {\cal F}_{t}\subseteq {\cal F}\) for \(0\leq s<t<\infty\). Here \({\cal F}_{t}\) represents the information available at time \(t\), and the filtration \(\{{\cal F}_{t}\}_{t\geq 0}\) represents the information flow evolving with time. A measurable space \((\Omega ,{\cal F})\) endowed with a filtration \(\{{\cal F}_{t}\}_{t\geq 0}\) is said to be a filtered space.
With a filtration \(\{{\cal F}_{t}\}_{t\geq 0}\), one can associate two other filtrations by setting
\[{\cal F}_{t-}=\bigvee_{s<t}{\cal F}_{s}=\sigma\left (\bigcup_{s<t}{\cal F}_{s}\right )\mbox{ and }
{\cal F}_{t+}=\bigcap_{s>t}{\cal F}_{s}.\]
We have
\[\bigvee_{t\geq 0}{\cal F}_{t-}=\bigvee_{t\geq 0}{\cal F}_{t}=\bigvee_{t\geq 0}{\cal F}_{t+}\]
and this \(\sigma\)-filed will be denoted by \({\cal F}_{\infty}\). The \(\sigma\)-field \({\cal F}_{\infty}\) is also defined by \({\cal F}_{\infty}=\sigma (\bigcup_{t\geq 0}{\cal F}_{t})\) in Karatzas and Shreve \cite{kar}. The \(\sigma\)-field \({\cal F}_{0-}\) is not defined and, by convention, we put \({\cal F}_{0-}={\cal F}_{0}\). We always have \({\cal F}_{t-}\subseteq {\cal F}_{t}\subseteq {\cal F}_{t+}\). We assume that the filtered probability space \((\Omega ,{\cal F},P,\{{\cal F}_{t}\}_{t\geq 0})\) satisfies the usual conditions.
- \({\cal F}_{0}\) contains all the \(P\)-null sets of \({\cal F}\);
- \(\{{\cal F}_{t}\}_{t\geq 0}\) is right-continuous, i.e., \({\cal F}_{t}={\cal F}_{t+}\).
The filtration \(\{{\cal F}_{t}\}_{t\geq 0}\) is also called the standard filtration.
Definition. A random variable \(T:\Omega\rightarrow [0,\infty ]\) is a stopping time when the event \(\{T\leq t\}\in {\cal F}_{t}\) for all \(0\leq t<\infty\). \(\sharp\)
\begin{equation}{\label{prot11}}\tag{2}\mbox{}\end{equation}
Proposition \ref{prot11}. If the filtration is right-continuous, then the event \(\{T<t\}\in {\cal F}_{t}\), \(0\leq t\leq\infty\), if and only if \(T\) is a stopping time.
Proof. (Method 1). Since \(\{T\leq t\}=\bigcap_{t+\epsilon >u>t}\{T<u\}\) for any \(\epsilon >0\), we have \(\{T\leq t\}\in\bigcap_{u>t}{\cal F}_{u}={\cal F}_{t}\) by the right continuity, so \(T\) is a stopping time. For the converse, \(\{T<t\}=\bigcup_{t>\epsilon>0}\{T\leq t-\epsilon\}\), and \(\{T\leq r-\epsilon\}\in {\cal F}_{t-\epsilon}\) since \(T\) is a stopping time, hence also in \({\cal F}_{t}\).
(Method 2). If \(T<t\), then for some \(n\), \(T\leq t-1/n\). Thus \(\{T<t\}=\bigcup_{n=1}^{\infty}\{T\leq t-1/n\}\). The event \(\{T\leq t-1/n\}\in {\cal F}_{t-1/n}\). Since \({\cal F}_{t}\) are increasing, \(\{T\leq t-1/n\}\in {\cal F}_{t}\), therefore \(\{T<t\}\in {\cal F}_{t}\). For the converse, \(\{T\leq t\}=\bigcap_{n=1}^{\infty}\{T<t+1/n\}\). Since \(\{T<t+1/n\}\in {\cal F}_{t+1/n}\), by right-continuity of the filtration, \(\{T\leq t\}\in {\cal F}_{t}\). \(\blacksquare\)
Remark. In Karatzas and Shreve \cite{kar}, \(T\) is called an optional time when \(\{T<t\}\in {\cal F}_{t}\) for every \(t\geq 0\). Then every stopping time is optional, and the two concepts coincide when the filtration is right-continuous. \(\sharp\)
A stochastic process \(\{X_{t}\}_{t\geq 0}\) is adapted to \(\{{\cal F}_{t}\}_{t\geq 0}\) when \(X_{t}\) is \({\cal F}_{t}\)-measurable for each \(t\geq 0\). Thus \(X_{t}\) is known when \({\cal F}_{t}\) is known at time \(t\). Any stochastic process \(\{X_{t}\}_{t\geq 0}\) is adapted to its natural filtration \(\{{\cal F}_{t}^{X}=\sigma (X_{s}:0\leq s\leq t)\}_{t\geq 0}\). We interpret \(A\in {\cal F}_{t}^{X}\) to mean that by time \(t\), an observer of \(X\) knows whether or not \(A\) has occurred.
Definition. Two stochastic processes \(X\) and \(Y\) are modifications when \(X_{t}=Y_{t}\) a.s. for each \(t\). Two stochastic processes \(X\) and \(Y\) are indistinguishable when \(X_{t}=Y_{t}\) for all \(t\) a.s. \(\sharp\)
If \(X\) and \(Y\) are modifications there exists a null set \(N_{t}\) such that \(X_{t}(\omega )=Y_{t}(\omega )\) for \(\omega\not\in N_{t}\). The null set depends on \(t\). Since the interval \([0,\infty )\) is uncountable, the set \(N=\bigcup_{0\leq t<\infty}N_{t}\) could have any probability between \(0\) and \(1\), and it could even be nonmeasurable. If \(X\) and \(Y\) are indistinguishable, however, then there exists one null set \(N\) such that \(X_{t}(\omega )=Y_{t}(\omega )\) for all \(t\) and for \(\omega\not\in N\). In other words, the functions \(t\mapsto X_{t}(\omega )\) and \(t\mapsto Y_{t}(\omega )\) are the same for all \(\omega\not\in N\), where \(\mathbb{P}(N)=0\). It is customary and convenient to require \(X_{t}\) to be right-continuous with left limit (RCLL or c\`{a}dl\`{a}g).
If \(X\) is adapted to \(\{{\cal F}_{t}\}_{t\geq 0}\) and \(Y\) is a modification of \(X\), then \(Y\) is also adapted to \(\{{\cal F}_{t}\}_{t\geq 0}\) provided that \({\cal F}_{0}\) contains all the \(P\)-null sets in \({\cal F}\). Note that this requirement is not the same as saying that \({\cal F}_{0}\) is complete, since some of the \(P\)-null sets in \({\cal F}\) may not be in the completion of \({\cal F}_{0}\).
\begin{equation}{\label{prot12}}\tag{3}\mbox{}\end{equation}
Proposition \ref{prot12}. Let \(X\) and \(Y\) be two stochastic processes with \(X\) a modification of \(Y\). If \(X\) and \(Y\) have right continuous sample paths a.s., then \(X\) and \(Y\) are indistiguishable.
Proof. Let \(A\) be the null set where the paths of \(X\) are not right continuous, and let \(B\) be the analogous set for \(Y\). Let \(N_{t}=\{\omega :X_{t}(\omega )\neq Y_{t}(\omega )\}\), and let \(N=\bigcup_{t\in {\bf Q}}N_{t}\), where \({\bf Q}\) denotes the rationals in \([0,\infty )\). Then \(\mathbb{P}(N)=0\). Let \(M=A\cup B\cup N\). Then \(\mathbb{P}(M)=0\). We have \(X_{t}(\omega )=Y_{t}(\omega )\) for \(t\in {\bf Q}\) and \(\omega\not\in M\). If \(t\) is not rational, let \(t_{n}\) decrease to \(t\) through \({\bf Q}\). For \(\omega\not\in M\), \(X_{t_{n}}(\omega )=Y_{t_{n}}(\omega )\) for each \(n\), and
\[X_{t}(\omega )=\lim_{n\rightarrow\infty} X_{t_{n}}(\omega )=\lim_{n\rightarrow\infty} Y_{t_{n}}(\omega )=Y_{t}(\omega )\]
by the right continuity. Since \(P(M)=0\), \(X\) and \(Y\) are distinguishable. \(\blacksquare\)
Corollary. Let \(X\) and \(Y\) be two stochastic processes which are RCLL. If \(X\) is a modification of \(Y\), then \(X\) and \(Y\) are indistinguishable. \(\sharp\)
Definition. Let \(X\) be a stochastic process and let \(A\) be a Borel set in \(\mathbb{R}\). Define
\[T_{A}(\omega )=\inf\{t>0:X_{t}(\omega )\in A\}.\]
Then \(T_{A}\) is called a {\bf hitting time of \(A\) for \(X\)}. The first exit time from a set \(A\) is defined by
\[\tau_{A}(\omega )=\inf\{t>0:X_{t}(\omega )\not\in A\}. \sharp\]
We see that \(\tau_{A}=T_{\mathbb{R}\setminus A}\) and \(T_{A}=\tau_{\mathbb{R}\setminus A}\).
Proposition. Let \(X\) be an adapted continuous stochastic process. If \(D=(a,b)\) is an open interval, or any other open set on \(\mathbb{R}\) \((\)a countable union of open intervals$)$, then \(\tau_{D}\) is a stopping time. If \(A=[a,b]\) is a closed interval, or any other closed set on \(\mathbb{R}\), then \(T_{A}\) is a stopping time. If in addition the filtration \(\{{\cal F}_{t}\}_{t\geq 0}\) is right-continuous then also for closed sets \(D\) and open set \(A\), \(\tau_{D}\) and \(T_{A}\) are stopping times.
Proof. We have
\[\{\tau_{D}>t\}=\{X_{s}\in D\mbox{ for }s\leq t\}=\bigcap_{s\in [0,t]}\{X_{s}\in D\}.\]
This event is an uncountable intersection over all \(s\in [0,t]\) of events in \({\cal F}_{s}\). The point of the proof is to represent this event as a countable inetrsection. Due to continuity of \(X_{s}\) and \(D\) being open, for any irrational \(s\) with \(X_{s}\in D\) there is a rational \(q\) with \(X_{q}\in D\). Therefore
\[\bigcap_{s\in [0,t]}\{X_{s}\in D\}=\bigcap_{s\in [0,t]\cap {\bf Q}}\{X_{s}\in D\},\]
which is now a countable intersection of the events from \({\cal F}_{t}\), and hence it itself in \({\cal F}_{t}\). This shows that \(\tau_{D}\) is a stopping time. Since for any closed set \(A\), \(\mathbb{R}\setminus A\) is open, and \(T_{A}=\tau_{\mathbb{R}\setminus A}\), \(T_{A}\) is also a stopping time. Assume now that the filtration is right-continuous. If \(D=[a,b]\) is a closed interval, then \(D=\bigcap_{n=1}^{\infty}(a-1/n,b+1/n)\). If \(D_{n}=(a-1/n,b+1/n)\), then \(\tau_{D_{n}}\) is a stopping time, and the event \(\{\tau_{D_{n}}>t\}\in {\cal F}_{t}\). It is easy to see that \(\bigcap_{n=1}^{\infty}\{\tau_{D_{n}}>t\}=\{\tau_{D}\geq t\}\), hence \(\{\tau_{D}\geq t\}\in {\cal F}_{t}\), and also \(\{\tau_{D}<t\}\in {\cal F}_{t}\) as its complimentary, for any \(t\). The rest of the proof follows by method 2 of Proposition \ref{prot11}. \(\blacksquare\)
Proposition. Let \(X\) be an adapted RCLL stochastic process; and let \(A\) be an open set. Then, the hitting time of \(A\) is a stopping time.
Proof. By Proposition \ref{prot11}, it suffices to show that \(\{T<t\}\in {\cal F}_{t}\) for \(0\leq t<\infty\). If \(A\) is open and \(X_{s}(\omega )\in A\), by the right-continuity of paths, \(X_{t}(\omega )\in A\) for every \(t\in [s,s+\epsilon )\) for some \(\epsilon >0\). As a result
\[\{T<t\}=\bigcup_{s\in {\bf Q}\cap [0,t]}\{X_{s}\in A\},\]
Since \(\{X_{s}\in A\}=X_{s}^{-1}(A)\in {\cal F}_{s}\), the result follows. \(\blacksquare\)
We write \(X_{t-}(\omega )\) as
\[X_{t-}(\omega )=\lim_{s\rightarrow t,s<t}X_{s}(\omega ).\]
\begin{equation}{\label{prot14}}\tag{4}\mbox{}\end{equation}
Proposition \ref{prot14}. L{et \(X\) be an adapted RCLL stochastic process and the filtration is right-continuous. If \(A\) is a closed set, then the random variable
\[T_{A}(\omega )=\inf\{t>0:X_{t}(\omega )\in A\mbox{ or }X_{t-}(\omega )\in A\}\]
is a stopping time.
Proof. Let \(A_{n}=\{x:d(x,A)<\frac{1}{n}\}\), where \(d(x,A)\) denotes the distance from a point \(x\) to \(A\). Then \(A_{n}\) is an open set and
\[\{T\leq t\}=\{X_{t}\in A\mbox{ or }X_{t-}\in A\}\bigcup\left\{
\bigcap_{n}\bigcup_{s\in {\bf Q}\cap [0,t)}\{X_{s}\in A_{n}\}\right\}.\]
This completes the proof. \(\blacksquare\)
It is a very deep result that the hitting time of a Borel set is a stopping time.
\begin{equation}{\label{prot*1}}\tag{5}\mbox{}\end{equation}
Proposition \ref{prot*1}. Let \(S\) and \(T\) be stopping times. Then, the following are stopping times
(i) \(S\wedge T=\min\{S,T\}\)
(ii) \(S\vee T=\max\{S,T\}\)
(iii) \(S+T\)
(iv) \(\alpha S\) for \(\alpha >1\). \(\sharp\)
The \(\sigma\)-filed \({\cal F}_{t}\) can be thought of as representing all (theoretically) observable events up to and including time \(t\). We would like to have an analogous notion of events that are observable before a random time. How can we measure the information accumulated up to a stopping time \(T\)? Let us suppose that an event \(A\) is part of this information, i.e., that the occurrence or nonoccurrence of \(A\) has been decided by time \(T\). Now if by time \(t\) one observes the value of \(T\), which can happen only if \(T\leq t\), then one must also be able to tell whether \(A\) has occurred. In other words, \(A\cap\{T\leq t\}\) and \(A^{c}\cap\{T\leq t\}\) must both be \({\cal F}_{t}\)-measurable, and this must be the case for any \(t\geq 0\). Since
\[A^{c}\cap\{T\leq t\}=\{T\leq t\}\cap (A\cap\{T\leq t\})^{c},\]
it is enough to check only that \(A\cap\{T\leq t\}\in {\cal F}_{t}\) for \(t\geq 0\).
Definition. Let \(T\) be a stopping time of the filtration \(\{{\cal F}_{t}\}_{t\geq 0}\). The stopping time \(\sigma\)-filed \({\cal F}_{T}\) is defined to be \(\left\{A\in {\cal F}:A\cap\{T\leq t\}\in {\cal F}_{t}\mbox{ for all }t\geq 0\right\}\). \(\sharp\)
The sets in \({\cal F}_{T}\) must be thought of as events which may occur before time \(T\). The constants, i.e., \(T(\omega )=t\) for every \(\omega\), are stopping times and in that case \({\cal F}_{T}={\cal F}_{t}\) and \(T\) is \({\cal F}_{T}\)-measurable.
\begin{equation}{\label{prot*}}\tag{6}\mbox{}\end{equation}
Proposition \ref{prot*}. Let \(T\) be a finite stopping time. The \({\cal F}_{T}\) is the smallest \(\sigma\)-filed containing all RCLL processes sampled at \(T\). That is,
\[{\cal F}_{T}=\sigma\{X_{T}:\mbox{$X$ all adapted RCLL processes}\}.\]
Proof. Let
\[{\cal G}=\sigma\{X_{T}:\mbox{$X$ all adapted RCLL processes}\}.\]
Let \(\Gamma\in {\cal F}_{T}\). Then \(X_{t}=I_{\Gamma}\cdot I_{\{t\geq T\}}\) is a RCLL process, and \(X_{T}=I_{\Gamma}\); hence \(\Gamma\in {\cal G}\), and \({\cal F}_{T}\subseteq {\cal G}\). Now let \(X\) be an adapted RCLL process. We need to show \(X_{T}\) is \({\cal F}_{T}\)-measurable. Consider \(X(s,\omega )\) as a function from \([0,\infty )\times\Omega\) into \(\mathbb{R}\). Construct \(\phi :\{T\leq t\}\mapsto [0,\infty)\times\Omega\) by \(\phi (\omega )= (T(\omega ),\omega )\). Then since \(X\) is adapted and RCLL, we have \(X_{T}=X\circ\phi\) is a measurable mapping from \((\{T\leq t\}, {\cal F}_{t}\cap\{T\leq t\})\) into \((\mathbb{R},{\cal B})\), where \({\cal B}\) are the Borel subsets of \(\mathbb{R}\). Therefore
\[\{\omega :X(T(\omega ),\omega )\in {\cal B}\}\cap\{T\leq t\}\]
is in \({\cal F}_{t}\), and this implies \(X_{T}\in {\cal F}_{T}\). Therefore \({\cal G}\subseteq {\cal F}_{T}\). \(\blacksquare\)
From the proof of Proposition \ref{prot*}, we see that \(X_{T}\) is \({\cal F}_{T}\)-measurable.
Proposition. For any two stopping times \(T\) and \(S\), and for any \(A\in {\cal F}_{S}\), we have \(A\cap\{S\leq T\}\in {\cal F}_{T}\). In particular, if \(S\leq T\) a.s., then \({\cal F}_{S}\subseteq {\cal F}_{T}\).
Proposition. Let \(T\) and \(S\) be stopping times. Then \({\cal F}_{S}\cap {\cal F}_{T}={\cal F}_{S\wedge T}\), and each of the events \(\{T<S\}\), \(\{S<T\}\), \(\{T\leq S\}\), \(\{S\leq T\}\), \(\{T=S\}\) belongs to \({\cal F}_{S}\cap {\cal F}_{T}\).
Proposition. Let \(S\) and \(T\) be stopping times and \(Z\) an integrable random variable. Then, we have
(i) \(\mathbb{E}[Z|{\cal F}_{T}]=\mathbb{E}[Z|{\cal F}_{S\wedge T}]\) \(\mathbb{P}\)-a.s. on \(\{T\leq S\}\);
(ii) \(\mathbb{E}[\mathbb{E}[Z|{\cal F}_{T}]|{\cal F}_{S}]=\mathbb{E}[Z|{\cal F}_{S\wedge T}]\) \(\mathbb{P}\)-a.s. \(\sharp\).
If \(X\) and \(Y\) are RCLL, then \(X_{t}=Y_{t}\) a.s. each \(t\) implies that \(X\) and \(Y\) are indistinguishable, as we have already noted. Since fixed times are stopping times, obviously if \(X_{T}=Y_{T}\) a.s. for each finite stopping time \(T\), then \(X\) and \(Y\) are indistinguishable. If \(X\) is RCLL, let \(\Delta X\) denote the process \(\Delta X_{t}=X_{t}-X_{t-}\). Then \(\Delta X\) is not RCLL, through it is adapted and for a.s. \(\omega\), \(t\rightarrow\Delta X_{t}=0\) except for at most countably many \(t\).
Proposition. Let \(X\) be adapted and RCLL. If \(\Delta X_{T}\cdot I_{\{T<\infty\}}=0\) a.s. for each stopping time \(T\), then \(\Delta X\) is indistinguishable from the zero process.
Proof. It suffices to prove the result on \([0,t_{0}]\) for \(0<t_{0}<\infty\). The set \(\{t:|\Delta X_{t}|>0\}\) is countable a.s. since \(X\) is RCLL. Moreover
\[\{t:|\Delta X_{t}|>0\}=\bigcup_{n=1}^{\infty}\left\{t:|\Delta X_{t}|>\frac{1}{n}\right\}\]
and the set \(\{t:|\Delta X_{t}|>\frac{1}{n}\}\) must be finite for each \(n\) since \(t_{0}<\infty\). Using Proposition \ref{prot14} we define stopping times for each \(n\) inductively as follows
\[T^{n,1}=\inf\left\{t>0:|\Delta X_{t}|>\frac{1}{n}\right\}\mbox{ and }
T^{n,k}=\inf\left\{t>T^{n.k-1}:|\Delta X_{t}|>\frac{1}{n}\right\}.\]
Then \(T^{n,k}>T^{n,k-1}\) a.s. on \(\{T^{n,k-1}<\infty\}\). Moreover
\[\{|\Delta X_{t}|>0\}=\bigcup_{n,k}\left\{\left |\Delta X_{T^{n,k}}\cdot I_{\{T^{n,k}<\infty\}}\right |>0\right\},\]
where the right side of the equaliy is a countable union. The result follows. \(\blacksquare\)
Corollary. Let \(X\) and \(Y\) be adapted and RCLL. If \(\Delta X_{T}\cdot I_{\{T<\infty\}}=\Delta Y_{t}\cdot I_{\{T<\infty\}}\) a.s. for each stopping time \(T\), then \(\Delta X\) and \(\Delta Y\) are indistinguishable.
Definition. A process \(X\) is {\bf progressively measurable} with respect to the filtration \(\{{\cal F}_{t}\}_{t\geq 0}\) if for every \(t\) the map \((s,\omega ) \mapsto X_{s}(\omega )\) from \([0,t]\times\Omega\) into \((S,{\cal S})\) is \({\cal B}([0,t])\times {\cal F}_{t}\)-measurable. A subset \(\Gamma\) of \(\mathbb{R}_{+}\times\Omega\) is progressive if the process \(X=I_{\Gamma}\) is progressively measurable. \(\sharp\)
The family of progressive sets is a \(\sigma\)-field on \(\mathbb{R}_{+}\times \Omega\) called the progressive \(\sigma\)-filed and denoted by \({\cal P}\). A process \(X\) is progressively measurable if and only if the map \((t,\omega )\mapsto X_{t}(\omega )\) is measurable with respect to the \(\sigma\)-filed \({\cal P}\).
Proposition. If the stochastic process \(X\) is measurable and adapted to the filtration \(\{{\cal F}_{t}\}_{t\geq 0}\), then it has a progressively measurable modification.
Proposition. Let \(\{(X_{t},{\cal F}_{t})\}_{t\geq 0}\) be a progressively measurable process, and let \(T\) be a stopping time of the filtration \(\{{\cal F}_{t}\}_{t\geq 0}\). Then the random variable \(X_{T}\) defined on the set \(\{T<\infty\}\in {\cal F}_{T}\) is \({\cal F}_{T}\)-measurable, and the “stopped process” \(\{(X_{T\wedge t},{\cal F}_{t})\}_{t\geq 0}\) is progressively measurable.
Proposition. An adapted process with right- or left-continuous paths is progressively measurable.
Proposition. If \(X\) is progressively measurable and \(T\) is a stopping time with respect to the same filtration \(\{{\cal F}_{t}\}\), then \(X_{T}\) is \({\cal F}_{T}\)-measurable on the set \(\{T<\infty\}\).
Proof. (We can also refer to the proof of Proposition \ref{prot*}). The set \(\{T<\infty\}\) is itself in \({\cal F}_{T}\). To say that \(X_{T}\) is \({\cal F}_{T}\)-measurable on this set is to say that \(X_{T}\cdot I_{\{T\leq t\}}\in {\cal F}_{t}\) for every \(t\), or \(\{\omega :X(T(\omega ),\omega )\in {\cal E}\}\cap\{T\leq t\}\in {\cal F}_{t}\) for every \(t\). The map
\[\psi :(\{T\leq t\},\{T\leq t\}\times {\cal F}_{t})\rightarrow ([0,t],{\cal B}([0,t]))\]
is measurable because \(T\) is a stopping time, hence \(\phi :\omega\rightarrow (T(\omega ),\omega )\) from \((\Omega ,{\cal F})\) into \(([0,t]\times\Omega ,{\cal B}([0,t])\times {\cal F}_{t})\) is measurable and \(X_{T}=X\circ\phi\) with \(X\) which is \({\cal B}([0,t])\times {\cal F}_{t}\)-measurable by hypothesis. \(\blacksquare\)
\begin{equation}{\label{b}}\tag{B}\mbox{}\end{equation}
Stieltjes Integration and Change of Variables.
Stochastic integration with respect to martingales can be thought of as an extension of path-by-path Stieltjes integration.
Variation of a Function.
We deal with real-valued, right-continuous functions \(g\) with domain \([0,\infty )\). The value of \(g\) in \(t\) is denoted by \(g_{t}\) or \(g(t)\). Let \(\pi\) be a subdivision of the interval \([0,t]\) with \(0=t_{0}<t_{1}<\cdots <t_{n}=t\); the number \(\parallel\pi\parallel = \sup_{i}|t_{i+1}-t_{i}|\) is called the modulus or mesh of \(\pi\). We consider the sum
\[V_{g}^{\pi}(t)=\sum_{i}|g_{t_{i+1}}-g_{t_{i}}|.\]
If \(\pi^{\prime}\) is another subdivision which is a refinement of \(\pi\), that is, every point \(t_{i}\) of \(\pi\) is a point of \(\pi^{\prime}\), then plainly \(V_{g}^{\pi^{\prime}}(t)\geq V_{g}^{\pi}(t)\) by the triangle inequality.
Definition. The function \(g\) is said to be of {\bf bounded variation on \([0,t]\)} if
\[V_{g}(t)=\sup_{\pi}V_{g}^{\pi}(t)<\infty.\]
The function \(g\) is said to be of finite variation when \(V_{g}(t)<\infty\) for every \(t\). The function \(V_{g}:t\mapsto V_{g}(t)\) is called the total variation of \(g\) and \(V_{g}(t)\) is the variation of \(g\) on \([0,t]\). The function \(V_{g}\) is obviously positive and increasing and if \(\lim_{t\rightarrow\infty}V_{g}(t)=\sup_{t}V_{g}(t)<\infty\), the function \(g\) is said to be of bounded variation. \(\sharp\)
The same notion could be defined on any interval \([a,b]\). we shall say that a function \(g\) on the whole line is of finite variation if it is of finite variation on any compact interval but not necessarily of bounded variation on the whole of \(\mathbb{R}\). Let us observe that \(C^{1}\)-functions are of finite variation. Monotone finite functions are of finite variation and conversely we have the following result.
Proposition (Jordan Decomposition). Any function \(g:[0,\infty )\rightarrow \mathbb{R}\) of finite variation is the difference of two increasing functions. If \(g\) is right-continuous then it can be expressed as the difference of two right-continuous increasing functions. \(\sharp\)
The decomposition is not unique. For example,
\[g(t)=(V_{g}(t))-(V_{g}(t)-g(t))=\frac{1}{2}(V_{g}(t)+g(t))-\frac{1}{2}(V_{g}(t)-g(t)).\]
This decomposition is moreover minimal in the sense that if \(g=a-b\) where \(a\) and \(b\) are positive and increasing, then \((V_{g}+g)/2\leq a\) and \((V_{g}-g)/2\leq b\). The sum, the difference and the product of functions of finite variation are also functions of finite variation. This is also true for the ratio of two functions of finite variation provided the modulus of the denominator is larger than a positive constant. As a result, the function \(g\) has left-limits in any \(t\in (0,\infty )\). We write \(g_{t-}\) or \(g(t-)\) for \(\lim_{s\uparrow t}g_{s}\), and we set \(g_{0-}=0\). We moreover set \(\Delta g_{t}=g_{t}-g_{t-}\); this is the jump of \(g\) in \(t\).
Let \(g(t)\), \(t\geq 0\), be a right-continuous increasing function. Then it can have at most countably many jumps, moreover the sum of the jumps is finite over finite time intervals. Define the discontinuous part \(g_{d}\) of \(g\) by
\[g_{d}(t)=\sum_{s\leq t}(g(s)-g(s-))=\sum_{0<s\leq t}\Delta g(s),\]
and the continuous part \(g^{c}\) of \(g\) by
\begin{equation}{\label{kleeq113}}\tag{7}
g_{c}(t)=g(t)-g_{d}(t).
\end{equation}
Clearly \(g_{d}\) changes only by jumps, \(g_{c}\) is continuous and \(g(t)=g_{c}(t)+g_{d}(t)\). Since a finite variation function is the difference of two increasing functions, the decomposition (\ref{kleeq113}) holds for functions of finite variation. Although representation as the difference of increasing functions is not unique, decomposition (\ref{kleeq113}) is essentially unique, in a sense that any two such decomposition differ by a constant. Indeed, if there were another such decomposition \(g(t)=h_{c}(t)+h_{d}(t)\), \(h_{c}(t)-g_{c}(t)=g_{d}(t)-h_{d}(t)\), and \(g_{d}-h_{d}\) is continuous. Hence \(h_{d}\) and \(g_{d}\) have the same set of jump points. It follows that \(g_{d}(t)-h_{d}(t)=c\) for some constant \(c\).
If \(g\) is a function of real variable, define its quadratic variation over the interval \([0,t]\) as a limit (when it exists)
\[[g](t)=\lim_{n\rightarrow\infty}\sum_{i=1}^{N_{n}}\left (g(t_{i}^{(n)})-g(t_{i-1}^{(n)})\right )^{2},\]
where the limit is taken over all partions \(\pi =\left\{0=t_{0}^{(n)}<t_{1}^{(n)}<\cdots <t_{N_{n}}^{(n)}=t\right\}\) of \([0,t]\) with \(\parallel\pi\parallel\rightarrow 0\) as \(n\rightarrow\infty\). Quadratic variation plays a major role in the stochastic calculus, but is hardly ever met in standard calculus due to the fact that smooth functions have zero quadratic variation.
Proposition. If \(g\) is continuous and of finite variation then its quadratic variation is zero.
Proof. We have
\begin{align*}
[g](t) & =\lim_{n\rightarrow\infty}\sum_{i=1}^{N_{n}}\left (g(t_{i}^{(n)})-g(t_{i-1}^{(n)})\right )^{2}\\
& \leq\lim_{n\rightarrow\infty}\max_{1\leq i\leq n}\left |g(t_{i}^{(n)})-
g(t_{i-1}^{(n)})\right |\cdot\sum_{i=1}^{N_{n}}\left |g(t_{i}^{(n)})-g(t_{i-1}^{(n)})\right |\\
& \leq\lim_{n\rightarrow\infty}\max_{1\leq i\leq n}\left |g(t_{i}^{(n)})-g(t_{i-1}^{(n)})\right |\cdot V_{g}(t).
\end{align*}
Since \(g\) is continuous, it is uniformly continuous on \([0,t]\), hence \(\lim_{n\rightarrow\infty}\max_{1\leq i\leq n}\left |g(t_{i}^{(n)})-g(t_{i-1}^{(n)})\right |=0\), and hence the result follows. \(\blacksquare\)
The converse is not true, there are functions with zero quadratic variation and infinite variation. The same proof works for the following result.
\begin{equation}{\label{klet111}}\tag{8}\mbox{}\end{equation}
Proposition \ref{klet111}. Let \(f\) and \(g\) be continuous, and \(f\) be of finite variation. Then
\[\lim_{n\rightarrow\infty}\sum_{i=1}^{N_{n}}\left (f(t_{i}^{(n)})-
f(t_{i-1}^{(n)})\right )\cdot\left (g(t_{i}^{(n)})-g(t_{i-1}^{(n)})\right )=0\]
where the limit is taken over all partitions \(\pi =\{t_{i}^{(n)}\}\) of \([0,t]\) with \(\parallel\pi\parallel\rightarrow 0\). \(\sharp\)
Define the quadratic covariation of \(f\) and \(g\) on \([0,t]\) by the following limit (when it exists)
\[[f,g](t)=\lim_{n\rightarrow\infty}\sum_{i=1}^{N_{n}}\left (f(t_{i}^{(n)})-
f(t_{i-1}^{(n)})\right )\cdot\left (g(t_{i}^{(n)})-g(t_{i-1}^{(n)})\right )\]
where the limit is taken over all partitions \(\pi =\{t_{i}^{(n)}\}\) of \([0,t]\) with \(\parallel\pi\parallel\rightarrow 0\). Proposition \ref{klet111} has an immediate important corollary
Corollary. If \(f\) and \(g\) are continuous and \(g\) is of finite variation, then covariation \([f,g](t)=0\). \(\sharp\)
Let \(f\) and \(g\) be such that their quadratic variation is defined. One can see that covariation satisfies
\begin{equation}{\label{kleeq118}}\tag{9}
[f,g](t)=\frac{1}{2}\left ([f+g,f+g](t)-[f,f](t)-[g,g](t)\right ),
\end{equation}
and could be defined in terms of quadratic variation. The above relation is known as the polarization identity. It is obvious that covariation is symmetric, \([f,g](t)=[g,f](t)\), it follows from (\ref{kleeq118}) that it is linear, that is, for any constants \(\alpha\) and \(\beta\)
\[[\alpha f+\beta g,h](t)=\alpha [f,h](t)+\beta [g,h](t).\]
Due to symmetry, it is bilinear, that is, linear in both arguments. By the polization identity, covariation is also of finite variation.
Stieltjes Integral.
First of all, we consider the Riemann integral of \(f\) over interval \([a,b]\) dfined by
\[\int_{a}^{b}f(t)dt=\lim_{\parallel\pi\parallel\rightarrow 0}
\sum_{i=1}^{N_{n}}f(\xi_{i}^{(n)})\cdot (t_{i}^{(n)}-t_{i-1}^{(n)}),\]
where the limit is taken over all partitions \(\pi =\left\{a=t_{0}^{(n)}<t_{1}^{(n)}<\cdots <t_{N_{n}}^{(n)}=b\right\}\) and \(t_{i-1}^{(n)}\leq\xi_{i}^{(n)}\leq t_{i}^{(n)}\). Riemann integral is defined for continuous functions \(f\). However, it can be extended for functions which are discontinuous at finitely many points, by splitting up the interval. In fact, a bounded function continuous at all but countably many points is Riemann integrable.
Proposition. The necessary and sufficient condition for a function \(f\) to be Riemann integrable on a finite closed interval \([a,b]\) is that \(f\) is bounded on \([a,b]\) and almost everywhere continuous on \([a,b]\), that is, continuous at all points except possibly on a set of Lebesgue measure zero. \(\sharp\)
For \(t\in \mathbb{R}_{+}\), a right-continuous function \(g\) which is of bounded variation on \([0,t]\) induces a signed measure \(\mu\) on \(([0,t],{\cal B}_{t})\)
\[\mu((a,b])=g(b)-g(a)\mbox{ for }0\leq a<b\leq t\mbox{ and }\mu (\{0\})=0.\]
The right continuity of \(g\) is used in verifying that \(\mu\) is a measure on \(([0,t],{\cal B}_{t})\), and \(\mu\) is uniquely determined since intervals of the form \((a,b]\) together with \(\{0\}\) generate \({\cal B}_{t}\). If \(g\) is continuous, then \(\mu\) has no atoms. If \(g\) is right-continuous and of finite variation, then the measures \(\mu\) for different intervals \([0,t]\) are consistent with one another, but they do not in general extend to a measure on \(\mathbb{R}_{+}\). However if \(g\) is of bounded variation, then the measures extend to a finite signed measure on \(\mathbb{R}_{+}\); if \(g\) is increasing on \(\mathbb{R}_{+}\), they extend to a positive \(\sigma\)-finite measure on \(\mathbb{R}_{+}\). We also have the following result.
Proposition. There is a one-to-one correspondence between Radon measures \(\mu\) on \([0,\infty )\) and right-continuous functions \(g\) of finite variation given by
\[g(t)=\mu ([0,t]). \sharp\]
Consequently \(g(t-)=\mu ([0,t))\) and \(\Delta g(t)=\mu (\{t\})\). Moreover if \(\mu (\{0\})=0\), the variation \(V_{g}\) of \(g\) corresponds to the total variation \(|\mu|\) of \(\mu\).
Approach I (Chung and Williams \cite{chu}). Returning to the case of a ringht-continuous \(g\) that is of bounded variation on a fixed inteval \([0,t]\), we note that the variation \(|\mu |\) of \(\mu\) is the measure associated with the variation \(V_{g}\) of \(g\). If \(f\in L^{1}([0,t],{\cal B}_{t},|\mu |)\), then the Lebesgue-Stieltjes integral of \(f\) with respect to \(g\) over \([0,t]\) is defined by
\begin{align*}
\int_{[0,t]}f(s)dg(s) & \equiv\int_{[0,t]}fd\mu =\lim_{n\rightarrow\infty}
\left (\sum_{k=0}^{\infty}\frac{k}{2^{n}}\cdot\mu\left (\left\{s\in [0,t]:
\frac{k}{2^{n}}\leq f(s)<\frac{k+1}{2^{n}}\right\}\right )\right .\\
& +\left .\sum_{k=-1}^{-\infty}\frac{k+1}{2^{n}}\cdot\mu\left (\left\{
s\in [0,t]:\frac{k}{2^{n}}\leq f(s)<\frac{k+1}{2^{n}}\right\}\right )\right )
\end{align*}
and
\[\left |\int_{[0,t]}f(s)dg(s)\right |\leq\int_{[0,t]}|f|d|\mu |.\]
If the last integral is finite for all \(t\in [0,T]\) and \(g\) is continuous, then \(\int_{[0,t]}f(s)dg(s)\) is a continuous function of \(t\in [0,T]\) and we denote it by \(\int_{0}^{t}f(s)dg(s)\). If \(f\) is a continuous function on \([0,t]\), then the Riemann-Stieltjes integral of \(f\) with respect to \(g\) on \([0,t]\) is well-defined and equals the Lebesgue-Stieltjes integral, i.e.
\begin{equation}{\label{chueq111}}\tag{10}
\int_{[0,t]}f(s)dg(s)=\lim_{n\rightarrow\infty}\sum_{i=1}^{N_{n}}
f\left ((\xi_{i}^{(n)}\right )\cdot\left (g(t_{i}^{(n)})-g(t_{i-1}^{(n)})\right ),
\end{equation}
for any sequence of partitions \(\pi =\left\{0=t_{0}^{(n)}<t_{1}^{(n)}<\cdots <t_{N_{n}}^{(n)}=t\right\}\) of \([0,t]\) where \(\xi_{i}^{(n)}\in [t_{i-1}^{(n)},t_{i}^{(n)}]\) and \(\parallel\pi\parallel\rightarrow 0\) as \(n\rightarrow\infty\). On the other hand, if \(g\) is continuous as well as being of bounded variation on \([0,t]\), then \(\int_{0}^{t}f(s)dg(s)\) is also given by (\ref{chueq111}) when \(f\) is right-continuous on \([0,t)\) with finite left limits on \((0,t]\), or is left-continuous on \((0,t]\) with finite right limits on \([0,t)\).
Approach II (Revuz and Yor \cite{rev}). If \(f\) is a locally bounded Borel function on \(\mathbb{R}_{+}\), its Stieltjes integral with respect to \(g\), denoted by
\[\int_{0}^{t}f_{s}dg_{s},\int_{0}^{t}f(s)dg(s)\mbox{ or }\int_{(0,t]}f(s)dg_{s},\]
is the integral of \(f\) with respect to \(\mu\) on the interval \((0,t]\). We will observe that the jump of \(g\) at zero does not come into play and that \(\int_{0}^{t}dg_{s}=g_{t}-g_{0}\). If we want to consider the integral on \([0,t]\), we will write \(\int_{[0,t]}f(s)dg_{s}\). The integral on \((0,t]\) is also denoted by \((f\bullet g)_{t}\). We point out that the map \(t\mapsto (f\bullet g)_{t}\) is itself a right-continuous function of finite variation. A consequence of the Radon-Nikodym theorem applied to \(\mu\) and to the Lebesgue measure \(\lambda\) is the following result.
Proposition. A function \(g\) of finite variation is \(\lambda\)-a.e. differentiable and there exists a function \(h\) of finite variation such that \(h’=0\) \(\lambda\)-a.e. and
\[g_{t}=h_{t}+\int_{0}^{t}g’_{s}ds. \sharp\]
The function \(g\) is said to be absolutely continuous if \(h=0\). The corresponding measure \(\mu\) is then absolutely continuous with respect to \(\lambda\).
Proposition (Integration by Parts Formula). If \(f\) and \(g\) are two functions of finite variation, then for any \(t\),
\begin{equation}{\label{reveq*6}}\tag{11}
f_{t}g_{t}=f_{0}g_{0}+\int_{0}^{t}f_{s}dg_{s}+\int_{0}^{t}g_{s-}df_{s}. \sharp
\end{equation}
To reestablish the symmetry, the formula (\ref{reveq*6}) can also be written as
\begin{equation}{\label{reveq*7}}\tag{12}
f_{t}g_{t}=\int_{0}^{t}f_{s-}dg_{s}+\int_{0}^{t}g_{s-}df_{s}+\sum_{s\leq t}\Delta f_{s}\Delta g_{s}.
\end{equation}
Note that although the sum in (\ref{reveq*7}) is written over uncountably many values, it has at most countably many nonzero terms. This is because a finite variation function can have at most a countable number of jumps. If \(f\) and \(g\) are both continuous and of bounded variation on \([0,t]\), then we have the integration by parts formula
\[\int_{0}^{t}g(s)df(s)=f(t)g(t)-f(0)g(0)-\int_{0}^{t}f(s)dg(s).\]
In fact, \(f\) can be written uniquely
\[f_{t}=f_{t}^{c}+\sum_{s\leq t}\Delta f_{s},\]
where \(f^{c}\) is continuous and of finite variation. The next result is a “chain rule” formula.
Proposition. If \(f\) is a \(C^{1}\)-function and \(g\) is of finite variation then \(f(g)\) is of finite variation and
\[f(g_{t})=f(g_{0})+\int_{0}^{t}f'(g_{s-})dg_{s}+\sum_{s\leq t}\left (f(g_{s})-f(g_{s-})-f'(g_{s-})\Delta g_{s}\right ).\]
\begin{equation}{\label{c}}\tag{C}\mbox{}\end{equation}
Other Results.
Lipschitz and H\”{o}lder conditions appear as conditions on the coefficients in the results on the existence and uniqueness of solutions of ordinary and stochastic differential equations.
Definition. \(f\) satisfies Lipschitz condition on \([a,b]\) if there is a constant \(K>0\), so that for all \(x,y\in [a,b]\),
\begin{equation}{\label{kleeq131}}\tag{13}
|f(x)-f(y)|\leq K\cdot |x-y|. \sharp
\end{equation}
\(f\) is smooth on \([a,b]\) when it possesses a continuous derivative \(f’\) on \((a,b)\) such that the limits \(f'(a+)\) and \(f'(b-)\) exist. \(f\) is piecewise continuous on \([a,b]\) when it is continuous on \([a,b]\) except possibly a finite number of points at which right-hand and left-hand limits exist. \(f\) is piecewise smooth on \([a,b]\) when it is piecewise continuous on \([a,b]\) and \(f’\) exists and is also piecewise continuous on \([a,b]\). We have the following properties.
(i) If \(f\) is continuously differentiable on a finite interval \([a,b]\), then it is Lipschitz. Indeed, since \(f’\) is continuous on \([a,b]\), it is bounded, \(|f’|\leq K\). Therefore
\[|f(x)-f(y)|=\left |\int_{x}^{y}f'(t)dt\right |\leq\int_{x}^{y}|f'(t)|dt\leq K\cdot |x-y|.\]
(ii) If \(f\) is continuous and piecewise smooth then it is Lipschitz, the proof is similar to the above (i).
(iii) Lipschitz function does not have to be differentiable, for example \(f(x)=|x|\) is Lipschitz but it is not dfferentiable at zero.
(iv) It follows from the definition of a Lipschitz function (\ref{kleeq131}), that if it is differentiable, then its derivative is bounded by \(K\).
(v) Lipschitz function has finite variation on finite intervals, since for any partition \(\{x_{i}\}\) of \([a,b]\),
\[\sum |f(x_{i}-x_{i-1}|\leq K\cdot\sum (x_{i}-x_{i-1})=K\cdot (b-a).\]
(vi) As functions of finite variation have derivatives almost everywhere with respect to Lebesgue measure, a Lipschitz function is almost everywhere differentiable. Note that functions of finite variation have derivatives which are integrable with respect to Lebesgue measure, but the function does not have to be equal to the integral of the derivative.
(vii) A Lipschitz function multiplied by a constant, and a sum of two Lipschitz functions are Lipschitz functions. Product of two bounded Lipschitz functions is again a Lipschitz function.
If \(f\) is Lipschitz on \([0,t]\) for any \(t>0\) but with the constant \(K_{t}\) depending on \(t\), then it is called {\bf locally Lipschitz}. For example, \(x^{2}\) is Lipschitz on \([0,t]\) for any \(t>0\), but it is not Lipschitz on \([0,+\infty )\), since its derivative is unbounded. If \(f\) is a function of two variables \(f(x,t)\) and its satisfies Lipschitz condition in \(x\) for all \(t\in [0,s]\) with some constant \(K\) independent of \(t\), it is said that \(f\) satisfies Lipschitz condition in \(x\) uniformly in \(t\), \(0\leq t\leq s\).
Definition. \(f\) satisfies a {\bf H\”{o}lder condition of order \(\alpha\)}, \(0\leq\alpha\leq 1\), if there is a constant \(K\) such that
\[|f(x)-f(y)|\leq K\cdot |x-y|^{\alpha}. \sharp\]
If \(\alpha =1\), H\”{o}lder condition becomes Lipschitz condition. Linear growth condition also appears in the results on existence and
uniqueness of solutions of differential equations. \(f(x)\) satisfies the linear growth condition when
\[|f(x)|\leq K\cdot (1+|x|).\]
This condition describes growth of a function for large values of \(x\), and states that \(f\) is bounded for small values of \(x\).
Example. It can be shown that if \(f(0,t)\) is a bounded function of \(t\), that is, \(|f(0,t)|\leq C\) for all \(t\), and \(f(x,t)\) satisfies Lipschitz condition in \(x\) uniformly in \(t\), that is, \(|f(x,t)-f(y,t)|\leq K\cdot |x-y|\), then \(f(x,t)\) satisfies the linear growth condition in \(x\), that is, \(|f(x,t)|\leq K_{1}\cdot (1+|x|)\). \(\sharp\)
The {\bf polynomial growth condition} on \(f\) is the condition of the form
\[|f(x)|\leq K\cdot (1+|x|^{m})\]
for soem \(K,m>0\).
Theorem (Fubini’s Theorem). Let \(\{X_{s}\}_{0\leq s\leq t}\) be a stochastic process with regular sample paths \((\)for all \(\omega\) at any point \(s\), \(X_{s}\) has left and right limits$)$. Then
\[\int_{0}^{t}E|X_{s}|ds=E\left [\int_{0}^{t}|X_{s}|ds\right ].\]
Furthermore if this quantity is finite, then
\[E\left [\int_{0}^{t}X_{s}ds\right ]=\int_{0}^{t}\mathbb{E}[X_{s}]ds. \sharp\]
Definition. Let \(A=\{A_{t}\}_{t\geq 0}\) be a RCLL process. \(A\) is an increasing process when the paths of \(A:t\rightarrow A_{t}(\omega )\) are nondecreasing for almost all \(\omega\). \(A\) is called a finite variation process (FV) if almost all of paths of \(A\) are of finite variation on each compact interval of \(\mathbb{R}_{+}\). \(\sharp\)
Let \(A\) be an increasing process. Fix an \(\omega\) such that \(t\mapsto A_{t}(\omega )\) is right-continuous and nondecreasing. This function induces a measure \(\mu_{A}(\omega ,ds)\) on \(\mathbb{R}_{+}\). If \(f\) is a bounded, Borel function on \(\mathbb{R}_{+}\), then \(\int_{0}^{t}f(s)\mu_{A} (\omega ,ds)\) is well-defined for each \(t>0\). We denote this integral by \(\int_{0}^{t}f(s)dA_{s}(\omega )\). If \(F_{s}=F(s,\omega )\) is bounded and jointly measurable, we can define, \(\omega\)-by-$\omega$, the integral
\[I(t,\omega )=\int_{0}^{t}F(s,\omega )dA_{s}(\omega ).\]
\(I\) is right-continuous in \(t\) and jointly measurable.
Proceeding analogously for \(A\) an FA process (except that the induced measure \(\mu_{A}(\omega ,ds)\) can have negative measure; that is, it is a signed measure), we can define
\[I(t,\omega )=\int_{0}^{t}F(s,\omega )dA_{s}(\omega )\]
for \(F\) bounded and jointly measurable.
Let \(A\) be an FV process. We define
\begin{equation}{\label{proeq*}}\tag{14}
|A|_{t}=\sup_{n\geq 1}\sum_{k=1}^{2^{n}}\left |A_{\frac{t\cdot k}{2^{n}}}-A_{\frac{t\cdot (k-1)}{2^{n}}}\right |.
\end{equation}
Then \(|A|_{t}<\infty\) a.s., and it is an increasing process.
Definition. For \(A\) an FV process, the {\bf total variation process}, \(|A|=\{|A|_{t}\}_{t\geq 0}\), is the increasing process defined in (\ref{proeq*}) above. \(\sharp\)
Let \(A\) be an FV process and let \(F\) be a jointly measurable process such that \(\int_{0}^{t}F(s,\omega )dA_{s}(\omega )\) exists and is finite for all \(t>0\) a.s. We let
\[(F\bullet A)_{t}(\omega )=\int_{0}^{t}F(s,\omega )dA_{s}(\omega )\]
and we write \(F\bullet A\) to denote the process \(F\bullet A=\{(F\bullet A)_{t}\}_{t\geq 0}\). We will also write \(\int_{0}^{t} F_{s}d|A|_{s}\) for \((F\bullet |A|)_{t}\).
Proposition. Let \(A\) and \(C\) be adapted, increasing processes such that \(C-A\) is also an increasing process. Then there exists a jointly measurable, adapted process \(H\) defined on \((0,\infty )\) such that \(0\leq H\leq 1\) and
\[A=H\bullet C\mbox{ or equivalently }A_{t}=\int_{0}^{t}H_{s}dC_{s}.\]
Corollary. Let \(A\) be an FV process. There exists a jointly measurable, adapted process \(H\), \(-1\leq H\leq 1\) such that
\[|A|=H\bullet A\mbox{ and }A=H\bullet |A|\]
or equivalently
\[|A|_{t}=\int_{0}^{t}H_{s}dA_{s}\mbox{ and }A_{t}=\int_{0}^{t}H_{s}d|A|_{s}. \sharp\]
When the integrand process \(H\) has continuous paths, the Stieltjes integral \(\int_{0}^{t}H_{s}dA_{s}\) is also known as the Riemann-Stieltjes integral for fixed \(\omega\). In this case, we can define the integral as the limit of approximating sums.
Proposition. Let \(A\) be an FV process and let \(H\) be a jointly measurable process such that \(s\mapsto H(s,\omega )\) is continuous a.s. Let \(\pi_{n}\) be a sequence of finite random partitions of \([0,t]\) with
\[\lim_{n\rightarrow\infty}mesh(\pi_{n})=0.\]
Then for \(T_{k}\leq S_{k}\leq T_{k+1}\),
\[\int_{0}^{t}H_{s}dA_{s}=\lim_{n\rightarrow\infty}\sum_{T_{k},T_{k+1}\in
\pi_{n}} H_{S_{k}}\cdot\left (A_{T_{k+1}}-A_{T_{k}}\right ).\]
\begin{equation}{\label{prot150}}\tag{15}\mbox{}\end{equation}
Theorem \ref{prot150}(Change of Variable). Let \(A\) be an FV process with continuous paths, and let \(f\) be such that its derivative \(f’\) exists and is continuous. Then \(\{f(A_{t})\}_{t\geq 0}\) is an FV process and
\[f(A_{t})-f(A_{0})=\int_{0}^{t}f'(A_{s})dA_{s}. \sharp\]
We will see that the sums
\[\sum_{t_{k}\in\pi_{n}[0,t]}f(A_{t_{k}})\cdot (A_{t_{k+1}}-A_{t_{k}})\]
converge in probability to
\[\int_{0}^{t}f(A_{s-})dA_{s}\]
for a continuous function \(f\) and an FV process \(A\). This leads to the more general change of variable formula, valid for any FV process \(A\), and \(f\in C^{1}\)
\[f(A_{t})-f(A_{0})=\int_{0}^{t}f'(A_{s-})dA_{s}+\sum_{0<s\leq t}\left (f(A_{s})-f(A_{s-})-f'(A_{s-})\Delta A_{s}\right ).\]
The next result explains why Theorem \ref{prot150} is known as a change of variable formula.
Corollary. Let \(g\) be continuous, and let \(h(t)=\int_{0}^{t}g(u)du\). Let \(A\) be an FV process with continuous paths. Then
\[h(A_{t})-h(A_{0})=\int_{A_{0}}^{A_{t}}g(u)du=\int_{0}^{t}g(A_{s})dA_{s}. \sharp\]


