On the Wiener space

I would like to discuss in this post the importance of the heuristic functional

$\displaystyle I(W) := \frac{1}{2} \int_0^T |\frac{d}{dt} W|^2 \, dt \ \ \ \ \ (1)$

that often shows up when doing analysis on the Wiener space ${S = (C([0,T], {\mathbb R}), \|\cdot\|_{\infty})}$: an element of the Wiener space is traditionally denoted by ${\omega}$ – this is a continuous function and ${\omega(t)}$ is its value at time ${t \in [0,T]}$. For a (nice) subset ${A}$ of ${S}$, the Wiener measure of ${A}$ is nothing else than the probability that a Brownian path belongs to ${A}$. Having said that, the quantity ${I(W)}$ hardly makes sense since a Brownian path ${(W_t: t \in [0,T])}$ is (almost surely) non differentiable anywhere: still, this is a very useful heuristic in many situations. A review of probability can be found on the excellent blog of Terry Tao.

Where does it come from ?

As often, this is very instructive to come back to the discrete setting. Consider a time interval ${[0,T]}$ and a discretization parameter ${\Delta t = \frac{T}{N}}$: a discrete Brownian path is represented by the ${N}$-tuple

$\displaystyle W^{(N)} = (W_{\Delta t},W_{2\Delta t},\ldots,W_{T}) \in {\mathbb R}^{N}.$

The random variables ${\Delta W_k := W_{k \Delta t} - W_{(k-1) \Delta t}}$ are independent centred Gaussian variables with variance ${\Delta t}$ so that the random vector ${W^{(N)}}$ has a density ${\mathop{\mathbb P}^{(N)}}$ with respect to the ${N}$-dimensional Lebesgue measure

$\displaystyle \begin{array}{rcl} \frac{d \mathop{\mathbb P}^{(N)}}{d \lambda^{\textrm{Leb}}}(W^{(N)}) &\propto& \exp\{-\frac{1}{2 \Delta t} \sum_{k=1}^N |W_{k \Delta t} - W_{(k-1) \Delta t}|^2\} \\ &=& \exp\{-\frac{1}{2} \sum_{k=1}^N |\frac{W_{k \Delta t} - W_{(k-1) \Delta t}}{\Delta t}|^2 \, \Delta t\} \\ &=& \exp\{- I^{(N)}(W^{N})\}. \end{array}$

The functional ${I^{(N)} := \frac{1}{2} \sum_{k=1}^N |\frac{W_{k \Delta t} - W_{(k-1) \Delta t}}{\Delta t}|^2 \, \Delta t}$ is indeed a discretization of ${I}$. Informally, the Wiener measure has a density proportional to ${\exp\{-I(W)\}}$ with respect to the “infinite dimensional Lebesgue measure”: this does not make much sense because there is no such thing as the infinite dimensional Lebesgue measure. This should be understood as the limiting case ${N \rightarrow \infty}$ of the discretization procedure presented above. Indeed, this is not an absolute non-sense to say that

$\displaystyle \mathop{\mathbb P}[ \omega = g] \sim e^{-I(g)}. \ \ \ \ \ (2)$

because we will see that if ${f,g}$ are two nice functions then

$\displaystyle \lim_{ \epsilon \rightarrow 0} \frac{P[ \|W-f\|_{\infty} < \epsilon]}{P[ \|W-g\|_{\infty} < \epsilon]} = \frac{e^{-I(f)}}{e^{-I(g)}}.$

It is then very convenient to write

$\displaystyle \mathop{\mathbb P}[\omega \in A] = \int_{A} e^{-I(W)} \, d\lambda(W)$

where ${\lambda}$ is a fictional infinite dimensional Lebesgue measure (ie: translation invariant).

Translations in the Wiener space

As a first illustration of the heuristic ${\mathop{\mathbb P}[\omega = g] \sim e^{-I(g)}}$, let see how the Wiener measure behave under translations. If we choose a nice continuous function ${f}$ such that ${I(f)}$ is well defined (ie: ${\dot{f} \in L^2([0,T]))}$), a translated probability measure ${\mathop{\mathbb P}^{f}}$ can be defined through the relation

$\displaystyle \mathop{\mathbb P}^f(A) := \mathop{\mathbb P}(f+\omega \in A). \ \ \ \ \ (3)$

This is not clear that ${\mathop{\mathbb P}^f}$ is absolutely continuous with respect to the Wiener measure ${\mathop{\mathbb P}}$. Of course, we impose that ${f(0)=0}$. For a set ${A \subset S}$, the heuristic says that

$\displaystyle \begin{array}{rcl} \mathop{\mathbb P}^f( \omega \in A) &=& \mathop{\mathbb P}(\omega \in A-f) = \int_{A-f} e^{-I(W)} \, d\lambda(W)\\ &\stackrel{\textrm{(trans. inv.)}}{=}& \int_{A} e^{-I(W-f)} \, d\lambda(W)\\ &=& \int_{A} e^{-\frac{1}{2} \int_0^T |\dot{W}-\dot{f}|^2 \, dt} \, d\lambda(W)\\ &=& \int_{A} e^{\int_0^T \dot{f} \dot{W} \, dt -\frac{1}{2} \int_0^T \dot{f}^2 \, dt} e^{-I(W)}\, d\lambda. \end{array}$

This is why, writing ${\dot{W} \, dt = dW}$, we obtain the following change of probability formula

Proposition 1 Cameron-Martin-Girsanov change of probability formula:

for any continuous function ${f}$ such that ${f(0)=0}$ and ${\dot{f} \in L^2([0,T])}$,

$\displaystyle \frac{d \mathop{\mathbb P}^f}{d \mathop{\mathbb P}}(W) = Z^f(W) := \exp\{ \int_0^T \dot{f} dW_t -\frac{1}{2} \int_0^T \dot{f}^2 \, dt \} \ \ \ \ \ (4)$

This change of probability formula is extremely useful since this is typically much more convenient to work with a Brownian motion ${W_t}$ than with a drifted Brownian motion ${f(t) + W_t}$. In many situations, we get rid of the annoying stochastic integral ${\int_0^T \dot{f} dW_t}$: if ${f}$ is regular enough (${f \in C^3([0,T])}$, say) we have

$\displaystyle \begin{array}{rcl} \frac{d \mathop{\mathbb P}^f}{d \mathop{\mathbb P}}(W) &=& Z^f(W)\\ &:=& \exp\{ \dot{f}(T)W(T) - \int_0^T \ddot{f}(t) W(t)\, dt -\frac{1}{2} \int_0^T \dot{f}^2 \, dt\}. \end{array}$

The next section is a straightforwards application of this change of variable formula.

Probability to be in an ${\epsilon}$-tube

Suppose that ${f,g}$ are two nice functions (smooth, say): for small ${\epsilon \ll 1}$, what is a good approximation of the quotient

$\displaystyle Q(f,g,\epsilon) = \frac{\mathop{\mathbb P}(\|W-f\|_{\infty}<\epsilon)}{\mathop{\mathbb P}(\|W-g\|_{\infty}<\epsilon)}.$

In words, this basically asks the question: how more probable is the event ${\{ W \textrm{looks like } f\}}$ than the event ${\{ W \textrm{looks like } g\}}$ ? Of course, since this can also be read as

$\displaystyle Q(f,g,\epsilon) = \frac{Q(f,0,\epsilon)}{Q(g,0,\epsilon)}$

where ${0}$ indicates the function identically equal to zero, it suffices to consider the case ${g=0}$. If we introduce the event

$\displaystyle A_{\epsilon} = \{ \omega \in S: \|\omega\|_{\infty} < \epsilon\}),$

the quotient ${Q(f,0,\epsilon)}$ is equal to ${\frac{\mathop{\mathbb P}^f(A_{\epsilon})}{\mathop{\mathbb P}(A_{\epsilon})}}$. This why, using the change of probability formula (4),

$\displaystyle \begin{array}{rcl} Q(f,0,\epsilon) &=& \frac{\mathop{\mathbb P}^f(A_{\epsilon})}{\mathop{\mathbb P}(A_{\epsilon})} = \frac{\mathop{\mathbb E}\left[ 1_{A_{\epsilon}}(W) Z^f(W) \right]}{\mathop{\mathbb E}\left[ 1_{A_{\epsilon}}(W) \right]} \end{array}$

with ${Z^f(W) = \exp\{\dot{f}(T)W(T) - \int_0^T \ddot{f}(t) W(t)\, dt -\frac{1}{2} \int_0^T \dot{f}^2 \, dt\}}$. If ${\|\dot{f}\|_{\infty}, \|\ddot{f}\|_{\infty} \leq C}$, this is clear that for ${W \in A}$,

$\displaystyle -(C\epsilon + \epsilon C T) < \dot{f}(T)W(T) - \int_0^T \ddot{f}(t) W(t)\, dt < C\epsilon + \epsilon C T.$

Both sides going to zero when ${\epsilon}$ goes to zero, this is enough to conclude that

$\displaystyle \lim_{\epsilon \rightarrow 0} Q(f,0,\epsilon) = \exp\{-\frac{1}{2} \int_0^T \dot{f}^2 \, dt\}\} = \exp\{ -I(f)\}. \ \ \ \ \ (5)$

In short, for any two reasonably nice functions (for example ${f,g \in C^3([0,T])}$) ${f,g}$ that satisfy ${f(0)=g(0)=0}$,

$\displaystyle \lim_{\epsilon \rightarrow 0} Q(f,0,\epsilon) \frac{\mathop{\mathbb P}(\|W-f\|_{\infty}<\epsilon)}{\mathop{\mathbb P}(\|W-g\|_{\infty}<\epsilon)} = \frac{ \exp\{ -I(f)\}}{\exp\{ -I(g)\} }. \ \ \ \ \ (6)$

Large deviation result

Take a subset ${A}$ of ${S}$ (it might be useful to think of sets like ${A_{f,\epsilon,\alpha} = \{\omega: |\omega(u)-f(u)| < \alpha \, \forall u \in U\}}$). We are interested to the probability that the rescaled (in space) Brownian motion

$\displaystyle W^{(\epsilon)}(t) = \epsilon W(t)$

belongs to ${A}$ when ${\epsilon}$ goes to ${0}$. Typically, if the null function does not belong to (the closure of) ${A}$, the probability ${\mathop{\mathbb P}( W^{(\epsilon)} \in A)}$ is exponentially small. It turns out that if ${A}$ is regular enough

$\displaystyle \ln \mathop{\mathbb P}( \epsilon W \in A) \sim -\epsilon^2 \inf_{f \in A} I(f) := -\epsilon^2 I(A).$

Again, the usual heuristic gives this result in no time if we accept not to be too rigorous:

$\displaystyle \begin{array}{rcl} \epsilon^2 \, \ln \mathop{\mathbb P}( \epsilon W \in A) &=& \epsilon^2 \, \ln \int_{\epsilon W \in A} e^{-I(W)} \, d\lambda \\ &=& \epsilon^2 \, \ln \int_{W \in A} e^{-I(\frac{W}{\epsilon})} \textrm{(Jacobian)}\, d\lambda \\ &=& \epsilon^2 \, \ln \int_{W \in A} e^{-\frac{I(W)}{\epsilon^2}} \textrm{(Jacobian)}\, d\lambda \\ &\stackrel{\epsilon \rightarrow 0}{\rightarrow}& -\inf \{I(f): f \in A\}. \end{array}$

This is very fishy since the Jacobian should behave very badly (actually the measure ${\mathop{\mathbb P}[W \in \cdot]}$ and ${\mathop{\mathbb P}[\epsilon W \in \cdot]}$ are mutually singular) but all this mess can be made perfectly rigorous. Nevertheless, the basic idea is almost there, and it can be proved (Freidlin-Wentzel theory) that for any open set $G$,

$\displaystyle \liminf \epsilon^2 \, \ln \mathop{\mathbb P}( \epsilon W \in G) \geq -\inf \{I(f): f \in G\}$

while for any closed set ${F}$,

$\displaystyle \limsup \epsilon^2 \, \ln \mathop{\mathbb P}( \epsilon W \in F) \leq -\inf \{I(f): f \in F\}.$

One cleaner way to prove this is to used the usual Cramer theorem of large deviations for sums of i.i.d random variables (in Banach space) and notice that for ${\epsilon(N) = \frac{1}{\sqrt{N}} }$ then

$\displaystyle W^{\epsilon(N)} = \frac{W_1+W_2+\ldots+W_N}{N}$

where ${(W_i:i=1,2,\ldots,N)}$ are independent standard Brownian motions. Cramer theorem states that

$\displaystyle \mathop{\mathbb P}[ \frac{W_1+W_2+\ldots+W_N}{N} \in A] \sim \exp\{-N \inf\{I(f): f \in A\} \}$

with

$\displaystyle I(f) = \sup\{ \int_{0}^t f(t)g(t)\, dt - \ln \, \mathop{\mathbb E} e^{ \int_0^T g(t) W(t) \, dt } : g \in L^2([0,T])\}.$

This is not very hard to see that the supremum is indeed ${\frac{1}{2} \int_0^T \dot{f}^2 \, dt}$.