On the Wiener space

I would like to discuss in this post the importance of the heuristic functional

\displaystyle I(W) := \frac{1}{2} \int_0^T |\frac{d}{dt} W|^2 \, dt \ \ \ \ \ (1)


that often shows up when doing analysis on the Wiener space {S = (C([0,T], {\mathbb R}), \|\cdot\|_{\infty})}: an element of the Wiener space is traditionally denoted by {\omega} – this is a continuous function and {\omega(t)} is its value at time {t \in [0,T]}. For a (nice) subset {A} of {S}, the Wiener measure of {A} is nothing else than the probability that a Brownian path belongs to {A}. Having said that, the quantity {I(W)} hardly makes sense since a Brownian path {(W_t: t \in [0,T])} is (almost surely) non differentiable anywhere: still, this is a very useful heuristic in many situations. A review of probability can be found on the excellent blog of Terry Tao.

Where does it come from ?

As often, this is very instructive to come back to the discrete setting. Consider a time interval {[0,T]} and a discretization parameter {\Delta t = \frac{T}{N}}: a discrete Brownian path is represented by the {N}-tuple

\displaystyle W^{(N)} = (W_{\Delta t},W_{2\Delta t},\ldots,W_{T}) \in {\mathbb R}^{N}.

The random variables {\Delta W_k := W_{k \Delta t} - W_{(k-1) \Delta t}} are independent centred Gaussian variables with variance {\Delta t} so that the random vector {W^{(N)}} has a density {\mathop{\mathbb P}^{(N)}} with respect to the {N}-dimensional Lebesgue measure

\displaystyle \begin{array}{rcl} \frac{d \mathop{\mathbb P}^{(N)}}{d \lambda^{\textrm{Leb}}}(W^{(N)}) &\propto& \exp\{-\frac{1}{2 \Delta t} \sum_{k=1}^N |W_{k \Delta t} - W_{(k-1) \Delta t}|^2\} \\ &=& \exp\{-\frac{1}{2} \sum_{k=1}^N |\frac{W_{k \Delta t} - W_{(k-1) \Delta t}}{\Delta t}|^2 \, \Delta t\} \\ &=& \exp\{- I^{(N)}(W^{N})\}. \end{array}

The functional {I^{(N)} := \frac{1}{2} \sum_{k=1}^N |\frac{W_{k \Delta t} - W_{(k-1) \Delta t}}{\Delta t}|^2 \, \Delta t} is indeed a discretization of {I}. Informally, the Wiener measure has a density proportional to {\exp\{-I(W)\}} with respect to the “infinite dimensional Lebesgue measure”: this does not make much sense because there is no such thing as the infinite dimensional Lebesgue measure. This should be understood as the limiting case {N \rightarrow \infty} of the discretization procedure presented above. Indeed, this is not an absolute non-sense to say that

\displaystyle \mathop{\mathbb P}[ \omega = g] \sim e^{-I(g)}. \ \ \ \ \ (2)


because we will see that if {f,g} are two nice functions then

\displaystyle \lim_{ \epsilon \rightarrow 0} \frac{P[ \|W-f\|_{\infty} < \epsilon]}{P[ \|W-g\|_{\infty} < \epsilon]} = \frac{e^{-I(f)}}{e^{-I(g)}}.

It is then very convenient to write

\displaystyle \mathop{\mathbb P}[\omega \in A] = \int_{A} e^{-I(W)} \, d\lambda(W)

where {\lambda} is a fictional infinite dimensional Lebesgue measure (ie: translation invariant).

Translations in the Wiener space

As a first illustration of the heuristic {\mathop{\mathbb P}[\omega = g] \sim e^{-I(g)}}, let see how the Wiener measure behave under translations. If we choose a nice continuous function {f} such that {I(f)} is well defined (ie: {\dot{f} \in L^2([0,T]))}), a translated probability measure {\mathop{\mathbb P}^{f}} can be defined through the relation

\displaystyle \mathop{\mathbb P}^f(A) := \mathop{\mathbb P}(f+\omega \in A). \ \ \ \ \ (3)


This is not clear that {\mathop{\mathbb P}^f} is absolutely continuous with respect to the Wiener measure {\mathop{\mathbb P}}. Of course, we impose that {f(0)=0}. For a set {A \subset S}, the heuristic says that

\displaystyle \begin{array}{rcl} \mathop{\mathbb P}^f( \omega \in A) &=& \mathop{\mathbb P}(\omega \in A-f) = \int_{A-f} e^{-I(W)} \, d\lambda(W)\\ &\stackrel{\textrm{(trans. inv.)}}{=}& \int_{A} e^{-I(W-f)} \, d\lambda(W)\\ &=& \int_{A} e^{-\frac{1}{2} \int_0^T |\dot{W}-\dot{f}|^2 \, dt} \, d\lambda(W)\\ &=& \int_{A} e^{\int_0^T \dot{f} \dot{W} \, dt -\frac{1}{2} \int_0^T \dot{f}^2 \, dt} e^{-I(W)}\, d\lambda. \end{array}

This is why, writing {\dot{W} \, dt = dW}, we obtain the following change of probability formula

Proposition 1 Cameron-Martin-Girsanov change of probability formula:

for any continuous function {f} such that {f(0)=0} and {\dot{f} \in L^2([0,T])},

\displaystyle \frac{d \mathop{\mathbb P}^f}{d \mathop{\mathbb P}}(W) = Z^f(W) := \exp\{ \int_0^T \dot{f} dW_t -\frac{1}{2} \int_0^T \dot{f}^2 \, dt \} \ \ \ \ \ (4)


This change of probability formula is extremely useful since this is typically much more convenient to work with a Brownian motion {W_t} than with a drifted Brownian motion {f(t) + W_t}. In many situations, we get rid of the annoying stochastic integral {\int_0^T \dot{f} dW_t}: if {f} is regular enough ({f \in C^3([0,T])}, say) we have

\displaystyle \begin{array}{rcl} \frac{d \mathop{\mathbb P}^f}{d \mathop{\mathbb P}}(W) &=& Z^f(W)\\ &:=& \exp\{ \dot{f}(T)W(T) - \int_0^T \ddot{f}(t) W(t)\, dt -\frac{1}{2} \int_0^T \dot{f}^2 \, dt\}. \end{array}


The next section is a straightforwards application of this change of variable formula.

Probability to be in an {\epsilon}-tube

Suppose that {f,g} are two nice functions (smooth, say): for small {\epsilon \ll 1}, what is a good approximation of the quotient

\displaystyle Q(f,g,\epsilon) = \frac{\mathop{\mathbb P}(\|W-f\|_{\infty}<\epsilon)}{\mathop{\mathbb P}(\|W-g\|_{\infty}<\epsilon)}.

In words, this basically asks the question: how more probable is the event {\{ W \textrm{looks like } f\}} than the event {\{ W \textrm{looks like } g\}} ? Of course, since this can also be read as

\displaystyle Q(f,g,\epsilon) = \frac{Q(f,0,\epsilon)}{Q(g,0,\epsilon)}

where {0} indicates the function identically equal to zero, it suffices to consider the case {g=0}. If we introduce the event

\displaystyle A_{\epsilon} = \{ \omega \in S: \|\omega\|_{\infty} < \epsilon\}),

the quotient {Q(f,0,\epsilon)} is equal to {\frac{\mathop{\mathbb P}^f(A_{\epsilon})}{\mathop{\mathbb P}(A_{\epsilon})}}. This why, using the change of probability formula (4),

\displaystyle \begin{array}{rcl} Q(f,0,\epsilon) &=& \frac{\mathop{\mathbb P}^f(A_{\epsilon})}{\mathop{\mathbb P}(A_{\epsilon})} = \frac{\mathop{\mathbb E}\left[ 1_{A_{\epsilon}}(W) Z^f(W) \right]}{\mathop{\mathbb E}\left[ 1_{A_{\epsilon}}(W) \right]} \end{array}

with {Z^f(W) = \exp\{\dot{f}(T)W(T) - \int_0^T \ddot{f}(t) W(t)\, dt -\frac{1}{2} \int_0^T \dot{f}^2 \, dt\}}. If {\|\dot{f}\|_{\infty}, \|\ddot{f}\|_{\infty} \leq C}, this is clear that for {W \in A},

\displaystyle -(C\epsilon + \epsilon C T) < \dot{f}(T)W(T) - \int_0^T \ddot{f}(t) W(t)\, dt < C\epsilon + \epsilon C T.

Both sides going to zero when {\epsilon} goes to zero, this is enough to conclude that

\displaystyle \lim_{\epsilon \rightarrow 0} Q(f,0,\epsilon) = \exp\{-\frac{1}{2} \int_0^T \dot{f}^2 \, dt\}\} = \exp\{ -I(f)\}. \ \ \ \ \ (5)

In short, for any two reasonably nice functions (for example {f,g \in C^3([0,T])}) {f,g} that satisfy {f(0)=g(0)=0},

\displaystyle \lim_{\epsilon \rightarrow 0} Q(f,0,\epsilon) \frac{\mathop{\mathbb P}(\|W-f\|_{\infty}<\epsilon)}{\mathop{\mathbb P}(\|W-g\|_{\infty}<\epsilon)} = \frac{ \exp\{ -I(f)\}}{\exp\{ -I(g)\} }. \ \ \ \ \ (6)

Large deviation result

Take a subset {A} of {S} (it might be useful to think of sets like {A_{f,\epsilon,\alpha} = \{\omega: |\omega(u)-f(u)| < \alpha \, \forall u \in U\}}). We are interested to the probability that the rescaled (in space) Brownian motion

\displaystyle W^{(\epsilon)}(t) = \epsilon W(t)

belongs to {A} when {\epsilon} goes to {0}. Typically, if the null function does not belong to (the closure of) {A}, the probability {\mathop{\mathbb P}( W^{(\epsilon)} \in A)} is exponentially small. It turns out that if {A} is regular enough

\displaystyle \ln \mathop{\mathbb P}( \epsilon W \in A) \sim -\epsilon^2 \inf_{f \in A} I(f) := -\epsilon^2 I(A).

Again, the usual heuristic gives this result in no time if we accept not to be too rigorous:

\displaystyle \begin{array}{rcl} \epsilon^2 \, \ln \mathop{\mathbb P}( \epsilon W \in A) &=& \epsilon^2 \, \ln \int_{\epsilon W \in A} e^{-I(W)} \, d\lambda \\ &=& \epsilon^2 \, \ln \int_{W \in A} e^{-I(\frac{W}{\epsilon})} \textrm{(Jacobian)}\, d\lambda \\ &=& \epsilon^2 \, \ln \int_{W \in A} e^{-\frac{I(W)}{\epsilon^2}} \textrm{(Jacobian)}\, d\lambda \\ &\stackrel{\epsilon \rightarrow 0}{\rightarrow}& -\inf \{I(f): f \in A\}. \end{array}

This is very fishy since the Jacobian should behave very badly (actually the measure {\mathop{\mathbb P}[W \in \cdot]} and {\mathop{\mathbb P}[\epsilon W \in \cdot]} are mutually singular) but all this mess can be made perfectly rigorous. Nevertheless, the basic idea is almost there, and it can be proved (Freidlin-Wentzel theory) that for any open set G,

\displaystyle \liminf \epsilon^2 \, \ln \mathop{\mathbb P}( \epsilon W \in G) \geq -\inf \{I(f): f \in G\}

while for any closed set {F},

\displaystyle \limsup \epsilon^2 \, \ln \mathop{\mathbb P}( \epsilon W \in F) \leq -\inf \{I(f): f \in F\}.

One cleaner way to prove this is to used the usual Cramer theorem of large deviations for sums of i.i.d random variables (in Banach space) and notice that for {\epsilon(N) = \frac{1}{\sqrt{N}} } then

\displaystyle W^{\epsilon(N)} = \frac{W_1+W_2+\ldots+W_N}{N}

where {(W_i:i=1,2,\ldots,N)} are independent standard Brownian motions. Cramer theorem states that

\displaystyle \mathop{\mathbb P}[ \frac{W_1+W_2+\ldots+W_N}{N} \in A] \sim \exp\{-N \inf\{I(f): f \in A\} \}


\displaystyle I(f) = \sup\{ \int_{0}^t f(t)g(t)\, dt - \ln \, \mathop{\mathbb E} e^{ \int_0^T g(t) W(t) \, dt } : g \in L^2([0,T])\}.

This is not very hard to see that the supremum is indeed {\frac{1}{2} \int_0^T \dot{f}^2 \, dt}.