Sheet 9

Prof. Leif Döring, Felix Benning
course: Wahrscheinlichkeitstheorie 1semester: FSS 2022tutorialDate: 02.05.2022dueDate: 10:15 in the exercise on Monday 02.05.2022
Exercise 1 (Calculating Moments).

Using the characteristic function, calculate the first three moments of

  1. (i)

    𝒩(μ,σ2)

    Solution.

    We have φ𝒩(μ,σ2)(t)=exp(iμt-σ22t2) by Example 7.4.12 of the lecture, so by Theorem 7.6.1

    i𝔼[X] =φ𝒩(μ,σ2)(t)|t=0=exp(iμt-σ22t2)[iμ-σ2t]|t=0=iμ
    i2𝔼[X2] =φ𝒩(μ,σ2)′′(t)|t=0=exp(iμt-σ22t2)([iμ-σ2t]2-σ2)|t=0=i2(σ2+μ2)
    i3𝔼[X3] =φ𝒩(μ,σ2)′′′(t)|t=0
    =exp(iμt-σ22t2)([iμ-σ2t]3-σ2[iμ-σ2t]-2σ2[iμ-σ2t])|t=0
    =i3(μ3+3σ2μ)
  2. (ii)

    Exp(λ)

    Solution.

    We have φExp(λ)(t)=λλ-it by Example 7.4.12 of the lecture, so by Theorem 7.6.1

    i𝔼[X] =φExp(λ)(t)|t=0=iλ(λ-it)-2|t=0=i1λ
    i2𝔼[X2] =φExp(λ)′′(t)|t=0=2i2λ(λ-it)-3|t=0=i22λ2
    i3𝔼[X3] =φExp(λ)′′′(t)|t=0=3!i3λ(λ-it)-4|t=0=i33!λ3
  3. (iii)

    Poi(λ)

    Solution.

    We have φPoi(λ)(t)=eλ(eit-1) by Example 7.4.10 of the lecture, so by Theorem 7.6.1

    i𝔼[X] =φPoi(λ)(t)|t=0=eλ(eit-1)λ(eit)i|t=0=iλ
    i2𝔼[X2] =φPoi(λ)′′(t)|t=0=eλ(eit-1)(λeiti)2+eλ(eit-1)λeiti2|t=0=i2(λ2+λ)
    i3𝔼[X3] =φPoi(λ)′′′(t)|t=0=eλ(eit-1)[(λeiti)3+2(λeit)2i3+(λeit)2i3+λeiti3]|t=0
    =i3(λ3+3λ2+λ)
Exercise 2 (Gaussian Vectors 2).

In Sheet 8 Exercise 4 we have defined Gaussian Vectors a little bit different than in the lecture. Here we will show these two definitions are equivalent.

So let X=μ+BY be a Gaussian vector as defined in the lecture with Yi independent 𝒩(0,1). Since we have talked about centered Gaussian vectors before, let us first deal with the expectation.

  1. (i)

    Prove that 𝔼[X]=μ, i.e. 𝔼[Xk]=μk

    Solution.

    Let Bk be the k-th row of B. Then

    𝔼[Xk] =𝔼[μk+Bk,Y]=μk+i=1dBki𝔼[Yi]=0=μk
  2. (ii)

    Prove that for any random vector (not just Gaussian) X the characteristic function of X+μ is simply

    φX+μ(t)=eit,μφX(t)
    Solution.

    We can interpret μ as a random variable and use 7.4.9 (iv) to prove that the characteristic function of the sum is the product of the individual characteristic functions (because a constant is independent from any random variable). So we only need to calculate the characteristic function of a constant

    φδμ(t) =eit,xdδμ(x)=eit,μ.

Now let us continue on to the (co-)variance and characteristic function.

  1. (iii)

    Prove that Cov(Xi,Xj)=Σij for Σ:=BBT.

    Solution.

    We can assume μ=0 without loss of generality since we are subtracting μ from X in the covariance anyway, so

    Cov(Xi,Xj) =𝔼[XiXj]=𝔼[Bi,YY,Bj]=Bi𝔼[YYT]𝟙d×dBjT=Σij
  2. (iv)

    Prove that t,X𝒩(t,μ,tTΣt). (In particular every linear combination is a Gaussian random variable).

    Solution.

    We have

    t,X=t,μ+t,BY=t,μ+BTt,Y.

    The first part is a constant while the second part is a linear combination of independent centered normal random variables. Therefore it is normal with expected value t,μ. And the variance is

    𝔼[BTt,Y2] =𝔼[tTBYYTBTt]=tTB𝟙d×dBTt=tTΣt.
  3. (v)

    Show that the characteristic function of X is equal to

    φX(t)=exp(it,μ-12t,Σt).
    Solution.

    Since we know the distribution of t,X and the characteristic function of normal distributed random variables in one dimension. We have

    φX(t) =𝔼[eit,X]=φt,X(1)=exp(it,μ-12t,Σt).

Du to uniqueness of characteristic functions we have shown that the two definitions are equivalent now. Lastly

  1. (vi)

    prove that the density of X is given by

    fX(x)=1(2π)d/2det(Σ)exp(-12x-μ,Σ-1(x-μ)),xd

    You may assume B to be invertible.

    Hint.

    Transformation Theorem.

    Solution.

    Let g be any measurable function. Then

    𝔼[g(X)] =𝔼[g(μ+BY)]=1(2π)d/2g(μ+By)i=1dexp(-yi22)dy
    =1(2π)d/2g(μ+By)exp(-yTy2)dy
    =1(2π)d/2g(μ+By)exp(-yTBT(B-1)TB-1By2)dy
    =1(2π)d/2g(μ+By)exp(-(By)TΣ-1(By)2)dy
    =1(2π)d/2g(x)exp(-(x-μ)TΣ-1(x-μ)2)1det(Σ)dx
Exercise 3 (Binomial against Poission).

Let (pn) be a sequence of real numbers in [0,1]. Suppose that limnnpn=λ>0. Using Lévy’s continuity theorem, show that

limn(nk)(pn)k(1-pn)n-k=e-λλkk!,0kn.
Solution.

From Example 7.4.10 we know that the characteristic functions of Bernoulli and Poisson random variables. So letXnBer(n,pn) and XPoi(λ), then

φXn(t)=(1-pn+pneit)n=(1+npneit-pnnn)nexp(λeit-λ)=φX(t)

as npnλ for n. Hence, by Lévy’s continuity theorem 7.5.3 the distributions converge against each other. For

limn(Xn=k)(X=k),

we want to apply Portmanteau. But (v) is not applicable because the border of the point set {X=k} is generally not of probability zero. Instead we view our probability measures XN,X as probability measures on . Now using 1/2-balls it becomes obvious, that all sets on this metric space are open, and therefore all sets are closed. This allows us to use (iii) and (iv) of the Portmanteau theorem for the same result, finishing our proof. ∎

Exercise 4 (Inverse Fourier).

(Optional - Bonus) Assuming measure μ has density f=dμdλ, then

f(x)=12πe-itxφμ(t)𝑑t=:φ^μ(x)  (λ-a.e.)

for the characteristic function φμ of μ.

Hint.

Simply plugging in the definition of the characteristic function and using Fubini won’t work. the result would be this

φ^f(x)=e-ixtφf(t)dt=e-ixteiytf(y)dydt=e-ixteiytdtδx(y)f(y)dy

but δx(y) would need to be a dirac measure on x to have us get back f(x) from this point. But Fubini is only allowed for measurable functions and this is not a measurable function. It somehow is zero on all values but x, so it is almost surely zero, but at the same time is supposed to integrate to one.

So we have to work around this infinitely high function on one point. And the idea here is to approximate this “delta function” by Gauß kernels

gϵ(x)=12πϵ2exp(-x22ϵ2)δ0(x)(ϵ0)

So you show

12πφ^μ(x)12πφ^μ*gϵ(x)=f*gϵ(x)f(x)ϵ0.

Here you can plug in definitions and use Fubini freely. Then use the fact that the fourier transform of the normal distribution is basically the density of a normal distribution again to iteratively get rid of all the transforms.

Solution.

We are using a Gauß kernel

gϵ(x)=12πϵ2exp(-x22ϵ2)δ0(x)(ϵ0)

to smudge the equation

φ^μ*gϵ(x) =def.12πϵ2φ^μ(y)exp(-(x-y)22ϵ2)dy
=def.12πϵ2e-ityφμ(t)dtexp(-(x-y)22ϵ2)dy
=Fub.12πϵ2e-ityexp(-(x-y)22ϵ2)dy=12πϵ2eityexp(-(y-(-x))22ϵ2)dy=φ𝒩(-x,ϵ2)(t)=eit(-x)-12t2ϵ2φμ(t)dt
=e-12t2ϵ2regularizatione-itxφμ(t)Fourier, vgl. φ^μdt

Here we can see that another interpretation of the smushing is that we are using a regularized version of the fourier transform on φμ. Now if φ^μ is well defined (i.e. if e-itxφμ(t) is integrable), then we have φ^*gϵ(x)φ^(x) for ϵ0 by the dominated convergence theorem. Continuing on we get

φ^μ*gϵ(x) =def.e-12t2ϵ2eitzf(z)dz
=eit(z-x)e-12t2ϵ2dt=φ𝒩(0,1/ϵ2)(z-x)2π1/ϵ2f(z)dz
=2π12πϵ2e-z-x2ϵ2f(z)dz
=2πf*gϵ(x).

What is left to do is taking the limit on this side as well. This is a bit tricky. We are going to show that

fL1:f*gϵfin L1, (1)

while the other limit was a pointwise limit! But this is okay due to Fatou’s Lemma

|φ^μ(x)-2πf(x)|dx =lim infϵ0|φ^μ*gϵ(x)-2πf(x)|dx
2πlim infϵ0|f*gϵ(x)-f(x)|dx=0,

which implies f=12πφ^μ Lebesgue almost everywhere. But densities are only unique up to Lebesgue zero sets, so we would be done.

Okay so let us show (1): Since gϵ is a density, we can write

f*gϵ(x)-f(x)=(f(x-y)-f(x))gϵ(y)dy=(f(x-ϵz)-f(x))g1(z)dz,

where we used the substitution y=ϵz in the last equation. Note that the constant of the normal distribution 12πϵ2 eats up the ϵ in the substitution dy=ϵdz. Therefore we have

f*gϵ-fL1(λ) =|(f(x-ϵz)-f(x))g1(z)dz|dx
=Fub.|f(x-ϵz)-f(x)|dx=f(-ϵz)-fL1(λ)g1(z)dz.

Writing τt(f)=f(-t) for the translation by t, we have

τϵz(f)-fL1τϵz(f)L1+fL1=2fL1,

because a shift does not change the integral over . With this upper bound integrable against 𝒩(0,1), we can use the DCT to move the limit into the integral

limϵ0f*gϵ-fL1=limϵ0τϵz(f)-fL1=0g1(z)dz.

Where limϵ0τϵz(f)-fL1=0, because this holds for fCc. And, as Cc is dense in L1 (since we can approximate any measurable function with linear combinations of indicators which can in turn be approximated by continuous indicators), it also holds by triangle inequality for all fL1. ∎

Exercise 5 (Pólya’s Theorem).
  1. (i)

    Let X,Y be independently 𝒰[-a2,a2] distributed. For Z=X+Y prove that the density is

    fZ(x)=1a(1-|x|a)+

    and the characteristic function is

    φZ(t)=21-cos(at)a2t2
    Hint.

    Lemma 7.4.9. And use the double angle formula

    cos(2x)=cos(x)2-sin(x)2.
    Solution.

    The density of a sum of independent random variables is the convolution of their densities:

    fZ(x) =fX(x-y)fY(y)dy=1a2𝟙[-a2,a2](x-y)𝟙[-a2,a2](y)𝑑y
    =1a2𝟙[-a2,a2](y-x)𝟙[-a2,a2](y)𝑑y
    =1a2𝟙[-a2+x,a2+x](y)𝟙[-a2,a2](y)𝑑y
    =1a2(min{a2+x,a2}=a2-(-x)+-max{-a2+x,-a2}=-a2+(x)+)+
    =1a2(a-[(-x)++(x)+]=|x|)+
    =1a(1-|x|a)+

    For the characteristic function we use Lemma 7.4.9 (iv) to get

    φZ(t) =φX(t)φY(t)=(exp(ita2)-exp(-ita2)ita)2
    =(cos(ta2)+isin(ta2)-[cos(-ta2)+isin(-ta2)]ita)2
    =(2sin(ta2)ta)2=4sin(ta2)2a2t2=21-cos(ta)a2t2

    where we have used the trigonometric double angle formula

    1-cos(2x) =[sin(x)2+cos(x)2]-cos(2x)=sin(x)2=2sin(x)2
  2. (ii)

    Prove that a random variable Xμk with density

    fXk(x)=1π1-cos(akx)akx2

    has the characteristic function

    φμk(t)=(1-|t|ak)+.

    Deduce the characteristic function φμ of μ=k=1npkμk.

    Hint.

    Use (i) and Fourier inversion.

    Solution.

    Using symmetry of the density, we can substitute x with -x to get

    φμk(t) =eitxfXk(x)𝑑x=e-itxfXk(-x)𝑑x
    =ak12πe-itx21-cos(akx)ak2x2=φZ(x)𝑑x
    =akfZ(t)
    =(1-|t|ak)+,

    where we have used Fourier inversion in the last inequality with the definition of Z from (i) except using ak in place of a. Since φμk(0)=1 this is a probability measure. By linearity of the fourier transform we have

    φμ(t)=k=1npk(1-|t|ak)+.

  3. (iii)

    Assuming 0a1an, calculate the slope of φμ(t) on [ak-1,ak]. Prove that for given slopes s1sn0=:sn+1 we can select μ with pk=ak(sk+1-sk), such that φμ has slope sk on the interval [ak-1,ak] with a0=0.

    Solution.

    For t[ak-1-ak] we can drop the absolute value to get

    ddtφμ=ddti=1npi(1-tai)+=ddti=knpi(1-tai)=-i=knpiai.

    So for pk=ak(sk+1-sk) we have

    ddtφμ =-i=knai(si+1-si)ai=-(sn+1-sk)=sk.
  4. (iv)

    Prove Pólya’s Theorem using interpolation points ak, previous results and Lévy’s Continuity Theorem.

    Theorem (Pólya).

    Let f:[0,1] be a function which is continuous, even, convex on [0,) and satisfies f(0)=1 and limxf(x)=0. Then f is a characteristic function for some probability measure.

    Solution.

    As f is convex on [0,) the negative slopes between the interpolation points are increasing (decreasing in absolute value). Therefore the pk=ak(sk+1-sk) are positive. So the resulting μ is a measure. If we can show

    f(ak)=φμ(ak)k=0,,n

    we get the fact, that μ is a probability measure for free, as then φμ(0)=f(0)=1. And by increasing the number of interpolation points we get convergence against f, so f would be a characteristic function by Lévy’s continuity theorem. Okay, so let us try to prove this equality

    φμ(al) =k=lnak(sk+1-sk)=pk(1-alak)+=k=ln(sk+1-sk)(ak-al)+
    =[(an-al)sn+1=0-(al-al)sl]-k=l+1nsk(ak-ak-1)
    =def. slope-k=l+1nf(ak)-f(ak-1)
    =f(al)