本文为 I n t r o d u c t i o n Introduction Introduction t o to to P r o b a b i l i t y Probability Probability 的读书笔记
目录
- Expectation
- Variance, Moments, and the Expected Value Rule
- Properties of Mean and Variance
- Mean and Variance of Some Common Random Variables
- Decision Making Using Expected Values
Expectation
-
Expectation of
X
X
X: a weighted (in proportion to probabilities) average of the possible values of
X
X
X.
Variance, Moments, and the Expected Value Rule
方差、矩、随机变量函数的期望规则
Moments
- We define the n n nth moment ( n n n 阶矩) as E [ X n ] E[X^n] E[Xn]. With this terminology, the 1st moment of X X X is just the mean.
Variance
v
a
r
(
X
)
=
E
[
(
X
−
E
[
X
]
)
2
]
var(X)=E[(X-E[X])^2]
var(X)=E[(X−E[X])2]
- The variance provides a measure of dispersion (分散程度) of X X X around its mean.
Standard Deviation (标准差)
- Another measure of dispersion is the standard deviation of
X
X
X.
σ X = v a r ( X ) \sigma_X=\sqrt{var(X)} σX=var(X) - The standard deviation is often easier to interpret because it has the same units as X X X.
Expected Value Rule
Proof
- Using the expected value rule, we can write the variance of
X
X
X as
v a r ( X ) = E [ ( X − E [ X ] ) 2 ] = ∑ x ( X − E [ X ] ) 2 p X ( x ) var(X)=E[(X-E[X])^2]=\sum_x(X-E[X])^2p_X(x) var(X)=E[(X−E[X])2]=x∑(X−E[X])2pX(x) - Similarly, the
n
n
nth moment is given by
E [ X n ] = ∑ x x n p X ( x ) E[X^n]=\sum_xx^np_X(x) E[Xn]=x∑xnpX(x)
Properties of Mean and Variance
- We will now use the expected value rule in order to derive some important properties of the mean and the variance.
Mean and Variance of a Linear Function of a Random Variance
- We start with a random variable
X
X
X and define a new random variable
Y
=
a
X
+
b
Y=aX+b
Y=aX+b, where
a
a
a and
b
b
b are given scalars. Let us derive the mean and the variance of the linear function
Y
Y
Y. We have
E [ Y ] = ∑ x ( a x + b ) p X ( x ) = a ∑ x x p X ( x ) + b ∑ x p X ( x ) = a E [ X ] + b v a r ( Y ) = ∑ x ( a x + b − E [ Y ] ) 2 p X ( x ) = a 2 ∑ x ( x − E [ X ] ) 2 p X ( x ) = a 2 v a r ( X ) E[Y]=\sum_x(ax+b)p_X(x)=a\sum_xxp_X(x)+b\sum_xp_X(x)=aE[X]+b\\ var(Y)=\sum_x(ax+b-E[Y])^2p_X(x)=a^2\sum_x(x-E[X])^2p_X(x)=a^2var(X) E[Y]=x∑(ax+b)pX(x)=ax∑xpX(x)+bx∑pX(x)=aE[X]+bvar(Y)=x∑(ax+b−E[Y])2pX(x)=a2x∑(x−E[X])2pX(x)=a2var(X)
Variance in Terms of Moments Expression
v
a
r
(
X
)
=
E
[
X
2
]
−
(
E
[
X
]
)
2
var(X)=E[X^2]-(E[X])^2
var(X)=E[X2]−(E[X])2
We finally illustrate by example a common pitfall:
- Unless g ( X ) g(X) g(X) is a linear function, it is not generally true that E [ g ( X ) ] E[g(X)] E[g(X)] is equal to g ( E [ X ] ) g(E[X]) g(E[X]).
Example 2.4. Average Speed Versus Average Time.
If the weather is good (which happens with probability
0.6
0.6
0.6). Alice walks the 2 miles to class at a speed of
V
=
5
V = 5
V=5 miles per hour, and otherwise rides her motorcycle at a speed of
V
=
30
V = 30
V=30 miles per hour. What is the mean of the time
T
T
T to get to class?
- A correct way to solve the problem is to first derive the PMF of
T
T
T and then calculate its mean by
E [ T ] = 0.6 ⋅ 2 5 + 0.4 ⋅ 2 30 = 4 15 h o u r s E[T]=0.6\cdot\frac{2}{5}+0.4\cdot\frac{2}{30}=\frac{4}{15}hours E[T]=0.6⋅52+0.4⋅302=154hours - However, it is wrong to calculate the mean of the speed
V
V
V,
E [ V ] = 0.6 ⋅ 5 + 0.4 ⋅ 30 = 15 m i l e s / h o u r E[V] = 0.6·5 + 0.4·30 = 15\ miles/hour E[V]=0.6⋅5+0.4⋅30=15 miles/hourand then claim that the mean of the time T T T is
2 E [ V ] = 2 15 h o u r s \frac{2}{E[V]}=\frac{2}{15}hours E[V]2=152hours - To summarize, in this example we have
T = 2 V , a n d E [ T ] = E [ 2 V ] ≠ 2 E [ V ] T=\frac{2}{V},\ \ \ \ and \ \ \ E[T]=E[\frac{2}{V}]\neq\frac{2}{E[V]} T=V2, and E[T]=E[V2]=E[V]2
Mean and Variance of Some Common Random Variables
Bernoulli Random Variable
- The mean. second moment. and variance of
X
X
X are given by the following calculations:
Geometric Random Variable
E [ X ] = 1 p v a r ( X ) = 1 − p p 2 E[X]=\frac{1}{p}\\ var(X)=\frac{1-p}{p^2} E[X]=p1var(X)=p21−p
Problem 23.
- (a) A fair coin is tossed repeatedly and independently until two consecutive heads or two consecutive tails appear. Find the PMF, the expected value, and the variance of the number of tosses.
- (b) Assume now that the coin is tossed until we obtain a tail that is immediately preceded by a head. Find the PMF and the expected value of the number of tosses.
SOLUTION
- Let X X X be the total number of tosses.
- (a) The random variable
X
X
X is of the form
X
=
Y
+
1
X = Y +1
X=Y+1, where
Y
Y
Y is a geometric random variable with parameter
p
=
1
/
2
p = 1/2
p=1/2. It follows that
and
E [ X ] = E [ Y ] + 1 = 1 p + 1 = 3 v a r ( X ) = v a r ( Y ) = 1 − p p 2 = 2 E[X]=E[Y]+1=\frac{1}{p}+1=3\\ var(X)=var(Y)=\frac{1-p}{p^2}=2 E[X]=E[Y]+1=p1+1=3var(X)=var(Y)=p21−p=2 - (b) If
k
>
2
k > 2
k>2, there are
k
−
1
k - 1
k−1 sequences that lead to the event
{
X
=
k
}
\{X = k\}
{X=k}. One such sequence is
H
.
.
.
H
T
H \ ...\ H\ T
H ... H T, where
k
−
1
k-1
k−1 heads are followed by a tail. The other
k
−
2
k-2
k−2 possible sequences are of the form
T
.
.
.
T
H
.
.
.
H
T
T\ ...\ T\ H\ ...\ H\ T
T ... T H ... H T, for various lengths of the initial
T
.
.
.
T
T\ ...\ T
T ... T segment. For the case where
k
=
2
k = 2
k=2, there is only one (hence
k
−
1
k- 1
k−1) possible sequence that leads to the event
{
X
=
k
}
\{X = k\}
{X=k}, namely the sequence
H
T
H\ T
H T. Therefore, for any
k
≥
2
k \geq 2
k≥2,
P ( X = k ) = ( k − 1 ) ( 1 / 2 ) k P(X=k)=(k-1)(1/2)^k P(X=k)=(k−1)(1/2)kIt follows that
and
We have used here the equalities
and
where Y Y Y is a geometric random variable with parameter p = 1 / 2 p = 1/2 p=1/2.
Discrete Uniform Random Variable
离散均匀随机变量
Discrete uniformly distributed random variable (or discrete uniform for short)
- A discrete uniform random variable takes one out of a range of contiguous (相邻的) integer values, with equal probability.
where a a a and b b b are two integers with a < b a < b a<b
E [ X ] = a + b 2 E[X]=\frac{a+b}{2} E[X]=2a+b - To calculate the variance of
X
X
X. we first consider the simpler case where
a
=
1
a = 1
a=1 and
b
=
n
b = n
b=n. It can be verified by induction on
n
n
n that
E [ X 2 ] = 1 n ∑ k = 1 n k 2 = 1 6 ( n + 1 ) ( 2 n + 1 ) E[X^2]=\frac{1}{n}\sum_{k=1}^nk^2=\frac{1}{6}(n+1)(2n+1) E[X2]=n1k=1∑nk2=61(n+1)(2n+1)The variance can now be obtained in terms of the first and second moments
v a r ( X ) = E [ X 2 ] − ( E [ X ] ) 2 = n 2 − 1 12 var(X)=E[X^2]-(E[X])^2=\frac{n^2-1}{12} var(X)=E[X2]−(E[X])2=12n2−1 - For the case of general integers
a
a
a and
b
b
b, we note that a random variable which is uniformly distributed over the interval
[
a
,
b
]
[a, b]
[a,b] has the same variance as one which is uniformly distributed over
[
1
,
b
−
a
+
1
]
[1, b - a+ 1]
[1,b−a+1]. Therefore, the desired variance is given by the above formula with
n
=
b
−
a
+
1
n = b - a + 1
n=b−a+1, which yields
v a r ( X ) = ( b − a + 1 ) 2 − 1 12 = ( b − a ) ( b − a + 2 ) 12 var(X)=\frac{(b - a+ 1)^2-1}{12}=\frac{(b - a)(b - a+ 2)}{12} var(X)=12(b−a+1)2−1=12(b−a)(b−a+2)
Poisson Random Variable
E [ X ] = ∑ k = 0 ∞ k e − λ λ k k ! = ∑ k = 1 ∞ k e − λ λ k k ! ( t h e k = 0 t e r m i s z e r o ) = λ ∑ k = 1 ∞ e − λ λ k − 1 ( k − 1 ) ! = λ ∑ k = 1 ∞ e − λ λ m m ! ( l e t m = k − 1 ) = λ \begin{aligned}E[X]=&\sum_{k=0}^\infty ke^{-\lambda}\frac{\lambda^k}{k!} \\=&\sum_{k=1}^\infty ke^{-\lambda}\frac{\lambda^k}{k!}\ \ \ \ \ \ \ \ (the\ k=0\ term\ is\ zero) \\=&\lambda\sum_{k=1}^\infty e^{-\lambda}\frac{\lambda^{k-1}}{(k-1)!} \\=&\lambda\sum_{k=1}^\infty e^{-\lambda}\frac{\lambda^{m}}{m!}\ \ \ \ \ \ \ \ (let\ m=k-1)\\ =&\lambda\end{aligned} E[X]=====k=0∑∞ke−λk!λkk=1∑∞ke−λk!λk (the k=0 term is zero)λk=1∑∞e−λ(k−1)!λk−1λk=1∑∞e−λm!λm (let m=k−1)λ
- A similar calculation shows that the variance of a Poisson random variable is also λ \lambda λ.
Decision Making Using Expected Values
Example 2.8. The Quiz Problem.
Consider a quiz game where a person is given two questions and must decide which one to answer first. Question 1 will be answered correctly with probability
p
1
p_1
p1, and the person will then receive as prize
v
1
v_1
v1, while question 2 will be answered correctly with probability
p
2
p_2
p2, and the person will then receive as prize
v
2
v_2
v2. If the first question attempted is answered incorrectly, the quiz terminates. If the first question is answered correctly, the person is allowed to attempt the second question. Which question should be answered first to maximize the expected value of the total prize money received?
- If question 1 is answered first, we have
E [ X ] = p 1 ( 1 − p 2 ) v 1 + p 1 p 2 ( v 1 + v 2 ) = p 1 v 1 + p 1 p 2 v 2 E[X]=p_1(1-p_2)v_1+p_1p_2(v_1+v_2)=p_1v_1+p_1p_2v_2 E[X]=p1(1−p2)v1+p1p2(v1+v2)=p1v1+p1p2v2while if question 2 is answered first, we have
E [ X ] = p 2 ( 1 − p 1 ) v 2 + p 2 p 1 ( v 2 + v 1 ) = p 2 v 2 + p 2 p 1 v 1 E[X]=p_2(1-p_1)v_2+p_2p_1(v_2+v_1)=p_2v_2+p_2p_1v_1 E[X]=p2(1−p1)v2+p2p1(v2+v1)=p2v2+p2p1v1 - It is thus optimal to answer question 1 first if and only if
p 1 v 1 + p 1 p 2 v 2 ≥ p 2 v 2 + p 2 p 1 v 1 p_1v_1+p_1p_2v_2\geq p_2v_2+p_2p_1v_1 p1v1+p1p2v2≥p2v2+p2p1v1or equivalently if
p 1 v 1 1 − p 1 ≥ p 2 v 2 1 − p 2 \frac{p_1v_1}{1-p_1}\geq\frac{p_2v_2}{1-p_2} 1−p1p1v1≥1−p2p2v2Therefore , it is optimal to order the questions in decreasing value of the expression p v / ( 1 − p ) pv / (1 - p) pv/(1−p). which provides a convenient index of quality for a question with probability of correct answer p p p and value v v v. - Interestingly, this rule generalizes to the case of more than two questions.