本文为 I n t r o d u c t i o n Introduction Introduction t o to to P r o b a b i l i t y Probability Probability 的读书笔记
目录
Joint PMF
- Consider two discrete random variables
X
X
X and
Y
Y
Y associated with the same experiment. The probabilities of the values that
X
X
X and
Y
Y
Y can take are captured by the joint PMF of
X
X
X and
Y
Y
Y, denoted
p
X
,
Y
p_{X,Y}
pX,Y.
p X , Y ( x , y ) = P ( X = x , Y = y ) p_{X,Y}(x,y)=P(X=x,Y=y) pX,Y(x,y)=P(X=x,Y=y) - In fact, we can calculate the PMFs of
X
X
X and
Y
Y
Y by using the formulas
p X ( x ) = ∑ y p X , Y ( x , y ) , p Y ( y ) = ∑ x p X , Y ( x , y ) p_X(x)=\sum_yp_{X,Y}(x,y),\ \ \ \ \ \ \ p_Y(y)=\sum_xp_{X,Y}(x,y) pX(x)=y∑pX,Y(x,y), pY(y)=x∑pX,Y(x,y)We sometimes refer to p X p_X pX and p Y p_Y pY as the marginal PMFs (边缘分布列), to distinguish them from the joint PMF.
Problem 26. PMF of the minimum of several random variables.
On a given day. your golf score takes values from the range 101 to 110. with probability 0.1, independent of other days. Determined to improve your score, you decide to play on three different days and declare as your score the minimum
X
X
X of the scores
X
1
,
X
2
X_1, X_2
X1,X2, and
X
3
X_3
X3 on the different days. Calculate the PMF of
X
X
X.
SOLUTION
- We have
P
(
X
>
100
)
=
1
P(X > 100) = 1
P(X>100)=1 and for
k
=
101
,
.
.
.
,
110
k = 101,...,110
k=101,...,110,
It follows that
Functions of Multiple Random Variables
- A function
Z
=
g
(
X
,
Y
)
Z = g(X, Y)
Z=g(X,Y) of the random variables
X
X
X and
Y
Y
Y defines another random variable. Its PMF can be calculated from the joint PMF
p
X
,
Y
p_{X,Y}
pX,Y according to
p Z ( z ) = ∑ { ( x , y ) ∣ g ( x , y ) = z } p X , Y ( x , y ) p_Z(z)=\sum_{\{(x,y)|g(x,y)=z\}}p_{X,Y}(x,y) pZ(z)={(x,y)∣g(x,y)=z}∑pX,Y(x,y) - Furthermore. the expected value rule for functions naturally extends and takes the form
E [ g ( X , Y ) ] = ∑ x ∑ y g ( x , y ) p X , Y ( x , y ) E[g(X,Y)]=\sum_x\sum_y g(x,y)p_{X,Y}(x,y) E[g(X,Y)]=x∑y∑g(x,y)pX,Y(x,y)In the special case where g g g is linear and of the form a X + b Y + c aX +bY +c aX+bY+c, where a , b a, b a,b, and c c c are given scalars, we have
E [ a X + b Y + c ] = a E [ X ] + b E [ Y ] + c E[aX+bY+c]=aE[X]+bE[Y]+c E[aX+bY+c]=aE[X]+bE[Y]+c
More than Two Random Variables
- The joint PMF of three random variables
X
,
Y
X, Y
X,Y, and
Z
Z
Z is defined in analogy with the above as
p X , Y , Z ( x , y , z ) = P ( X = x , Y = y , Z = z ) p_{X,Y,Z}(x, y, z) = P(X = x, Y = y, Z = z) pX,Y,Z(x,y,z)=P(X=x,Y=y,Z=z)for all possible triplets of numerical values ( x , y , z ) (x, y, z) (x,y,z). Corresponding marginal PMFs are analogously obtained by equations such as
p X , Y ( x , y ) = ∑ z p X , Y , Z ( x , y , z ) p_{X,Y}(x,y)=\sum_zp_{X,Y,Z}(x, y, z) pX,Y(x,y)=z∑pX,Y,Z(x,y,z)and p X ( x ) = ∑ y ∑ z p X , Y , Z ( x , y , z ) p_X(x)=\sum_y\sum_zp_{X,Y,Z}(x, y, z) pX(x)=y∑z∑pX,Y,Z(x,y,z) - The expected value rule for functions is given by
E [ g ( X , Y , Z ) ] = ∑ x ∑ y ∑ z g ( x , y , z ) p X , Y , Z ( x , y , z ) E[g(X,Y,Z)]=\sum_x\sum_y\sum_zg(x,y,z)p_{X,Y,Z}(x, y, z) E[g(X,Y,Z)]=x∑y∑z∑g(x,y,z)pX,Y,Z(x,y,z)and if g g g is linear and has the form a X + b Y + c Z + d aX + bY + cZ + d aX+bY+cZ+d, then
E [ a X + b Y + c Z + d ] = a E [ X ] + b E [ Y ] + c E [ Z ] + d E[aX + bY + cZ + d]=aE[X]+bE[Y]+cE[Z]+d E[aX+bY+cZ+d]=aE[X]+bE[Y]+cE[Z]+d
Example 2.10. Mean of the Binomial
Your probability class has 300 students and each student has probability
1
/
3
1 /3
1/3 of getting an A, independent of any other student. What is the mean of
X
X
X, the number of students that get an A?
- Let
- Thus
X
1
,
X
2
,
.
.
.
.
.
,
X
n
X_1,X_2, ....., X_n
X1,X2,.....,Xn are Bernoulli random variables with common mean
p
=
1
/
3
p = 1/3
p=1/3. Their sum
X = X 1 + X 2 + ⋅ ⋅ ⋅ + X n X=X_1+ X_2+···+X_n X=X1+X2+⋅⋅⋅+Xnis the number of students that get an A. Since X X X is the number of “successes”’ in n n n independent trials, it is a binomial random variable with parameters n n n and p p p. - Using the linearity of
X
X
X as a function of the
X
i
X_i
Xi, we have
E [ X ] = ∑ i = 1 300 E [ X i ] = ∑ i = 1 300 1 3 = 100 E[X] =\sum_{i=1}^{300}E[X_i] =\sum_{i=1}^{300}\frac{1}{3}=100 E[X]=i=1∑300E[Xi]=i=1∑30031=100If we repeat this calculation for a general number of students n n n and probability of A equal to p p p, we obtain
E [ X ] = ∑ i = 1 n E [ X i ] = ∑ i = 1 n p = n p E[X] =\sum_{i=1}^nE[X_i] =\sum_{i=1}^np= np E[X]=i=1∑nE[Xi]=i=1∑np=np
Example 2.11. The Hat Problem.
Suppose that
n
n
n people throw their hats in a box and then each picks one hat at random. (Each hat can be picked by only one person, and each assignment of hats to persons is equally likely.) What is the expected value of
X
X
X, the number of people that get back their own hat?
- For the i i ith person. we introduce a random variable X i X_i Xi that takes the value 1 if the person selects his/her own hat. and takes the value 0 otherwise. Since P ( X i = 1 ) = 1 / n P(X_i = 1) = 1/n P(Xi=1)=1/n and P ( X i = 0 ) = 1 − 1 / n P(X_i = 0) = 1 - 1/n P(Xi=0)=1−1/n, the mean of X i X_i Xi is 1 / n 1/n 1/n.
- We now have
X = X 1 + X 2 + ⋅ ⋅ ⋅ + X n X = X_1 + X_2 +· · ·+ X_n X=X1+X2+⋅⋅⋅+Xnso that
E [ X ] = E [ X 1 ] + E [ X 2 ] + ⋅ ⋅ ⋅ + E [ X n ] = n ⋅ 1 n = 1 E[X] = E[X_1] + E[X_2] +· · · + E[X_n] = n\cdot \frac{1}{n}= 1 E[X]=E[X1]+E[X2]+⋅⋅⋅+E[Xn]=n⋅n1=1
Problem 27. The multinomial distribution.
A die with
r
r
r faces, numbered
1
,
.
.
.
.
r
1, .... r
1,....r. is rolled a fixed number of times
n
n
n. The probability that the
i
i
ith face comes up on any one roll is denoted
p
i
p_i
pi , and the results of different rolls are assumed independent. Let
X
i
X_i
Xi be the number of times that the
i
i
ith face comes up. Find
E
[
X
i
X
j
]
E[X_iX_j]
E[XiXj] for
i
≠
j
.
i\neq j.
i=j.
SOLUTION
- Let
Y
i
,
k
Y_{i,k}
Yi,k (or
Y
j
,
k
Y_{j,k}
Yj,k) be the Bernoulli random variable that takes the value 1 if face
i
i
i (respectively,
j
j
j) comes up on the
k
k
kth roll. and the value 0 otherwise. Note that
Y
i
,
k
Y
j
,
k
=
0
Y_{i,k}Y_{j,k}= 0
Yi,kYj,k=0, and that for
l
≠
k
l\neq k
l=k,
Y
i
,
k
Y_{i,k}
Yi,k and
Y
i
,
l
Y_{i,l}
Yi,l are independent, so that
E
[
Y
i
,
k
,
Y
j
,
l
]
=
p
i
p
j
E[Y_{i,k},Y_{j,l}] = p_ip_j
E[Yi,k,Yj,l]=pipj. Therefore,
E [ X i X j ] = E [ ( Y i , 1 + . . . + Y i , n ) ( Y j , 1 + . . . + Y j , n ) ] = n ( n − 1 ) E [ Y i , 1 , Y j , 2 ] = n ( n − 1 ) p i p j \begin{aligned}E[X_iX_j]&= E[(Y_{i,1}+...+Y_{i,n})(Y_{j,1}+...+Y_{j,n})] \\&=n(n -1)E[Y_{i,1},Y_{j,2}] \\&=n(n-1)p_ip_j \end{aligned} E[XiXj]=E[(Yi,1+...+Yi,n)(Yj,1+...+Yj,n)]=n(n−1)E[Yi,1,Yj,2]=n(n−1)pipj
Problem 29. The incIusion-exclusion formula. (容斥不等式)
Let
A
1
,
A
2
,
.
.
.
,
A
n
A_1 , A_2 , ... , A_n
A1,A2,...,An be events. Let
S
1
=
{
i
∣
1
≤
i
≤
n
}
S_1 = \{ i|1 \leq i\leq n\}
S1={i∣1≤i≤n} ,
S
2
=
{
(
i
1
,
i
2
)
∣
1
≤
i
1
<
i
2
≤
n
}
S_2 = \{ (i_1 , i_2 ) |1\leq i_1 < i_2\leq n\}
S2={(i1,i2)∣1≤i1<i2≤n},and more generally, let
S
m
S_m
Sm be the set of all
m
m
m-tuples
(
i
1
,
.
.
.
,
i
m
)
(i_1, ... ,i_m)
(i1,...,im) of indices that satisfy
1
≤
i
1
<
i
2
<
.
.
.
<
i
m
≤
n
1\leq i_1< i_2<...< i_m\leq n
1≤i1<i2<...<im≤n. Show that
- Hint: Let X i X_i Xi be a binary random variable which is equal to 1 when A i A_i Ai occurs, and equal to 0 otherwise. Relate the event of interest to the random variable ( 1 − X 1 ) ( 1 − X 2 ) . . . ( 1 − X n ) (1 -X_1)( 1 -X_2)... (1 - X_n ) (1−X1)(1−X2)...(1−Xn)
SOLUTION
- Let us express the event
B
=
∪
k
=
1
n
A
k
B = \cup_{k=1}^n A_k
B=∪k=1nAk in terms of the random variables
X
1
,
.
.
.
,
X
n
X_1 , ... , X_n
X1,...,Xn. The event
B
C
B^C
BC occurs when
Y
=
(
1
−
X
1
)
(
1
−
X
2
)
.
.
.
(
1
−
X
n
)
Y = (1 -X_1)( 1 -X_2)... (1 - X_n )
Y=(1−X1)(1−X2)...(1−Xn) is equal to
1
1
1.
P ( B C ) = P ( Y = 1 ) = E [ Y ] P(B^C ) = P(Y = 1) = E[Y] P(BC)=P(Y=1)=E[Y]Therefore,
P ( B ) = 1 − E [ ( 1 − X 1 ) ( 1 − X 2 ) . . . ( 1 − X n ) ] = E [ X 1 + . . . + X n ] − E [ ∑ ( i 1 , i 2 ) ∈ S 2 X i 1 X i 2 ] + . . . + ( − 1 ) n − 1 E [ X 1 . . . X n ] \begin{aligned}P(B)&=1-E[(1 -X_1)( 1 -X_2)... (1 - X_n )]\\ &=E[X_1+...+X_n]-E[\sum_{(i_1,i_2)\in S_2}X_{i_1}X_{i_2}]+...+(-1)^{n-1}E[X_1...X_n]\end{aligned} P(B)=1−E[(1−X1)(1−X2)...(1−Xn)]=E[X1+...+Xn]−E[(i1,i2)∈S2∑Xi1Xi2]+...+(−1)n−1E[X1...Xn]We note that
E [ X i ] = P ( A i ) , E [ X i 1 X i 2 ] = P ( A i 1 ∩ A i 2 ) E [ X i 1 X i 2 X i 3 ] = P ( A i 1 ∩ A i 2 ∩ A i 3 ) , E [ X 1 X 2 . . . X n ] = P ( ∩ k = 1 n A k ) E[X_i]=P(A_i),\ \ \ \ \ E[X_{i_1}X_{i_2}]=P(A_{i_1}\cap A_{i_2})\\ E[X_{i_1}X_{i_2}X_{i_3}]=P(A_{i_1}\cap A_{i_2}\cap A_{i_3}),\ \ \ \ \ E[X_1X_2...X_n]=P(\cap_{k=1}^nA_k) E[Xi]=P(Ai), E[Xi1Xi2]=P(Ai1∩Ai2)E[Xi1Xi2Xi3]=P(Ai1∩Ai2∩Ai3), E[X1X2...Xn]=P(∩k=1nAk)etc., from which the desired formula follows.
Problem 30.
Alvin’s database of friends contains
n
n
n entries, but due to a software glitch, the addresses correspond to the names in a totally random fashion. Alvin writes a holiday card to each of his friends and sends it to the (software-corrupted) address. What is the probability that at least one of his friends will get the correct card?
- Hint: Use the inclusion-exclusion formula.
SOLUTION
- Let
A
k
A_k
Ak be the event that the
k
k
kth card is sent to the correct address. We have for any
k
,
j
,
i
k, j, i
k,j,i,
etc., and
- Applying the inclusion-exclusion formula, we obtain the desired probability
When n n n is large, this probability can be approximated by 1 − e − 1 1 - e^{-1} 1−e−1.