Homework1
Question 2
Consider the following model the location of a tennis player’s serve when serving to the right. Let where is the bivariate normal distribution with mean vector and covariance matrix and and represents the lateral and depth
locations of the landing point of the serve on the court. A serve is considered legal if the ball lands in the cross court service box, the area
bounded by and . The following R code can be used to generate an image of the tennis court.
#Creates a data.frame object, the easy structure to use for ggploting
tennisCourt = data.frame(x1 = c(
0
,
4.5
,
18
,
31.5
,
36
,
0
,
4.5
,
4.5
,
0
,-
2
),
x2 = c(
0
,
4.5
,
18
,
31.5
,
36
,
36
,
31.5
,
31.5
,
36
,
38
),
y1 = c(-
39
,-
39
,-
21
,-
39
,-
39
,
39
,
21
,-
21
,-
39
,
0
), y2 = c(
39
,
39
,
21
,
39
,
39
,
39
,
21
,-
21
,-
39
,
0
),
width = c(rep(
1
,
9
),
3
))
#Creates a plot object called ggTennis
ggTennis = ggplot(tennisCourt) + geom_segment(aes(x = x1,y = y1,xend = x2,yend = y2), linewidth = tennisCourt$width) + labs(x = "Court Length"
,y = 'Court Width'
,
title = 'Tennis Court'
)
a. Generate 5,000 independent realizations of and use ggplot to create a scatterplot of your simulated values of over
the provided tennis court plot.
# set seed and create data vectors
set.seed(
98989
)
sample_size <- 5000 sample_meanvector <- c(
29
, 16
) sample_covariance_matrix <- matrix(c(
4
, 4
, 4
, 16
),
ncol = 2
)
# create bivariate normal distribution samples
sample_tennis <- mvrnorm(n = sample_size,
mu = sample_meanvector, Sigma = sample_covariance_matrix)
tennis<- as.data.frame(sample_tennis)
pointsToAdd = data.frame(x=tennis$V1, y=tennis$V2)
ggTennisWithPoints = ggTennis + geom_point(data = pointsToAdd, aes(x = x, y = y),color = 'firebrick'
)
ggTennisWithPoints
b.Using the model, what is the theoretical probablity a serve from the player will be legal? Additionally, show how you can
approximate this probability from the realizations of and provide the numeric value of your approximation.
lower <- c(
18
, 0
)
upper <- c(
31.5
, 21
)
mean <- c(
29
, 16
)
sigma <- diag(sqrt(
4
),
2
)
sigma[
2
,
2
] <- sqrt(
16
)
prob <- pmvnorm(lower=lower, upper=upper, mean=mean, sigma = sigma)
print(prob)
## [1] 0.9554798
## attr(,"error")
## [1] 1e-15
## attr(,"msg")
## [1] "Normal Completion"
cov_sample<-cov(tennis)
sigma2 <- diag(sqrt(cov_sample[
1
,
1
]),
2
)
sigma2[
2
,
2
] <- sqrt(cov_sample[
2
,
2
])
mean2 <- c(mean(tennis$V1), mean(tennis$V2))
prob2 <- pmvnorm(lower=lower, upper=upper, mean=mean2, sigma = sigma2)
print(prob2)
## [1] 0.954265
## attr(,"error")
## [1] 1e-15
## attr(,"msg")
## [1] "Normal Completion"
c. Say the player decides to evaluate their serve that land further to the right (positive direction). Given that the player
examines their serves landing around , what is the conditional distribution of ? What is the probability that
these serves are legal (only considering depth, not width)?(Hink: Consider using the pnorm function)
The conditional distribution of given is:
where x <- 30.5
mu_x2 <- 29
sigma_x2 <- sqrt(
16
)
sigma_x1 <- sqrt(
4
)
rho <- 4
/(sigma_x1*sigma_x2)
mean_x2_given_x1 <- 16 + rho*(sigma_x2/sigma_x1)*(x-mu_x2)
var_x2_given_x1 <- (sigma_x2)^
2
*(
1
-rho^
2
)
print(mean_x2_given_x1)
## [1] 17.5
print(var_x2_given_x1)
## [1] 12
prob_x2<-pnorm(
21
, mean_x2_given_x1, sqrt(var_x2_given_x1))
print(prob_x2)
## [1] 0.8438393
Therefore, the conditional distribution of and the probability that these serves are legal is 0.844.
d. Generate 500 realizations of from the conditional distribution found in part c. Create a new version of the scatterplot
constructed for part a that includes plots of the 500 realizations of plotted as different color points with fixed at
30.5. Add a small amount of random noise to the component to reduce the effects of overplotting (consider using R’s
jitter function). Describe your results. How do your values of generated from the conditional distribution compare to the
values generated directly from the original distribution?
x2_data <- rnorm(
500
, mean=mean_x2_given_x1, sd=sqrt(var_x2_given_x1))
#conditional_tennis <- data.frame(x=30.5, y=x2_data)
pointsToAdd_conditional <- data.frame(x=
30.5
, y=x2_data)
ggTennisWithPoints2 = ggTennisWithPoints + geom_jitter(data = pointsToAdd_conditional, aes(x = x, y = y),color = 'blue'
, width=
1
)
ggTennisWithPoints2
Since the condition guarantees one direction is satisfied to be considered legal, the probability that these serves are legal goes higher.
As we can see the plot above, the points are extremely condensed. The values of generated from the conditional distribution mostly land
inside the legal box. However, the values generated directly from the original distribution shape a ellipse, so the upper part of points locate
beyond the legal box.
χ
=
[
]
∼
([
]
,
[
])
X
1
X
2
N
2
29
16
4
4
4
16
(
υ
,
C
)
N
2
υ
C
X
1
X
2
18
≤
≤
31.5
X
1
0
≤
≤
21
X
2
x
x
χ
X
1
= 30.5
X
1
X
2
X
2
=
x
X
1
|
∼
N
(
+
ρ
(
x
−
),
)
X
2
x
1
μ
X
2
σ
X
2
σ
X
1
μ
X
1
σ
2
|
X
2
X
1
=
(1
−
)
σ
2
|
X
2
X
1
σ
2
X
2
ρ
2
|
∼
N
(17.5, 12)
X
2
x
1
X
2
X
2
X
1
X
1
X
2
= 30.5
X
1
X
2