Additional_Exercises
pdf
School
Biola University *
*We aren’t endorsed by this school
Course
210
Subject
Economics
Date
Apr 3, 2024
Type
Pages
16
Uploaded by JusticeOpossumMaster1074
Additional Exercises
1. The General Social Survey asked the question, “After an average workday, about
how many hours do you have to relax or pursue activities that you enjoy?” to
a random sample of 1,155 Americans. The average relaxing time was found to
be 1.65 hours. Determine which of the following is an observation, a variable, a
sample statistic (value calculated based on the observed sample), or a population
parameter.
(a) Average number of hours all Americans spend relaxing after an average
workday.
(b) Number of hours spent relaxing after an average workday.
(c) An American in the sample.
(d) 1.65.
2. Researchers collected data to examine the relationship between air pollutants
and preterm births in Southern California. During the study air pollution lev-
els were measured by air quality monitoring stations. Specifically, levels of car-
bon monoxide were recorded in parts per million, nitrogen dioxide and ozone
in parts per hundred million, and coarse particulate matter (PM
10
) in
µg/m
3
.
Length of gestation data were collected on 143,196 births between the years 1989
and 1993, and air pollution exposure during gestation was calculated for each
birth. The analysis suggested that increased ambient PM
10
and, to a lesser de-
gree, CO concentrations may be associated with the occurrence of preterm births.
(https://www.jstor.org/stable/3703990)
(a) Who are the subjects in this study, and how many are included?
(b) Comment on whether the results of the study can be generalized to the pop-
ulation, and if the findings of the study can be used to establish causal rela-
tionships.
1
3. Researchers studying the effect of antibiotic treatment for acute sinusitis com-
pared to symptomatic treatments randomly assigned 89 adults diagnosed with
acute sinusitis to one of two groups: treatment or control.
Study participants
received either a 10-day course of amoxicillin (an antibiotic) or a placebo simi-
lar in appearance and taste.
The placebo consisted of symptomatic treatments
such as acetaminophen, nasal decongestants, etc. At the end of the 10-day pe-
riod, patients were asked if they experienced improvement in symptoms.
The
distribution of responses is summarized below.
(https://jamanetwork.com/journals/jama/fullarticle/1104985)
(a) What percent of patients in the treatment group experienced improvement
in symptoms?
(b) What percent experienced improvement in symptoms in the control group?
(c) In which group did a higher percentage of patients experience improvement
in symptoms?
(d) Your findings so far might suggest a real difference in the effectiveness of an-
tibiotic and placebo treatments for improving symptoms of sinusitis. How-
ever, this is not the only possible conclusion. What is one other possible ex-
planation for the observed difference between the percentages patients who
experienced improvement in symptoms?
(e) What are the explanatory and response variables in this study?
4. In a study of the relationship between socio-economic class and unethical be-
havior, 129 University of California undergraduates at Berkeley were asked to
identify themselves as having low or high social class by comparing themselves
to others with the most (least) money, most (least) education, and most (least)
respected jobs.
They were also presented with a jar of individually wrapped
candies and informed that the candies were for children in a nearby laboratory,
but that they could take some if they wanted.
After completing some unre-
lated tasks, participants reported the number of candies they had taken. It was
found that those who were identified as upper-class took more candy than others.
(https://www.pnas.org/content/109/11/4086)
2
(a) Identify the population of interest and the sample in this study.
(b) Comment on whether the results of the study can be generalized to the pop-
ulation, and if the findings of the study can be used to establish causal rela-
tionships.
5. Researchers collected data to examine the relationship between air pollutants
and preterm births in Southern California. During the study air pollution lev-
els were measured by air quality monitoring stations. Specifically, levels of car-
bon monoxide were recorded in parts per million, nitrogen dioxide and ozone
in parts per hundred million, and coarse particulate matter (PM
10
) in
µg/m
3
.
Length of gestation data were collected on 143,196 births between the years 1989
and 1993, and air pollution exposure during gestation was calculated for each
birth. The analysis suggested that increased ambient PM
10
and, to a lesser de-
gree, CO concentrations may be associated with the occurrence of preterm births.
(https://www.jstor.org/stable/3703990)
(a) Identify the main research question of the study.
(b) Who are the subjects in this study, and how many are included?
(c) What are the variables in the study? Identify each variable as numerical or
categorical. If numerical, state whether the variable is discrete or continuous.
If categorical, state whether the variable is ordinal.
6. The bar graph and the pie chart below show the distribution of pre-existing medi-
cal conditions of children involved in a study on the optimal duration of antibiotic
use in treatment of tracheitis, which is an upper respiratory infection.
(a) What features are apparent in the bar graph but not in the pie chart?
(b) What features are apparent in the pie chart but not in the bar graph?
(c) Which graph would you prefer to use for displaying these categorical data?
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
7. A local news survey asked 500 randomly sampled Los Angeles residents which
shipping carrier they prefer to use for shipping holiday gifts. The table below
shows the distribution of responses by age group as well as the expected counts
for each cell.
(a) Which graph (top or bottom) would you use to understand the shipping
choices of people of different ages?
(b) Which graph (top or bottom) would you use to understand the age distribu-
tion across different types of shipping choices?
(c) A new shipping company would like to market to people over the age of 55.
Who will be their biggest competitor?
(d) FedEx would like to reach out to grow their market share to balance the age
demographics of FedEx users. To what age group should FedEx market?
8. Here are the starting salaries, in thousands of dollars, offered to the 20 students
who earned degrees in computer science in 2011 at a university.
63
56
66
77
50
53
78
55
90
65
64
69
59
76
48
54
49
68
51
50
(a) Make a graph to describe the distribution and write a brief description of its
important features.
(b) Find the median salary.
(c) Find the mean salary.
(d) Find the mode of the salaries.
4
(e) Is the mean about the same as the median or not? What feature of the dis-
tribution explains the difference between the mean and the median? Is the
mode a good measure of the center for these data?
9. Each month, the Commerce Department reports the “average” price of new single-
family homes. For August 2012, the two “averages” reported were $256,900 and
$295,300. Which of these numbers was the mean price and which was the median
price? Explain your answer.
10. In 1961 New York Yankee outfielder Roger Maris held the major league record for
home runs in a single season, with 61 home runs. That record held for 37 years.
Here are Maris’s home run totals for his 10 years in the American League.
13
23
26
16
33
61
28
39
14
8
(a) Find the mean number of home runs that Maris hit in a year, both with and
without his record 61. How does removing the record number of home runs
affect his mean number of runs?
(b) Find the median number of home runs that Maris hit in a year, both with and
without his record 61. How does removing the record number of home runs
affect his median number of runs?
(c) If you had to choose between the mean and median to describe Maris’s home
run hitting pattern, which would you use?
11. In a class of 25 students, 24 of them took an exam in class and 1 student took
a make-up exam the following day.
The professor graded the first batch of 24
exams and found an average score of 74 points with a standard deviation of 8.9
points. The student who took the make-up the following day scored 64 points on
the exam.
(a) Does the new student’s score increase or decrease the average score?
(b) What is the new average?
(c) Does the new student’s score increase or decrease the standard deviation of
the scores?
12. Compare the two plots below. What characteristics of the distribution are appar-
ent in the histogram and not in the box plot? What characteristics are apparent in
the box plot but not in the histogram?
5
13. Estimate the median for the 400 observations shown in the histogram, and note
whether you expect the mean to be higher or lower than the median.
14. For each of the following, state whether you expect the distribution to be sym-
metric, right skewed, or left skewed. Also specify whether the mean or median
would best represent a typical observation in the data, and whether the variability
of observations would be best represented using the standard deviation or IQR.
Explain your reasoning.
(a) Housing prices in a country where 25% of the houses cost below $350,000,
50% of the houses cost below $450,000, 75% of the houses cost below $1,000,000,
and there are a meaningful number of houses that cost more than $6,000,000.
(b) Housing prices in a country where 25% of the houses cost below $300,000,
50% of the houses cost below $600,000, 75% of the houses cost below $900,000,
and very few houses that cost more than 1,200,000.
(c) Number of alcoholic drinks consumed by college students in a given week.
Assume that most of these students don’t drink since they are under 21 years
old, and only a few drink excessively.
6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
(d) Annual salaries of the employees at a Fortune 500 company where only a few
high level executives earn much higher salaries than all the other employees.
(e) Gestation time in humans where 25% of the babies are born by 38 weeks of
gestation, 50% of the babies are born by 39 weeks, 75% of the babies are born
by 40 weeks, and the maximum gestation length is 46 weeks.
15. The final exam scores of twenty introductory statistics students, arranged in as-
cending order, as as follows: 57, 66, 69, 71, 72, 73, 74, 77, 78, 78, 79, 79, 81, 81, 82,
83, 83, 88, 89, 94. Suppose students who score above the 75th percentile on the
final exam get an A in the class. How many students will get an A in this class?
16. Describe (in words) the distribution in the histograms below and match them to
the box plots.
17. Daily air quality is measured by the air quality index (AQI) reported by the En-
vironmental Protection Agency. This index reports the pollution level and what
associated health effects might be a concern. The index is calculated for five ma-
jor air pollutants regulated by the Clean Air Act and takes values from 0 to 300,
where a higher value indicates lower air quality. AQI was reported for a sample
of 91 days in 2011 in Durham, NC. The histogram below shows the distribution
of the AQI values on these days.
7
(a) Estimate
Q
1
,
Q
3
, and
IQR
for the distribution.
(b) Would any of the days in this sample be considered to have an unusually
low or high AQI? Explain your reasoning.
18. Match each correlation to the corresponding scatterplot.
(a)
r
=
−
0
.
7
(b)
r
= 0
.
4
(c)
r
= 0
.
06
(d)
r
= 0
.
92
19. What would be the correlation between the ages of partners if people always
dated others who are
(a) 3 years younger than themselves?
(b) 2 years older than themselves?
(c) half as old as themselves?
20. Determine if the following statements are true or false. If false, explain why.
(a) A correlation coefficient of
−
0
.
90
indicates a stronger linear relationship than
a correlation of
0
.
5
.
(b) Correlation is a measure of the association between any two variables.
8
21. The Coast Starlight Amtrak train runs from Seattle to Los Angeles. Correlation
between travel time (in minutes) and distance (in miles) is
r
= 0
.
636
.
The equation
of the regression line is
\
travel time
= 51 + 0
.
726
×
distance
.
(a) What is the correlation coefficient between travel time (in hours
) and dis-
tance (in kilometers
)?
(b) Does the intercept have a useful interpretation in this situation? If so, give
the interpretation. If not, explain why not.
(c) Does the slope have a useful interpretation in this situation? If so, give the
interpretation. If not, explain why not.
(d) Calculate
r
2
of the regression line for predicting travel time from distance
traveled for the Coast Starlight, and interpret it in the context of the applica-
tion.
(e) The distance between Santa Barbara and Los Angeles is 103 miles. Use the
model to estimate the time it takes for the Starlight to travel between these
two cities.
(f) It actually takes the Coast Starlight about 168 mins to travel from Santa Bar-
bara to Los Angeles. Calculate the residual and explain the meaning of this
residual value.
(g) Suppose Amtrak is considering adding a stop to the Coast Starlight 500 miles
away from Los Angeles. Would it be appropriate to use this linear model to
predict the travel time from Los Angeles to this point?
22. Which type of probability applies to each of the following situations?
9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
(a) Based on data from the 1991 to 1993 General Social Survey, the probability
was about
612
1669
≈
0
.
367
in those years that a randomly selected adult who
had ever married had divorced.
(b) Your friend says that her chance of getting an A in statistics is 30%.
(c) If you guess on a true-false question, the probability of getting it right is 0.50.
23. The following risks are associated with tendon surgery: infection (3%), repair
fails (14%), both infection and repair fails (1%). What percent of tendon surgeries
succeed and are free of infection?
24. A study of the US clinical population found that 22.5% of patients are diagnosed
with a mental disorder, 13.5% of patients are diagnosed with an alcohol-related
disorder, and 5% are diagnosed with both disorders.
(a) Are mental disorders and alcohol related disorders disjoint events (statisti-
cally speaking)? Explain why or why not.
(b) Are mental disorders and alcohol related disorders independent events (sta-
tistically speaking)? Explain why or why not.
25. Which of the following sequences resulting from tossing a fair coin 5 times is most
likely, where
H
=head and
T
=tail?
• HHHHH
• HTHHT
• HHHTT
• They are equally likely
26. Suppose
80%
of people like peanut butter,
89%
like jelly, and
78%
like both. Given
that a randomly selected person likes peanut butter, what is the probability that
he also likes jelly?
27. The average daily high temperature in June in LA is
77
◦
F with a standard devia-
tion of
5
◦
F. Suppose that the temperatures in June closely follow a normal distri-
bution.
10
(a) What is the probability of observing an
83
◦
F temperature or higher in LA
during a randomly chosen day in June?
(b) How cold are the coldest
10%
of the days during June in LA?
28. MENSA is an organization whose members have IQs in the top
2%
of the popu-
lation. IQs are normally distributed with mean 100, and the minimum IQ score
required for admission to MENSA is 132. Find the standard deviation of the dis-
tribution of IQs.
29. Consider the salaries of major league baseball (MLB) players, where each player
is a member of the league’s 30 teams. Suppose we would like to draw a sample
of the salaries of 120 players. How would you draw a
(a) simple random sample?
(b) systematic sample?
(c) stratified sample?
(d) cluster sample?
(e) convenience sample?
30. For each of the following situations, can we use the standard normal distribution
table (
z
-table) to compute probabilities?
(a) Weights of adults are approximately Normally distributed with mean 150
lbs and standard deviation 25 lbs. We want to know the probability that a
randomly selected person weights more than 200 pounds.
(b) Weights of adults are approximately Normally distributed with mean 150
lbs and standard deviation 25 lbs. We want to know the probability that the
average weight of 10 randomly selected people is more than 200 pounds.
(c) Weights of adults are approximately Normally distributed with mean 150
lbs and standard deviation 25 lbs. We want to know the probability that the
average weight of 50 randomly selected people is more than 200 pounds.
(d) Salaries at a large corporation have mean of $40,000 and standard deviation
of $20,000. We want to know the probability that a randomly selected em-
ployee makes more than $50,000.
11
(e) Salaries at a large corporation have mean of $40,000 and standard deviation
of $20,000. We want to know the probability that the average of ten randomly
selected employees is more than $50,000.
(f) A club has 50 members, 10 of which think the president should be deposed.
What is the probability that, if we select 20 members at random, 18% or more
in our sample think the president should be deposed?
(g) A club has 5000 members, 1000 of which think the president should be de-
posed. What is the probability that, if we select 91 members at random, 18%
or more in our sample think the president should be deposed?
31. Suppose the average weight of adult males (age 18 or older) in a certain county is
190 lbs with a standard deviation of 20.
(a) What is the probability that the weight of a randomly selected adult male
from this county is bigger than 193 lbs?
(b) Suppose we take a random sample of 16 adult males (age 18 or older) from
this County. What is the probability that their average weight is bigger than
193 lbs?
32. Which statement is not true about confidence intervals?
(a) A confidence interval is an interval of values computed from sample data
that is likely to include the true population value.
(b) An approximate formula for a 95% confidence interval is sample estimate
±
margin of error.
(c) A confidence interval between 0.2 and 0.4 means that the population propor-
tion lies between 0.2 and 0.4.
(d) A 99% confidence interval procedure has a higher probability of producing
intervals that will include the population parameter than a 95% confidence
interval procedure.
33. The General Social Survey asked the question: ”For how many days during the
past 30 days was your mental health, which includes stress, depression, and prob-
lems with emotions, not good?” Based on responses from 1,151 US residents, the
survey reported a 95% confidence interval of 3.40 to 4.24 days in 2010.
(a) Interpret this interval in context of the data.
12
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
(b) What does “95% confident” mean? Explain in the context of the application.
(c) Suppose the researchers think a 99% confidence level would be more appro-
priate for this interval. Will this new interval be smaller or wider than the
95% confidence interval?
(d) If a new survey were to be done with 500 Americans, do you think the mar-
gin of error will be larger, smaller, or about the same?
34. As seen from the density curves, the tails of a
t
-distribution are longer than the
standard normal which results in
t
critical value being larger than
z
critical value
for any given confidence level. When finding a confidence interval for a popula-
tion mean, explain how mistakenly using
z
critical value (instead of the correct
t
critical value) would affect the confidence level.
35. A 90% confidence interval for a population mean is
(65
,
77)
. The population dis-
tribution is approximately normal and the population standard deviation is un-
known. This confidence interval is based on a simple random sample of 25 obser-
vations. Calculate the sample mean, the margin of error, and the sample standard
deviation. Assume that all conditions necessary for inference are satisfied. Use
the
t
-distribution in any calculations.
36. In 2013, the Pew Research Foundation reported that “45% of U.S. adults report
that they live with one or more chronic conditions.” However, this value was
based on a sample, so it may not be a perfect estimate for the population parame-
ter of interest on its own. The study reported a standard error of about 1.2%, and
a normal model may reasonably be used in this setting.
(a) Create a 95% confidence interval for the proportion of U.S. adults who live
with one or more chronic conditions. Also interpret the confidence interval
in the context of the study.
(b) Identify each of the following statements as true or false. Provide an expla-
nation to justify each of your answers.
i. We can say with certainty that the confidence interval from part (a) con-
tains the true percentage of U.S. adults who suffer from a chronic illness.
ii. If we repeated this study 1,000 times and constructed a 95% confidence
interval for each study, then approximately 950 of those confidence in-
tervals would contain the true fraction of U.S. adults who suffer from
chronic illnesses.
13
37. Georgianna claims that in a small city renowned for its music school, the average
child takes less than 5 years of piano lessons. We have a random sample of 20
children from the city, with a mean of 4.6 years of piano lessons and a standard
deviation of 2.2 years.
(a) Evaluate Georgianna’s claim (or that the opposite might be true) using a hy-
pothesis test.
(b) Construct a 95% confidence interval for the number of years students in this
city take piano lessons, and interpret it in context of the data.
(c) Do your results from the hypothesis test and the confidence interval agree?
Explain your reasoning.
38. You are given the hypotheses shown below. We know that the sample standard
deviation is 8 and the sample size is 20. For what sample mean would the
P
-value
be equal to 0.05? Assume that all conditions necessary for inference are satisfied.
H
0
:
µ
= 60
,
H
a
:
µ
̸
= 60
39. A food safety inspector is called upon to investigate a restaurant with a few cus-
tomer reports of poor sanitation practices. The food safety inspector uses a hy-
pothesis testing framework to evaluate whether regulations are not being met.
If he decides the restaurant is in gross violation, its license to serve food will be
revoked.
(a) Write the hypotheses in words.
(b) What is a Type 1 Error in this context?
(c) What is a Type 2 Error in this context?
(d) Which error is more problematic for the restaurant owner? Why?
(e) Which error is more problematic for the diners? Why?
(f) As a diner, would you prefer that the food safety inspector requires strong
evidence or very strong evidence of health concerns before revoking a restau-
rant’s license? Explain your reasoning.
40.
True or false
. Determine if the following statements are true or false, and explain
your reasoning. If false, state how it could be corrected.
14
(a) If a given value (for example, the null hypothesized value of a parameter)
is within a 95% confidence interval, it will also be within a 99% confidence
interval.
(b) Decreasing the significance level (
α
) will increase the probability of making
a Type 1 Error.
(c) Suppose the null hypothesis is
p
= 0
.
5
and we fail to reject
H
0
. Under this
scenario, the true population proportion is 0.5.
(d) With large sample sizes, even small differences between the null value and
the observed point estimate, a difference often called the effect size, will be
identified as statistically significant.
41. Researchers interviewed over 14,000 college students and asked them how many
hours they studied per credit hour taken, on average.
They hypothesized the
number of hours college students studied was less than the recommended 2 hours
per week per credit hour taken, on average. The average from the students in the
sample was 1.98 hours per week per credit hour taken.
After performing the
hypothesis test, researchers found the results “statistically significant”. Which of
the following can explain why the results were statistically significant when there
is practically no difference between the sample mean and the hypothesized value?
• The standard deviation was probably quite large.
• The
P
-value was large.
• The sample size was not large enough.
• The sample size was large.
42. Suppose you conduct a hypothesis test based on a sample where the sample size
is
n
= 50
, and arrive at a
P
-value of 0.08.
You then refer back to your notes
and discover that you made a careless mistake, the sample size should have been
n
= 500
. Will your
P
-value increase, decrease, or stay the same? Explain.
43. Chicken farming is a multi-billion dollar industry, and any methods that increase
the growth rate of young chicks can reduce consumer costs while increasing com-
pany profits, possibly by millions of dollars. An experiment was conducted to
measure and compare the effectiveness of various feed supplements on the growth
rate of chickens. Newly hatched chicks were randomly allocated into six groups,
15
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
and each group was given a different feed supplement. In this exercise we con-
sider chicks that were fed horsebean and linseed. Below are some summary statis-
tics from this dataset along with box plots showing the distribution of weights by
feed type.
(a) Describe the distributions of weights of chickens that were fed horsebean
and linseed.
(b) Do these data provide strong evidence that the average weights of chickens
that were fed linseed and horsebean are different?
Use a 5% significance
level.
(c) What type of error might we have committed? Explain.
(d) Would your conclusion change if we used
α
= 0
.
01
?
16
Related Documents
Recommended textbooks for you

Managerial Economics: Applications, Strategies an...
Economics
ISBN:9781305506381
Author:James R. McGuigan, R. Charles Moyer, Frederick H.deB. Harris
Publisher:Cengage Learning




Recommended textbooks for you
- Managerial Economics: Applications, Strategies an...EconomicsISBN:9781305506381Author:James R. McGuigan, R. Charles Moyer, Frederick H.deB. HarrisPublisher:Cengage Learning

Managerial Economics: Applications, Strategies an...
Economics
ISBN:9781305506381
Author:James R. McGuigan, R. Charles Moyer, Frederick H.deB. Harris
Publisher:Cengage Learning



