STATS 10 Assignment 2 (1)
.pdf
keyboard_arrow_up
School
University of California, Los Angeles *
*We aren’t endorsed by this school
Course
10
Subject
Statistics
Date
Apr 3, 2024
Type
Pages
12
Uploaded by CommodoreCrow17890
STATS 10 Assignment 2
Lalonye Calhoun 006059433
Discussion 3A/B
Exercise 1
Work with lead and copper data obtained from the residents of Flint, Michigan from January-
February, 2017. Data are reported in PPB (parts per billion, or μg/L) from each residential
testing kit. Remember that “Pb” denotes lead, and “Cu” denotes copper. You can learn more
about the Flint water crisis at https://en.wikipedia.org/wiki/Flint_water_crisis.
a. Download the data from the course site and read it into R. Or use online data link:
read.csv(“https://ucla.box.com/shared/static/e9xuft4h3p8fdi4ydoj2hhujee0vmopb.csv”)
When you read in the data, name your object “flint”.
b. The EPA states a water source is especially dangerous if the lead level is 15 PPB or
greater. What proportion of the locations tested were found to have dangerous lead
levels?
-
.04436229%
c. Report the mean copper level for only test sites in the North region
.
-
44.6424
d. Report the mean copper level for only test sites with dangerous lead levels (at least 15
PPB)
.
-
141.9631
e. Report the mean lead and copper levels.
-
54.581 copper levels
-
3.383
f.
Create a box plot with a good title for the lead levels.
-
g. Based on what you see in part (f), does the mean seem to be a good measure of center
for the data? Report a more useful statistic for this data
-
No, The median would be better because the data is skewed
.
Exercise 2
The data here represent life expectancies (Life) and per capita income (Income) in 1974 dollars
for 101 countries in the early 1970’s. The source of these data is: Leinhardt and Wasserman
(1979), New York Times (September, 28, 1975, p. E-3). They also appear on Regression
Analysis by Ashish Sen and Muni Srivastava. You can access these data in R using:
life <-read.table("https://ucla.box.com/shared/static/rqk4lc030pabv30wknx2ft9jy848ub9n.txt",
header = TRUE)
a. Construct a scatterplot of Life against Income. Note: Income should be on the
horizontal axis. How does income appear to affect life expectancy?
-
The higher your income the more likely you are to live past 70, the less money you have
the more likely you are to die around 50.
b. Construct the boxplot and histogram of Income. Are there any outliers?
-
Boxplot: There were some outliers around 3000 to 5000
-
Histogram: I don't see any outliers
c. Split the data set into two parts: One for which the Income is strictly below $1000, and
one for which the Income is at least $1000. Come up with your own names for these two
objects.
-
lowerthan1000 = life[life$Income < 1000,]
-
-
Above1000 = life[life$Income > 1000,]
d. Use the data for which the Income is below $1000. Plot Life against Income and
compute the correlation coefficient. Hint: use the function cor()
-
0.752886
Exercise 3
The Maas river data contain the concentration of lead and zinc in ppm at 155 locations at
the banks of the Maas river in the Netherlands. You can read the data in R as follows:
maas <-
read.table("https://ucla.box.com/shared/static/tv3cxooyp6y8fh6gb0qj2cxihj8klg1h.txt",
header = TRUE)
a. Compute the summary statistics for lead and zinc using the summary() function.
-
Lead:
Min. 1st Qu.
Median
Mean 3rd Qu. Max.
-
37.0
72.5
123.0
153.4
207.0
654.0
-
Zinc:
Min.
1st Qu.
Median
Mean
3rd Qu.
Max.
-
113.0
198.0
326.0
469.7
674.5
1839.0
-
b
. Plot two histograms: one of lead and one of log(lead).
Lead:
Log:
c
. Plot log(lead) against log(zinc). What do you observe?
-
The correlation coefficient is positive and the graph is linear
-
d. The level of risk for surface soil based on lead concentration in ppm is given on the
table below:
The following commands give different colors and sizes on a scatterplot
For two variables: x, y
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Questions
Sonya collected data on the shoe size worn by each member of her class. She used this data to calculate the average shoe size in the class. What type of data are these?
1.primary aggregate data
2.secondary aggregate data
3.primary microdata
4.secondary microdata
arrow_forward
The Iris Flower Data Set at
https://en.wikipedia.org/wiki/Iris flower data set e consists of
measurements taken from Iris flowers: the petal length and width and the
sepal length and width were measured in centimeters (cm) and recorded for
a lot of flowers.
We are interested in being able to use the petal width to predict the petal
length.
Use the Excel output provided below to answer the Iris Flower questions
that follow.
SUMMARY OUTPUT
Data from:
https://en.wikipedia.org/wiki/Iris flower data set
Using Petal Width to predict Petal Length.
Regression Statistics
Multiple R
0.962746025
R Square
0.926879908
Adjusted R Square
0.926385853
Standard Error
0.47895943
Observations
150
ANOVA
df
MS
Significance F
Regression
1
430.373884
430.373884
1876.067473 5.90274E-86
Residual
148
33.951516
0.229402135
Total
149
464.3254
Coefficients Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Lower 95.0% Upper 95.0%
Intercept
1.079463302 0.073168311
14.75315321
7.2245E-31
0.934873757 1.224052847…
arrow_forward
an attempt to develop a model of wine quality as judged by wine experts, data on alcohol content and wine quality was collected from variants of a particular wine. From a sample of 12wines, a model was created using the percentages of alcohol to predict wine quality. For those data, SR=18,671 and SST=27,382.Use this information to complete parts (a) through (c) below. Please complete part 3(B) ONLY.
Question content area bottom
Part 1
a. Determine the coefficient of determination,
r2,
and interpret its meaning.
r2=0.682
(Round to three decimal places as needed.)
Part 2
Interpret the meaning of r2.
It means that
68.2
of the variation in
wine quality
can be explained by the variation in
alcohol content.
(Round to one decimal place as needed.)
Part 3
b. Determine the standard error of the estimate.
SYX=
(Round to four decimal places as needed.)
arrow_forward
please assist with this NON GRADED assignment
arrow_forward
attempt.php?attempt-295788cmid-9447&page-5
The box plots below represent the lifetime (in months) for two brands of car battery.
What is the best measure of
dispersion for MARATHON brand?
Give Reason.
Marathon
Energysia
Answer:
62
63
64
65
99
5 4 73 72 1 70 69 68 67
arrow_forward
I need the right answers to this problem.
arrow_forward
When exposed to an infection, a person typically develops antibodies. The extent to which theantibodies respond can be measured by looking at a persons titer, which is a measure of the number of antibodies present. The higher the titer is, the more antibodies that are present. The data in Table represent the titers of 11 ill people and 11 healthy people exposed to the tularemia virus in Vermont. Is the level of titer in the ill group greater than the level of titer in the healthy group? Use the α = 0.10 level of significance.
arrow_forward
A blog noted that "there has been increasing anecdotal evidence that vitamin C may still be useful as an anticancer medicine if used in high concentrations and given directly into the vein (intravenously)." Use this information to answer the questions below.
Question content area bottom
Part 1
Explain what it means that there is "increasing anecdotal evidence" that Vitamin C may be a useful anticancer medicine.
A.
There is no evidence that shows Vitamin C may be a useful anticancer medicine.
B.
There is scientific evidence that Vitamin C is a non-useful anticancer medicine.
C.
There is an increase in rigorous or scientific analysis that shows Vitamin C may be a useful anticancer medicine.
D.
There is an increase in observations or personal experiences that shows Vitamin C may be a useful anticancer medicine.
Part 2
How does anecdotal evidence contrast with scientific evidence?
A.
They are stories about individual cases.…
arrow_forward
Background: To prevent crashes caused by running red lights, many states are installing cameras at
dangerous intersections. These cameras are used to take photographs of the license plates of vehicles that
run a red light. The Virginia Department of Transportation (VDOT) obtained data on the number of crashes
per year caused by running a red light at 13 intersections in Fairfax County, Virginia.
Source: Virginia Transportation Research Council, "Research Report: The Impact of Red Light Cameras
(Photo-Red Enforcement) on Crashes in Virginia", June 2007
Directions: Perform an appropriate significance test to determine whether or not the reduction in the
number of crashes was statistically significant.
1. Click on the Data button below to display the data. Copy the data into a statistical software package
and click the Data button a second time to hide it.
Data
Before After
3.7 1.26
0.1
0
4.55 1.69
2.4
1.94
2.09 3.24
2.5
2.72
0.83
0.24
3.05
1.67
3.21
0.33
1.08
0.18
1.25 0.99
7.25 5.02
191…
arrow_forward
Background: To prevent crashes caused by running red lights, many states are installing cameras at
dangerous intersections. These cameras are used to take photographs of the license plates of vehicles that
run a red light. The Virginia Department of Transportation (VDOT) obtained data on the number of crashes
per year caused by running a red light at 13 intersections in Fairfax County, Virginia.
Source: Virginia Transportation Research Council, "Research Report: The Impact of Red Light Cameras
(Photo-Red Enforcement) on Crashes in Virginia", June 2007
Directions: Perform an appropriate significance test to determine whether or not the reduction in the
number of crashes was statistically significant.
RED
LIGHT
PHOTO
ENFORCED
1. Click on the Data button below to display the data. Copy the data into a statistical software package
and click the Data button a second time to hide it.
Data
Before After
3.5
1.46
0.47
0.1
0.39
0
4.55 1.69
2.6 2.04
2.39
3.14
2.5
2.62
0.83
0.14
3.25
1.57
3.41…
arrow_forward
A legal case rested on whether a patent witness' signature was written on top of key text in a notebook or under the key text. The zinc measurements for three notebook
locations on a text line, on a witness line, and on the intersection of the witness and text line-are provided in the table below. Complete parts a through c below.
Text line
Witness line
0.319
0.384
0.327
Intersection
Find the test statistic.
HT-1400
0.321
0.307
0.327
t = (Round to two decimal places as needed.)
Find the p-value.
p-value= (Round to three decimal places as needed.)
a. Use a test (at a = 0.02) to compare the mean zinc measurement for the text line with the mean for the intersection.
Let μ represent the mean zinc measurement for the text line, let uw represent the mean zinc measurement for the signature of a patent witness, and let μ, represent the
mean zinc measurement for the intersection line. Select the correct hypotheses below.
OA. Ho: HT-Hw=0, H₂: HT μ₁0
OB. Ho: HT-H₁0, H₂: HT-H₁0
O C. Ho: HT-HW = 0,…
arrow_forward
Which of the following is not statistically variable?
Please choose one:
a.
A, B system blood group in humans
b.
Students' birth cities
c.
Euler's coefficient
D.
Relative system grades of students
to.
Height in humans
arrow_forward
The carbon monoxide (CO) level in a manufacturing plant is supposed to be about 50 parts per million (ppm). However the actual CO levels are quite variable. Ten CO measurements are taken at various times during the day: Test if the average carbon monoxide (CO) level is not 5 0 ppm at the manufacturing plant if the population standard is 8.07. Use 10% level of significance.
arrow_forward
lili
....
OFF
References
Mailings
Review
View >> Tell me
Sh
Draw
Design
Layout
12
AA
Aa v
ibri (Bo...
三 三三三E
Styles
Dictate
Create
x A -
130 adults with gum disease were asked the number of times per week they used to floss
before their diagnoses. The (incomplete) results are shown below:
# of times floss per week
Frequency
Relative Frequency
Cumulative Frequency
15
0.1154
33
1.
0.1385
51
2.
81
0.0769
3.
15
0.1154
4.
17
0.1308
93
5.
0.1538
113
6.
0.1308
130
7.
a. Complete the table (Use 4 decimal places when applicable)
b. What is the cumulative relative frequency for flossing 3 times per week? %
English (United States)
spioM 06
Focus
JAN
13
arrow_forward
Please review the SPSS output below. The output helps me to understand whether Blacks and Whites differ with respect to the
average number of children they have ever had. Then answer the questions beneath the output. You may provide your
answers directly in the space provide or upload a document with your response. PLEASE BE AWARE THAT I WILL NOT BE
ABLE TO OPEN ANY DOCUMENTS THAT ARE HEIC
Group Statistics
WHAT IS RS RACE 1ST
MENTION
WHITE
Std. Error
Mean
Mean
Std. Devlation
NUMBER OF CHILDREN
1011
1.81
1.560
049
BLACK OR AFRICAN
AMERICAN
192
1.84
1.748
126
Indgendent Sanplis Test
Lavent's TiS Eq.aho
Lowen
Upper
Equalvaranpes
252.12
a.
What kind of statistical test is this?
b.
Is Levene's test for homogeneity of variances statistically significant? How do you know?
C.
Which value for t do we interpret aCcording to this test? How do you know?
d.
Is the overalI test statistically significant? How do you know?
e.
What would the null and research hypotheses be for this statistical test?
f.…
arrow_forward
Please only do B, C, and D...
arrow_forward
The poison dart frog, Dendrobates auratus, is native to Costa Rica and other
areas of Central and South America. They were introduced to Hawaii and have
flourished there. These frogs concentrate toxins from their food and also modify
various ingested compounds into toxins called allopumiliotoxins. A small sample
of these frogs was collected on Kauai Island and their overall lengths measured in
cm. The data appear below.
4.1
5.2
4.3
5.1
4.7
4.5
3.9
4.6
4.3
For this sample find the following:
(a) the sum of the X's;
(b) the sample mean;
(c) the uncorrected sum of squares;
(d) the corrected sum of squares;
(e) the standard deviation:
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt
Related Questions
- Sonya collected data on the shoe size worn by each member of her class. She used this data to calculate the average shoe size in the class. What type of data are these? 1.primary aggregate data 2.secondary aggregate data 3.primary microdata 4.secondary microdataarrow_forwardThe Iris Flower Data Set at https://en.wikipedia.org/wiki/Iris flower data set e consists of measurements taken from Iris flowers: the petal length and width and the sepal length and width were measured in centimeters (cm) and recorded for a lot of flowers. We are interested in being able to use the petal width to predict the petal length. Use the Excel output provided below to answer the Iris Flower questions that follow. SUMMARY OUTPUT Data from: https://en.wikipedia.org/wiki/Iris flower data set Using Petal Width to predict Petal Length. Regression Statistics Multiple R 0.962746025 R Square 0.926879908 Adjusted R Square 0.926385853 Standard Error 0.47895943 Observations 150 ANOVA df MS Significance F Regression 1 430.373884 430.373884 1876.067473 5.90274E-86 Residual 148 33.951516 0.229402135 Total 149 464.3254 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 1.079463302 0.073168311 14.75315321 7.2245E-31 0.934873757 1.224052847…arrow_forwardan attempt to develop a model of wine quality as judged by wine experts, data on alcohol content and wine quality was collected from variants of a particular wine. From a sample of 12wines, a model was created using the percentages of alcohol to predict wine quality. For those data, SR=18,671 and SST=27,382.Use this information to complete parts (a) through (c) below. Please complete part 3(B) ONLY. Question content area bottom Part 1 a. Determine the coefficient of determination, r2, and interpret its meaning. r2=0.682 (Round to three decimal places as needed.) Part 2 Interpret the meaning of r2. It means that 68.2 of the variation in wine quality can be explained by the variation in alcohol content. (Round to one decimal place as needed.) Part 3 b. Determine the standard error of the estimate. SYX= (Round to four decimal places as needed.)arrow_forward
- please assist with this NON GRADED assignmentarrow_forwardattempt.php?attempt-295788cmid-9447&page-5 The box plots below represent the lifetime (in months) for two brands of car battery. What is the best measure of dispersion for MARATHON brand? Give Reason. Marathon Energysia Answer: 62 63 64 65 99 5 4 73 72 1 70 69 68 67arrow_forwardI need the right answers to this problem.arrow_forward
- When exposed to an infection, a person typically develops antibodies. The extent to which theantibodies respond can be measured by looking at a persons titer, which is a measure of the number of antibodies present. The higher the titer is, the more antibodies that are present. The data in Table represent the titers of 11 ill people and 11 healthy people exposed to the tularemia virus in Vermont. Is the level of titer in the ill group greater than the level of titer in the healthy group? Use the α = 0.10 level of significance.arrow_forwardA blog noted that "there has been increasing anecdotal evidence that vitamin C may still be useful as an anticancer medicine if used in high concentrations and given directly into the vein (intravenously)." Use this information to answer the questions below. Question content area bottom Part 1 Explain what it means that there is "increasing anecdotal evidence" that Vitamin C may be a useful anticancer medicine. A. There is no evidence that shows Vitamin C may be a useful anticancer medicine. B. There is scientific evidence that Vitamin C is a non-useful anticancer medicine. C. There is an increase in rigorous or scientific analysis that shows Vitamin C may be a useful anticancer medicine. D. There is an increase in observations or personal experiences that shows Vitamin C may be a useful anticancer medicine. Part 2 How does anecdotal evidence contrast with scientific evidence? A. They are stories about individual cases.…arrow_forwardBackground: To prevent crashes caused by running red lights, many states are installing cameras at dangerous intersections. These cameras are used to take photographs of the license plates of vehicles that run a red light. The Virginia Department of Transportation (VDOT) obtained data on the number of crashes per year caused by running a red light at 13 intersections in Fairfax County, Virginia. Source: Virginia Transportation Research Council, "Research Report: The Impact of Red Light Cameras (Photo-Red Enforcement) on Crashes in Virginia", June 2007 Directions: Perform an appropriate significance test to determine whether or not the reduction in the number of crashes was statistically significant. 1. Click on the Data button below to display the data. Copy the data into a statistical software package and click the Data button a second time to hide it. Data Before After 3.7 1.26 0.1 0 4.55 1.69 2.4 1.94 2.09 3.24 2.5 2.72 0.83 0.24 3.05 1.67 3.21 0.33 1.08 0.18 1.25 0.99 7.25 5.02 191…arrow_forward
- Background: To prevent crashes caused by running red lights, many states are installing cameras at dangerous intersections. These cameras are used to take photographs of the license plates of vehicles that run a red light. The Virginia Department of Transportation (VDOT) obtained data on the number of crashes per year caused by running a red light at 13 intersections in Fairfax County, Virginia. Source: Virginia Transportation Research Council, "Research Report: The Impact of Red Light Cameras (Photo-Red Enforcement) on Crashes in Virginia", June 2007 Directions: Perform an appropriate significance test to determine whether or not the reduction in the number of crashes was statistically significant. RED LIGHT PHOTO ENFORCED 1. Click on the Data button below to display the data. Copy the data into a statistical software package and click the Data button a second time to hide it. Data Before After 3.5 1.46 0.47 0.1 0.39 0 4.55 1.69 2.6 2.04 2.39 3.14 2.5 2.62 0.83 0.14 3.25 1.57 3.41…arrow_forwardA legal case rested on whether a patent witness' signature was written on top of key text in a notebook or under the key text. The zinc measurements for three notebook locations on a text line, on a witness line, and on the intersection of the witness and text line-are provided in the table below. Complete parts a through c below. Text line Witness line 0.319 0.384 0.327 Intersection Find the test statistic. HT-1400 0.321 0.307 0.327 t = (Round to two decimal places as needed.) Find the p-value. p-value= (Round to three decimal places as needed.) a. Use a test (at a = 0.02) to compare the mean zinc measurement for the text line with the mean for the intersection. Let μ represent the mean zinc measurement for the text line, let uw represent the mean zinc measurement for the signature of a patent witness, and let μ, represent the mean zinc measurement for the intersection line. Select the correct hypotheses below. OA. Ho: HT-Hw=0, H₂: HT μ₁0 OB. Ho: HT-H₁0, H₂: HT-H₁0 O C. Ho: HT-HW = 0,…arrow_forwardWhich of the following is not statistically variable? Please choose one: a. A, B system blood group in humans b. Students' birth cities c. Euler's coefficient D. Relative system grades of students to. Height in humansarrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Big Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin Harcourt
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt