STATS 10 Assignment 2 (1)

.pdf

School

University of California, Los Angeles *

*We aren’t endorsed by this school

Course

10

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

12

Uploaded by CommodoreCrow17890

Report
STATS 10 Assignment 2 Lalonye Calhoun 006059433 Discussion 3A/B Exercise 1 Work with lead and copper data obtained from the residents of Flint, Michigan from January- February, 2017. Data are reported in PPB (parts per billion, or μg/L) from each residential testing kit. Remember that “Pb” denotes lead, and “Cu” denotes copper. You can learn more about the Flint water crisis at https://en.wikipedia.org/wiki/Flint_water_crisis. a. Download the data from the course site and read it into R. Or use online data link: read.csv(“https://ucla.box.com/shared/static/e9xuft4h3p8fdi4ydoj2hhujee0vmopb.csv”) When you read in the data, name your object “flint”. b. The EPA states a water source is especially dangerous if the lead level is 15 PPB or greater. What proportion of the locations tested were found to have dangerous lead levels? - .04436229% c. Report the mean copper level for only test sites in the North region . - 44.6424 d. Report the mean copper level for only test sites with dangerous lead levels (at least 15 PPB) . - 141.9631 e. Report the mean lead and copper levels. - 54.581 copper levels - 3.383 f. Create a box plot with a good title for the lead levels. - g. Based on what you see in part (f), does the mean seem to be a good measure of center for the data? Report a more useful statistic for this data - No, The median would be better because the data is skewed . Exercise 2 The data here represent life expectancies (Life) and per capita income (Income) in 1974 dollars for 101 countries in the early 1970’s. The source of these data is: Leinhardt and Wasserman (1979), New York Times (September, 28, 1975, p. E-3). They also appear on Regression Analysis by Ashish Sen and Muni Srivastava. You can access these data in R using: life <-read.table("https://ucla.box.com/shared/static/rqk4lc030pabv30wknx2ft9jy848ub9n.txt", header = TRUE) a. Construct a scatterplot of Life against Income. Note: Income should be on the horizontal axis. How does income appear to affect life expectancy?
- The higher your income the more likely you are to live past 70, the less money you have the more likely you are to die around 50. b. Construct the boxplot and histogram of Income. Are there any outliers? - Boxplot: There were some outliers around 3000 to 5000
- Histogram: I don't see any outliers c. Split the data set into two parts: One for which the Income is strictly below $1000, and one for which the Income is at least $1000. Come up with your own names for these two objects. - lowerthan1000 = life[life$Income < 1000,] - - Above1000 = life[life$Income > 1000,] d. Use the data for which the Income is below $1000. Plot Life against Income and compute the correlation coefficient. Hint: use the function cor() - 0.752886 Exercise 3 The Maas river data contain the concentration of lead and zinc in ppm at 155 locations at the banks of the Maas river in the Netherlands. You can read the data in R as follows: maas <- read.table("https://ucla.box.com/shared/static/tv3cxooyp6y8fh6gb0qj2cxihj8klg1h.txt", header = TRUE) a. Compute the summary statistics for lead and zinc using the summary() function. - Lead: Min. 1st Qu. Median Mean 3rd Qu. Max.
- 37.0 72.5 123.0 153.4 207.0 654.0 - Zinc: Min. 1st Qu. Median Mean 3rd Qu. Max. - 113.0 198.0 326.0 469.7 674.5 1839.0 - b . Plot two histograms: one of lead and one of log(lead). Lead:
Log: c . Plot log(lead) against log(zinc). What do you observe? - The correlation coefficient is positive and the graph is linear - d. The level of risk for surface soil based on lead concentration in ppm is given on the table below: The following commands give different colors and sizes on a scatterplot For two variables: x, y
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help