BME 335 - Homework 3 - Solutions

.pdf

School

University of Texas *

*We aren’t endorsed by this school

Course

335

Subject

Mechanical Engineering

Date

Dec 6, 2023

Type

pdf

Pages

Uploaded by DrSteelSquid41

BME 335 – HOMEWORK 3 Due Sept 16, 2022 at 11:59 pm on Gradescope Homework should be completed individually. Homework is graded out of 60 pts. There is a total of 120 pts possible in this homework set. You may do as many problems as you want to earn as many points as possible, with a maximum score of 60 points. For example, if you earn 50 pts, your score will be 50/60. If you earn 70 pts, your score will be 60/60. If you earn 120 pts, your score will be 60/60. It is to your advantage to try as many problems as possible to maximize your score. NOTE: All problem narratives are fictitious unless otherwise indicated. 1. In the accompanying graph of a normal distribution, each of the two red areas represent 1/6 of the area under the curve. Estimate the following quantities from the graph. [10 pts] 1.1. The mean [2 pts] 14 1.2. The mode [2 pts] 14 (because for a normal probability density the most common value is the mean) 1.3. The median [2 pts] 14 (because the normal probability density is symmetric about the mean) 1.4. The standard deviation [2 pts] 2 (since 2*1/6 = 1/3 of the data lies outside the range [12,16], and we know approximately 2/3 of the data lie within one standard deviation of the mean, the standard deviation is approximately equal to 2) 1.5. The variance [2 pts] 4 (the square of the standard deviation) 2. Consider the following: [10 pts] 2.1. Sketch the probability density function for a variable with a normal distribution with mean equal to 14 cm and variance of 4 cm 2 . Label the axes appropriately and mark on the x-axis what the mean is. [3 pts] See below plot (blue line) 2.2. Add to your sketch the probability density function for the means of samples taken from the above distribution, if each sample was a random sample of size n = 10 . [3 pts] See below plot (red line) Intuition and Comprehension [30 pts]

2.3. How does the mean of the sampling distribution relate to the mean of the original distribution (i.e. 14 cm)? How does the standard deviation of the sampling distribution relate to the standard deviation of the original distribution? [4 pts] The mean of the sampling distribution is the same as the mean of the original distribution. The standard deviation of the sampling distribution is scaled by the square-root of the sample size, and thus is always smaller than that of the original distribution. In this case, the standard deviation of the sampling distribution is √ 10 = 3.16x smaller. 3. Dr. Anita Petty is interested in how white blood cell counts change during infection of the Marburg virus. In a cohort of mice, she measures white blood cell prior to infection and after infection, and computes a normalized differential white blood cell count (([Before infection] – [After infection])/[Before infection]). Let X be the continuous variable used to describe the normalized differential white blood cell count for a mouse in the research study. 3.1. What is Pr(|X| > 2)? Write your answer in terms of probability statements without absolute values. [3 pts] Pr(|X| > 2) = Pr(X < -2 or X > 2) = Pr(X < -2) + Pr(X > 2) since events are mutually exclusive 3.2. Dr. Petty determines that the normalized differential white blood cell counts follow a standard normal distribution. Using this information, solve for Pr(|X| > 2). Use R (hint: use pnorm() function) or a statistical table to get a final numerical value. [3 pts] Pr(|X| > 2) = Pr(X < -2) + Pr(X > 2) = 0.0228 + 0.0228 = 0.0456 3.3. Dr. Petty realized that actually data is best described with a normal distribution with mean 0 and standard deviation 1.2 (rather than 1). Would this make the probability computed above smaller or larger? Why? [4 pts] Since higher standard deviation means more spread, increasing the standard deviation will make the probability computed above larger because there is more area in the tails. 4. The following data are temperatures (in Fahrenheit) reported from a random of days during the time period July 1, 2022 – Aug 31, 2022. Use this data to complete the following problems. [10 pts] R Application [30 pts]

Temperatures: 79, 80, 82, 83, 86, 85, 86, 86, 88, 87, 87, 89, 89, 90, 92, 94, 92, 94, 96, 95, 95, 95, 96, 98, 98, 98, 101, 103, 101, 102, 99, 98, 100, 98, 100 4.1. In R, compute the sample mean and sample standard deviation for the temperature data. Provide the code and values in your answer. [2 pts] ################################################ # Problem 4.1 - Descriptive statistics ############################################### # Load data from csv file. Data has header "Temps". temp_data <- read.csv(file = "temperatures_in_austin.csv", header = TRUE, sep = ",",stringsAsFactors=FALSE) # Compute sample mean mean_temp <- mean(temp_data$Temps) # 92.63 # Compute the standard error sd_temp <- sd(temp_data$Temps) # 6.71 4.2. In R, compute the standard error of the mean for your sample estimate. Provide the code and value in your answer. [1 pts] ################################################ # Problem 4.2 - SEM ############################################### sem_temp <- sd_temp/sqrt(length(temp_data$Temps)) # 1.13 4.3. In R, generate a set of random numbers of the same size as the original sample. Generate these values from a normal distribution with mean and standard deviation equal to your sample estimates from 4.1 (hint: use rnorm() function). Compute the mean of this random set of value. Provide the code and values in your answer. [2 pts] ################################################ # Problem 4.3 - Random values ############################################### # determine number of original values n <- length(temp_data$Temps) # 35 # generate random values random_num <- rnorm(n, mean = mean_temp, sd = sd_temp) # compute mean of random values mean_random_num <- mean(random_num) # 94.00 *Note: values for mean will vary since it comes from randomly generated values. 4.4. In R, repeat 4.3 a hundred times, i.e. make 100 sets of randomly generated values, with each set the same size as the original population. For each set, compute the mean. Generate a histogram of the mean values. What does this histogram describe? Provide the code and figure in your answer. [5 pts] The histogram describes the sampling distribution of the sample means for the randomly generated values. Since the original sample values are normally distributed and have a large sample size, we expect this histogram to also look approximately normal.

################################################ # Problem 4.4 - Histogram ############################################### # Define parameters num_sets <- 100 # number of sets of random values mean_random_num <- rep(NA,num_sets) # placeholder variable for probabilities # Use for loop to generate each set of random values for (i in 1:num_sets) { # Generate random values random_num <- rnorm(n, mean = mean_temp, sd = sd_temp) # Compute mean for each set mean_random_num[i] <- mean(random_num) } # Make histogram hist(mean_random_num, main= "Frequency Distribution of Sample Means", xlab="Sample means of randomly generated temps (F)", ylab="Frequency", xlim = c(85, 100),ylim = c(0, 20), breaks = 10) # Save plot dev.copy(png,'Figure4-4.png') dev.off() *Note: plots will vary since it comes from randomly generated values. 5. Dr. Yasemin Avalos collected the following measurements of diameter growth rates of malignant tumors in mm/day. Use these data to answer the following questions. [10 pts] Tumor growth rates: 0.0, 0.1, 0.1, 0.3, 0.4, 0.4, 0.4, 0.5, 0.5, 0.5, 0.6, 0.6, 0.7, 0.7, 0.7, 0.7, 0.8, 0.8, 0.8, 1.2, 1.2, 1.3, 1.4, 1.6, 1.9, 2.0, 2.1, 2.1, 2.2, 2.2, 2.3 2.4, 2.5, 2.5, 2.7, 2.7, 2.7, 2.7, 2.8, 3.1

5.1. In R, make a histogram showing the frequency distribution of the data. Provide the code and figure in your answer. [2 pts] ################################################ # Problem 5.1 - MAke histogram ############################################### # Load data from csv file. Data has header "GrowthRate". tumor_data <- read.csv(file = "tumor_growth_rates.csv", header = TRUE, sep = ",",stringsAsFactors=FALSE) # Make histogram hist(tumor_data$GrowthRate, main= "Frequency Distribution for Tumor Growth Rates", xlab="Tumor growth rates (mm/day)", ylab="Frequency", xlim = c(0, 3.5),ylim = c(0, 15), breaks = 10) # Save plot dev.copy(png,'Figure5-1.png') dev.off() 5.2. In R, compute the mean and variance of the data. Provide the code and values in your answer. [2 pts] ################################################ # Problem 5.2 - Descriptive statistics ############################################### # Compute descriptive statistics mean_tumor <- mean(tumor_data$GrowthRate) # 1.38 var_tumor <- var(tumor_data$GrowthRate) # 0.9006 5.3. In R, randomly generate 40 values from a normal distribution with mean and variance as computed in 5.2. Make a frequency distribution plot from these values. Provide the code and figure in your answer. [3 pts]

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version