Due to financial hardship, the Nyke shoe company feels they only need to make one size of shoes, regardless of gender or height. They have collected data on gender, shoe size, and height and have asked you to tell them if they can change their business model to include only one size of shoes – regardless of height or gender of the wearer. In no more 5-10 pages (including figures), explain your recommendations, using statistical evidence to support your findings. The data found are below:
Assuming unequal variances, the two sampled t-Test was applied on the data sets of female and male shoe sizes with the alpha value of 0.05. The null hypothesis was that the female and male shoe sizes have an equal mean while the alternative hypothesis was that female and male shoe sizes do not have an equal mean. With the degrees of freedom being 27, the t-statistic is -8.16. The probability that -8.16 is ≤ -1.70 is 4.5×10-9 for the one-tailed test. Also, the probability that -8.16 is ≤ ±2.05. is 9.1×10-9 for the two-tailed test. Given that both probabilities are under the alpha value of 0.05, the null hypothesis is therefore rejected, and the alternative hypothesis is accepted at the 95% confidence level.
Then we graphed a scatterplot, graphed clusters of dots, which represented the values of the two variables, on the ratings people gave for eating at home verse ones eating out. We got the mean, the arithmetic average of a distribution, median, the middle score in a distribution, mode, the value in the set that occurs most often, range, the difference between the highest and lowest scores in a distribution, and standard deviation, a computed measure of how much scores vary around the mean score, for the data we collected. For communication at home we were given the scores, respectively: 9,4,4,2,2,7,2,5,4,9,2,8,6,7,4,3,9,9. For communication eating out we were given the respective scores: 7,3,6,2,2,6,1,5,6,8,3,7,6,3,7,6,5,4,5,6. So for communication at home, we got a median of 4.5, a standard deviation of 2.74, a range of 7, a mean of 5.4, and a mode of 9. For communication eating out, we got a mean of 4.9, a standard deviation of 1.97, a range of 7, a median of 5.5, and a mode of 4. Then when you reached a correlation, a measure of the extent to which two favorites vary together, and thus of how well either factors predicts the
A scatter plot diagram provides a graphical observation of how two different variables are related to one another. Looking at the data collected for credit balance of customers along with the data collected for income of customers, it’s easy to recognize that there is a correlation between the two variables. The linear positive slope indicates that an increase in the credit balance correlates with an increase in income.
A pharmaceutical company is testing the effectiveness of a new drug for lowering cholesterol. As part of this trial, they wish to determine whether there is a difference between the effectiveness for women and for men. Using = .05, what is the value the test statistic?
In How to Lie with Statistics (Huff, 1954), Darrel Huff deciphers statistical examples and explains the means of deception that statistics and statisticians sometimes use to relay false information. Huff also conveys an underlying message of don’t believe everything you’re told, something him and my mother have in common. At first glance, a reader might think that this book will teach people how to actually lie using statistics, but that is not the case. It gives the reader a glimpse or a behind the curtain view of how easily it is to be deceived using numbers and how it is slyly achieved. Ironically he calls the book How to Lie with Statistics almost to tease his audience that the content in this book is not as it appears. To my utmost surprise, I actually rather enjoyed this book. It was a fairly simple read that was filled with new information and showed me how to look closer at statistical figures in the future. The humor was spot on so much, so that I even chuckled aloud occasionally. For the icing on the cake, I even expanded my vocabulary to learn fun words such as rotogravure.
Why does the sampling distribution of the mean follow a normal distribution for a large enough sample size, even though the population may not be normally distributed?
Starting in 1972, the General Social Survey (GSS) used a four-category response scale for respondents to answer a question on how they view their own health, known as the self-reported health question (SRH) (Smith 2005, 1). The four-categories used were: poor, fair, good, and excellent (Smith et al. 2017, 385) Starting in 2002, the GSS started using both a four and five-category scale for people to respond to the SRH (Smith et al. 2017,1537). The five-category scale used the same measures from the four-category scale, but also included “very good” as the fifth option. The question is: which response category form gives a better ability to determine SRH among people?
The data provided from the collection of transactions provided details on the amount spent with the pages viewed. Grouping the number of pages viewed in increments and with the data in Figure A-1, compares the amount spent categorized by the number of pages viewed. Based on the information graphed it can be concluded that the shoppers who viewed 3-4 pages spent more but looking at the mean