BU1007 Business Data Analysis and Interpretation Singapore Campus, Study Period SP53, 2013 Statistical Report Analysis of Case study 3: Heavenly Chocolates website transactions Prepared for Dr Tjong Budisantoso Done by Mr. Keung, Tseung Student ID : 12776910 20/12/2013 Table of contents Introduction --------------------------------------------------------------------------- Question a --------------------------------------------------------------------------- Question b …show more content…
Vice versa, the less time spent and page viewed, the less payment made. The total amount of transaction have reached two peak points recorded $758.86 on Monday and $925.43 on Friday. From Tuesday to Thursday, has a slight decline from $414.86 to $294.03. Furthermore, on both Saturday and Sunday, the transactions have a relative low amount compared to the rest of the days, recorded $394.4 and $222.15, respectively. c). You have to develop a scatter diagram and compute a simple correlation coefficient to explore the relationship between the number of pages viewed and the amount ($) spent (use horizontal axis for the number of pages viewed). Correlations Analysis Step 1: Scatter Diagram: the amount ($) spent and the number of pages viewed Independent variable: Number of pages viewed Dependent variable: Amount Spent Step 2: Elaboration The Scatter diagram illustrate a strong positive linear correlation and a direct relationship between the two variables. To apply the Coefficient of Correlation (r) formula: Correlation coefficient = 0.9(Solved by Excel) When r is close to 1, which proves it is a positive correlation, so there is a direct relationship can be seen between the number of pages viewed and the amount spent , and the value of 0.9 is rather close to 1.00, so it can be concluded that the
A scatter plot diagram provides a graphical observation of how two different variables are related to one another. Looking at the data collected for credit balance of customers along with the data collected for income of customers, it’s easy to recognize that there is a correlation between the two variables. The linear positive slope indicates that an increase in the credit balance correlates with an increase in income.
Assuming unequal variances, the two sampled t-Test was applied on the data sets of female and male shoe sizes with the alpha value of 0.05. The null hypothesis was that the female and male shoe sizes have an equal mean while the alternative hypothesis was that female and male shoe sizes do not have an equal mean. With the degrees of freedom being 27, the t-statistic is -8.16. The probability that -8.16 is ≤ -1.70 is 4.5×10-9 for the one-tailed test. Also, the probability that -8.16 is ≤ ±2.05. is 9.1×10-9 for the two-tailed test. Given that both probabilities are under the alpha value of 0.05, the null hypothesis is therefore rejected, and the alternative hypothesis is accepted at the 95% confidence level.
(1) A study of the number of cars sold looked at the number of cars sold at 500
Answer = A visual representation of the relationship between the independent and the dependant variables. Either bar or line graph.
A pharmaceutical company is testing the effectiveness of a new drug for lowering cholesterol. As part of this trial, they wish to determine whether there is a difference between the effectiveness for women and for men. Using = .05, what is the value the test statistic?
Then we graphed a scatterplot, graphed clusters of dots, which represented the values of the two variables, on the ratings people gave for eating at home verse ones eating out. We got the mean, the arithmetic average of a distribution, median, the middle score in a distribution, mode, the value in the set that occurs most often, range, the difference between the highest and lowest scores in a distribution, and standard deviation, a computed measure of how much scores vary around the mean score, for the data we collected. For communication at home we were given the scores, respectively: 9,4,4,2,2,7,2,5,4,9,2,8,6,7,4,3,9,9. For communication eating out we were given the respective scores: 7,3,6,2,2,6,1,5,6,8,3,7,6,3,7,6,5,4,5,6. So for communication at home, we got a median of 4.5, a standard deviation of 2.74, a range of 7, a mean of 5.4, and a mode of 9. For communication eating out, we got a mean of 4.9, a standard deviation of 1.97, a range of 7, a median of 5.5, and a mode of 4. Then when you reached a correlation, a measure of the extent to which two favorites vary together, and thus of how well either factors predicts the
Due to financial hardship, the Nyke shoe company feels they only need to make one size of shoes, regardless of gender or height. They have collected data on gender, shoe size, and height and have asked you to tell them if they can change their business model to include only one size of shoes – regardless of height or gender of the wearer. In no more 5-10 pages (including figures), explain your recommendations, using statistical evidence to support your findings. The data found are below:
Answer: A positive correlation means that increases in the value of one variable are associated
c.)Find a 95% confidence interval for the difference between the above obtained mean starting salaries.
The data provided from the collection of transactions provided details on the amount spent with the pages viewed. Grouping the number of pages viewed in increments and with the data in Figure A-1, compares the amount spent categorized by the number of pages viewed. Based on the information graphed it can be concluded that the shoppers who viewed 3-4 pages spent more but looking at the mean
Why does the sampling distribution of the mean follow a normal distribution for a large enough sample size, even though the population may not be normally distributed?
c. What kind of display would you use to show the association between job class and mode of
In How to Lie with Statistics (Huff, 1954), Darrel Huff deciphers statistical examples and explains the means of deception that statistics and statisticians sometimes use to relay false information. Huff also conveys an underlying message of don’t believe everything you’re told, something him and my mother have in common. At first glance, a reader might think that this book will teach people how to actually lie using statistics, but that is not the case. It gives the reader a glimpse or a behind the curtain view of how easily it is to be deceived using numbers and how it is slyly achieved. Ironically he calls the book How to Lie with Statistics almost to tease his audience that the content in this book is not as it appears. To my utmost surprise, I actually rather enjoyed this book. It was a fairly simple read that was filled with new information and showed me how to look closer at statistical figures in the future. The humor was spot on so much, so that I even chuckled aloud occasionally. For the icing on the cake, I even expanded my vocabulary to learn fun words such as rotogravure.
Test results showed that most of the independent variables were positively related in the strength of the
Starting in 1972, the General Social Survey (GSS) used a four-category response scale for respondents to answer a question on how they view their own health, known as the self-reported health question (SRH) (Smith 2005, 1). The four-categories used were: poor, fair, good, and excellent (Smith et al. 2017, 385) Starting in 2002, the GSS started using both a four and five-category scale for people to respond to the SRH (Smith et al. 2017,1537). The five-category scale used the same measures from the four-category scale, but also included “very good” as the fifth option. The question is: which response category form gives a better ability to determine SRH among people?