STA2023 Lab 1 Fall 2021

.docx

School

University of Florida *

*We aren’t endorsed by this school

Course

2023

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

2

Uploaded by SuperComputerFinch28

Report
STA 2023 Fall 2021 Lab 1 Answer or perform the following. All answers, graphs, and explanations should be typed and displayed in a Microsoft Word document. Explanations should be clear and complete to receive full credit. You can use any program (Excel, StatCrunch, MiniTab, even your TI-83 calculator and take a photo of your screen). Part I: Box office totals The data is the 50 highest grossing films (as of 2021). Box office totals are measured in billions of dollars. Retrieve the data from Canvas. Do not include the data in the word document. 1) Identify the variables. The variables in the data are the year the movie was released, the ranking number, movie, and the revenue in billions. 2) Create a histogram of the box office totals. 3) Report the mean, standard deviation, and five-number summary. The mean for the revenue in billions is 1.2934, the standard deviation is 0.423, and the five number summary is 0.97, 1.05, 1.13,1.34, and 2.85. 4) Both the mean and median describe the center of the data set. Discuss why one value is larger than the other for the box office totals. The mean is higher than the median as a result of a skewed distribution. That side that is skewed determines which one is bigger and if it is symmetric then they are even. 5) Is the standard deviation or quartiles a better description for the spread of the data set? In general, when is the standard deviation a good representation of the spread? In general, when are the quartiles a good representation of the spread? Quartiles are a better measure of the spread of data. Quartiles are a better measurement for the spread when it comes to 1
STA 2023 Fall 2021 skewed data sets because it is not affected by outliers. The standard deviation is used to summarize continuous data, not categorical data. Furthermore, the standard deviation is generally only useful when the continuous data is not highly skewed or includes outliers. 6) Describe the distribution of the box office totals. This should address the overall pattern, shape, center, and spread. The box office totals are skewed to the left, making it a negative skew. The center of the graph shows that the revenue worth is increasing with the newer movies. The data set was skewed with there being a higher revenue in the newer movies. Part II: 7) Create a scatterplot with the explanatory variable being the Year and response the Revenue (in billions). Make sure to make a copy of the scatterplot in the word document. 8) Find the equation of the least square’s regression line. Y=0.005x+10.219 9) Interpret the correlation coefficient, and coefficient of determination (one sentence each). The correlation coefficient is r=0.089, this means that there is a negligible positive linear relationship. The coefficient of determination is r^2=0.008 is a very weak positive correlation, the coefficient is barely over 0 but because of the incline, it is considered positive. 10) Is year a good predictor of box office revenue? Give a 1-2 sentence reason why you think so (or not). The year is a good predictor of box office revenue because, as seen in the graphs, the revenue values were higher as the years increased. While there is a chance that an older movie is worth more than a new one, such as the 1997 movie Titanic that had a revenue of $2.19 billion, as the year increased the revenue value did too. 2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help