Assignment 2 STAT1070

.pdf

School

The University of Newcastle *

*We aren’t endorsed by this school

Course

1070

Subject

Statistics

Date

May 22, 2024

Type

pdf

Pages

9

Uploaded by ChancellorPolarBearMaster1054

STAT1070 – Assignment 2 C3380626 – Jordan Proctor Jones QUESTION 1 Using appropriate graphs and statistics, describe the relationship between distance lived from campus and the type of enrolment. From the histogram and boxplot in Figure 1, the distribution of satisfaction appears left skewed in both enrolments. The Centre descriptives shows a mean for Online of a mean of 17.1 and median 13.1 and face-to-face a mean of 10.0 and median of 6.20, further supporting the shape to be left- skewed. The Spread for online descriptive showed a standard deviation of 16.4 and IQR of 18.3, for face-to-face a standard deviation of 12.4 and IQR of 8.03. Shown in both diagram there appears to be multiple outliers in the upper end of the boxplot both for online and face-to-face. Figure 1: Side-by-side boxplot/histogram and descriptive statistics for Question 1a Are the online and face-to-face samples paired or independent? Write a sentence justifying your choice. Both online and face-to-face variable are not associated with one another as there sample of 50 students are 2 different samples, resulting in being an independent sample. Is there evidence that the average distance lived from campus is different for students enrolled in online classes and students enrolled in face-to-face classes? Conduct the appropriate test in Jamovi and include relevant output. Be sure to define any parameters you use, state the null and alternative hypotheses, observed test statistic, null distribution, p-value, decision and provide an appropriate conclusion in plain language. Let µ d be the true average difference in the distance people lived from campus who are enrolled in online and face-to-face classes.
The hypotheses are: 𝐻𝐻 0 𝜇𝜇 𝑑𝑑 = 0 𝐻𝐻 𝐴𝐴 𝜇𝜇 𝑑𝑑 0 The test statistic is t = 2.44 (see statistic column in Figure 2) Null distribution: If H 0 is true t ~ t 98 , where the degrees of freedom is determined as n – 1 = 98 or observing the df column of Figure 2. The p-value given by is: 2 𝑃𝑃 ( 𝑡𝑡 98 2.44) 0.017 Conclusion: Due the p-value being small, we reject the H 0 and conclude that there is strong evidence to suggest there is a difference in the average of those who study online and face-to-face and the distance they live from campus. Independent Samples T-Test 95% Confidence Interval Statistic df p Mean difference SE difference Lower Upper Distance Student's t 2.44 98.0 0.017 7.08 2.90 1.32 12.8 Figure 2: Jamovi output for independent t-test for different distances of students enrolled in face-to-face and online. Report the 95% confidence interval using Jamovi for the difference in average distance lived from campus. Write a sentence interpreting this interval in plain language. From Figure 2, a 95% confidence interval for µ d is (1.32, 12.8). This means that from the data we can be 95% confident that the average percentage of student who enrolled in face-to-face and online is between 1.32% and 12.8% Does this confidence interval from (d) support the decision made in part (c)? The confidence interval does not support the decision made in part c, as it does not contain 0, which was the claimed value of µ d under the null hypothesis and therefore the decision to reject the null hypothesis was aligned. What are the assumptions of your analyses in parts (c) and (d)? Are these assumptions met? Justify why or why not for each assumption, with appropriate references to Jamovi output where needed. For both the independent t-test and the confident interval for µ d we assume that: - The 2 samples are independent. - The sample differences are from a normal population or the sample size in each sample is large enough to rely on the Central Limit Theorem.
- We are told that the samples were sampled from distances the student lives from campus, indicating the samples are independent. - The normal quantile plot in Figure 3 shows that the points do not fall well along the expected line, therefore the assumption is not normally distributed. Figure 3: Normal quantile plot of the sample of difference. Defining the variable distance – enrolment. Question 2 income level and school life expectancy using descriptive statistics. Is there evidence of a difference in average school life expectancy values among the four income levels? state the null and alternative hypotheses, observed test statistic, null distribution, p-value, decision and provide an appropriate conclusion in plain language. Let µ L , µ LM, µ UM, and µ H be the population mean of school life expectancy in the 4 levels of income, low income, low-middle income, upper-middle income, and high income, respectively. To proceed with the analysis of variance, ANOVA output is produced by Jamovi Figure 4. ANOVA - School life expectancy Sum of Squares df Mean Square F p Income Level 376 3 125.25 38.0 < .001 Residuals 185 56 3.30 Figure 4: AVOVA output for the School life expectancy among income level.
The hypothesis: 𝐻𝐻 0 : 𝜇𝜇 𝐿𝐿 = 𝜇𝜇 𝐿𝐿𝐿𝐿 = 𝜇𝜇 𝑈𝑈𝐿𝐿 = 𝜇𝜇 𝐻𝐻 𝐻𝐻 𝐴𝐴 ∶ 𝑎𝑎𝑡𝑡 𝑙𝑙𝑙𝑙𝑎𝑎𝑙𝑙𝑡𝑡 2 𝜇𝜇 𝒾𝒾 𝑎𝑎𝑎𝑎𝑙𝑙 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑙𝑙𝑎𝑎𝑙𝑙𝑑𝑑𝑡𝑡 𝑡𝑡𝑡𝑡 𝑙𝑙𝑎𝑎𝑒𝑒ℎ 𝑡𝑡𝑡𝑡ℎ𝑙𝑙𝑎𝑎 The test statistic is F = 38.0 shown in Figure 4 as shown in the F column for income level . Null distribution: If H 0 is true F ~ F 3, 56 . The degrees of freedom are shown in the df column of the Income Level and residuals rows of Figure 4 The p-value is given by: The p-value = P(F 3, 56 > 38.0) <0.001 ( see p column of income level in Figure 4) Conclusion: Because the p-value is small, reject H 0 . There is strong evidence that at least two income levels have significantly different average in school life expectancy. If appropriate, perform post-hoc tests to determine which income levels have significantly different average school life expectancies. If post-hoc tests are not appropriate, explain the purpose of a post- hoc test and why it’s not appropriate in this example. Due to rejecting H 0 , it is appropriate to consider the post-hoc tests, as there is evidence stating that there at least 2 means that are different as the p-value was low. However a post-hoc test is needed to determine which means are different. Post Hoc Comparisons - Income Level Comparison Income Level Income Level Mean Difference SE df t p tukey Low income - Lower-middle income -2.06 0.712 56.0 -2.89 0.027 - Upper-middle income -4.38 0.654 56.0 -6.69 < .001 - High income -6.92 0.688 56.0 - 10.05 < .001 Lower-middle income - Upper-middle income -2.32 0.654 56.0 -3.55 0.004 - High income -4.86 0.688 56.0 -7.06 < .001 Upper-middle income - High income -2.54 0.627 56.0 -4.05 < .001 Figure 5: Post-hoc output for the average school life expectancy across 4 types of income levels. The post-hoc test suggests that: - The mean of the average school life expectancy for lower income was not significantly different from the mean of the average life expectancy for lower-middle income. Therefore
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help