MNET 315 Ch 11 Text Nonparametric Tests (Missing from book)

.pdf

School

New Jersey Institute Of Technology *

*We aren’t endorsed by this school

Course

315

Subject

Statistics

Date

Jan 9, 2024

Type

pdf

Pages

53

Uploaded by LieutenantKookabura2603

Report
C H A P T E R 11 582 Nonparametric Tests 11.1 The Sign Test 11.2 The Wilcoxon Tests Case Study 11.3 The Kruskal-Wallis Test 11.4 Rank Correlation 11.5 The Runs Test Uses and Abuses Real Statistics—Real Decisions Technology In a recent year, the most common form of reported identity theft was employment- or tax-related fraud, which accounted for 34% of cases. The second most common form was credit card fraud, which accounted for 33% of cases.
583 Where You’re Going In this chapter, you will study additional statistical tests that do not require the population distribution to meet any specific conditions. Each of these tests has usefulness in real-life applications. With the data above, the number of fraud complaints F and the number of identity theft victims V can be related by the regression equation V = 0.145 F + 429.103. The correlation coefficient is approximately 0.915, so there is a strong positive correlation. You can determine that the correlation is significant by using Table 11 in Appendix B. Further analysis of the data, however, can show that the variables do not appear to have a bivariate normal distribution, which is one of the requirements for using the Pearson correlation coefficient. So, although a simple correlation test might indicate a relationship between the number of fraud complaints and the number of identity theft victims, one might question the results because the data do not fit the requirements for the test. Similar tests you will study in this chapter, such as Spearman’s rank correlation test, will give you additional information. The Spearman’s rank correlation coefficient for this data is approximately 0.962. At a = 0.01, there is in fact a significant correlation between the number of fraud complaints and the number of identity theft victims for each state. Fraud complaints Identity theft victims x y Number of Fraud Complaints and Identity Theft Victims for 25 States 20,000 40,000 60,000 80,000 100,000 120,000 5,000 10,000 15,000 20,000 25,000 Where You’ve Been Up to this point in the text, you have studied dozens of different statistical formulas and tests that can help you in a decision-making process. Specific conditions had to be satisfied in order to use these formulas and tests. Suppose it is believed that as the number of fraud complaints in a state increases, the number of identity theft victims also increases. Can this belief be supported by actual data? The table below shows the numbers of fraud complaints and the numbers of identify theft victims for 25  randomly selected states in a recent year. (Source: Federal Trade Commission) Fraud complaints 39,344 45,528 33,745 21,117 7593 117,189 5768 7800 14,635 Identity theft victims 4007 8748 6203 4933 1484 12,787 789 1348 2532 Fraud complaints 5642 48,594 107,557 4600 25,636 7525 112,006 77,213 Identity theft victims 1170 8251 17,430 711 3993 1352 20,205 11,009 Fraud complaints 20,350 22,385 7206 2775 51,036 12,750 40,423 9948 Identity theft victims 3337 4312 1216 503 5718 2540 8310 1093
The Sign Test 11.1 584 CHAPTER 11 Nonparametric Tests What You Should Learn How to use the sign test to test a population median How to use the paired-sample sign test to test the difference between two population medians (dependent samples) The Sign Test for a Population Median The Paired-Sample Sign Test The Sign Test for a Population Median Many of the hypothesis tests studied so far have imposed one or more requirements for a population distribution. For instance, some tests require that a population must have a normal distribution, and other tests require that population variances be equal. What should you do when such requirements cannot be met? For these cases, statisticians have developed hypothesis tests that are “distribution free.” Such tests are called nonparametric tests. A nonparametric test is a hypothesis test that does not require any specific conditions concerning the shapes of population distributions or the values of population parameters. DEFINITION Nonparametric tests are usually easier to perform than corresponding parametric tests. They are, however, usually less efficient than parametric tests. Stronger evidence is required to reject a null hypothesis using the results of a nonparametric test. Consequently, whenever possible, you should use a parametric test. One of the easiest nonparametric tests to perform is the sign test. The only condition necessary to use a sign test is that the sample is randomly selected. The sign test is a nonparametric test that can be used to test a population median against a hypothesized value k. DEFINITION The sign test for a population median can be left-tailed, right-tailed, or two-tailed. The null and alternative hypotheses for each type of test are shown below. Left-tailed test: H 0 : median Ú k and H a : median 6 k Right-tailed test: H 0 : median k and H a : median 7 k Two-tailed test: H 0 : median = k and H a : median k To use the sign test, first compare each entry in the sample with the hypothesized median k . When the entry is below the median, assign it a - sign; when the entry is above the median, assign it a + sign; and when the entry is equal to the median, assign it a 0. Then compare the number of + and - signs. (The 0’s are ignored.) When there is a large difference between the number of + signs and the number of - signs, it is likely that the median is different from the hypothesized value and you should reject the null hypothesis. Study Tip For many nonparametric tests, statisticians test the median instead of the mean.
SECTION 11.1 The Sign Test 585 Table 8 in Appendix B lists the critical values for the sign test for selected levels of significance and sample sizes. When the sign test is used, the sample size n is the total number of + and - signs. When the sample size is greater than 25, you can use the standard normal distribution to find the critical values. When n 25, the test statistic for the sign test is x , the smaller number of + or - signs. When n 7 25, the test statistic for the sign test is z = 1 x + 0.5 2 - 0.5 n 1 n 2 where x is the smaller number of + or - signs and n is the sample size, i.e., the total number of + and - signs. Test Statistic for the Sign Test Because x is defined to be the smaller number of + or - signs, the rejection region is always in the left tail. Consequently, the sign test for a population median is always a left-tailed test or a two-tailed test. When the test is two-tailed, use only the left-tailed critical value. (When x is defined to be the larger number of + or - signs, the rejection region is always in the right tail. Right-tailed sign tests are presented in the exercises.) Performing a Sign Test for a Population Median In Words In Symbols 1. Verify that the sample is random. 2. Identify the claim. State the null State H 0 and H a . and alternative hypotheses. 3. Specify the level of significance. Identify a . 4. Determine the sample size n by n = total number of assigning + signs, - signs, and 0’s + and - signs to the sample data. 5. Determine the critical value. When n 25, use Table 8 in Appendix B. When n 7 25, use Table 4 in Appendix B. 6. Find the test statistic. When n 25, use x = smaller number of + or - signs. When n 7 25, use z = 1 x + 0.5 2 - 0.5 n 1 n 2 . 7. Make a decision to reject or fail If the test statistic is less than to reject the null hypothesis. or equal to the critical value, then reject H 0 . Otherwise, fail to reject H 0 . 8. Interpret the decision in the context of the original claim. GUIDELINES Study Tip Because the 0’s are ignored, there are two possible outcomes when comparing a data entry with a hypothesized median: a + or a - sign. If the median is k , then about half of the values will be above k and half will be below. As such, the probability for each sign is 0.5. Table 8 in Appendix B is constructed using the binomial distribution where p = 0.5. When n 7 25, you can use the normal approximation (with a continuity correction) for the binomial. In this case, use m = np = 0.5 n and s = 1 npq = 1 n 2 .
586 CHAPTER 11 Nonparametric Tests Using the Sign Test A website administrator for a company claims that the median number of visitors per day to the company’s website is no more than 1500. An employee doubts the accuracy of this claim. The numbers of visitors per day for 20 randomly selected days are listed below. At a = 0.05, can the employee reject the administrator’s claim? 1469 1462 1634 1602 1500 1463 1476 1570 1544 1452 1487 1523 1525 1548 1511 1579 1620 1568 1492 1649 SOLUTION The claim is “the median number of visitors per day to the company’s website is no more than 1500.” So, the null and alternative hypotheses are H 0 : median 1500 (Claim) and H a : median 7 1500. To compare each data entry with the hypothesized median 1500, subtract 1500 from each data entry and assign the appropriate sign or 0. For instance, here are the comparisons for the first row of data entries. 1469 - 1500 = - 31, assign a - sign 1462 - 1500 = - 38, assign a - sign 1634 - 1500 = + 134, assign a + sign 1602 - 1500 = + 102, assign a + sign 1500 - 1500 = 0, assign a 0 The results of comparing each data entry with the hypothesized median 1500 are shown. - - + + 0 - - + + - - + + + + + + + - + You can see that there are 7 - signs and 12 + signs. So, n = 12 + 7 = 19. Because n 25, use Table 8 in Appendix B to find the critical value. The test is a one-tailed test with a = 0.05 and n = 19. So, the critical value is 5. Because n 25, the test statistic x is the smaller number of + or - signs. So, x = 7. Because x = 7 is greater than the critical value, the employee should fail to reject the null hypothesis. Interpretation There is not enough evidence at the 5% level of significance for the employee to reject the website administrator’s claim that the median number of visitors per day to the company’s website is no more than 1500. TRY IT YOURSELF 1 A real estate agency claims that the median number of days a home is on the market in its city is greater than 120. A homeowner wants to verify the accuracy of this claim. The numbers of days on the market for 24 randomly selected homes are shown below. At a = 0.025, can the homeowner support the agency’s claim? 118 167 72 79 76 106 102 113 73 119 162 114 120 93 135 147 77 157 115 88 152 70 65 91 Answer: Page T1 EXAMPLE 1
SECTION 11.1 The Sign Test 587 Using the Sign Test An organization claims that the median annual attendance for museums in the United States is at least 39,000. A random sample of 125 museums reveals that the annual attendances for 79 museums were less than 39,000, the annual attendances for 42 museums were more than 39,000, and the annual attendances for 4 museums were 39,000. At a = 0.01, is there enough evidence to reject the organization’s claim? (Adapted from American Association of Museums) SOLUTION The claim is “the median annual attendance for museums in the United States is at least 39,000.” So, the null and alternative hypotheses are H 0 : median Ú 39,000 (Claim) and H a : median 6 39,000. Because n 7 25, use Table 4 in Appendix B, the Standard Normal Table, to find the critical value. Because the test is a left-tailed test with a = 0.01, the critical value is z 0 = - 2.33. Of the 125 museums, there are 79 - signs and 42 + signs. When the 0’s are ignored, the sample size is n = 79 + 42 = 121, and x = 42. With these values, the test statistic is z = 1 42 + 0.5 2 - 0.5 1 121 2 2 121 2 = - 18 5.5 - 3.27. The figure shows the location of the rejection region and the test statistic z . Because z is less than the critical value, it is in the rejection region. So, you reject the null hypothesis. z 3 4 2 1 0 1 2 3 4 z 0 = 2.33 α = 0.01 z 3.27 Interpretation There is enough evidence at the 1% level of significance to reject the organization’s claim that the median annual attendance for museums in the United States is at least 39,000. TRY IT YOURSELF 2 An organization claims that the median age of museum workers in the United States is 46 years old. A random sample of 95 museum workers reveals that 57  museum workers were less than 46 years old, 34 museum workers were more than 46 years old, and 4 museum workers were 46 years old. At a = 0.10, can you reject the organization’s claim? (Adapted from American Association of Museums) Answer: Page T1 EXAMPLE 2 Picturing the World For recent college graduates in the United States, a financial analyst claims that the median auto loan is $21,883. A random sample of recent college graduates reveals that the loans for 42 graduates were less than $21,883 and the loans for 35 graduates were greater than $21,883. (Adapted from lendedu.com) Would you use a parametric test or a nonparametric test to test the claim that for recent college graduates in the United States, the median auto loan is $21,883? Explain your reasoning. Study Tip When performing a two-tailed sign test, remember to use only the left-tailed critical value.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help