MNET 315 Ch 11 Text Nonparametric Tests (Missing from book)
.pdf
keyboard_arrow_up
School
New Jersey Institute Of Technology *
*We aren’t endorsed by this school
Course
315
Subject
Statistics
Date
Jan 9, 2024
Type
Pages
53
Uploaded by LieutenantKookabura2603
C H A P T E R 11
582
Nonparametric Tests
11.1
The Sign Test
11.2
The Wilcoxon Tests
Case Study
11.3
The Kruskal-Wallis Test
11.4
Rank Correlation
11.5
The Runs Test
Uses and Abuses
Real Statistics—Real Decisions
Technology
In a recent year, the most common form of reported identity theft was employment- or tax-related fraud, which accounted for 34% of cases. The second most common form was credit card fraud, which accounted for 33% of cases.
583
Where You’re Going
In this chapter, you will study additional statistical tests that do not require the population distribution to meet any specific conditions. Each of these tests has usefulness in real-life applications.
With the data above, the number of fraud complaints F
and the number of identity theft victims V
can be related by the regression equation V
=
0.145
F
+
429.103. The correlation coefficient is approximately 0.915, so there is a strong positive correlation. You can determine that the correlation is significant by using Table 11 in Appendix B. Further analysis of the data, however, can show that the variables do not appear to have a bivariate normal distribution, which is one of the requirements for using the Pearson correlation coefficient.
So, although a simple correlation test might indicate a relationship between the number of fraud complaints and the number of identity theft victims, one might question the results because the data do not fit the requirements for the test. Similar tests you will study in this chapter, such as Spearman’s rank correlation test, will give you additional information. The Spearman’s rank correlation coefficient for this data is approximately 0.962. At a
=
0.01, there is in fact a significant correlation between the number of fraud complaints and the number of identity theft victims for each state.
Fraud complaints
Identity theft victims
x
y
Number of Fraud Complaints
and Identity Theft Victims
for 25 States
20,000
40,000
60,000
80,000
100,000 120,000
5,000
10,000
15,000
20,000
25,000
Where You’ve Been
Up to this point in the text, you have studied dozens of different statistical formulas and tests that can help you in a decision-making process. Specific conditions had to be satisfied in order to use these formulas and tests.
Suppose it is believed that as the number of fraud complaints in a state increases, the number of identity theft victims also increases. Can this belief be supported by actual data? The table below shows the numbers of fraud complaints and the numbers of identify theft victims for 25 randomly selected states in a recent year. (Source: Federal Trade Commission)
Fraud complaints
39,344
45,528
33,745
21,117
7593
117,189
5768
7800
14,635
Identity theft victims
4007
8748
6203
4933
1484
12,787
789
1348
2532
Fraud complaints
5642
48,594
107,557
4600
25,636
7525
112,006
77,213
Identity theft victims
1170
8251
17,430
711
3993
1352
20,205
11,009
Fraud complaints
20,350
22,385
7206
2775
51,036
12,750
40,423
9948
Identity theft victims
3337
4312
1216
503
5718
2540
8310
1093
The Sign Test
11.1
584
CHAPTER 11 Nonparametric Tests
What You Should Learn
How to use the sign test to test a population median
How to use the paired-sample sign test to test the difference between two population medians (dependent samples)
The Sign Test for a Population Median The Paired-Sample Sign Test
The Sign Test for a Population Median
Many of the hypothesis tests studied so far have imposed one or more requirements for a population distribution. For instance, some tests require that a population must have a normal distribution, and other tests require that population variances be equal. What should you do when such requirements cannot be met? For these cases, statisticians have developed hypothesis tests that are “distribution free.” Such tests are called nonparametric tests.
A nonparametric test
is a hypothesis test that does not require any specific conditions concerning the shapes of population distributions or the values of population parameters.
DEFINITION
Nonparametric tests are usually easier to perform than corresponding parametric tests. They are, however, usually less efficient than parametric tests. Stronger evidence is required to reject a null hypothesis using the results of a nonparametric test. Consequently, whenever possible, you should use a parametric test. One of the easiest nonparametric tests to perform is the sign test.
The only condition necessary to use a sign test is that the sample is randomly selected.
The sign test
is a nonparametric test that can be used to test a population median against a hypothesized value k.
DEFINITION
The sign test for a population median can be left-tailed, right-tailed, or two-tailed. The null and alternative hypotheses for each type of test are shown below.
Left-tailed test: H
0
: median
Ú
k
and H
a
: median
6
k
Right-tailed test: H
0
: median
…
k
and H
a
: median
7
k
Two-tailed test: H
0
: median
=
k
and H
a
: median
≠
k
To use the sign test, first compare each entry in the sample with the hypothesized median k
. When the entry is below the median, assign it a -
sign; when the entry is above the median, assign it a +
sign; and when the entry is equal to the median, assign it a 0. Then compare the number of +
and -
signs. (The 0’s are ignored.) When there is a large difference between the number of +
signs and the number of -
signs, it is likely that the median is different from the hypothesized value and you should reject the null hypothesis.
Study Tip
For many nonparametric tests, statisticians test the median instead of the mean.
SECTION 11.1 The Sign Test
585
Table 8 in Appendix B lists the critical values for the sign test for selected levels of significance and sample sizes. When the sign test is used, the sample size n
is the total number of +
and -
signs. When the sample size is greater than 25, you can use the standard normal distribution to find the critical values.
When n
…
25, the test statistic for the sign test is x
, the smaller number of +
or -
signs.
When n
7
25, the test statistic
for the sign test is
z
=
1
x
+
0.5
2
-
0.5
n
1
n
2
where x
is the smaller number of +
or -
signs and n
is the sample size, i.e., the total number of +
and -
signs.
Test Statistic for the Sign Test
Because x
is defined to be the smaller number of +
or -
signs, the rejection region is always in the left tail. Consequently, the sign test for a population median is always a left-tailed test or a two-tailed test. When the test is two-tailed, use only the left-tailed critical value. (When x
is defined to be the larger number of +
or -
signs, the rejection region is always in the right tail. Right-tailed sign tests are presented in the exercises.)
Performing a Sign Test for a Population Median
In Words In Symbols
1.
Verify that the sample is random.
2.
Identify the claim. State the null State H
0
and H
a
. and alternative hypotheses.
3.
Specify the level of significance. Identify a
.
4.
Determine the sample size n
by n
=
total number of
assigning +
signs, -
signs, and 0’s +
and -
signs to the sample data.
5.
Determine the critical value. When n
…
25, use Table 8 in Appendix B.
When n
7
25, use Table 4 in Appendix B.
6.
Find the test statistic. When n
…
25, use x
=
smaller number of +
or -
signs.
When n
7
25, use
z
=
1
x
+
0.5
2
-
0.5
n
1
n
2
.
7.
Make a decision to reject or fail If the test statistic is less than
to reject the null hypothesis. or equal to the critical value, then reject H
0
. Otherwise, fail to reject H
0
. 8.
Interpret the decision in the context of the original claim.
GUIDELINES
Study Tip
Because the 0’s are ignored, there are two possible outcomes when comparing a data entry with a hypothesized median: a +
or a -
sign. If the median is k
, then about half of the values will be above k
and half will be below. As such, the probability for each sign is 0.5. Table 8 in Appendix B is constructed using the binomial distribution where p
=
0.5.
When n
7
25, you can use the normal approximation (with a continuity correction) for the binomial. In this case, use m
=
np
=
0.5
n
and s
=
1
npq
=
1
n
2
.
586
CHAPTER 11 Nonparametric Tests
Using the Sign Test
A website administrator for a company claims that the median number of visitors per day to the company’s website is no more than 1500. An employee doubts the accuracy of this claim. The numbers of visitors per day for 20 randomly selected days are listed below. At a
=
0.05, can the employee reject the administrator’s claim?
1469 1462 1634 1602 1500 1463 1476 1570 1544 1452 1487 1523 1525 1548 1511 1579 1620 1568 1492 1649
SOLUTION
The claim is “the median number of visitors per day to the company’s website is no more than 1500.” So, the null and alternative hypotheses are
H
0
: median
…
1500 (Claim)
and H
a
: median
7
1500.
To compare each data entry with the hypothesized median 1500, subtract 1500 from each data entry and assign the appropriate sign or 0. For instance, here are the comparisons for the first row of data entries.
1469
-
1500
=
-
31, assign a -
sign
1462
-
1500
=
-
38, assign a -
sign
1634
-
1500
=
+
134, assign a +
sign
1602
-
1500
=
+
102, assign a +
sign
1500
-
1500
=
0, assign a 0
The results of comparing each data entry with the hypothesized median 1500 are shown.
-
-
+
+
0 -
-
+
+
-
-
+
+
+
+
+
+
+
-
+
You can see that there are 7 -
signs and 12 +
signs. So, n
=
12
+
7
=
19. Because n
…
25, use Table 8 in Appendix B to find the critical value. The test is a one-tailed test with a
=
0.05 and n
=
19. So, the critical value is 5. Because n
…
25, the test statistic x
is the smaller number of +
or -
signs. So, x
=
7. Because x
=
7 is greater than the critical value, the employee should fail to reject the null hypothesis.
Interpretation
There is not enough evidence at the 5% level of significance for the employee to reject the website administrator’s claim that the median number of visitors per day to the company’s website is no more than 1500.
TRY IT YOURSELF 1
A real estate agency claims that the median number of days a home is on the market in its city is greater than 120. A homeowner wants to verify the accuracy of this claim. The numbers of days on the market for 24 randomly selected homes are shown below. At a
=
0.025, can the homeowner support the agency’s claim?
118 167 72 79 76 106 102 113 73 119 162 114 120 93 135 147 77 157 115 88 152 70 65 91
Answer: Page T1
EXAMPLE 1
SECTION 11.1 The Sign Test
587
Using the Sign Test
An organization claims that the median annual attendance for museums in the United States is at least 39,000. A random sample of 125 museums reveals that the annual attendances for 79 museums were less than 39,000, the annual attendances for 42 museums were more than 39,000, and the annual attendances for 4 museums were 39,000. At a
=
0.01, is there enough evidence to reject the organization’s claim? (Adapted from American Association of Museums)
SOLUTION
The claim is “the median annual attendance for museums in the United States is at least 39,000.” So, the null and alternative hypotheses are
H
0
: median
Ú
39,000 (Claim)
and H
a
: median
6
39,000.
Because n
7
25, use Table 4 in Appendix B, the Standard Normal Table, to find the critical value. Because the test is a left-tailed test with a
=
0.01, the critical value is z
0
=
-
2.33. Of the 125 museums, there are 79 -
signs and 42 +
signs. When the 0’s are ignored, the sample size is
n
=
79
+
42
=
121, and x
=
42.
With these values, the test statistic is
z
=
1
42
+
0.5
2
-
0.5
1
121
2
2
121
2
=
-
18
5.5
≈
-
3.27.
The figure shows the location of the rejection region and the test statistic z
.
Because z
is less than the critical value, it is in the rejection region. So, you reject the null hypothesis.
z
−
3
−
4
−
2
−
1
0
1
2
3
4
z
0
= −
2.33
α
= 0.01
z
≈
−
3.27
Interpretation
There is enough evidence at the 1% level of significance to reject the organization’s claim that the median annual attendance for museums in the United States is at least 39,000.
TRY IT YOURSELF 2
An organization claims that the median age of museum workers in the United States is 46 years old. A random sample of 95 museum workers reveals that 57 museum workers were less than 46 years old, 34 museum workers were more than 46 years old, and 4 museum workers were 46 years old. At a
=
0.10, can you reject the organization’s claim? (Adapted from American Association of Museums)
Answer: Page T1
EXAMPLE 2
Picturing the World
For recent college graduates in the United States, a financial analyst claims that the median auto loan is $21,883. A random sample of recent college graduates reveals that the loans for 42 graduates were less than $21,883 and the loans for 35 graduates were greater than $21,883. (Adapted from lendedu.com)
Would you use a parametric test or a nonparametric test to test the claim that for recent college graduates in the United States, the median auto loan is $21,883? Explain your reasoning.
Study Tip
When performing a two-tailed sign test, remember to use only the left-tailed critical value.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Recommended textbooks for you
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt
Recommended textbooks for you
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw HillBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin Harcourt
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt