Bartleby Sitemap - Textbook Solutions

All Textbook Solutions for Understandable Statistics: Concepts and Methods

Statistical Literacy In a statistical study what is the difference between an individual and a variable?Statistical Literacy Are data at the nominal level of measurement quantitative or qualitative?Statistical Literacy What is the difference between a parameter and a statistic?Statistical Literacy For a set population, does a parameter ever change? If there are three different samples of the same size from a set population, is it possible to get three different values for the same statistic?Critical Thinking Numbers are often assigned to data that are categorical in nature. (a) Consider these number assignments for category items describing electronic ways of expressing personal opinions: 1 Twitter; 2 e-mail; 3 text message; 4 Facebook; 5 blog Are these numerical assignments at the ordinal data level or higher? Explain. (b) Consider these number assignments for category items describing usefulness of customer service: 1 not helpful; 2 somewhat helpful; 3 very helpful; 4 extremely helpful Are these numerical assignments at the ordinal data level? Explain. What about at the interval level or higher? Explain.Interpretation Lucy conducted a survey asking some of her friends to specify their favorite type of TV entertainment from the following list of choices: sitcom; reality; documentary; drama; cartoon; other Do Lucys observations apply to all adults? Explain. From the description of the survey group, can we draw any conclusions regarding age of participants, gender of participants, or education level of participants?Marketing: Fast Food A national survey asked 1261 U.S. adult fast-food customers which meal (breakfast, lunch, dinner, snack) they ordered. (a) Identify the variable. (b) Is the variable quantitative or qualitative? (c) What is the implied population?Advertising: Auto Mileage What is the average miles per gallon (mpg) for all new hybrid small cars? Using Consumer Reports, a random sample of such vehicles gave an average of 35.7 mpg. (a) Identify the variable. (b) Is the variable quantitative or qualitative? (c) What is the implied population?Ecology: Wetlands Government agencies carefully monitor water quality and its effect on wetlands (Reference: Environmental Protection Agency Wetland Report EPA 832-R-93-005). Of particular concern is the concentration of nitrogen in water draining from fertilized lands. Too much nitrogen can kill fish and wildlife. Twenty-eight samples of water were taken at random from a lake. The nitrogen concentration (milligrams of nitrogen per liter of water) was determined for each sample. (a) Identify the variable. (b) Is the variable quantitative or qualitative? (c) What is the implied population?Archaeology: Ireland The archaeological site of Tara is more than 4000 years old. Tradition states that Tara was the seat of the high kings of Ireland. Because of its archaeological importance, Tara has received extensive study (Reference: Tara: An Archaeological Survey by Conor Newman, Royal Irish Academy, Dublin). Suppose an archaeologist wants to estimate the density of ferromagnetic artifacts in the Tara region. For this purpose, a random sample of 55 plots, each of size 100 square meters, is used. The number of ferromagnetic artifacts for each plot is determined. (a) Identify the variable. (b) Is the variable quantitative or qualitative? (c) What is the implied population?Student Life: Levels of Measurement Categorize these measurements associated with student life according to level: nominal, ordinal, interval, or ratio. (a) Length of time to complete an exam (b) Time of first class (c) Major held of study (d) Course evaluation scale: poor, acceptable, good (e) Score on last exam (based on 100 possible points) (f) Age of studentBusiness: Levels of Measurement Categorize these measurements associated with a robotics company according to level: nominal, ordinal, interval, or ratio. (a) Salespersons performance: below average, average, above average (b) Price of companys stock (c) Names of new products (d) Temperature (F) in CEOs private office (e) Gross income for each of the past 5 years (f) Color of product packagingFishing: Levels of Measurement Categorize these measurements associated with fishing according to level: nominal, ordinal, interval, or ratio. (a) Species of fish caught: perch, bass, pike, trout (b) Cost of rod and reel (c) Time of return home (d) Guidebook rating of fishing area: poor, fair, good (e) Number of fish caught (f) Temperature of waterEducation: Teacher Evaluation If you were going to apply statistical methods to analyze teacher evaluations, which question form, A or B, would be better? Form A: In your own words, tell how this teacher compares with other teachers you have had. Form B: Use the following scale to rank your teacher as compared with other teachers you have had.Critical Thinking You are interested in the weights of backpacks students carry to class and decide to conduct a study using the backpacks carried by 30 students. (a) Give some instructions for weighing the backpacks. Include unit of measure, accuracy of measure, and type of scale. (b) Do you think each student asked will allow you to weigh his or her backpack? (c) Do you think telling students ahead of time that you are going to weigh their backpacks will make a difference in the weights?Statistical Literacy Explain the difference between a stratified sample and a cluster sample.Statistical Literacy Explain the difference between a simple random sample and a systematic sample.Statistical Literacy Marcie conducted a study of the cost of breakfast cereal. She recorded the costs of several boxes of cereal. However, she neglected to take into account the number of servings in each box. Someone told her not to worry because she just had some sampling error. Comment on that advice.Statistical Literacy A random sample of students who use the college recreation center were asked if they approved increasing student fees for all students in order to add a climbing wall to the recreation center. Describe the sample frame. Does the sample frame include all students enrolled in the college? Explain.Interpretation In a random sample of 50 students from a large university, all the students were between 18 and 20 years old. Can we conclude that the entire population of students at the university is between 18 and 20 years old? Explain.Interpretation A campus performance series features plays, music groups, dance troops, and stand-up comedy. The committee responsible for selecting the performance groups include three students chosen at random from a pool of volunteers. This year the 30 volunteers came from a variety of majors. However, the three students for the committee were all music majors. Does this fact indicate there was bias in the selection process and that the selection process was not random? Explain.Critical Thinking Greg took a random sample of size 100 from the population of current season ticket holders to State College mens basketball games. Then he took a random sample of size 100 from the population of current season ticket holders to State College womens basketball games. (a) What sampling technique (stratified, systematic, cluster, multistage, convenience, random) did Greg use to sample from the population of current season ticket holders to all State College basketball games played by either men or women? (b) Is it appropriate to pool the samples and claim to have a random sample of size 200 from the population of current season ticket holders to all State College home basketball games played by either men or women? Explain.Critical Thinking Consider the students in your statistics class as the population and suppose they are seated in four rows of 10 students each. To select a sample, you toss a coin. If it comes up heads, you use the 20 students sitting in the first two rows as your sample. If it comes up tails, you use the 20 students sitting in the last two rows as your sample. (a) Does every student have an equal chance of being selected for the sample? Explain. (b) Is it possible to include students sitting in row 3 with students sitting in row 2 in your sample? Is your sample a simple random sample? Explain. (c) Describe a process you could use to get a simple random sample of size 20 from a class of size 40.Critical Thinking Suppose you are assigned the number 1, and the other students in your statistics class call out consecutive numbers until each person in the class has his or her own number. Explain how you could get a random sample of four students from your statistics class. (a) Explain why the first four students walking into the classroom would not necessarily form a random sample. (b) Explain why four students coming in late would not necessarily form a random sample. (c) Explain why four students sitting in the back row would not necessarily form a random sample. (d) Explain why the four tallest students would not necessarily form a random sample.Critical Thinking In each of the following situations, the sampling frame does not match the population, resulting in undercoverage. Give examples of population members that might have been omitted. (a) The population consists of all 250 students in your large statistics class. You plan to obtain a simple random sample of 30 students by using the sampling frame of students present next Monday. (b) The population consists of all 15-year-olds living in the attendance district of a local high school. You plan to obtain a simple random sample of 200 such residents by using the student roster of the high school as the sampling frame.Sampling: Random Use a random-number table to generate a list of 10 random numbers between 1 and 99. Explain your work.12PSampling: Random Use a random-number table to generate a list of six random numbers from 1 to 8615. Explain your work.14PComputer Simulation: Roll of a Die A die is a cube with dots on each face. The faces have 1, 2, 3, 4, 5, or 6 dots. The table below is a computer simulation (from the software package Minitab) of the results of rolling a fair die 20 times. (a) Assume that each number in the table corresponds to the number of dots on the upward face of the die. Is it appropriate that the same number appears more than once? Why? What is the outcome of the fourth roll? (b) If we simulate more rolls of the die, do you expect to get the same sequence of outcomes? Why or why not?16PEducation: Test Construction Professor Gill is designing a multiple-choice test. There are to be 10 questions. Each question is to have five choices for answers. The choices are to be designated by the letters a, b, c, d, and e. Professor Gill wishes to use a random-number table to determine which letter choice should correspond to the correct answer for a question. Using the number correspondence 1 for a, 2 for b, 3 for c, 4 for d, and 5 for e, use a random-number table to determine the letter choice for the correct answer for each of the 10 questions.Education: Test Construction Professor Gill uses truefalse questions. She wishes to place 20 such questions on the next test. To decide whether to place a true statement or a false statement in each of the 20 questions, she uses a random-number table. She selects 20 digits from the table. An even digit tells her to use a true statement. An odd digit tells her to use a false statement. Use a random-number table to pick a sequence of 20 digits, and describe the corresponding sequence of 20 truefalse questions. What would the test key for your sequence look like?Sampling Methods: Benefits Package An important part of employee compensation is a benefits package, which might include health insurance, life insurance, child care, vacation days, retirement plan, parental leave, bonuses, etc. Suppose you want to conduct a survey of benefits packages available in private businesses in Hawaii. You want a sample size of 100. Some sampling techniques are described below. Categorize each technique as simple random sample, stratified sample, systematic sample, cluster sample, or convenience sample. (a) Assign each business in the Island Business Directory a number, and then use a random-number table to select the businesses to be included in the sample. (b) Use postal ZIP Codes to divide the state into regions. Pick a random sample of 10 ZIP Code areas and then include all the businesses in each selected ZIP Code area. (c) Send a team of five research assistants to Bishop Street in downtown Honolulu. Let each assistant select a block or building and interview an employee from each business found. Each researcher can have the rest of the day off after getting responses from 20 different businesses. (d) Use the Island Business Directory. Number all the businesses. Select a starting place at random, and then use every 50th business listed until you have 100 businesses. (e) Group the businesses according to type: medical, shipping, retail, manufacturing, financial, construction, restaurant, hotel, tourism, other. Then select a random sample of 10 businesses from each business type.Sampling Methods: Health Care Modern Managed Hospitals (MMH) is a national for-profit chain of hospitals. Management wants to survey patients discharged this past year to obtain patient satisfaction profiles. They wish to use a sample of such patients. Several sampling techniques are described below. Categorize each technique as simple random sample, stratified sample, systematic sample, cluster sample, or convenience sample. (a) Obtain a list of patients discharged from all MMH facilities. Divide the patients according to length of hospital stay (2 days or less, 37 days, 814 days, more than 14 days). Draw simple random samples from each group. (b) Obtain lists of patients discharged from all MMH facilities. Number these patients, and then use a random-number table to obtain the sample. (c) Randomly select some MMH facilities from each of five geographic regions, and then include all the patients on the discharge lists of the selected hospitals. (d) At the beginning of the year, instruct each MMH facility to survey every 500th patient discharged. (e) Instruct each MMH facility to survey 10 discharged patients this week and send in the results.Statistical Literacy A study of college graduates involves three variables: income level, job satisfaction, and one-way commute times to work. List some ways the variables might be confounded.Statistical Literacy Consider a completely randomized experiment in which a control group is given a placebo for congestion relief and a treatment group is given a new drug for congestion relief. Describe a double-blind procedure for this experiment and discuss some benefits of such a procedure.Critical Thinking A brief survey regarding opinions about recycling was carefully designed so that the wording of the questions would not influence the responses. Jill administered the survey at a farmers market. She approached adults and asked if they would fill out the survey, explaining that the results might be used to set trash collection and recycling policy in the city. She stood by silently while the form was filled out. Jill was wearing a green T-shirt with the slogan fight global warming. Are the respondents a random sample of people in the community? Are there any concerns that Jill might have influenced the respondents?Critical Thinking A randomized block design was used to study the amount of grants awarded to students at a large university. One block consisted of undergraduate students and the other block consisted of graduate students. Samples of size 30 were taken from each block. Could the combined sample of 60 be considered a simple random sample from the population of all students, undergraduate and all graduate, at the university? Explain.Interpretation Zane is examining two studies involving how different generations classify specified items as either luxuries or necessities. In the first study, the Echo generation is defined to be people ages 1829. The second study defined the Echo generation to be people ages 2031. Zane notices that the first study was conducted in 2006 while the second one was conducted in 2008. (a) Are the two studies inconsistent in their description of the Echo generation? (b) What are the birth years of the Echo generation?Interpretation Suppose you are looking at the 2006 results of how the Echo generation classified specified items as either luxuries or necessities. Do you expect the results to reflect how the Echo generation would classify items in 2020? Explain.Ecology: Gathering Data Which technique for gathering data (observational study or experiment) do you think was used in the following studies? (a) The Colorado Division of Wildlife netted and released 774 fish at Quincy Reservoir. There were 219 perch, 315 blue gill, 83 pike, and 157 rainbow trout. (b) The Colorado Division of Wildlife caught 41 bighorn sheep on Mt. Evans and gave each one an injection to prevent heartworm. A year later, 38 of these sheep did not have heartworm, while the other three did. (c) The Colorado Division of Wildlife imposed special fishing regulations on the Deckers section of the South Platte River. All trout under 15 inches had to be released. A study of trout before and after the regulation went into effect showed that the average length of a trout increased by 4.2 inches after the new regulation. (d) An ecology class used binoculars to watch 23 turtles at Lowell Ponds. It was found that 18 were box turtles and 5 were snapping turtles.General: Gathering Data Which technique for gathering data (sampling, experiment, simulation, or census) do you think was used in the following studies? (a) An analysis of a sample of 31,000 patients from New York hospitals suggests that the poor and the elderly sue for malpractice at one-fifth the rate of wealthier patients (Journal of the American Medical Association). (b) The effects of wind shear on airplanes during both landing and takeoff were studied by using complex computer programs that mimic actual flight. (c) A study of all league football scores attained through touchdowns and field goals was conducted by the National Football League to determine whether field goals account for more scoring events than touchdowns (USA Today). (d) An Australian study included 588 men and women who already had some precancerous skin lesions. Half got a skin cream containing a sunscreen with a sun protection factor of 17; half got an inactive cream. After 7 months, those using the sunscreen with the sun protection had fewer new precancerous skin lesions (New England Journal of Medicine).General: Completely Randomized Experiment How would you use a completely randomized experiment in each of the following settings? Is a placebo being used or not? Be specific and give details. (a) A veterinarian wants to test a strain of antibiotic on calves to determine their resistance to common infection. In a pasture are 22 newborn calves. There is enough vaccine for 10 calves. However, blood tests to determine resistance to infection can be done on all calves. (b) The Denver Police Department wants to improve its image with teenagers. A uniformed officer is sent to a school 1 day a week for 10 weeks. Each day the officer visits with students, eats lunch with students, attends pep rallies, and so on. There are 18 schools, but the police department can visit only half of these schools this semester. A survey regarding how teenagers view police is sent to all 18 schools at the end of the semester. (c) A skin patch contains a new drug to help people quit smoking. A group of 75 cigarette smokers have volunteered as subjects to test the new skin patch. For 1 month, 40 of the volunteers receive skin patches with the new drug. The other volunteers receive skin patches with no drugs. At the end of 2 months, each subject is surveyed regarding his or her current smoking habits.Surveys: Manipulation The New York Times did a special report on polling that was carried in papers across the nation. The article pointed out how readily the results of a survey can be manipulated. Some features that can influence the results of a poll include the following: the number of possible responses, the phrasing of the questions, the sampling techniques used (voluntary response or sample designed to be representative), the fact that words may mean different things to different people, the questions that precede the question of interest, and finally, the fact that respondents can offer opinions on issues they know nothing about. (a) Consider the expression over the last few years. Do you think that this expression means the same time span to everyone? What would be a more precise phrase? (b) Consider this question: Do you think fines for running stop signs should be doubled? Do you think the response would be different if the question Have you ever run a stop sign? preceded the question about fines? (c) Consider this question: Do you watch too much television? What do you think the responses would be if the only responses possible were yes or no? What do you think the responses would be if the possible responses were rarely, sometimes, or frequently?Critical Thinking An agricultural study is comparing the harvest volume of two types of barley. The site for the experiment is bordered by a river. The field is divided into eight plots of approximately the same size. The experiment calls for the plots to be blocked into four plots per block. Then, two plots of each block will be randomly assigned to one of the two barley types. Two blocking schemes are shown below, with one block indicated by the white region and the other by the gray region. Which blocking scheme, A or B, would be better? Explain. Scheme A Scheme BCritical Thinking Sudoku is a puzzle consisting of squares arranged in 9 rows and 9 columns. The 81 squares are further divided into nine 3 3 square boxes. The object is to fill in the squares with numerals 1 through 9 so that each column, row, and box contains all nine numbers. However, there is a requirement that each number appear only once in any row, column, or box. Each puzzle already has numbers in some of the squares. Would it be appropriate to use a random-number table to select a digit for each blank square? Explain.2CRPStatistical Literacy You are conducting a study of students doing work-study jobs on your campus. Among the questions on the survey instrument are: A. How many hours are you scheduled to work each week? Answer to the nearest hour. B. How applicable is this work experience to your future employment goals? Respond using the following scale: 1 = not at all, 2 = somewhat, 3 = very (a) Suppose you take random samples from the following groups: freshmen, sophomores, juniors, and seniors. What kind of sampling technique are you using (simple random, stratified, systematic, cluster, multistage, convenience)? (b) Describe the individuals of this study. (c) What is the variable for question A? Classify the variable as qualitative or quantitative. What is the level of the measurement? (d) What is the variable for question B? Classify the variable as qualitative or quantitative. What is the level of the measurement? (e) Is the proportion of responses 3 = very to question B a statistic or a parameter? (f) Suppose only 40% of the students you selected for the sample respond. What is the nonresponse rate? Do you think the nonresponse rate might introduce bias into the study? Explain. (g) Would it be appropriate to generalize the results of your study to all work-study students in the nation? Explain.4CRP5CRPGeneral: Type of Sampling Categorize the type of sampling (simple random, stratified, systematic, cluster, or convenience) used in each of the following situations. (a) To conduct a preelection opinion poll on a proposed amendment to the state constitution, a random sample of 10 telephone prefixes (first three digits of the phone number) was selected, and all households from the phone prefixes selected were called. (b) To conduct a study on depression among the elderly, a sample of 30 patients in one nursing home was used. (c) To maintain quality control in a brewery, every 20th bottle of beer coming off the production line was opened and tested. (d) Subscribers to a new smart phone app that streams songs were assigned numbers. Then a sample of 30 subscribers was selected by using a random- number table. The subscribers in the sample were invited to rate the process for selecting the songs in the playlist. (e) To judge the appeal of a proposed television sitcom, a random sample of 10 people from each of three different age categories was selected and those chosen were asked to rate a pilot show.7CRPGeneral: Experiment How would you use a completely randomized experiment in each of the following settings? Is a placebo being used or not? Be specific and give details. (a) A charitable nonprofit organization wants to test two methods of fundraising. From a list of 1000 past donors, half will be sent literature about the successful activities of the charity and asked to make another donation. The other 500 donors will be contacted by phone and asked to make another donation. The percentage of people from each group who make a new donation will be compared. (b) A tooth-whitening gel is to be tested for effectiveness. A group of 85 adults have volunteered to participate in the study. Of these, 43 are to be given a gel that contains the tooth-whitening chemicals. The remaining 42 are to be given a similar-looking package of gel that does not contain the tooth-whitening chemicals. A standard method will be used to evaluate the whiteness of teeth for all participants. Then the results for the two groups will be compared. How could this experiment be designed to be double-blind? (c) Consider the experiment described in part (a). Describe how you would use a randomized block experiment with blocks based on age. Use three blocks: donors younger than 30 years old, donors 30 to 59 years old, donors 60 and older.9CRP10CRP11CRP1DH2DH1LCDiscuss each of the following topics in class or review the topics on your own. Then write a brief but complete essay in which you summarize the main points. Please include formulas and graphs as appropriate. In your own words, explain the differences among the following sampling techniques: simple random sample, stratified sample, systematic sample, cluster sample, multistage sample, and convenience sample. Describe situations in which each type might be useful.1UT2UTStatistical Literacy What is the difference between a class boundary and a class limit?Statistical Literacy A data set has values ranging from a low of 10 to a high of 52. Whats wrong with using the class limits 1019, 2029, 3039, 4049 for a frequency table?Statistical Literacy A data set has values ranging from a low of 10 to a high of 50. Whats wrong with using the class limits 1020, 2030, 3040, 4050 for a frequency table?4PBasic Computation: Class Limits A data set with whole numbers has a low value of 20 and a high value of 82. Find the class width and class limits for a frequency table with 7 classes.6PInterpretation You are manager of a specialty coffee shop and collect data throughout a full day regarding waiting time for customers from the time they enter the shop until the time they pick up their order. (a) What type of distribution do you think would be most desirable for the waiting times: skewed right, skewed left, mound-shaped symmetric? Explain. (b) What if the distribution for waiting times were bimodal? What might be some explanations?8PCritical Thinking Look at the histogram in Figure 2-10(a), which shows mileage, in miles per gallon (mpg), for a random selection of older passenger cars (Reference: Consumer Reports). (a) Is the shape of the histogram essentially bimodal? (b) Jose looked at the raw data and discovered that the 54 data values included both the city and the highway mileages for 27 cars. He used the city mileages for the 27 cars to make the histogram in Figure 2-10(b). Using this information and Figure 2-10, parts (a) and (b), construct a histogram for the highway mileages of the same cars. Use class boundaries 16.5, 20.5, 24.5, 28.5, 32.5, 36.5, and 40.5. FIGURE 2-10Critical Thinking The following data represent annual salaries, in thousands of dollars, for employees of a small company. Notice that the data have been sorted in increasing order. (a) Make a histogram using the class boundaries 53.5, 99.5, 145.5, 191.5, 237.5, 283.5. (b) Look at the last data value. Does it appear to be an outlier? Could this be the owners salary? (c) Eliminate the high salary of 280 thousand dollars. Make a new histogram using the class boundaries 53.5, 62.5, 71.5, 80.5, 89.5, 98.5. Does this histogram reflect the salary distribution of most of the employees better than the histogram in part (a)?Interpretation Histograms of random sample data are often used as an indication of the shape of the underlying population distribution. The histograms on the next page are based on random samples of size 30, 50, and 100 from the same population. (a) Using the midpoint labels of the three histograms, what would you say about the estimated range of the population data from smallest to largest? Does the bulk of the data seem to be between 8 and 12 in all three histograms? (b) The population distribution from which the samples were drawn is symmetric and mound-shaped, with the top of the mound at 10, 95% of the data between 8 and 12, and 99.7% of data between 7 and 13. How well does each histogram reflect these characteristics? (i) Sample of size 30 (ii) Sample of size 50 (iii) Sample of size 10012PInterpretation The ogives shown are based on U.S. Census data and show the average annual personal income per capita for each of the 50 states. The data are rounded to the nearest thousand dollars. (a) How were the percentages shown in graph (ii) computed? (b) How many states have average per capita income less than 37.5 thousand dollars? (c) How many states have average per capita income between 42.5 and 52.5 thousand dollars? (d) What percentage of the states have average per capita income more than 47.5 thousand dollars? (i) Ogive (ii) Ogive Showing Cumulative Percentage of Data14PFor Problems 15-20, use the specified number of classes to do the following. (a) Find the class width. (b) Make a frequency table showing class limits, class boundaries, midpoints, frequencies, relative frequencies, and cumulative frequencies. (c) Draw a histogram. (d) Draw a relative-frequency histogram. (e) Categorize the basic distribution shape as uniform, mound-shaped symmetric, bimodal, skewed left, or skewed right. (f) Draw an ogive. (g) Interpretation Discuss some of the features about the data that the graphs reveal. Consider items such as data range, location of the middle half of the data, unusual values, outliers, etc. 15. Sports: Dog Sled Racing How long does it take to finish the 1161-mile Iditarod Dog Sled Race from Anchorage to Nome, Alaska (see Viewpoint)? Finish times (to the nearest hour) for 57 dogsled teams are shown below. Use five classes.For Problems 15-20, use the specified number of classes to do the following. (a) Find the class width. (b) Make a frequency table showing class limits, class boundaries, midpoints, frequencies, relative frequencies, and cumulative frequencies. (c) Draw a histogram. (d) Draw a relative-frequency histogram. (e) Categorize the basic distribution shape as uniform, mound-shaped symmetric, bimodal, skewed left, or skewed right. (f) Draw an ogive. (g) Interpretation Discuss some of the features about the data that the graphs reveal. Consider items such as data range, location of the middle half of the data, unusual values, outliers, etc. 16. Medical: Glucose Testing The following data represent glucose blood levels (mg/100 ml) after a 12-hour fast for a random sample of 70 women (Reference: American Journal of Clinical Nutrition, Vol. 19, pp. 345351). Note: These data are also available for download at the Companion Sites for this text. Use six classes.For Problems 15-20, use the specified number of classes to do the following. (a) Find the class width. (b) Make a frequency table showing class limits, class boundaries, midpoints, frequencies, relative frequencies, and cumulative frequencies. (c) Draw a histogram. (d) Draw a relative-frequency histogram. (e) Categorize the basic distribution shape as uniform, mound-shaped symmetric, bimodal, skewed left, or skewed right. (f) Draw an ogive. (g) Interpretation Discuss some of the features about the data that the graphs reveal. Consider items such as data range, location of the middle half of the data, unusual values, outliers, etc. 17. Medical: Tumor Recurrence Certain kinds of tumors tend to recur. The following data represent the lengths of time, in months, for a tumor to recur after chemotherapy (Reference: D. P. Byar, Journal of Urology, Vol. 10, pp. 556561). Note: These data are also available for download at the Companion Sites for this text. Use five classes.18PFor Problems 15-20, use the specified number of classes to do the following. (a) Find the class width. (b) Make a frequency table showing class limits, class boundaries, midpoints, frequencies, relative frequencies, and cumulative frequencies. (c) Draw a histogram. (d) Draw a relative-frequency histogram. (e) Categorize the basic distribution shape as uniform, mound-shaped symmetric, bimodal, skewed left, or skewed right. (f) Draw an ogive. (g) Interpretation Discuss some of the features about the data that the graphs reveal. Consider items such as data range, location of the middle half of the data, unusual values, outliers, etc. 19. Education: College Enrollment What percent of undergraduate enrollment in coed colleges and universities in the United States is male? A random sample of 50 such institutions give the following data (Source USA Today College Guide). Percent Males Enrolled in Coed Universities and Colleges Use five classes.20PExpand Your Knowledge: Decimal Data The following data represent tonnes of wheat harvested each year (18941925) from Plot 19 at the Rothamsted Agricultural Experiment Stations, England. (a) Multiply each data value by 100 to clear the decimals. (b) Use the standard procedures of this section to make a frequency table and histogram with your whole-number data. Use six classes. (c) Divide class limits, class boundaries, and class midpoints by 100 to get back to your original data values.Decimal Data: Batting Averages The following data represent baseball batting averages for a random sample of National League players near the end of the baseball season. The data are from the baseball statistics section of the Denver Host. (a) Multiply each data value by 1000 to clear the decimals. (b) Use the standard procedures of this section to make a frequency table and histogram with your whole-number data. Use five classes. (c) Divide class limits, class boundaries, and class midpoints by 1000 to get back to your original data.Expand Your Knowledge: Dotplot Another display technique that is somewhat similar to a histogram is a dotplot. In a dotplot, the data values are displayed along the horizontal axis. A dot is then plotted over each data value in the data set. The next display shows a dotplot generated by Minitab (Graph Dotplot) for the number of licensed drivers per 1000 residents by state, including the District of Columbia (Source: U.S. Department of Transportation). Dotplot for Licensed Drivers per 1000 Residents (a) From the dotplot, how many states have 600 or fewer licensed drivers per 1000 residents? (b) About what percentage of the states (out of 51) seem to have close to 800 licensed drivers per 1000 residents? (c) Consider the intervals 550 to 650, 650 to 750, and 750 to 850 licensed drivers per 1000 residents. In which interval do most of the states fall?24PDotplot: Tumor Recurrence Make a dotplot for the data in Problem 17 regarding the recurrence of tumors after chemotherapy. Compare the dotplot to the histogram of Problem 17. 17. Medical: Tumor Recurrence Certain kinds of tumors tend to recur. The following data represent the lengths of time, in months, for a tumor to recur after chemotherapy (Reference: D. P. Byar, Journal of Urology, Vol. 10, pp. 556561). Note: These data are also available for download at the Companion Sites for this text. Use five classes.Interpretation Consider graph (a) of Reasons People Like Texting on Cell Phones, based on a GfK Roper survey of 1000 adults. Reasons People Like Texting on Cell Phones (a) (a) Do you think respondents could select more than one response? Explain. (b) Could the same information be displayed in a circle graph? Explain. (c) Is graph (a) a Pareto chart?Reasons People Like Texting on Cell Phones (b) Interpretation Look at graph (b) of Reasons People Like Texting on Cell Phones. Is this a proper bar graph? Explain.Critical Thinking A personnel office is gathering data regarding working conditions. Employees are given a list of five conditions that they might want to see improved. They are asked to select the one item that is most critical to them. Which type of graph, circle graph or Pareto chart, would be most useful for displaying the results of the survey? Why?4PEducation: Does College Pay Off? It is costly in both time and money to go to college. Does it pay off? According to the Bureau of the Census, the answer is yes. The average annual income (in thousands of dollars) of a household headed by a person with the stated education level is as follows: 24.3 if ninth grade is the highest level achieved, 41.4 for high school graduates, 59.7 for those holding associate degrees, 82.7 for those with bachelors degrees, 100.8 for those with masters degrees, and 121.6 for those with doctoral degrees. Make a bar graph showing household income for each education level.6PCommercial Fishing: Gulf of Alaska Its not an easy life, but its a good life! Suppose you decide to take the summer off and sign on as a deck hand for a commercial fishing boat in Alaska that specializes in deep-water fishing for groundfish. What kind of fish can you expect to catch? One way to answer this question is to examine government reports on groundfish caught in the Gulf of Alaska. The following list indicates the types of fish caught annually in thousands of metric tons (Source: Report on the Status of U.S. Living Marine Resources, National Oceanic and Atmospheric Administration): flatfish, 36.3; Pacific cod, 68.6; sablefish, 16.0; Walleye pollock, 71.2; rockfish, 18.9. Make a Pareto chart showing the annual harvest for commercial fishing in the Gulf of Alaska.8PLifestyle: Hide the Mess! A survey of 1000 adults (reported in USA Today) uncovered some interesting housekeeping secrets. When unexpected company comes, where do we hide the mess? The survey showed that 68% of the respondents toss their mess into the closet, 23% shove things under the bed, 6% put things into the bathtub, and 3% put the mess into the freezer. Make a circle graph to display this information.10PFBI Report: Hawaii In the Aloha state, you are very unlikely to be murdered! However, it is considerably more likely that your house might be burgled, your car might be stolen, or you might be punched in the nose. That said, Hawaii is still a great place to vacation or, if you are very lucky, to live. The following numbers represent the crime rates per 100,000 population in Hawaii: murder, 2.6; rape, 33.4; robbery, 93.3; house burglary, 911.6; motor vehicle theft, 550.7; assault, 125.3 (Source: Crime in the United States, U.S. Department of Justice, Federal Bureau of Investigation). (a) Display this information in a Pareto chart, showing the crime rate for each category. (b) Could the information as reported be displayed as a circle graph? Explain. Hint: Other forms of crime, such as arson, are not included in the information. In addition, some crimes might occur together.12P13P14P15P16PCowboys: Longevity How long did real cowboys live? One answer may be found in the book The Last Cowboys by Connie Brooks (University of New Mexico Press). This delightful book presents a thoughtful sociological study of cowboys in west Texas and southeastern New Mexico around the year 1890. A sample of 32 cowboys gave the following years of longevity: (a) Make a stem-and-leaf display for these data. (b) Interpretation Consider the following quote from Baron von Richthofen in his Cattle Raising on the Plains of North America: Cowboys are to be found among the sons of the best families. The truth is probably that most were not a drunken, gambling lot, quick to draw and fire their pistols. Does the data distribution of longevity lend credence to this quote?Ecology: Habitat Wetlands offer a diversity of benefits. They provide a habitat for wildlife, spawning grounds for U.S. commercial fish, and renewable timber resources. In the last 200 years, the United States has lost more than half its wetlands. Environmental Almanac gives the percentage of wetlands lost in each state in the last 200 years. For the lower 48 states, the percentage loss of wetlands per state is as follows: Make a stem-and-leaf display of these data. Be sure to indicate the scale. How are the percentages distributed? Is the distribution skewed? Are there any gaps?Health Care: Hospitals The American Medical Association Center for Health Policy Research, in its publication State Health Care Data: Utilization, Spending, and Characteristics, included data, by state, on the number of community hospitals and the average patient stay (in days). The data are shown in the table. Make a stem-and-leaf display of the data for the average length of stay in days. Comment about the general shape of the distribution.Health Care: Hospitals Using the number of hospitals per state listed in the table in Problem 3, make a stem-and-leaf display for the number of community hospitals per state. Which states have an unusually high number of hospitals? 3. Health Care: Hospitals The American Medical Association Center for Health Policy Research, in its publication State Health Care Data: Utilization, Spending, and Characteristics, included data, by state, on the number of community hospitals and the average patient stay (in days). The data are shown in the table. Make a stem-and-leaf display of the data for the average length of stay in days. Comment about the general shape of the distribution.Expand Your Knowledge: Split Stem The Boston Marathon is the oldest and best-known U.S. marathon. It covers a route from Hopkinton, Massachusetts, to downtown Boston. The distance is approximately 26 miles. The Boston Marathon web site has a wealth of information about the history of the race. In particular, the site gives the winning times for the Boston Marathon. They are all over 2 hours. The following data are the minutes over 2 hours for the winning male runners over two periods of 20 years each: Earlier Period Recent Period (a) Make a stem-and-leaf display for the minutes over 2 hours of the winning times for the earlier period. Use two lines per stem. (b) Make a stem-and-leaf display for the minutes over 2 hours of the winning times for the recent period. Use two lines per stem. (c) Interpretation Compare the two distributions. How many times under 15 minutes are in each distribution?6P7P8P9P10P1CRPCritical Thinking A consumer interest group is tracking the percentage of household income spent on gasoline over the past 30 years. Which graphical display would be more useful, a histogram or a time-series graph? Why?3CRP4CRP5CRP6CRP7CRP8CRP9CRP10CRP11CRP12CRP1DH2DHIn your own words, explain the differences among histograms, relative-frequency histograms, bar graphs, circle graphs, time-series graphs, Pareto charts, and stem-and-leaf displays. If you have nominal data, which graphic displays might be useful? What if you have ordinal, interval, or ratio data?What do we mean when we say a histogram is skewed to the left? to the right? What is a bimodal histogram? Discuss the following statement: A bimodal histogram usually results if we draw a sample from two populations at once. Suppose you took a sample of weights of college football players and with this sample you included weights of cheerleaders. Do you think a histogram made from the combined weights would be bimodal? Explain.Discuss the statement that stem-and-leaf displays are quick and easy to construct. How can we use a stem-and-leaf display to make the construction of a frequency table easier? How does a stem-and-leaf display help you spot extreme values quickly?4LC1UT2UT3UTStatistical Literacy Consider the mode, median, and mean. Which average represents the middle value of a data distribution? Which average represents the most frequent value of a distribution? Which average takes all the specific values into account?2P3P4P5PBasic Computation: Mean, Median, Mode Find the mean, median, and mode of the data set 10 12 20 15 207PCritical Thinking Consider a data set with at least three data values. Suppose the highest value is increased by 10 and the lowest is decreased by 5. a) Does the mean change? Explain. b) Does the median change? Explain. c) Is it possible for the mode to change? Explain.Critical Thinking Consider a data set with at least three data values. Suppose the highest value is increased by 10 and the lowest is decreased by 10. a) Does the mean change? Explain. b) Does the median change? Explain. c) Is it possible for the mode to change? Explain.10PCritical Thinking When a distribution is mound-shaped symmetric, what is the general relationship among the values of the mean, median, and mode?Critical Thinking Consider the following types of data that were obtained from a random sample of 49 credit card accounts. Identify all the averages (mean, median, or mode) that can be used to summarize the data. a) Outstanding balance on each account b) Name of credit card (e.g., MasterCard, Visa, American Express, etc.) c) Dollar amount due on next paymentCritical Thinking Consider the numbers 2 3 4 5 5 a) Compute the mode, median, and mean. b) If the numbers represent codes for the colors of T-shirts ordered from a catalog, which average(s) would make sense? c) If the numbers represent one-way mileages for trails to different lakes, which average(s) would make sense? d) Suppose the numbers represent survey responses from 1 to 5, with 1 = disagree strongly, 2 = disagree, 3 = agree, 4 = agree strongly, and 5 = agree very strongly. Which averages make sense?14P15P16P17PCritical Thinking Consider a data set of 15 distinct measurements with mean A and median B. a) If the highest number were increased, what would be the effect on the median and mean? Explain. b) If the highest number were decreased to a value still larger than B, what would be the effect on the median and mean? c) If the highest number were decreased to a value smaller than B, what would be the effect on the median and mean?19P20P21PFootball: Age of Professional Players How old are professional football players? The 11th edition of The Pro Football Encyclopedia gave the following information. Random sample of pro football player ages in years: (a) Compute the mean, median, and mode of the ages. (b) Interpretation Compare the averages. Does one seem to represent the age of the pro football players most accurately? Explain.23P24P25P26P27P28PExpand Your Knowledge: Harmonic Mean When data consist of rates of change, such as speeds, the harmonic mean is an appropriate measure of central tendency. For n data values, Harmonic mean =n1x, assuming no data value is 0 Suppose you drive 60 miles per hour for 100 miles, then 75 miles per hour for 100 miles. Use the harmonic mean to find your average speed.30PStatistical Literacy Which averagemean, median, or modeis associated with the standard deviation?Statistical Literacy What is the relationship between the variance and the standard deviation for a sample data set?Statistical Literacy When computing the standard deviation, does it matter whether the data are sample data or data comprising the entire population? Explain.4P5PBasic Computation: Range, Standard Deviation Consider the data set 1 2 3 4 5 (a) Find the range. (b) Use the defining formula to compute the sample standard deviation s. (c) Use the defining formula to compute the population standard deviation .Critical Thinking For a given data set in which not all data values are equal, which value is smaller, s or ? Explain.8P9P10P11PCritical Thinking: Outliers One indicator of an outlier is that an observation is more than 2.5 standard deviations from the mean. Consider the data value 80. (a) If a data set has mean 70 and standard deviation 5, is 80 a suspect outlier? (b) If a data set has mean 70 and standard deviation 3, is 80 a suspect outlier?13PBasic Computation: Coefficient of Variation, Chebyshev Interval Consider sample data with x=15 and s = 3. (a) Compute the coefficient of variation. (b) Compute a 75% Chebyshev interval around the sample mean.15P16PSpace Shuttle: Epoxy Kevlar epoxy is a material used on the NASA space shuttles. Strands of this epoxy were tested at the 90% breaking strength. The following data represent time to failure (in hours) for a random sample of 50 epoxy strands (Reference: R. E. Barlow, University of California, Berkeley). Let x be a random variable representing time to failure (in hours) at 90% breaking strength. Note: These data are also available for download at the Companion Sites for this text. (a) Find the range. (b) Use a calculator to verify that x = 62.11 and x2 164.23. (c) Use the results of part (b) to compute the sample mean, variance, and standard deviation for the time to failure. (d) Interpretation Use the results of part (c) to compute the coefficient of variation. What does this number say about time to failure? Why does a small CV indicate more consistent data, whereas a larger CV indicates less consistent data? Explain.18P19P20P21P22P23P24P25P26PBrain Teaser: Sum of Squares If you like mathematical puzzles or love algebra, try this! Otherwise, just trust that the computational formula for the sum of squares is correct. We have a sample of x values. The sample size is n. Fill in the details for the following steps. (xx)2=x22xx+nx2=x22nx2+nx2=x2(x)2n28P29P30PStatistical Literacy Angela took a general aptitude test and scored in the 82nd percentile for aptitude in accounting. What percentage of the scores were at or below her score? What percentage were above?2P3P4P5PBasic Computation: Five-Number Summary, Interquartile Range Consider the following ordered data: 2 5 5 6 7 8 8 9 10 12 (a) Find the low, Q1, median, Q3 high. (b) Find the interquartile range. (c) Make a box-and-whisker plot.Health Care: Nurses At Center Hospital there is some concern about the high turnover of nurses. A survey was done to determine how long (in months) nurses had been in their current positions. The responses (in months) of 20 nurses were Make a box-and-whisker plot of the data. Find the interquartile range.8P9PSociology: High School Dropouts What percentage of the general U.S. population are high school dropouts? The Statistical Abstract of the United States, 120th edition, gives the percentage of high school dropouts by state. For convenience, the data are sorted in increasing order. (a) Make a box-and-whisker plot and find the interquartile range. (b) Wyoming has a dropout rate of about 7%. Into what quartile does this rate fall?Auto Insurance: Interpret Graphs Consumer Reports rated automobile insurance companies and listed annual premiums for top-rated companies in several states. Figure 3-9 shows box-and-whisker plots for annual premiums for urban customers (married couple with one 17-year-old son) in three states. The box-and-whisker plots in Figure 3-9 were all drawn using the same scale on a TI-84Plus/TI-83Plus/TI-nspire calculator. FIGURE 3-9 Insurance Premium (annual, urban) (a) Which state has the lowest premium? the highest? (b) Which state has the highest median premium? (c) Which state has the smallest range of premiums? the smallest interquartile range? (d) Figure 3-10 gives the five-number summaries generated on the TI-84Plus/TI-83Plus/TI-nspire calculators for the box-and-whisker plots of Figure 3-9. Match the five-number summaries to the appropriate box-and-whisker plots. FIGURE 3-10 Five-Number Summaries for Insurance Premiums12P1CRPCritical Thinking Look at the two histograms below. Each involves the same number of data. The data are all whole numbers, so the height of each bar represents the number of values equal to the corresponding midpoint shown on the horizontal axis. Notice that both distributions are symmetric. (a) Estimate the mode, median, and mean for each histogram. (b) Which distribution has the larger standard deviation? Why?Critical Thinking Consider the following Minitab display of two data sets. (a) What are the respective means? the respective ranges? (b) Which data set seems more symmetric? Why? (c) Compare the interquartile ranges of the two sets. How do the middle halves of the data sets compare?4CRPPolitical Science: Georgia Democrats How Democratic is Georgia? County-by-county results are shown for a recent election. For your convenience, the data have been sorted in increasing order (Source: County and City Data Book, 12th edition, U.S. Census Bureau). Percentage of Democratic vote by counties in Georgia (a) Make a box-and-whisker plot of the data. Find the interquartile range. (b) Grouped Data Make a frequency table using five classes. Then estimate the mean and sample standard deviation using the frequency table. Compute a 75% Chebyshev interval centered about the mean. (c) If you have a statistical calculator or computer, use it to find the actual sample mean and sample standard deviation. Otherwise, use the values x 2769 and x2 132,179 to compute the sample mean and sample standard deviation.Grades: Weighted Average Professor Cramer determines a final grade based on attendance, two papers, three major tests, and a final exam. Each of these activities has a total of 100 possible points. However, the activities carry different weights. Attendance is worth 5%, each paper is worth 8%, each test is worth 15%, and the final is worth 34%. (a) What is the average for a student with 92 on attendance, 73 on the first paper, 81 on the second paper, 85 on test 1, 87 on test 2, 83 on test 3, and 90 on the final exam? (b) Compute the average for a student with the above scores on the papers, tests, and final exam, but with a score of only 20 on attendance.7CRPAgriculture: Harvest Weight of Maize The following data represent weights in kilograms of maize harvest from a random sample of 72 experimental plots on St. Vincent, an island in the Caribbean (Reference: B. G. F. Springer, Proceedings, Caribbean Food Corps. Soc., Vol. 10, pp. 147152). Note: These data are also available for download at the Companion Sites for this text. For convenience, the data are presented in increasing order. (a) Compute the five-number summary. (b) Compute the interquartile range. (c) Make a box-and-whisker plot. (d) Interpretation Discuss the distribution. Does the lower half of the distribution show more data spread than the upper half?9CRPAgriculture: Bell Peppers The pathogen Phytophthora capsici causes bell pepper plants to wilt and die. A research project was designed to study the effect of soil water content and the spread of the disease in fields of bell peppers (Source: Journal of Agricultural, Biological, and Environmental Statistics, Vol. 2, No. 2). It is thought that too much water helps spread the disease. The fields were divided into rows and quadrants. The soil water content (percent of water by volume of soil) was determined for each plot. An important first step in such a research project is to give a statistical description of the data. soil water content for bell Pepper study (a) Make a box-and-whisker plot of the data. Find the interquartile range. (b) Grouped Data Make a frequency table using four classes. Then estimate the mean and sample standard deviation using the frequency table. Compute a 75% Chebyshev interval centered about the mean. (c) If you have a statistical calculator or computer, use it to find the actual sample mean and sample standard deviation.Performance Rating: Weighted Average A performance evaluation for new sales representatives at Office Automation Incorporated involves several ratings done on a scale of 1 to 10, with 10 the highest rating. The activities rated include new contacts, successful contacts, total contacts, dollar volume of sales, and reports. Then an overall rating is determined by using a weighted average. The weights are 2 for new contacts, 3 for successful contacts, 3 for total contacts, 5 for dollar value of sales, and 3 for reports. What would the overall rating be for a sales representative with ratings of 5 for new contacts, 8 for successful contacts, 7 for total contacts, 9 for dollar volume of sales, and 7 for reports?1DH2DH1LC2LC3LC4LC1UT1CURPDescribe how the presence of possible outliers might be identified on (a) histograms. (b) dotplots. (c) stem-and-leaf displays. (d) box-and-whisker plots.3CURP4CURP5CURP6CURP7CURPIn west Texas, water is extremely important. The following data represent pH levels in ground water for a random sample of 102 west Texas wells. A pH less than 7 is acidic and a pH above 7 is alkaline. Scanning the data, you can see that water in this region tends to be hard (alkaline). Too high a pH means the water is unusable or needs expensive treatment to make it usable (Reference: C. E. Nichols and V. E. Kane, Union Carbide Technical Report K/UR-1). These data also available for download at the companion Sites for this text. For convenience, the data are presented in increasing order. x: pH of Ground Water in 102 West Texas Wells Make a frequency table, histogram, and relative frequency histogram using five classes. Recall that for decimal data, we clear the decimal to determine classes for whole-number data and then reinsert the decimal to obtain the classes for the frequency table of the original data.9CURP10CURP11CURPIn west Texas, water is extremely important. The following data represent pH levels in ground water for a random sample of 102 west Texas wells. A pH less than 7 is acidic and a pH above 7 is alkaline. Scanning the data, you can see that water in this region tends to be hard (alkaline). Too high a pH means the water is unusable or needs expensive treatment to make it usable (Reference: C. E. Nichols and V. E. Kane, Union Carbide Technical Report K/UR-1). These data are also available for download at the Companion Sites for this text. For convenience, the data are presented in increasing order. Compute a 75% Chebyshev interval centered on the mean.13CURP14CURP15CURPIn west Texas, water is extremely important. The following data represent pH levels in ground water for a random sample of 102 west Texas wells. A pH less than 7 is acidic and a pH above 7 is alkaline. Scanning the data, you can see that water in this region tends to be hard (alkaline). Too high a pH means the water is unusable or needs expensive treatment to make it usable (Reference: C. E. Nichols and V. E. Kane, Union Carbide Technical Report K/UR-1). These data also available for download at the companion Sites for this text. For convenience, the data are presented in increasing order. x: pH of Ground Water in 102 West Texas Wells Look at the stem-and-leaf plot. Are there any unusually high or low pH levels in this sample of wells? How many wells are neutral (pH of 7)?17CURP18CURPStatistical Literacy List three methods of assigning probabilities.2P3PStatistical Literacy What is the law of large numbers? If you were using the relative frequency of an event to estimate the probability of the event, would it be better to use 100 trials or 500 trials? Explain.5P6P7P8PInterpretation An investment opportunity boasts that the chance of doubling your money in 3 years is 95%. However, when you research the details of the investment, you estimate that there is a 3% chance that you could lose the entire investment. Based on this information, are you certain to make money on this investment? Are there risks in this investment opportunity?Interpretation A sample space consists of 4 simple events: A, B, C, D. Which events comprise the complement of A? Can the sample space be viewed as having two events, A and Ac? Explain.Critical Thinking Consider a family with 3 children. Assume the probability that one child is a boy is 0.5 and the probability that one child is a girl is also 0.5, and that the events boy and girl are independent. (a) List the equally likely events for the gender of the 3 children, from oldest to youngest. (b) What is the probability that all 3 children are male? Notice that the complement of the event all three children are male is at least one of the children is female. Use this information to compute the probability that at least one child is female.12P13PCritical Thinking (a) Explain why 0.41 cannot be the probability of some event. (b) Explain why 1.21 cannot be the probability of some event. (c) Explain why 120% cannot be the probability of some event. (d) Can the number 0.56 be the probability of an event? Explain.15P16PMyers-Briggs: Personality Types Isabel Briggs Myers was a pioneer in the study of personality types. The personality types are broadly defined according to four main preferences. Do married couples choose similar or different personality types in their mates? The following data give an indication (Source: I. B. Myers and M. H. McCaulley, A Guide to the Development and Use of the MyersBriggs Type Indicators). Similarities and Differences in a Random Sample of 375 Married Couples Suppose that a married couple is selected at random. (a) Use the data to estimate the probability that they will have 0, 1, 2, 3, or 4 personality preferences in common. (b) Do the probabilities add up to 1? Why should they? What is the sample space in this problem?General: Roll a Die (a) If you roll a single die and count the number of dots on top, what is the sample space of all possible outcomes? Are the outcomes equally likely? (b) Assign probabilities to the outcomes of the sample space of part (a). Do the probabilities add up to 1? Should they add up to 1? Explain. (c) What is the probability of getting a number less than 5 on a single throw? (d) What is the probability of getting 5 or 6 on a single throw?Psychology: Creativity When do creative people get their best ideas? USA Today did a survey of 966 inventors (who hold U.S. patents) and obtained the following information: Time of Day When Best Ideas Occur (a) Assuming that the time interval includes the left limit and all the times up to but not including the right limit, estimate the probability that an inventor has a best idea during each time interval: from 6 A.M. to 12 noon, from 12 noon to 6 P.M., from 6 P.M. to 12 midnight, from 12 midnight to 6 A.M. (b) Do the probabilities of part (a) add up to 1? Why should they? What is the sample space in this problem?Agriculture: Cotton A botanist has developed a new hybrid cotton plant that can withstand insects better than other cotton plants. However, there is some concern about the germination of seeds from the new plant. To estimate the probability that a seed from the new plant will germinate, a random sample of 3000 seeds was planted in warm, moist soil. Of these seeds, 2430 germinated. (a) Use relative frequencies to estimate the probability that a seed will germinate. What is your estimate? (b) Use relative frequencies to estimate the probability that a seed will not germinate. What is your estimate? (c) Either a seed germinates or it does not. What is the sample space in this problem? Do the probabilities assigned to the sample space add up to 1? Should they add up to 1? Explain. (d) Are the outcomes in the sample space of part (c) equally likely?Expand Your Knowledge: Odds in Favor Sometimes probability statements are expressed in terms of odds. The odds in favor of an event A are the ratio P(A)P(notA)=P(A)P(Ac). For instance, if P(A) = 0.60, then P(Ac) = 0.40 and the odds in favor of A are 0.600.40=64=32, written as 3 to 2 or 3:2 (a) Show that if we are given the odds in favor of event A as n:m, the probability of event A is given by P(A)=nn+m. Hint: Solve the equation nm=P(A)1P(A) for P(A). (b) A telemarketing supervisor tells a new worker that the odds of making a sale on a single call are 2 to 15. What is the probability of a successful call? (c) A sports announcer says that the odds a basketball player will make a free throw shot are 3 to 5. What is the probability the player will make the shot?Expand Your Knowledge: Odds Against Betting odds are usually stated against the event happening (against winning). The odds against event W are the ratio P(notW)P(W)=P(Wc)P(W). In horse racing, the betting odds are based on the probability that the horse does not win. (a) Show that if we are given the odds against an event W as a:b, the probability of not W is P(Wc)=aa+b. Hint: Solve the equation ab=P(Wc)1P(Wc) for P(Wc). (b) In a recent Kentucky Derby, the betting odds for the favorite horse, Point Given, were 9 to 5. Use these odds to compute the probability that Point Given would lose the race. What is the probability that Point Given would win the race? (c) In the same race, the betting odds for the horse Monarchos were 6 to 1. Use these odds to estimate the probability that Monarchos would lose the race. What is the probability that Monarchos would win the race? (d) Invisible Ink was a long shot, with betting odds of 30 to 1. Use these odds to estimate the probability that Invisible Ink would lose the race. What is the probability the horse would win the race? For further information on the Kentucky Derby, visit the web site of the Kentucky Derby.Business: Customers John runs a computer software store. Yesterday he counted 127 people who walked by his store, 58 of whom came into the store. Of the 58, only 25 bought something in the store. (a) Estimate the probability that a person who walks by the store will enter the store. (b) Estimate the probability that a person who walks into the store will buy something. (c) Estimate the probability that a person who walks by the store will come in and buy something. (d) Estimate the probability that a person who comes into the store will buy nothing.Statistical Literacy If two events are mutually exclusive, can they occur concurrently? Explain.Statistical Literacy If two events A and B are independent and you know that P(A) = 0.3, what is the value of P(A | B)?Basic Computation: Addition Rule Given P(A) = 0.3 and P(B) = 0.4: (a) If A and B are mutually exclusive events, compute P(A or B). (b) If P(A and B) = 0.1, compute P(A or B).Basic Computation: Addition Rule Given P(A) = 0.7 and P(B) = 0.4: (a) Can events A and B be mutually exclusive? Explain. (b) If P(A and B) = 0.2, compute P(A or B).5P6P7P8P9P10P11P12P13P14P15PEnvironmental: Land Formations Arches National Park is located in southern Utah. The park is famous for its beautiful desert landscape and its many natural sandstone arches. Park Ranger Edward McCarrick started an inventory (not yet complete) of natural arches within the park that have an opening of at least 3 feet. The following table is based on information taken from the book Canyon Country Arches and Bridges by F. A. Barnes. The height of the arch opening is rounded to the nearest foot. For an arch chosen at random in Arches National Park, use the preceding information to estimate the probability that the height of the arch opening is (a) 3 to 9 feet tall (b) 30 feet or taller (c) 3 to 49 feet tall (d) 10 to 74 feet tall (e) 75 feet or taller17PGeneral: Roll Two Dice You roll two fair dice, a green one and a red one. (a) Are the outcomes on the dice independent? (b) Find P(1 on green die and 2 on red die). (c) Find P(2 on green die and 1 on red die). (d) Find P[(1 on green die and 2 on red die) or (2 on green die and 1 on red die)].19P20P21PGeneral: Deck of Cards You draw two cards from a standard deck of 52 cards without replacing the first one before drawing the second. (a) Are the outcomes on the two cards independent? Why? (b) Find P(3 on 1st card and 10 on 2nd). (c) Find P(10 on 1st card and 3 on 2nd). (d) Find the probability of drawing a 10 and a 3 in either order.General: Deck of Cards You draw two cards from a standard deck of 52 cards, but before you draw the second card, you put the first one back and reshuffle the deck. (a) Are the outcomes on the two cards independent? Why? (b) Find P(Ace on 1st card and King on 2nd). (c) Find P(King on 1st card and Ace on 2nd). (d) Find the probability of drawing an Ace and a King in either order.24PMarketing: Toys USA Today gave the information shown in the table about ages of children receiving toys. The percentages represent all toys sold. What is the probability that a toy is purchased for someone (a) 6 years old or older? (b) 12 years old or younger? (c) between 6 and 12 years old? (d) between 3 and 9 years old? Interpretation A child between 10 and 12 years old looks at this probability distribution and asks, Why are people more likely to buy toys for kids older than I am [13 and over] than for kids in my age group [1012]? How would you respond?26P27P28P29PSurvey: Medical Tests Diagnostic tests of medical conditions can have several types of results. The test result can be positive or negative, whether or not a patient has the condition. A positive test (+) indicates that the patient has the condition. A negative test () indicates that the patient does not have the condition. Remember, a positive test does not prove that the patient has the condition. Additional medical work may be required. Consider a random sample of 200 patients, some of whom have a medical condition and some of whom do not. Results of a new diagnostic test for the condition are shown. Assume the sample is representative of the entire population. For a person selected at random, compute the following probabilities: (a) P(+ | condition present); this is known as the sensitivity of a test. (b) P( | condition present); this is known as the false-negative rate. (c) P( | condition absent); this is known as the specificity of a test. (d) P(+ | condition absent); this is known as the false-positive rate. (e) P(condition present and +); this is the predictive value of the test. (f) P(condition present and ).Survey: Lung/Heart In an article titled Diagnostic accuracy of fever as a measure of postoperative pulmonary complications (Heart Lung, Vol. 10, No. 1, p. 61), J. Roberts and colleagues discuss using a fever of 38C or higher as a diagnostic indicator of postoperative atelectasis (collapse of the lung) as evidenced by x-ray observation. For fever 38C as the diagnostic test, the results for postoperative patients are For the meaning of + and , see Problem 30. Complete parts (a) through (f) from Problem 30. 30. Survey: Medical Tests Diagnostic tests of medical conditions can have several types of results. The test result can be positive or negative, whether or not a patient has the condition. A positive test (+) indicates that the patient has the condition. A negative test () indicates that the patient does not have the condition. Remember, a positive test does not prove that the patient has the condition. Additional medical work may be required. Consider a random sample of 200 patients, some of whom have a medical condition and some of whom do not. Results of a new diagnostic test for the condition are shown. Assume the sample is representative of the entire population. For a person selected at random, compute the following probabilities: (a) P(+ | condition present); this is known as the sensitivity of a test. (b) P( | condition present); this is known as the false-negative rate. (c) P( | condition absent); this is known as the specificity of a test. (d) P(+ | condition absent); this is known as the false-positive rate. (e) P(condition present and +); this is the predictive value of the test. (f) P(condition present and ).Survey: Customer Loyalty Are customers more loyal in the east or in the west? The following table is based on information from Trends in the United States, published by the Food Marketing Institute, Washington, D.C. The columns represent length of customer loyalty (in years) at a primary supermarket. The rows represent regions of the United States. What is the probability that a customer chosen at random (a) has been loyal 10 to 14 years? (b) has been loyal 10 to 14 years, given that he or she is from the east? (c) has been loyal at least 10 years? (d) has been loyal at least 10 years, given that he or she is from the west? (e) is from the west, given that he or she has been loyal less than 1 year? (f) is from the south, given that he or she has been loyal less than 1 year? (g) has been loyal 1 or more years, given that he or she is from the east? (h) has been loyal 1 or more years, given that he or she is from the west? (i) Are the events from the east and loyal 15 or more years independent? Explain.Franchise Stores: Profits Wing Foot is a shoe franchise commonly found in shopping centers across the United States. Wing Foot knows that its stores will not show a profit unless they gross over 940,000 per year. Let A be the event that a new Wing Foot store grosses over 940,000 its first year. Let B be the event that a store grosses over 940,000 its second year. Wing Foot has an administrative policy of closing a new store if it does not show a profit in either of the first 2 years. The accounting office at Wing Foot provided the following information: 65% of all Wing Foot stores show a profit the first year; 71% of all Wing Foot stores show a profit the second year (this includes stores that did not show a profit the first year); however, 87% of Wing Foot stores that showed a profit the first year also showed a profit the second year. Compute the following: (a) P(A) (b) P(B) (c) P(B | A) (d) P(A and B) (e) P(A or B) (f) What is the probability that a new Wing Foot store will not be closed after 2 years? What is the probability that a new Wing Foot store will be closed after 2 years?Education: College of Nursing At Litchfield College of Nursing, 85% of incoming freshmen nursing students are female and 15% are male. Recent records indicate that 70% of the entering female students will graduate with a BSN degree, while 90% of the male students will obtain a BSN degree. If an incoming freshman nursing student is selected at random, find (a) P(student will graduate | student is female). (b) P(student will graduate and student is female). (c) P(student will graduate | student is male). (d) P(student will graduate and student is male). (e) P(student will graduate). Note that those who will graduate are either males who will graduate or females who will graduate. (f) The events described by the phrases will graduate and is female and will graduate, given female seem to be describing the same students. Why are the probabilities P(will graduate and is female) and P(will graduate | female) different?