Bartleby Sitemap - Textbook Solutions

All Textbook Solutions for An Introduction to Statistical Methods and Data Analysis

Hansen (2006) describes a study to assess the migration and survival of salmon released from fish farms located in Norway. The mingling of escaped farmed salmon with wild salmon raises several concerns. First, the assessment of the abundance of wild salmon stocks will be biased if there is a presence of large numbers of farmed salmon. Second, potential interbreeding between farmed and wild salmon may result in a reduction in the health of the wild stocks. Third, diseases present in farmed salmon may be transferred to wild salmon. Two batches of farmed salmon were tagged and released in two locations, one batch of 1,996 fish in northern Norway and a second batch of 2,499 fish in southern Norway. The researchers recorded the time and location at which the fish were captured by either commercial fisherman or anglers in fresh water. Two of the most important pieces of information to be determined by the study were the distance from the point of the fishs release to the point of its capture and the length of time it took for the fish to be captured. a. Identify the population that is of interest to the researchers. b. Describe the sample. c. What characteristics of the population are of interest to the researchers? d. If the sample measurements are used to make inferences about the population characteristics, why is a measure of reliability of the inferences important?During 2012, Texas had listed on FracFocus, an industry fracking disclosure site, nearly 6,000 oil and gas wells in which the fracking methodology was used to extract natural gas. Fontenot et al. (2013) reports on a study of 100 private water wells in or near the Barnett Shale in Texas. There were 91 private wells located within 5 km of an active gas well using fracking, 4 private wells with no gas wells located within a 14 km radius, and 5 wells outside of the Barnett Shale with no gas well located with a 60 km radius. They found that there were elevated levels of potential contaminants such as arsenic and selenium in the 91 wells closest to natural gas extraction sites compared to the 9 wells that were at least 14 km away from an active gas well using the £racking technique to extract natural gas. Identify the population that is of interest to the researchers. Describe the sample. What characteristics of the population are of interest to the researchers? If the sample measurements are used to make inferences about the population characteristics, why is a measure of reliability of the inferences important? In 2014, Congress cut $8.7 billion from the Supplemental Nutrition Assistance Program (SNAP), more commonly referred to as food stamps. The rationale for the decrease is that providing assistance to people will result in the next generation of citizens being more dependent on the government for support. Hoynes (2012) describes a study to evaluate this claim. The study examines 60,782 families over the time period of 1968 to 2009 which is subsequent to the introduction of the Food Stamp Program in 1961. This study examines the impact of a positive and policy-driven change in economic resources available in utero and during childhood on the economic health of individuals in adulthood. The study assembled data linking family background in early childhood to adult health and economic outcomes. The study concluded that the Food Stamp Program has effects decades after initial exposure. Specifically, access to food stamps in childhood leads to a significant reduction in the incidence of metabolic syndrome (obesity, high blood pressure, and diabetes) and, for women, an increase in economic self-sufficiency. Overall, the results suggest substantial internal and external benefits of SNAP. Identify the population that is of interest to the researchers. Describe the sample. What characteristics of the population are of interest to the researchers? If the sample measurements are used to make inferences about the population characteristics, why is a measure of reliability of the inferences important? Of all sports, football accounts for the highest incidence of concussion in the United States due to the large number of athletes participating and the nature of the sport. While there is general agreement that concussion incidence can be reduced by making rule changes and teaching proper tackling technique, there remains debate as to whether helmet design may also reduce the incidence of concussion. Rowson et al. (2014) report on a retrospective analysis of head impact data collected between 2005 and 2010 from eight collegiate football teams. Concussion rates for players wearing two types of helmets, Riddell VSR4 and Riddell Revolution, were compared. A total of 1,281,444 head impacts were recorded, from which 64 concussions were diagnosed. The relative risk of sustaining a concussion in a Revolution helmet compared with a VSR4 helmet was 46.1%. This study illustrates that differences in the ability to reduce concussion risk exist between helmet models in football. Although helmet design may never prevent all concussions from occurring in football, evidence illustrates that it can reduce the incidence of this injury. a. Identify the population that is of interest to the researchers. b. Describe the sample. c. What characteristics of the population are of interest to the researchers? d. If the sample measurements are used to make inferences about the population characteristics, why is a measure of reliability of the inferences important?During the 2004 senatorial campaign in a large southwestern state, illegal immigration was a major issue. One of the candidates argued that illegal immigrants made use of educational and social services without having to pay property taxes. The other candidate pointed out that the cost of new homes in their state was 20–30% less than the national average due to the low wages received by the large number of illegal immigrants working on new home construction. A random sample of 5,500 registered voters was asked the question, “Are illegal immigrants generally a benefit or a liability to the state’s economy?” The results were as follows: 3,500 people responded “liability,” 1,500 people responded “benefit,” and 500 people responded “uncertain.” What is the population of interest? What is the population from which the sample was selected? Does the sample adequately represent the population? If a second random sample of 5,000 registered voters was selected, would the results be nearly the same as the results obtained from the initial sample of 5,000 voters? Explain your answer. An American history professor at a major university was interested in knowing the history literacy of college freshmen. In particular, he wanted to find what proportion of college freshmen at the university knew which country controlled the original 13 colonies prior to the American Revolution. The professor sent a questionnaire to all freshman students enrolled in HIST 101 and received responses from 318 students out of the 7,500 students who were sent the questionnaire. One of the questions was “What country controlled the original 13 colonies prior to the American Revolution?” What is the population of interest to the professor? What is the sampled population? Is there a major difference in the two populations. Explain your answer. Suppose that several lectures on the American Revolution had been given in HIST 101 prior to the students receiving the questionnaire. What possible source of bias has the professor introduced into the study relative to the population of interest? In the following descriptions of a study, confounding is present. Describe the explanatory and confounding variable in the study and how the confounding may invalidate the conclusions of the study. Furthermore, suggest how you would change the study to eliminate the effect of the confounding variable. A prospective study is conducted to study the relationship between incidence of lung cancer and level of alcohol drinking. The drinking status of 5,000 subjects is determined, and the health of the subjects is then followed for 10 years. The results are given below. A study was conducted to examine the possible relationship between coronary disease and obesity. The study found that the proportion of obese persons having developed coronary disease was much higher than the proportion of nonobese persons. A medical researcher states that the population of obese persons generally has higher incidences of hypertension and diabetes than the population of nonobese persons. In the following descriptions of a study, confounding is present. Describe the explanatory and confounding variable in the study and how the confounding may invalidate the conclusions of the study. Furthermore, suggest how you would change the study to eliminate the effect of the confounding variable. a. A hospital introduces a new screening procedure to identify patients suffering from a stroke so that a new blood clot medication can be given to the patient during the crucial period of 12 hours after stroke begins. The procedure appears to be very successful because in the first year of its implementation there is a higher rate of total recovery by the patients in comparison to the rate in the previous year for patients admitted to the hospital. b. A high school mathematics teacher is convinced that a new software program will improve math scores for students taking the SAT. As a method of evaluating her theory, she offers the students an opportunity to use the software on the schools computers during a 1-hour period after school. The teacher concludes the software is effective because the students using the software had significantly higher scores on the SAT than did the students who did not use the software.A news report states that minority children who take advanced mathematics courses in high school have a first-year GPA in college that is equivalent to that of white students. The newspaper columnist suggested that the lack of advanced mathematics courses in high school curriculums in inner-city schools was a major cause of the low college success rate of students from inner-city schools. What confounding variables may be present that invalidate the columnist’s conclusion? A study was conducted to determine if the inclusion of a foreign language requirement in high schools may have a positive effect on students performance on standardized English exams. From a sample of 100 high schools, 50 of which had a foreign language requirement and 50 of which did not, it was found that the average score on the English proficiency exam was 25% higher for the students having a foreign language requirement. What confounding variables may be present that would invalidate the conclusion that requiring a foreign language in high school increases English language proficiency?5EA large auto parts supplier with distribution centers throughout the United States wants to survey its employees concerning health insurance coverage. Employee insurance plans vary greatly from state to state. The company wants to obtain an estimate of the annual health insurance deductible its employees would find acceptable. What sampling plan would you suggest to the company to achieve its goal? The circuit judges in a rural county are considering a change in how jury pools are selected for felony trials. They ask the administrator of the courts to assess the county residents’ reaction to changing the requirement for membership in the jury pool from the current requirement of all registered voters to a new requirement of all registered voters plus all residents with a current driver’s license. The administrator sends questionnaires to a random sample of 1,000 people from the list of registered voters in the county and receives responses from 253 people. What is the population of interest? What is the sampling frame? What possible biases could be present in using the information from the survey? 8ETime magazine, in an article in the late 1950s, stated that “the average Yaleman, class of 1924, makes $25,111 a year,” which, in today’s dollars, would be over $150,000. Time’s estimate was based on replies to a sample survey questionnaire mailed to those members of the Yale class of 1924 whose addresses were on file with the Yale administration in the late 1950s. What is the survey’s population of interest? Were the techniques used in selecting the sample likely to produce a sample that was representative of the population of interest? What are the possible sources of bias in the procedures used to obtain the sample? Based on the sources of bias, do you believe that Time’s estimate of the salary of a 1924 Yale graduate in the late 1950s is too high, too low, or nearly the correct value? The New York City school district is planning a survey of 1,000 of its 250,000 parents or guardians who have students currently enrolled. They want to assess the parents’ opinion about mandatory drug testing of all students participating in any extracurricular activities, not just sports. An alphabetical listing of all parents or guardians is available for selecting the sample. In each of the following descriptions of the method of selecting the 1,000 participants in the survey, identify the type of sampling method used (simple random sampling, stratified sampling, or cluster sampling). Each name is randomly assigned a number. The names with numbers 1 through 1,000 are selected for the survey. The schools are divided into five groups according to grade level taught at the school: K–2, 3–5, 6–7, 8–9, 10–12. Five separate sampling frames are constructed, one for each group. A simple random sample of 200 parents or guardians is selected from each group. The school district is also concerned that the parent’s or guardian’s opinion may differ depending on the age and sex of the student. Each name is randomly assigned a number. The names with numbers 1 through 1,000 are selected for the survey. The parent is asked to fill out a separate survey for each of their currently enrolled children. A professional society, with a membership of 45,000, is designing a study to evaluate its members satisfaction with the type of sessions presented at the societys annual meeting. In each of the following descriptions of the method of selecting participants in the survey, identify the type of sampling method used (simple random sampling, stratified sampling, or cluster sampling). a. The society has an alphabetical listing of all its members. It assigns a number to each name and then using a computer software program generates 1,250 numbers between 1 and 45,000. It selects these 1,250 members for the survey. b. The society is interested in regional differences in its members opinions. Therefore, it divides the United States into nine regions with approximately 5,000 members per region. It then randomly selects 450 members from each region for inclusion in the survey. c. The society is composed of doctors, nurses, and therapists, all working in hospitals. There are a total of 450 distinct hospitals. The society decides to conduct onsite in-person interviews, so it randomly selects 20 hospitals and interviews all members working at the selected hospital.For each of the following situations, decide what sampling method you would use. Provide an explanation of why you selected a particular method of sampling. a. A large automotive company wants to upgrade the software on its notebook computers. A survey of 1,500 employees will request information concerning frequently used software applications such as spreadsheets, word processing, e-mail, Internet access, statistical data processing, and so on. A list of employees with their job categories is available. b. A hospital is interested in what types of patients make use of their emergency room facilities. It is decided to sample 10% of all patients arriving at the emergency room for the next month and record their demographic information along with type of service required, the amount of time the patient waits prior to examination, and the amount of time needed for the doctor to assess the patients problem.For each of the following situations, decide what sampling method you would use. Provide an explanation of why you selected a particular method of sampling. The major state university in the state is attempting to lobby the state legislature for a bill that would allow the university to charge a higher tuition rate than the other universities in the state. To provide a justification, the university plans to conduct a mail survey of its alumni to collect information concerning their current employment status. The university grants a wide variety of different degrees and wants to make sure that information is obtained about graduates from each of the degree types. A 5% sample of alumni is considered sufficient. The Environmental Protection Agency (EPA) is required to inspect landfills in the United States for the presence of certain types of toxic material. The materials were sealed in containers and placed in the landfills. The exact location of the containers is no longer known. The EPA wants to inspect a sample of 100 containers from the 4,000 containers known to be in the landfills to determine if leakage from the containers has occurred. The process engineer designed a study to evaluate the quality of plastic irrigation pipes. The study involved a total of 48 pipes; 24 pipes were randomly selected from each of the companys two manufacturing plants. The pipes were heat-treated at one one of four temperatures (175, 200, 225, 250F). The pipes were chemically treated with one of three types of hardeners (H1, H2, H3). The deviations from the nominal compressive strength were measured at five locations on each of the pipes. Identify each of the following components of the experimental design. a. Factors b. Factor levels c. Blocks d. Experimental unit e. Measurement unit f. Replications g. Covariates h. Treatments15EIn the descriptions of experiments given in Exercises 2.152.18, identify the important features of each design. Include as many of the components listed in Exercise 2.14 as needed to adequately describe the design. 2.16 A medical study is designed to evaluate a new drug, D1, for treating a particular illness. There is a widely used treatment, D2, for this disease to which the new drug will be compared. A placebo will also be included in the study. The researcher has selected 10 hospitals for the study. She does a thorough evaluation of the hospitals and concludes that there may be aspects of the hospitals that may result in the elevation of responses at some of the hospitals. Each hospital has six wards of patients. She will randomly select six patients in each ward to participate in the study. Within each hospital, two wards are randomly assigned to administer D1, two wards to administer D2, and two wards administer the placebo. All six patients in each of the wards will be given the same treatment. Age, BMI, blood pressure, and a measure of degree of illness are recorded for each patient upon entry into the hospital. The response is an assessment of the degree of illness after 6 days of treatment. 2.14 The process engineer designed a study to evaluate the quality of plastic irrigation pipes. The study involved a total of 48 pipes; 24 pipes were randomly selected from each of the companys two manufacturing plants. The pipes were heat-treated at one one of four temperatures (175, 200, 225, 250F). The pipes were chemically treated with one of three types of hardeners (H1, H2, H3). The deviations from the nominal compressive strength were measured at five locations on each of the pipes. Identify each of the following components of the experimental design. a. Factors b. Factor levels c. Blocks d. Experimental unit e. Measurement unit f. Replications g. Covariates h. TreatmentsIn the descriptions of experiments given in Exercises 2.15–2.18, identify the important features of each design. Include as many of the components listed in Exercise 2.14 as needed to adequately describe the design. 2.17 In place of the design described in Exercise 2.16, make the following change. Within each hospital, the three treatments will be randomly assigned to the patients, with two patients in each ward receiving D1, two patients receiving D2, and two patients receiving the placebo. 2.14 The process engineer designed a study to evaluate the quality of plastic irrigation pipes. The study involved a total of 48 pipes; 24 pipes were randomly selected from each of the company’s two manufacturing plants. The pipes were heat-treated at one one of four temperatures (175, 200, 225, 250°F). The pipes were chemically treated with one of three types of hardeners (H1, H2, H3). The deviations from the nominal compressive strength were measured at five locations on each of the pipes. Identify each of the following components of the experimental design. Factors Factor levels Blocks Experimental unit Measurement unit Replications Covariates Treatments In the descriptions of experiments given in Exercises 2.15–2.18, identify the important features of each design. Include as many of the components listed in Exercise 2.14 as needed to adequately describe the design. 2.18 Researchers in an education department at a large state university have designed a study to compare the math abilities of students in junior high. They will also examine the impact of three types of schools — public, private nonparochial, and parochial — on the scores the students receive in a standardized math test. Two large cities in each of four geographical regions of the United States were selected for the study. In each city, one school of each of the three types was randomly selected, and a single eighth-grade class was randomly selected within each school. The scores on the test were recorded for each student in the selected classrooms. The researcher was concerned about differences in socio-economie status among the 8 cities, so she obtained a measure of socioeconomic status for each of the students that participated in the study. 2.14 The process engineer designed a study to evaluate the quality of plastic irrigation pipes. The study involved a total of 48 pipes; 24 pipes were randomly selected from each of the company’s two manufacturing plants. The pipes were heat-treated at one one of four temperatures (175, 200, 225, 250°F). The pipes were chemically treated with one of three types of hardeners (H1, H2, H3). The deviations from the nominal compressive strength were measured at five locations on each of the pipes. Identify each of the following components of the experimental design. Factors Factor levels Blocks Experimental unit Measurement unit Replications Covariates Treatments A research specialist for a large seafood company plans to investigate bacterial growth on oysters and mussels subjected to three different storage temperatures. Nine cold-storage units are available. She plans to use three storage units for each of the three temperatures. One package of oysters and one package of mussels will be stored in each of the storage units for 2 weeks. At the end of the storage period, the packages will be removed and the bacterial count made for two samples from each package. The treatment factors of interest are temperature (levels: 0, 5, 10°C) and seafood (levels: oysters, mussels). She will also record the bacterial count for each package prior to placing seafood in the cooler. Identify each of the following components of the experimental design. Factors Factor levels Blocks Experimental unit Measurement unit Replications Treatments In Exercises 2.20–2.22, identify whether the design is a completely randomized design, randomized complete block design, or Latin square design. If there is a factorial structure for the treatments, specify whether it has a two-factor or three-factor structure. If the measurement units are different from the experimental units, identify both. 2.20. The researchers design an experiment to evaluate the effect of applying fertilizer having varying levels of nitrogen, potassium, and phosphorus on the yields of orange trees. There were three, four, and three different levels of N, P, and K, respectively, yielding 36 distinct combinations. Ten orange groves were randomly selected for the experiment. Each grove was then divided into 36 distinct plots, and the 36 fertilizer combinations were randomly assigned to the plots within each grove. The yield of five randomly selected trees in each plot is recorded to assess the variation within each of the 360 plots. In Exercises 2.202.22, identify whether the design is a completely randomized design, randomized complete block design, or Latin square design. If there is a factorial structure for the treatments, specify whether it has a two-factor or three-factor structure. If the measurement units are different from the experimental units, identify both. 2.21 A company is planning on purchasing a software program to manage its inventory. Five vendors submit bids on supplying the inventory control software. In order to evaluate the effectiveness of the software, the companys personnel decide to evaluate the software by running each of the five software packages at each of the companys 10 warehouses. The number of errors produced by each of the software packages is recorded at each of the warehouses.In Exercises 2.20–2.22, identify whether the design is a completely randomized design, randomized complete block design, or Latin square design. If there is a factorial structure for the treatments, specify whether it has a two-factor or three-factor structure. If the measurement units are different from the experimental units, identify both. 2.22 Four different glazes are applied at two different thicknesses to clay pots. The kiln used in the glazing can hold eight pots at a time, and it takes 1 day to apply the glazes. The experimenter wanted eight replications of the experiment. Since conditions in the kiln vary somewhat from day to day, the experiment was conducted over an 8-day period. The experiment is conducted so that each combination of a thickness and type of glaze is randomly assigned to one pot in the kiln each day. A bakery wants to evaluate new recipes for carrot cake. It decides to ask a random sample of regular customers to evaluate the recipes by tasting samples of the cakes. After a customer tastes a sample of the cake, the customer will provide scores for several characteristics of the cake, and these scores are then combined into a single overall score for the recipe. Thus, from each customer, a single numeric score is recorded for each recipe. The taste-testing literature indicates that in this type of study some consumers tend to give all samples low scores and others tend to give all samples high scores. a. There are two possible experimental designs. Design A would use a random sample of 100 customers. From this group, 20 would be randomly assigned to each of the five recipes, so that each customer tastes only one recipe. Design B would use a random sample of 100 customers with each customer tasting all five recipes, the recipes being presented in a random order for each customer. Which design would you recommend? Justify your answer. b. The manager of the bakery asked for a progress report on the experiment. The person conducting the experiment replied that one recipe tasted so bad that she eliminated it from the analysis. Is this a problem for the analysis if Design B was used? Why or why not? Would it have been a problem if Design A was used? Why or why not?24SE25SE26SE27SE28SE29SEA forester wants to estimate the total number of trees on a tree farm that have a diameter exceeding 12 inches. Because the farm contains too many trees to facilitate measuring all of them, she uses Google Earth to divide the farm into 250 rectangular plots of approximately the same area. An examination of the plots reveals that 27 of the plots have a sizable portion of their land under water. The forester excluded the 27 “watery” plots for the study. She then randomly selected 42 of the remaining 223 plots and counted all the trees having a diameter exceeding 12 inches on the 42 selected plots. What is the sampling frame for this study? How does the sampling frame differ from the population of interest, if at all? What biases may exist in the estimate of the number of trees having a diameter greater than 12 inches based on the collected data? A transportation researcher is funded to estimate the proportion of automobile tires with an unsafe tread thickness in a small northern state. The researcher randomly selects one month during each of the four seasons for taking the measurements. During each of the four selected months, the researcher randomly selects 500 cars from the list of registered cars in the state and then measures the tread thickness of the four tires on each of the selected cars. What is the population of interest? What is the sampling frame? What biases if any may result from using the data from this study to obtain the estimated proportion of cars with an unsafe thread thickness? The department of agriculture in a midwestern state wants to estimate the amount of corn produced in the state that is used to make ethanol. There are 50,000 farms in the state that produce corn. The farms are classified into four groups depending on the total number of acres planted in corn. A random sample of 500 farms is selected from each of the four groups, and the amount corn used to generate ethanol is determined for each of the 2,000 selected farms. What is the population of interest? What is the sampling frame? What type of sampling plan is being used in this study? What biases if any may result from using the data from this study to obtain an estimate of the amount of corn used to produce ethanol? 33SEA Yankelovich, Skelly, and White poll taken in the fall of 1984 showed that one-fifth of the 2,207 people surveyed admitted to having cheated on their federal income taxes. Do you think that this fraction is close to the actual proportion who cheated? Why? (Discuss the difficulties of obtaining accurate information on a question of this type.) The U.S. government spent more than $3.6 trillion in the 2014 fiscal year. The following table provides broad categories that demonstrate the expenditures of the federal government for domestic and defense programs. Construct a pie chart for these data. Construct a bar chart for these data. Construct a pie chart and bar chart using percentages in place of dollars. Which of the four charts is more informative to the tax-paying public? The type of vehicle the U.S public purchases varies depending on many factors. Table 1060 from the U.S. Census Bureau, Statistical Abstract of the United States: 2012 provides the following data. The numbers reported are in thousands of units; that is, 9,300 represents 9,300,000 vehicles sold in 1990. a. Construct a graph that would display the changes from 1990 to 2010 in the publics choice in vehicle. b. Do you observe any trends in the type of vehicle purchased? What factors may be influencing these trends?It has been reported that there has been a change in the type of practice physicians are selecting for their career. In particular, there is concern that there will be a shortage of family practice physicians in future years. The following table contains data on the total number of office-based physicians and the number of those physicians declaring themselves to be family practice physicians. The numbers in the table are given in thousands of physicians. (Source: U.S. Census Bureau, Statistical Abstract of the United States: 2002.) a. Use a bar chart to display the increase in the number of family practice physicians from 1990 to 2001. b. Calculate the percentage of office-based physicians who are family practice physicians and then display these data in a bar chart. c. Is there a major difference in the trend displayed by the two bar charts?The regulations of the board of health in a particular state specify that the fluoride level must not exceed 1.5 parts per million (ppm). The 25 measurements given here represent the fluoride levels for a sample of 25 days. Although fluoride levels are measured more than once per day, these data represent the early morning readings for the 25 days sampled. Determine the range of the measurements. Dividing the range by 7, the number of subintervals selected, and rounding, we have a class interval width of .05. Using .705 as the lower limit of the first interval, construct a frequency histogram. Compute relative frequencies for each class interval and construct a relative frequency histogram. Note that the frequency and relative frequency histograms for these data have the same shape. If one of these 25 days were selected at random, what would be the chance (probability) that the fluoride reading would be greater than .90 ppm? Guess (predict) what proportion of days in the coming year will have a fluoride reading greater than .90 ppm. The National Highway Traffic Safety Administration has studied the use of rear-seat automobile lap and shoulder seat belts. The number of lives potentially saved with the use of lap and shoulder seat belts is shown for various percentages of use. Suggest several different ways to graph these data. Which one seems more appropriate and why? 6EThe survival times (in months) for two treatments for patients with severe chronic left-ventricular heart failure are given in the following tables. a. Construct separate relative frequency histograms for the survival times of both the therapies. b. Compare the two histograms. Does the new therapy appear to generate a longer survival time? Explain your answer.Combine the data from the separate therapies in Exercise 3.7 into a single data set, and construct a relative frequency histogram for this combined data set. Does the plot indicate that the data are from two separate populations? Explain your answer. The survival times (in months) for two treatments for patients with severe chronic left-ventricular heart failure are given in the following tables. Construct separate relative frequency histograms for the survival times of both the therapies. Compare the two histograms. Does the new therapy appear to generate a longer survival time? Explain your answer. 9EThe following table presents homeownership rates, in percentages, by state for the years 1985, 1996, and 2002. These values represent the proportion of homes owned by the occupant to the total number of occupied homes. Construct relative frequency histogram plots for the homeownership data given in the table for the years 1985, 1996, and 2002. What major differences exist among the plots for the three years? Why do you think the plots have changed over these 17 years? How could Congress use the information in these plots for writing tax laws that allow major tax deductions for homeownership? 11E12EA supplier of high-quality audio equipment for automobiles accumulates monthly sales data on speakers and receiver-amplifier units for 5 years. The data (in thousands of units per month) are shown in the following table. Plot the sales data. Do you see any overall trend in the data? Do there seem to be any cyclic or seasonal effects? 14ECompute the mean, median, and mode for the following data:16E17E18EA study of the reliability of buses [Large Sample Simultaneous Confidence Intervals for the Multinominal Probabilities on Transformations of the Cell Frequencies, Technometrics (1980) 22:588] examined the reliability of 191 buses. The distance traveled (in 1,000s of miles) prior to the first major motor failure was classified into intervals. A modified form of the table follows. a. Sketch the relative frequency histogram for the distance data and describe its shape. b. Estimate the mode, median, and mean for the distance traveled by the 191 buses. c. What does the relationship among the three measures of center indicate about the shape of the histogram for these data? d. Which of the three measures would you recommend as the most appropriate representative of the distance traveled by one of the 191 buses? Explain your answer.20E21EA study of the survival times, in days, of skin grafts on burn patients was examined by Woolson and Lachenbruch [Biometrika (1980) 67:597–606]. Two of the patients left the study prior to the failure of their grafts. The survival time for these individuals is some number greater than the reported value. Survival time (days): 37, 19, 57*, 93, 16, 22, 20, 18, 63, 29, 60* (The “*’’ indicates that the patient left the study prior to failure of the graft; values given are for the day the patient left the study.) Calculate the measures of center (if possible) for the 11 patients. If the survival times of the two patients who left the study were obtained, how would these new values change the values of the summary statistics calculated in (a)? 23E24E Refer to Exercise 3.24. Average the three group means, the three group medians, and the three group modes, and compare your results to those of part (b). Comment on your findings. 3.24. Effective tax rates (per $100) on residential property for three groups of large cities, ranked by residential property tax rate, are shown in the following table. Source: Government of the District of Columbia, Department of Finance and Revenue, Tax Rates and Tax Burdens in the District of Columbia: A Nationwide Comparison (annual). Compute the mean, median, and mode separately for the three groups. Compute the mean, median, and mode for the complete set of 30 measurements. What measure or measures best summarize the center of these distributions? Explain. Pushing economy and wheelchair-propulsion technique were examined for eight wheelchair racers on a motorized treadmill in a paper by Goosey and Campbell [Adapted Physical Activity Quarterly (1998) 15:36–50]. The eight racers had the following years of racing experience: Racing experience (years): 6, 3, 10, 4, 4, 2, 4, 7 Verify that the mean years of experience is 5 years. Does this value appear to adequately represent the center of the data set? Verify that . Calculate the sample variance and standard deviation for the experience data. How would you interpret the value of the standard deviation relative to the sample mean? 27E28E The treatment times (in minutes) for patients at a health clinic are as follows: Construct the quantile plot for the treatment times for the patients at the health clinic. Find the 25th percentile for the treatment times and interpret this value. The health clinic advertises that 90% of all its patients have a treatment time of 40 minutes or less. Do the data support this claim? To assist in estimating the amount of lumber in a tract of timber, an owner decided to count the number of trees with diameters exceeding 12 inches in randomly selected 50 50-foot squares. Seventy 50 50 squares were randomly selected from the tract and the number of trees (with diameters in excess of 12 inches) was counted for each. The data are as follows: a. Construct a relative frequency histogram to describe these data. b. Calculate the sample mean y as an estimate of m, the mean number of timber trees with diameter exceeding 12 inches for all 50 50 squares in the tract. c. Calculate s for the data. Construct the intervals (ys),(y2s), and (y3s). Count the percentages of squares falling in each of the three intervals, and compare these percentages with the corresponding percentages given by the Empirical Rule. Consumer Reports in its June 1998 issue reports on the typical daily room rate at six luxury and nine budget hotels. The room rates are given in the following table. Compute the mean and standard deviation of the room rates for both luxury and budget hotels. Verify that luxury hotels have a more variable room rate than budget hotels. Give a practical reason why the luxury hotels are more variable than the budget hotels. Might another measure of variability be better to compare luxury and budget hotel rates? Explain. Many marine phanerogam species are highly sensitive to changes in environmental conditions. In the article Posidonia oceanica: A Biological Indicator of Past and Present Mercury Contamination in the Mediterranean Sea [Marine Environmental Research, March 1998 45:101111], the researchers report the mercury concentrations over a period of about 20 years at several locations in the Mediterranean Sea. Samples of Posidonia oceanica were collected by scuba diving at a depth of 10 meters. For each site, 45 orthotropic shoots were sampled and the mercury concentration was determined. The average mercury concentration is recorded in the following table for each of the sampled years. a. Generate a time-series plot of the mercury concentrations and place lines for both sites on the same graph. Comment on any trends in the lines across the years of data. Are the trends similar for both sites? b. Select the most appropriate measure of center for the mercury concentrations. Compare the centers for the two sites. c. Compare the variabilities of the mercury concentrations at the two sites. Use the CV in your comparison, and explain why it is more appropriate than using the standard deviations. d. When comparing the centers and variabilities of the two sites, should the years 19691972 be used for site 2?33EThe following data are the resting pulse rates for 30 randomly selected individuals who were participants at a 10K race. Construct a stem-and-leaf plot of the pulse rates. Construct a boxplot of the pulse rates. Describe the shape of the distribution of the pulse rates. The boxplot provides information about the distribution of pulse rates for what population? Consumer Reports in its May 1998 issue provides cost per daily feeding for 28 brands of dry dog food and 23 brands of canned dog food. Using the Minitab computer program, the following side-by-side boxplot for these data was created. From these graphs, determine the median, lower quartile, and upper quartile for the daily costs of both dry and canned dog food. Comment on the similarities and differences in the distributions of daily costs for the two types of dog food. 36E37E38EIn the paper “Demographic Implications of Socioeconomic Transition Among the Tribal Populations of Manipur, India’’ [Human Biology (1998) 70(3):597–619], the authors describe the tremendous changes that have taken place in all the tribal populations of Manipur, India, since the beginning of the twentieth century. The tribal populations of Manipur are in the process of socioeconomic transition from a traditional subsistence economy to a market-oriented economy. The following table displays the relation between literacy level and subsistence group for a sample of 614 married men and women in Manipur, India. Graphically depict the data in the table using a stacked bar graph. Do a percentage comparison based on the row and column totals. What conclusions do you reach with respect to the relation between literacy and subsistence group? 40E41E42E43SE44SE45SE46SE47SE48SEA random sample of 90 standard metropolitan statistical areas (SMSAs) was studied to obtain information on murder rates. The murder rates (number of murders per 100,000 people) were recorded, and these data are summarized in the following frequency table. Construct a relative frequency histogram for these data.50SE51SE52SE53SEThe Insurance Institute for Highway Safety published data on the total damage suffered by compact automobiles in a series of controlled, low-speed collisions. The data, in dollars, with brand names removed are as follows: a. Draw a histogram of the data using six or seven categories. b. On the basis of the histogram, what would you guess the mean to be? c. Calculate the median and mean. d. What does the relation between the mean and median indicate about the shape of the data?55SE56SE Federal authorities have destroyed considerable amounts of wild and cultivated marijuana plants. The following table shows the number of plants destroyed and the number of arrests for a 12-month period for 15 states. Discuss the appropriateness of using the sample mean to describe these two variables. Compute the sample mean, 10% trimmed mean, and 20% trimmed mean. Which trimmed mean seems more appropriate for each variable? Why? Does there appear to be a relation between the number of plants destroyed and the number of arrests? How might you examine this question? What other variable(s) might be related to the number of plants destroyed? The most widely reported index of the performance of the New York Stock Exchange (NYSE) is the Dow Jones Industrial Average (DJIA). This index is computed from the stock prices of 30 companies. When the DJIA was invented in 1896, the index was the average price of 12 stocks. The index was modified over the years as new companies were added and dropped from the index and was also altered to reflect when a company splits its stock. The closing New York Stock Exchange (NYSE) prices for the 30 components (as of June 19, 2014) of the DJIA are given in the following table. a. Compute the average price of the 30 stock prices in the DJIA. b. The DJIA is no longer an average; the name includes the word average only for historical reasons. The index is computed by summing the stock prices and dividing by a constant, which is changed when stocks are added or removed from the index and when stocks split. DJIA=i=130yiC where yi is the closing price for stock i and C = 0.155625. Using the stock prices given, compute the DJIA for June 19, 2014. c. The DJIA is a summary of data. Does the DJIA provide information about a population using sampled data? If so, to what population? Is the sample a random sample?As one part of a review of middle-manager selection procedures, a study was made of the relation between the hiring source (promoted from within, hired from related business, hired from unrelated business) and the 3-year job history (additional promotion, same position, resigned, dismissed). The data for 120 middle managers follow. a. Compute the job-history percentages within each of the three sources. b. Describe the relation between job history and source. c. Use an appropriate graph to display the relation between job history and source.60SE61SE62SEThe correlations computed for the six variables in the epilepsy study are given here. Do the sizes of the correlation coefficients reflect the relationships displayed in the graphs given in Exercise 3.62? Explain your answer. 3.62. Refer to the epilepsy study data in Table 3.19. Examine the scatterplots of Y1, Y2, Y3, and Y4 versus baseline count and age given here. a. Does there appear to be a difference in the relationship between the seizure count (Y1 Y4) and either the baseline count or age when considering the two groups (treatment and placebo)? b. Describe the type of apparent differences, if any, that you found in part (a). Seizure counts versus age and baseline counts64SE65SE66SE67SE68SE69SE70SE71SE Refer to the data in Exercise 3.69. Construct a quantile plot of the number of syphilis cases. From the quantile plot, determine the 90th percentile for the number of syphilis cases. Identify the states in which the number of syphilis cases is above the 90th percentile. 3.69. Certain types of diseases tend to occur in clusters. In particular, persons affected with AIDS, syphilis, and tuberculosis may have some common characteristics and associations that increase their chances of contracting these diseases. The following table lists the number of reported cases by state in 2001. Construct a scatterplot of the number of AIDS cases versus the number of syphilis cases. Compute the correlation between the number of AIDS cases and the number of syphilis cases. Does the value of the correlation coefficient reflect the degree of association shown in the scatterplot? Why do you think there may be a correlation between these two diseases? 73SE74SE75SE76SE77SE78SE79SE80SEIndicate which interpretation of the probability statement seems most appropriate. a. A casino in New Jersey posts a probability of .02 that the Dallas Cowboys will win Super Bowl L. b. A purchaser of a single ticket in the Texas Powerball has a probability of 1/175,223,510 of winning the big payout. c. The quality control engineer of a large pharmaceutical firm conducts an intensive process reliability study. Based on the findings of the study, the engineer claims that the probability that a bottle of a newly produced drug will have a shelf life greater than 2 years is .952. d. The probability that the control computer on a nuclear power plant and its backup will both fail is .00001. e. The state meteorologist of Michigan reports that there is a 70/30 chance that the rainfall during the months of June through August in 2014 will be below normal; that is, there is a .70 probability of the rainfall being below normal and a .30 probability of the rainfall being above normal. f. A miniature tablet that is small enough to be worn as a watch is in beta testing. In a preliminary report, the company states that more than 55% the 500 testers found the device to be easier to use than a full-sized tablet. The probability of this happening is .011 provided there is no difference in ease of use of the two devices.If you are having a stroke, it is critical that you get medical attention right away. Immediate treatment may minimize the long-term effects of a stroke and even prevent death. A major U.S. city reported that there was a 1 in 250 chance of the patient not having long-term memory problems after suffering a stroke. That is, for a person suffering a stroke in the city, P (no memory problems) = 1/250 = .004. This very high chance of memory problems was attributed to many factors associated with large cities that affected response times, such as heavy traffic, the misidentification of addresses, and the use of cell phones, which results in emergency personnel not being able to obtain an address. The study documented the 1/250 probability based on a study of 15,000 requests for assistance by stroke victims. a. Provide a relative frequency interpretation of the .004 probability. b. The value .004 was based on the records of 15,000 requests for assistance from stroke victims. How many of the 15,000 victims in the study had long-term memory problems? Explain your answer.In reporting highway safety, the National Highway Traffic Safety Administration (NHTSA) reports the number of deaths in automobile accidents each year. If there is a decrease in the number of traffic deaths from the previous year, NHTSA claims that the chance of a death on the highways has decreased. Explain the flaw in NHTSAs claim.In a cable TV program concerning the risk of travel accidents, it was stated that the chance of a fatal airplane crash was 1 in 11 million. An explanation of this risk was that you could fly daily for the next 11 million days (30,137 years) before you would experience a fatal crash. Provide an explanation why this statement is misleading.The gaming commission in its annual examination of the casinos in the state reported that all roulette wheels were fair. Explain the meaning of the term fair with respect to the roulette wheel?The state vehicle inspection bureau provided the following information on the percentage of cars that fail an annual vehicle inspection due to having faulty lights: 15% of all cars have one faulty light, 10% have two faulty lights, and 5% have three or more faulty lights. a. What is the probability that a randomly selected car will have no faulty lights? b. What is the probability that a randomly selected car will have at most one faulty light? c. What is the probability that a randomly selected car will fail an inspection due to a faulty light?The Texas Lottery has a game, Daily 4, in which a player pays 1 to select four single-digit numbers. Each week the Lottery commission places a set of 10 balls numbered 09 in each of four containers. After the balls are thoroughly mixed, one ball is selected from each of the four containers. The winner is the player who matches all four numbers. a. What is the probability of being the winning player if you purchase a single set of four numbers? b. Which of the probability approaches (subjective, classical, or relative frequency) did you employ in obtaining your answer in part (a)?A die is rolled two times. Provide a list of the possible outcomes of the two rolls in this form: the result from the first roll and the result from the second roll.Refer to Exercise 4.10. Assume that the die is a fair die, that is, each of the outcomes has a probability of 1/36. What is the probability of observing a. Event A: Exactly one dot appears on each of the two upturned faces? b. Event B: The sum of the dots on the two upturned faces is exactly 4? c. Event C: The sum of the dots on the two upturned faces is at most 4? 4.10. A die is rolled two times. Provide a list of the possible outcomes of the two rolls in this form: the result from the first roll and the result from the second roll.Refer to Exercise 4.11. a. Describe the event that is the complement of event A. b. Compute the complement of event A. 4.11 Refer to Exercise 4.10. Assume that the die is a fair die, that is, each of the outcomes has a probability of 1/36. What is the probability of observing a. Event A: Exactly one dot appears on each of the two upturned faces? b. Event B: The sum of the dots on the two upturned faces is exactly 4? c. Event C: The sum of the dots on the two upturned faces is at most 4?Refer to Exercise 4.11. a. Are events A and B mutually exclusive? b. Are events A and C mutually exclusive? c. Are events B and C mutually exclusive? 4.11 Refer to Exercise 4.10. Assume that the die is a fair die, that is, each of the outcomes has a probability of 1/36. What is the probability of observing a. Event A: Exactly one dot appears on each of the two upturned faces? b. Event B: The sum of the dots on the two upturned faces is exactly 4? c. Event C: The sum of the dots on the two upturned faces is at most 4?A credit union takes a sample of four mortgages each month to survey the homeowners satisfaction with the credit unions servicing of their mortgage. Each mortgage is classified as a fixed rate (F) or variable rate (V). a. What are the 16 possible combinations of the four mortgages? Hint: One such possibility would be F1V2V3F4. b. List the combinations in event A: At least three of the mortgages are variable rate. c. List the combinations in event B: All four mortgages are the same type. d. List the combinations in event C: The union of events A and B. e. List the combinations in event D: The intersection of events A and B.A nuclear power plant has double redundancy on the feedwater pumps used to remove heat from the reactor core. A safely operating plant requires only one of the three pumps to be functional. Define the events A, B, and C as follows: A: Pump 1 works properly B: Pump 2 works properly C: Pump 3 works properly Describe in words the following events: a. The intersection of A, B, and C b. The union of A, B, and C c. The complement of the intersection of A, B, and C d. The complement of the union of A, B, and CThe population distribution in the United States based on race/ethnicity and blood type as reported by the American Red Cross is given here. a. A volunteer blood donor walks into a Red Cross blood donation center. What is the probability she will be Asian and have Type O blood? b. What is the probability that a white donor will not have Type A blood? c. What is the probability that an Asian donor will have either Type A or Type B blood? d. What is the probability that a donor will have neither Type A nor Type AB blood?The makers of the candy MMs report that their plain MMs are composed of 15% yellow, 10% red, 20% orange, 25% blue, 15% green, and 15% brown. If you randomly select an MM, what is the probability of the following? a. It is brown. b. It is red or green. c. It is not blue. d. It is both red and brownRefer to Exercise 4.11. Compute the following probabilities: a. P(A|B) b. P(A|C) c. P(B|C) 4.11 Refer to Exercise 4.10. Assume that the die is a fair die, that is, each of the outcomes has a probability of 1/36. What is the probability of observing a. Event A: Exactly one dot appears on each of the two upturned faces? b. Event B: The sum of the dots on the two upturned faces is exactly 4? c. Event C: The sum of the dots on the two upturned faces is at most 4?19E20ERefer to Exercise 4.16. Let W be the event that the donor is white, B be the event that the donor is black, and A be the event that the donor is Asian. Also, let T1 be the event that the donor has blood type O, T2 be the event that the donor has blood type A, T3 be the event that the donor has blood type B, and T4 be the event that the donor has blood type AB. a. Describe in words the event T1|W. b. Compute the probability of the occurrence of the event T1|W, P(T1|W). c. Are the events W and T1 independent? Justify your answer. d. Are the events W and T1 mutually exclusive? Explain your answer. 4.16 The population distribution in the United States based on race/ethnicity and blood type as reported by the American Red Cross is given here. a. A volunteer blood donor walks into a Red Cross blood donation center. What is the probability she will be Asian and have Type O blood? b. What is the probability that a white donor will not have Type A blood? c. What is the probability that an Asian donor will have either Type A or Type B blood? d. What is the probability that a donor will have neither Type A nor Type AB blood?Is it possible for events A and B to be both mutually exclusive and independent? Justify your answer.A survey of 1,000 U.S. government employees who have an advanced college degree produced the following responses to the offering of a promotion to a higher grade position that would involve moving to a new location. Use the results of the survey to estimate the following probabilities. a. What is the probability that a randomly selected government employee having an advanced college degree would accept a promotion? b. What is the probability that a randomly selected government employee having an advanced college degree would not accept a promotion? c. What is the probability that a randomly selected government employee having an advanced college degree has a spouse with a professional position?Refer to Exercise 4.23. Define the following events. Event A: A randomly selected government employee having an advanced college degree would accept a promotion Event B: A randomly selected government employee having an advanced college degree has a spouse in a professional career Event C: A randomly selected government employee having an advanced college degree has a spouse without a professional position Event D: A randomly selected government employee having an advanced college degree is unmarried Use the results of the survey in Exercise 4.23 to compute the following probabilities: a. P(A) b. P(B) c. P(A|C) d. P(A|D) A survey of 1,000 U.S. government employees who have an advanced college degree produced the following responses to the offering of a promotion to a higher grade position that would involve moving to a new location. Use the results of the survey to estimate the following probabilities. a. What is the probability that a randomly selected government employee having an advanced college degree would accept a promotion? b. What is the probability that a randomly selected government employee having an advanced college degree would not accept a promotion? c. What is the probability that a randomly selected government employee having an advanced college degree has a spouse with a professional position?25EA large corporation has spent considerable time developing employee performance rating scales to evaluate an employees job performance on a regular basis so major adjustments can be made when needed and employees who should be considered for a fast track can be isolated. Keys to this latter determination are ratings on the ability of an employee to perform to his or her capabilities and on his or her formal training for the job. The probabilities for being placed on a fast track are as indicated for the 12 categories of workload capacity and formal training. The following three events (A, B, and C) are defined: A: An employee works at the high-capacity level B: An employee falls into the highest (extensive) formal training category C: An employee has little or no formal training and works below high capacity a. Find P(A), P(B), and P(C). b. Find P(A|B), P(BB), and P(BC). c. Find P(A B), P(A C), and P(B C ).The utility company in a large metropolitan area finds that 70% of its customers pay a given monthly bill in full. a. Suppose two customers are chosen at random from the list of all customers. What is the probability that both customers will pay their monthly bill in full? b. What is the probability that at least one of them will pay in full?28EOf a finance companys loans, 1% are defaulted (not completely repaid). The company routinely runs credit checks on all loan applicants. It finds that 30% of defaulted loans went to poor risks, 40% to fair risks, and 30% to good risks. Of the nondefaulted loans, 10% went to poor risks, 40% to fair risks, and 50% to good risks. Use Bayes Formula to calculate the probability that a poor-risk loan will be defaulted.Refer to Exercise 4.29. Show that the posterior probability of default, given a fair risk, equals the prior probability of default. Explain why this is a reasonable result 4.29 Of a finance companys loans, 1% are defaulted (not completely repaid). The company routinely runs credit checks on all loan applicants. It finds that 30% of defaulted loans went to poor risks, 40% to fair risks, and 30% to good risks. Of the nondefaulted loans, 10% went to poor risks, 40% to fair risks, and 50% to good risks. Use Bayes Formula to calculate the probability that a poor-risk loan will be defaulted.31EIn Example 4.4, compute the probability that the test incorrectly identifies the defects D1, D2, and D3; that is, compute P(D1A1),P(D2A2)andP(D3A),andP(D3A3). EXAMPLE 4.4 In the manufacture of circuit boards, there are three major types of defective boards. The types of defects, along with the percentage of all circuit boards having these defects, are (1) improper electrode coverage (D1), 2.8%; (2) plating separation (D2), 1.2%; and (3) etching problems (D3), 3.2%. A circuit board will contain at most one of the three defects. Defects can be detected with certainty using destructive testing of the finished circuit boards; however, this is not a very practical method for inspecting a large percentage of the circuit boards. A nondestructive inspection procedure has been developed that has the following outcomes: A1, which indicates the board has only defect D1; A2, which indicates the board has only defect D2; A3, which indicates the board has only defect D3; and A4, which indicates the board has no defects. The respective likelihoods for the four outcomes of the nondestructive test determined by evaluating a large number of boards known to have exactly one of the three types of defects are given in Table 4.5.33EIn a January 15, 1998, article, the New England Journal of Medicine (338:141146) reported on the utility of using computerized tomography (CT) as a diagnostic test for patients with clinically suspected appendicitis. In at least 20% of patients with appendicitis, the correct diagnosis was not made. On the other hand, the appendix was normal in 15% to 40% of patients who under- went emergency appendectomy. A study was designed to determine the prospective effectiveness of using CT as a diagnostic test to improve the treatment of these patients. The study examined 100 consecutive patients suspected of having acute appendicitis who presented to the emergency department or were referred there from a physicians office. The 100 patients underwent a CT scan, and the surgeon made an assessment of the presence of appendicitis for each of the patients. The final clinical outcomes were determined at surgery and by pathological examination of the appendix after appendectomy or by clinical follow-up at least 2 months after CT scanning. The 1996 rate of occurrence of appendicitis was approximately P(C) .00108. a. Find the sensitivity and specificity of the radiological determination of appendicitis. b. Find the probability that a patient truly had appendicitis given that the radiological determination was definitely appendicitis (DA). c. Find the probability that a patient truly did not have appendicitis given that the radiological determination was definitely appendicitis (DA). d. Find the probability that a patient truly did not have appendicitis given that the radiological determination was definitely not appendicitis (DNA).35EClassify each of the following random variables as either continuous or discrete: a. The survival time of a cancer patient after receiving a new treatment for cancer b. The number of ticks found on a cow entering an inspection station c. The average rainfall during August in College Station, Texas d. The daily dose level of medication prescribed to a patient having an iron deficiency e. The number of touchdowns thrown during an NFL game f. The number of monthly shutdowns of the sewage treatment plant in a large midwestern city37ETexting while driving is a very dangerous practice. An electronic monitoring device is installed on rental cars at a randomly selected rental franchise. a. Is the number of times a randomly selected driver sends a text message during the first hour after leaving the rental companys parking lot a discrete or continuous random variable? b. Is the length of time the driver spends typing a text message while driving a discrete or continuous random variable? c. Is the brand of cell phone from which the text message is sent a discrete or continuous random variable?39EThe numbers of cars failing an emissions test on randomly selected days at a state inspection station are given in the following table. a. Construct a graph of P(y). b. Compute P(y 2). c. Compute P(y 7). d. Compute P(2 y 7).A traditional call center has a simple mission: Agents have to answer customer calls fast and end them as quickly as possible to move on to the next call. The quality of service rendered by the call center was evaluated by recording the number of times a customer called the center back within a week of his or her initial call to the center. a. What is the probability that a customer will recall the center more than three times? b. What is the probability that a customer will recall the center at least two times but less than five times? c. Suppose a call center must notify a supervisor if a customer recalls the center more than four times within a week of his or her initial call. What proportion of customers who contact the call center will require a supervisor to be contacted?A biologist randomly selects 10 portions of water, each equal to .1 cm3 in volume, from the local reservoir and counts the number of bacteria present in each portion. The biologist then totals the number of bacteria for the 10 portions to obtain an estimate of the number of bacteria per cubic centimeter present in the reservoir water. Is this a binomial experiment?Examine the accompanying newspaper clipping. Does this sampling appear to satisfy the characteristics of a binomial experiment?A survey is conducted to estimate the percentage of pine trees in a forest that are infected by the pine shoot moth. A grid is placed over a map of the forest, dividing the area into 25-foot by 25-foot square sections. One hundred of the squares are randomly selected, and the number of infected trees is recorded for each square. Is this a binomial experiment?In an attempt to decrease drunk driving, police set up vehicle checkpoints during the July 4 evening. The police randomly select vehicles to be stopped for informational checks. On a particular roadway, assume that 20% of all drivers have a blood alcohol level above the legal limit. For a random sample of 15 vehicles, compute the following probabilities: a. All 15 drivers will have a blood alcohol level exceeding the legal limit. b. Exactly 6 of the 15 drivers will exceed the legal limit. c. Of the 15 drivers, 6 or more will exceed the legal limit. d. All 15 drivers will have a blood alcohol level within the legal limit.The quality control department examines all the products returned to a store by customers. An examination of the returned products yields the following assessment: 5% are defective and not repairable, 45% are defective but repairable, 35% have small surface scratches but are functioning properly, and 15% have no problems. Compute the following probabilities for a random sample of 20 returned products: a. All of the 20 returned products have some type of problem. b. Exactly 6 of the 20 returned products are defective and not repairable. c. Of the 20 returned products, 6 or more are defective and not functioning properly. d. None of the 20 returned products has any sort of defect.47EThe CFO of a hospital is concerned about the risk of patients contracting an infection after a one-week or longer stay in the hospital. A long-term study estimates that the chance of contracting an infection after a one-week or longer stay in a hospital is 10%. A random sample of 50 patients who have been in the hospital at least 1 week is selected. a. If the 10% infection rate is correct, what is the probability that at least 5 patients out of the 50 will have an infection? b. What assumptions are you making in computing the probability in part (a)?Suppose the random variable y has a Poisson distribution. Compute the following probabilities: a. P(y 4) given 2 b. P(y 4) given 3.5 c. P(y 4) given 2 d. P(1 y 4) given 2Customers arrive at a grocery store checkout at a rate of six per 30 minutes during the hours of 5 p.m. and 7 p.m. during the workweek. Let C be the number of customers arriving at the checkout during any 30-minute period of time. The management of the store wants to determine the frequency of the following events. Compute the probabilities of these events: a. No customers arrive. b. More than six customers arrive. c. At most three customers arrive.A firm is considering using the Internet to supplement its traditional sales methods. Using data from an industry association, the firm estimates that 1 of every 1,000 Internet hits results in a sale. Suppose the firm has 2,500 hits per day. a. What is the probability that the firm will have more than five sales in a randomly selected day? b. What conditions must be satisfied in order for you to make the calculation in part (a)? c. Use the Poisson approximation to compute the probability that the firm will have more than five sales in a randomly selected day. d. Is the Poisson approximation accurate?52E53E54E55E56E57E58E59E60EIn Exercises 4.57 through 4.63, let z be a random variable with a standard normal distribution. 4.61 Find the value of z, denoted z0, such that P(z z0) = .0091.62E63E64ELet y be a random variable having a normal distribution with a mean equal to 250 and a standard deviation equal to 50. Find the following probabilities: a. P(y 250) b. P(y 150) c. P(150 y 350) d. Find k such that P(250 k y 250 + k) = .60Suppose that y is a random variable having a normal distribution with a mean equal to 250 and a standard deviation equal to 10. a. Show that the event y 260 has the same probability as z 1. b. Convert the event y 230 to the z-score equivalent. c. Find P(y 260) and P(y 230). d. Find P(y 265), P(y 242), and P(242 y 265).Suppose that z is a random variable having a standard normal distribution. a. Find a value z0, such that P(z z0) = .01. b. Find a value z0, such that P(z z0) = .025. c. Find a value z0, such that P(z0 z z0) = .95.68ERecords maintained by the office of budget in a particular state indicate that the amount of time elapsed between the submission of travel vouchers and the final reimbursement of funds has approximately a normal distribution with a mean of 36 days and a standard deviation of 3 days. a. What is the probability that the elapsed time between submission and reimbursement will exceed 30 days? b. If you had a travel voucher submitted more than 55 days ago, what might you conclude?The College Boards, which are administered each year to many thousands of high school students, are scored so as to yield a mean of 513 and a standard deviation of 130. These scores are close to being normally distributed. What percentage of the scores can be expected to satisfy each of the following conditions? a. Greater than 600 b. Greater than 700 c. Less than 450 d. Between 450 and 600Monthly sales figures for a particular food industry tend to be normally distributed with a mean of 155 (thousand dollars) and a standard deviation of 45 (thousand dollars). Compute the following probabilities: a. P(y 200) b. P(y 100) c. P(100 y 200)Refer to Exercise 4.70. An honor society wishes to invite those scoring in the top 5% on the College Boards to join their society. a. What score is required to be invited to join the society? b. What score separates the top 75% of the population from the bottom 25%? What do we call this value? 4.70 The College Boards, which are administered each year to many thousands of high school students, are scored so as to yield a mean of 513 and a standard deviation of 130. These scores are close to being normally distributed. What percentage of the scores can be expected to satisfy each of the following conditions? a. Greater than 600 b. Greater than 700 c. Less than 450 d. Between 450 and 60073E74EA psychologist is interested in studying women who are in the process of obtaining a divorce to determine whether the women experienced significant attitudinal changes after the divorce has been finalized. Existing records from the geographic area in question show that 798 couples have recently filed for divorce. Assume that a sample of 25 women is needed for the study, and use Table 12 in the Appendix to determine which women should be asked to participate in the study. (Hint: Begin in column 2, row 1, and proceed down.)76EA random sample of 16 measurements is drawn from a population with a mean of 60 and a standard deviation of 5. Describe the sampling distribution of , the sample mean. Within what interval would you expect to lie approximately 95% of the time? 78EPsychomotor retardation scores for a particular group of manic-depressive patients have approximately a normal distribution with a mean of 930 and a standard deviation of 130. A random sample of 20 patients from the group was selected, and their mean psychomotor retardation score was obtained. a. What is the probability that their mean score was between 900 and 960? b. What is the probability that their mean score was greater than 960? c. What is the 90th percentile of their mean scores?Federal resources have been tentatively approved for the construction of an outpatient clinic. In order to design a facility that will handle patient load requirements and stay within a limited budget, the designers studied patient demand. From studying a similar facility in the area, they found that the distribution of the number of patients requiring hospitalization during a week could be approximated by a normal distribution with a mean of 125 and a standard deviation of 32. a. Use the Empirical Rule to describe the distribution of y, the number of patients requesting service in a week. b. If the facility was built with a 160-patient capacity, what fraction of the weeks might the clinic be unable to handle the demand?Refer to Exercise 4.80. What size facility should be built so the probability of the patient loads exceeding the clinic capacity is .10? .30? 4.81 Federal resources have been tentatively approved for the construction of an outpatient clinic. In order to design a facility that will handle patient load requirements and stay within a limited budget, the designers studied patient demand. From studying a similar facility in the area, they found that the distribution of the number of patients requiring hospitalization during a week could be approximated by a normal distribution with a mean of 125 and a standard deviation of 32. a. Use the Empirical Rule to describe the distribution of y, the number of patients requesting service in a week. b. If the facility was built with a 160-patient capacity, what fraction of the weeks might the clinic be unable to handle the demand?Based on the 1990 census, the number of hours per day adults spend watching television is approximately normally distributed with a mean of 5 hours and a standard deviation of 1.3 hours. a. What proportion of the population spends more than 7 hours per day watching television? b. In a 1998 study of television viewing, a random sample of 500 adults reported that the average number of hours spent viewing television was greater than 5.5 hours per day. Do the results of this survey appear to be consistent with the 1990 census? (Hint: If the census results are still correct, what is the probability that the average viewing time would exceed 5.5 hours?)The level of a particular pollutant, nitrogen oxide, in the exhaust of a hypothetical model of car, the Polluter, when driven in city traffic has approximately a normal distribution with a mean level of 2.1 grams per mile (g/m) and a standard deviation of 0.3 g/m. a. If the EPA mandates that a nitrogen oxide level of 2.7 g/m cannot be exceeded, what proportion of Polluters would be in violation of the mandate? b. At most, 25% of Polluters exceed what nitrogen oxide level value (that is, find the 75th percentile)? c. The company producing the Polluter must reduce the nitrogen oxide level so that at most 5% of its cars exceed the EPA level of 2.7 g/m. If the standard deviation remains 0.3 g/m, to what value must the mean level be reduced so that at most 5% of Polluters would exceed 2.7 g/m?Refer to Exercise 4.83. A company has a fleet of 150 Polluters used by its sales staff. Describe the distribution of the total amount, in g/m, of nitrogen oxide produced in the exhaust of this fleet. What are the mean and standard deviation of the total amount, in g/m, of nitrogen oxide in the exhaust for the fleet? (Hint: The total amount of nitrogen oxide can be represented as i=1150Wi, where Wi is the amount of nitrogen oxide in the exhaust of the ith car. Thus, the Central Limit Theorem for sums is applicable.) 4.83 The level of a particular pollutant, nitrogen oxide, in the exhaust of a hypothetical model of car, the Polluter, when driven in city traffic has approximately a normal distribution with a mean level of 2.1 grams per mile (g/m) and a standard deviation of 0.3 g/m. a. If the EPA mandates that a nitrogen oxide level of 2.7 g/m cannot be exceeded, what proportion of Polluters would be in violation of the mandate? b. At most, 25% of Polluters exceed what nitrogen oxide level value (that is, find the 75th percentile)? c. The company producing the Polluter must reduce the nitrogen oxide level so that at most 5% of its cars exceed the EPA level of 2.7 g/m. If the standard deviation remains 0.3 g/m, to what value must the mean level be reduced so that at most 5% of Polluters would exceed 2.7 g/m?The baggage limit for an airplane is set at 100 pounds per passenger. Thus, for an airplane with 200 passenger seats, there would be a limit of 20,000 pounds. The weight of the baggage of an individual passenger is a random variable with a mean of 95 pounds and a standard deviation of 35 pounds. If all 200 seats are sold for a particular flight, what is the probability that the total weight of the passengers baggage will exceed the 20,000-pound limit?A patient visits her doctor with concerns about her blood pressure. If the systolic blood pressure exceeds 150, the patient is considered to have high blood pressure, and medication may be prescribed. The problem is that there is a considerable variation in a patients systolic blood pressure readings during a given day. a. If a patients systolic readings during a given day have a normal distribution with a mean of 160 mm mercury and a standard deviation of 20 mm, what is the probability that a single measurement will fail to detect that the patient has high blood pressure? b. If five measurements are taken at various times during the day, what is the probability that the average blood pressure reading will be less than 150 and hence fail to indicate that the patient has a high blood pressure problem? c. How many measurements would be required so that the probability of failing to detect that the patient has high blood pressure is at most 1%.Critical key-entry errors in the data processing operation of a large district bank occur approximately .1% of the time. If a random sample of 10,000 entries is examined, determine the following: a. The expected number of errors b. The probability of observing fewer than four errors c. The probability of observing more than two errors88ELet y be a binomial random variable with n = 10 and = .5. a. Calculate P(4 y 6). b. Use a normal approximation without the continuity correction to calculate the same probability. Compare your results. How well did the normal approximation work?90EA marketing research firm advises a new client that approximately 15% of all persons sent a sweepstakes offer will return the mailing. Suppose the client sends out 10,000 sweepstakes offers. a. What is the probability that fewer than 1,430 of the mailings will be returned? b. What is the probability that more than 1,600 of the mailings will be returned?92ESuppose a population consists of the 10 measurements (2, 3, 6, 8, 9, 12, 25, 29, 39, 50). Generate the 45 possible values for the sample mean based on a sample of n = 2 observations per sample. Use the 45 sample means to determine whether the sampling distribution of the sample mean is approximately normally distributed by constructing a boxplot, relative frequency histogram, and normal quantile plot of the 45 sample means. Compute the correlation coefficient and p-value to assess whether the 45 means appear to be sampled from a normal distribution. Do the results in part (b) confirm your conclusion from part (a)? The fracture toughness in concrete specimens is a measure of how likely it is that blocks used in new home construction may fail. A construction investigator obtains a random sample of 15 concrete blocks and determines the following toughness values: .47, .58, .67, .70, .77, .79, .81, .82, .84, .86, .91, .95, .98, 1.01, 1.04 Use a normal quantile plot to assess whether the data appear to fit a normal distribution. Compute the correlation coefficient and p-value for the normal quantile plot. Comment on the degree of fit of the data to a normal distribution. One way to audit expense accounts for a large consulting firm is to sample all reports dated the last day of each month. Comment on whether such a sample constitutes a random sample.The breaking strengths for 1-foot-square samples of a particular synthetic fabric are approximately normally distributed with a mean of 2,250 pounds per square inch (psi) and a standard deviation of 10.2 psi. Find the probability of selecting a 1-foot-square sample of material at random that on testing would have a breaking strength in excess of 2,265 psi.Refer to Exercise 4.96. Suppose that a new synthetic fabric has been developed that may have a different mean breaking strength. A random sample of 15 1-foot sections is obtained, and each section is tested for breaking strength. If we assume that the population standard deviation for the new fabric is identical to that for the old fabric, describe the sampling distribution for y based on random samples of 15 1-foot sections of new fabric. The breaking strengths for 1-foot-square samples of a particular synthetic fabric are approximately normally distributed with a mean of 2,250 pounds per square inch (psi) and a standard deviation of 10.2 psi. Find the probability of selecting a 1-foot-square sample of material at random that on testing would have a breaking strength in excess of 2,265 psi.Refer to Exercise 4.97. Suppose that the mean breaking strength for the sample of 15 1-foot sections of the new synthetic fabric is 2,268 psi. What is the probability of observing a value of y equal to or greater than 2,268, assuming that the mean breaking strength for the new fabric is 2,250, the same as that for the old? Refer to Exercise 4.96. Suppose that a new synthetic fabric has been developed that may have a different mean breaking strength. A random sample of 15 1-foot sections is obtained, and each section is tested for breaking strength. If we assume that the population standard deviation for the new fabric is identical to that for the old fabric, describe the sampling distribution for y based on random samples of 15 1-foot sections of new fabric.99SE100SEExperts consider high serum cholesterol levels to be associated with an increased incidence of coronary heart disease. Suppose that the natural logarithm of cholesterol levels for males in a given age bracket is normally distributed with a mean of 5.35 and a standard deviation of .12. a. What percentage of the males in this age bracket could be expected to have a serum cholesterol level greater than 250 mg/ml, the upper limit of the clinical normal range? b. What percentage of the males could be expected to have serum cholesterol levels within the clinical normal range of 150-250 mg/ml? c. What percentage of the adult males in this age bracket could be expected to have a very risky cholesterol level that is, above 300 mg/ml?102SE103SE104SE105SE106SERefer to Exercise 4.106. Plot the sampling distribution of the sample median of Exercise 4.106. a. Does the sampling distribution appear to be approximately normal? b. Compute the mean of the sampling distribution of the sample median, and compare this value to the population median. 4.106 Refer to Exercise 4.104. Use the same population to find the sampling distribution for the sample median based on samples of size n = 4. 4.104 The sample mean to be calculated from a random sample of size n = 4 from a population that consists of eight measurements (2, 6, 9, 12, 25, 29, 39, 50). Find the sampling distribution of y. (Hint: There are 70 samples of size 4 when sampling from a population of eight measurements.)Random samples of size 5, 20, and 80 are drawn from a population with a mean of = 100 and a standard deviation of = 15. a. Give the mean of the sampling distribution of y for each of the three sample sizes. b. Give the standard deviation of the sampling distribution of y for each of the three sample sizes. c. Based on the results obtained in parts (a) and (b), what do you conclude about the accuracy of using the sample mean y as an estimate of population mean ?109SESuppose the probability that a major earthquake occurs on a given day in Fresno, California, is 1 in 10,000. a. In the next 1,000 days, what is the expected number of major earthquakes in Fresno? b. If the occurrence of major earthquakes can be modeled by the Poisson distribution, calculate the probability that there will be at least one major earthquake in Fresno during the next 1,000 days.111SEAirlines overbook (sell more tickets than there are seats) flights, based on past records that indicate that approximately 5% of all passengers fail to arrive on time for their flight. Suppose a plane will hold 250 passengers, but the airline books 260 seats. What is the probability that at least 1 passenger will be bumped from the flight?113SEAs part of a study to determine factors that may explain differences in animal species relative to their size, the following body masses (in grams) of 50 different bird species were reported in the paper Temperature and the Northern Distributions of Wintering Birds, by Richard Repasky (1991). a. Does the distribution of the body masses appear to follow a normal distribution? Provide both a graphical and a quantitative assessment. b. Repeat part (a), with the outlier 448.0 removed. c. Determine the sample mean and median with and without the value 448.0 in the data set. d. Determine the sample standard deviation and MAD with and without the value 448.0 in the data set.The county government in a city that is dominated by a large state university is concerned that a small subset of its population has been overutilized in the selection of residents to serve on county court juries. The county decides to determine the mean number of times that an adult resident of the county has been selected for jury duty during the past 5 years. They will then compare the mean jury participation for full-time students to that of nonstudents. Identify the populations of interest to the county officials. How might you select a sample of voters to gather this information? In the research study on percentage of calories from fat, a. What is the population of interest? b. What dietary variables other than PCF might affect a persons health? c. What characteristics of the nurses other than dietary intake might be important in studying their health condition? d. Describe a method for randomly selecting which nurses participate in the study. e. State several hypotheses that may be of interest to the researchers.Face masks used by firefighters often fail by having their lenses fall out when exposed to very high temperatures. A manufacturer of face masks claims that for its masks the average temperature at which pop-out occurs is 550°F. A sample of 75 masks is tested, and the average temperature at which the lenses popped out was 470°F. Based on this information is the manufacturer’s claim valid? Identify the population of interest to the firefighters in this problem. Would an answer to the question posed involve estimation or hypothesis testing? Refer to Exercise 5.3. Describe a process to select a sample of face masks from the manufacturer to evaluate the claim. 5.3 Face masks used by firefighters often fail by having their lenses fall out when exposed to very high temperatures. A manufacturer of face masks claims that for its masks the average temperature at which pop-out occurs is 550F. A sample of 75 masks is tested, and the average temperature at which the lenses popped out was 470F. Based on this information is the manufacturers claim valid? a. Identify the population of interest to the firefighters in this problem. b. Would an answer to the question posed involve estimation or hypothesis testing?A company that manufacturers coffee for use in commercial machines monitors the caffeine content in its coffee. The company selects 50 samples of coffee every hour from its production line and determines the caffeine content. From historical data, the caffeine content (in milligrams, mg) is known to have a normal distribution with = 7.1 m During a 1-hour time period, the 50 samples yielded a mean caffeine content of y = 110 mg. a. Identify the population about which inferences can be made from the sample data. b. Calculate a 95% confidence interval for the mean caffeine content of the coffee produced during the hour in which the 50 samples were selected. c. Explain to the CEO of the company in nonstatistical language the interpretation of the constructed confidence interval.Refer to Exercise 5.5. The engineer in charge of the coffee manufacturing process examines the confidence intervals for the mean caffeine content calculated over the past several weeks and is concerned that the intervals are too wide to be of any practical use. That is, they are not providing a very precise estimate of . a. What would happen to the width of the confidence intervals if the level of confidence of each interval is increased from 95% to 99%? b. What would happen to the width of the confidence intervals if the number of samples per hour was increased from 50 to 100? 5.5 A company that manufacturers coffee for use in commercial machines monitors the caffeine content in its coffee. The company selects 50 samples of coffee every hour from its production line and determines the caffeine content. From historical data, the caffeine content (in milligrams, mg) is known to have a normal distribution with = 7.1 m During a 1-hour time period, the 50 samples yielded a mean caffeine content of y = 110 mg. a. Identify the population about which inferences can be made from the sample data. b. Calculate a 95% confidence interval for the mean caffeine content of the coffee produced during the hour in which the 50 samples were selected. c. Explain to the CEO of the company in nonstatistical language the interpretation of the constructed confidence interval.Refer to Exercise 5.5. Because the company is sampling the coffee production process every hour, there are 720 confidence intervals for the mean caffeine content μ constructed every month. If the level of confidence remains at 95% for the 720 confidence intervals in a given month, how many of the confidence intervals would you expect to fail to contain the value of μ and hence provide an incorrect estimation of the mean caffeine content? If the number of samples is increased from 50 to 100 each hour, how many of the 95% confidence intervals would you expect to fail to contain the value of μ in a given month? If the number of samples remains at 50 each hour but the level of confidence is increased from 95% to 99% for each of the intervals, how many of the 99% confidence intervals would you expect to fail to contain the value of μ in a given month? 5.5 A company that manufacturers coffee for use in commercial machines monitors the caffeine content in its coffee. The company selects 50 samples of coffee every hour from its production line and determines the caffeine content. From historical data, the caffeine content (in milligrams, mg) is known to have a normal distribution with = 7.1 m During a 1-hour time period, the 50 samples yielded a mean caffeine content of = 110 mg. Identify the population about which inferences can be made from the sample data. Calculate a 95% confidence interval for the mean caffeine content μ of the coffee produced during the hour in which the 50 samples were selected. Explain to the CEO of the company in nonstatistical language the interpretation of the constructed confidence interval. As part of the recruitment of new businesses, the citys economic development department wants to estimate the gross profit margin of small businesses (under 1 million in sales) currently residing in the city. A random sample of the previous years annual reports of 15 small businesses shows the mean net profit margin to be 7.2% (of sales) with a standard deviation of 12.5%. a. Construct a 99% confidence interval for the mean gross profit margin of of all small businesses in the city. b. The city manager reads the report and states that the confidence interval for constructed in part (a) is not valid because the data are obviously not normally distributed and thus the sample size is too small. Based on just knowing the mean and standard deviation of the sample of 15 businesses, do you think the city manager is valid in his conclusion about the data? Explain your answer.A program to reduce recidivism has been in effect for two years in a large northeastern state. A sociologist investigates the effectiveness of the program by taking a random sample of 200 prison records of repeat offenders. The records were selected from the files in the courthouse of the largest city in the state. The average length of time out of prison between the first and second offenses is 2.8 years with a standard deviation of 1.3 years. Use this information to estimate the mean prison-free time between first and second offenses using a 95% confidence interval. Identify the group for which the confidence interval would be an appropriate estimate of the population mean. Would it be valid to use this confidence interval to estimate the mean prison-free time between first and second offenses for all two-time offenders in the whole state? In a large southern state? The susceptibility of the root stocks of a variety of orange tree to a specific larva is investigated by a group of researchers. Forty orange trees are exposed to the larva and then examined by the researchers 6 months after exposure. The number of larvae per gram is recorded on each root stock. The mean and standard deviation of the logarithm of the counts are recorded to be 9.02 and 1.12, respectively. Use the sample information to construct a 90% confidence interval on the mean of the logarithm of the larvae counts. Identify the population for which this confidence interval could be used to assess the susceptibility of the orange trees to the larva. 11EIn any given situation, if the level of confidence and the standard deviation are kept constant, how much would you need to increase the sample size to decrease the width of the interval to half its original size? A biologist wishes to estimate the effect of an antibiotic on the growth of a particular bacterium by examining the mean amount of bacteria present per plate of culture when a fixed amount of the antibiotic is applied. Previous experimentation with the antibiotic on this type of bacteria indicates that the standard deviation of the amount of bacteria present is approximately 13 cm2. Use this information to determine the number of observations (cultures that must be developed and then tested) necessary to estimate the mean amount of bacteria present, using a 99% confidence interval with a half-width of 3 cm2.Refer to Exercise 5.14. Suppose the mayors staff reviews the proposed survey and decides that in order for the survey to be taken seriously the requirements need to be increased. a. If the level of confidence is increased to 99% with the average rent estimated within 50, how many apartments need to be included in the survey? b. Suppose the budget for the survey will not support increasing the level of confidence to 99%. Provide an explanation to the mayor, who has never taken a statistics course, of the impact on the accuracy of the estimate of the average rent of not raising the level of confidence from 95% to 99%. 5.14 The housing department in a large city monitors the rent for rent-controlled apartments in the city. The mayor wants an estimate of the average rent. The housing department must determine the number of apartments to include in a survey in order to be able to estimate the average rent to within 100 using a 95% confidence interval. From past surveys, the monthly charge for rent-controlled apartments ranged from 1,000 to 3,500. How many renters must be included in the survey to meet the requirements?Refer to Exercise 5.14. Suppose the mayors staff reviews the proposed survey and decides that in order for the survey to be taken seriously the requirements need to be increased. a. If the level of confidence is increased to 99% with the average rent estimated within 50, how many apartments need to be included in the survey? b. Suppose the budget for the survey will not support increasing the level of confidence to 99%. Provide an explanation to the mayor, who has never taken a statistics course, of the impact on the accuracy of the estimate of the average rent of not raising the level of confidence from 95% to 99%. 5.14 The housing department in a large city monitors the rent for rent-controlled apartments in the city. The mayor wants an estimate of the average rent. The housing department must determine the number of apartments to include in a survey in order to be able to estimate the average rent to within 100 using a 95% confidence interval. From past surveys, the monthly charge for rent-controlled apartments ranged from 1,000 to 3,500. How many renters must be included in the survey to meet the requirements?A study is designed to test the hypotheses H0: 26 versus Ha: 26. A random sample of 50 units was selected from a specified population, and the measurements were summarized to y = 25.9 and s = 7.6. a. With = .05, is there substantial evidence that the population mean is less than 26? b. Calculate the probability of making a Type II error if the actual value of the population mean is at most 24. c. If the sample size is doubled to 100, what is the probability of making a Type II error if the actual value of the population mean is at most 24?Refer to Exercise 5.16. Graph the power curve for rejecting H0: μ ≥ 26 for the following values of μ: 20, 21, 22, 23, 24, 25, and 26. Describe the change in the power as the value of μ decreases from μ0 = 26. Suppose the value of n remains at 50 but α is decreased to α = .01. Without recalculating the values of the power, superimpose on the graph for α = .05 and n = 50 the power curve for α = .01 and n = 50. Suppose the value of n is decreased to 35 but α is kept at α = .05. Without recalculating the values of the power, superimpose on the graph for α = .05 and n = 50 the power curve for α = .05 and n = 35. 5.16 A study is designed to test the hypotheses H0: μ ≥ 26 versus Ha: μ < 26. A random sample of 50 units was selected from a specified population, and the measurements were summarized to = 25.9 and s = 7.6. With α = .05, is there substantial evidence that the population mean is less than 26? Calculate the probability of making a Type II error if the actual value of the population mean is at most 24. If the sample size is doubled to 100, what is the probability of making a Type II error if the actual value of the population mean is at most 24? A study was conducted of 90 adult male patients following a new treatment for congestive heart failure. One of the variables measured on the patients was the increase in exercise capacity (in minutes) over a 4-week treatment period. The previous treatment regime had produced an average increase of μ = 2 minutes. The researchers wanted to evaluate whether the new treatment had increased the value of μ in comparison to the previous treatment. The data yielded = 2.17 and s = 1.05. Using α = .05, what conclusions can you draw about the research hypothesis? What is the probability of making a Type II error if the actual value of μ is 2.1? 22EA national agency sets recommended daily dietary allowances for many supplements. In particular, the allowance for zinc for males over the age of 50 years is 15 mg/day. The agency would like to determine if the dietary intake of zinc for active males is significantly higher than 15 mg/day. How many males would need to be included in the study if the agency wants to construct an α = .05 test with the probability of committing a Type II error at most .10 whenever the average zinc content is 15.3 mg/day or higher? Suppose from previous studies they estimate the standard deviation to be approximately 4 mg/day. To evaluate the success of a 1-year experimental program designed to increase the mathematical achievement of underprivileged high school seniors, a random sample of participants in the program will be selected and their mathematics scores will be compared with the previous years statewide average of 525 for underprivileged seniors. The researchers want to determine whether the experimental program has increased the mean achievement level over the previous years statewide average. If = .05, what sample size is needed to have a probability of Type II error of at most .025 if the actual mean is increased to 550? From previous results, 80.Refer to Exercise 5.24. Suppose a random sample of 100 students is selected yielding = 542 and s = 76. Is there sufficient evidence to conclude that the mean mathematics achievement level has been increased? Explain. 5.24 To evaluate the success of a 1-year experimental program designed to increase the mathematical achievement of underprivileged high school seniors, a random sample of participants in the program will be selected and their mathematics scores will be compared with the previous year’s statewide average of 525 for underprivileged seniors. The researchers want to determine whether the experimental program has increased the mean achievement level over the previous year’s statewide average. If α = .05, what sample size is needed to have a probability of Type II error of at most .025 if the actual mean is increased to 550? From previous results, σ ≈ 80. The administrator of a nursing home would like to do a time-and-motion study of staff time spent per day performing nonemergency tasks. Prior to the introduction of some efficiency measures, the average number of person-hours per day spent on these tasks was = 16. The administrator wants to test whether the efficiency measures have reduced the value of . How many days must be sampled to test the proposed hypothesis if she wants a test having = .05 and the probability of a Type II error of at most .10 when the actual value of is 12 hours or less (at least a 25% decrease from the number of hours spent before the efficiency measures were implemented)? Assume = 7.64.The vulnerability of inshore environments to contamination due to urban and industrial expansion in Mombasa is discussed in the paper Metals, Petroleum Hydrocarbons and Organochlorines in Inshore Sediments and Waters on Mombasa, Kenya [Marine Pollution Bulletin (1997) 34:570577]. A geochemical and oceanographic survey of the inshore waters of Mombasa, Kenya, was undertaken during the period from September 1995 to January 1996. In the survey, suspended particulate matter and sediment were collected from 48 stations within Mombasas estuarine creeks. The concentrations of major oxides and 13 trace elements were determined for a varying number of cores at each of the stations. In particular, the lead concentrations in suspended particulate matter (mg kg1 dry weight) were determined at 37 stations. The researchers were interested in determining whether the average lead concentration was greater than 30 mg kg1 dry weight. The data are given in the following table along with summary statistics and a normal probability plot. Lead concentrations (mg kg1 dry weight) from 37 stations in Kenya a. Is there sufficient evidence ( = .05) in the data that the mean lead concentration exceeds 30 mg kg1 dry weight? b. What is the probability of a Type II error if the actual mean concentration is 50? c. Do the data appear to have a normal distribution? d. Based on your answer in (c), is the sample size large enough for the test procedures to be valid? Explain.The RD department of a paint company has developed an additive that it hopes will increase the ability of the companys stain for outdoor decks to resist water absorption. The current formulation of the stain has a mean absorption rate of 35 units. Before changing the stain, a study was designed to evaluate whether the mean absorption rate of the stain with the additive was decreased from the current rate of 35 units. The stain with the additive was applied to 50 pieces of decking material. The resulting data were summarized to y = 33.6 and s = 9.2 a. Is there substantial evidence ( = .01) that the additive reduces the mean absorption from its current value? b. What is the level of significance (p-value) of your test results? c. What is the probability of a Type II error if the stain with the additive in fact has a mean absorption rate of 30? d. Estimate the mean absorption using a 99% confidence interval. Is the confidence interval consistent with your conclusions from the test of hypotheses?29EA concern to public health officials is whether a concentration of lead in the paint of older homes may have an effect on the muscular development of young children. In order to evaluate this phenomenon, a researcher exposed 90 newly born mice to paint containing a specified amount of lead. The number of Type 2 fibers in the skeletal muscle was determined 6 weeks after exposure. The mean number of Type 2 fibers in the skeletal muscles of normal mice of this age is 21.7. The n = 90 mice yielded = 18.8, s = 15.3. Is there significant evidence in the data to support the hypothesis that the mean number of Type 2 fibers is different from 21.7 using an α = .05 test? 31E32E33EProvide the rejection region based on a t-test statistic for the following situations: H0: μ ≥ 28 versus Ha: μ < 28 with n = 11, α = .05 H0: μ ≤ 28 versus Ha: μ > 28 with n = 21, α = .025 H0: μ ≥ 28 versus Ha: μ < 28 with n = 8, α = .001 H0: μ = 28 versus Ha: μ ≠ 28 with n = 13, α = .01 A study was designed to evaluate whether the population of interest has a mean greater than 9. A random sample of n = 17 units was selected from a population, and the data yield = 10.1 and s = 3.1. Is there substantial evidence (α = .05) that the population mean is greater than 9? What is the level of significance of the test? The ability to read rapidly and simultaneously maintain a high level of comprehension is often a determining factor in the academic success of many high school students. A school district is considering a supplemental reading program for incoming freshmen. Prior to implementing the program, the school runs a pilot program on a random sample of n = 20 students. The students were thoroughly tested to determine reading speed and reading comprehension. Based on a fixed-length standardized test reading passage, the following reading times (in minutes) and comprehension scores (based on a 100-point scale) were recorded. What is the population about which inferences are being made? Place a 95% confidence interval on the mean reading time for all incoming freshmen in the district. Plot the reading time using a normal probability plot or boxplot. Do the data appear to be a random sample from a population having a normal distribution? Provide an interpretation of the interval estimate in part (b). Refer to Exercise 5.36. Using the reading comprehension data, is there significant evidence that the reading program would produce for incoming freshmen a mean comprehension score greater than 80, the statewide average for comparable students during the previous year? Determine the level of significance for your test. Interpret your findings. 5.36 The ability to read rapidly and simultaneously maintain a high level of comprehension is often a determining factor in the academic success of many high school students. A school district is considering a supplemental reading program for incoming freshmen. Prior to implementing the program, the school runs a pilot program on a random sample of n = 20 students. The students were thoroughly tested to determine reading speed and reading comprehension. Based on a fixed-length standardized test reading passage, the following reading times (in minutes) and comprehension scores (based on a 100-point scale) were recorded. a. What is the population about which inferences are being made? b. Place a 95% confidence interval on the mean reading time for all incoming freshmen in the district. c. Plot the reading time using a normal probability plot or boxplot. Do the data appear to be a random sample from a population having a normal distribution? d. Provide an interpretation of the interval estimate in part (b). Refer to Exercise 5.36. Does there appear to be a relationship between reading time and reading comprehension of the individual students? Provide a plot of the data to support your conclusion. What are some weak points in this study relative to evaluating the potential of the reading improvement program? How would you redesign the study to overcome these weak points? 5.36 The ability to read rapidly and simultaneously maintain a high level of comprehension is often a determining factor in the academic success of many high school students. A school district is considering a supplemental reading program for incoming freshmen. Prior to implementing the program, the school runs a pilot program on a random sample of n = 20 students. The students were thoroughly tested to determine reading speed and reading comprehension. Based on a fixed-length standardized test reading passage, the following reading times (in minutes) and comprehension scores (based on a 100-point scale) were recorded. What is the population about which inferences are being made? Place a 95% confidence interval on the mean reading time for all incoming freshmen in the district. Plot the reading time using a normal probability plot or boxplot. Do the data appear to be a random sample from a population having a normal distribution? Provide an interpretation of the interval estimate in part (b). A consumer testing agency wants to evaluate the claim made by a manufacturer of discount tires. The manufacturer claims that its tires can be driven at least 35,000 miles before wearing out. To determine the average number of miles that can be obtained from the manufacturers tires, the agency randomly selects 60 tires from the manufacturers warehouse and places the tires on 15 cars driven by test drivers on a 2-mile oval track. The number of miles driven (in thousands of miles) until the tires are determined to be worn out is given in the following table. a. Place a 99% confidence interval on the average number of miles driven, , prior to the tires wearing out. b. Is there significant evidence ( = .01) that the manufacturers claim is false? What is the level of significance of your test? Interpret your findings. Refer to Exercise 5.39. Does the normality of the data appear to be valid? How close to the true value were your bounds on the p-value? Is there a contradiction between the interval estimate of μ and the conclusion reached by your test of the hypotheses? 5.39 A consumer testing agency wants to evaluate the claim made by a manufacturer of discount tires. The manufacturer claims that its tires can be driven at least 35,000 miles before wearing out. To determine the average number of miles that can be obtained from the manufacturer’s tires, the agency randomly selects 60 tires from the manufacturer’s warehouse and places the tires on 15 cars driven by test drivers on a 2-mile oval track. The number of miles driven (in thousands of miles) until the tires are determined to be worn out is given in the following table. Place a 99% confidence interval on the average number of miles driven, μ, prior to the tires wearing out. Is there significant evidence (α − .01) that the manufacturer’s claim is false? What is the level of significance of your test? Interpret your findings. The amount of sewage and industrial pollutants dumped into a body of water affects the health of the water by reducing the amount of dissolved oxygen available for aquatic life. Over a 2-month period, eight samples were taken from a river at a location 1 mile downstream from a sewage treatment plant. The amount of dissolved oxygen in the samples was determined and is reported in the following table. The current research asserts that the mean dissolved oxygen level must be at least 5.0 parts per million (ppm) for fish to survive. Place a 95% confidence on the mean dissolved oxygen level during the 2-month period. Using the confidence interval from part (a), does the mean oxygen level appear to be less than 5 ppm? Test the research hypothesis that the mean oxygen level is less than 5 ppm. What is the level of significance of your test? Interpret your findings. A dealer in recycled paper places empty trailers at various sites. The trailers are gradually filled by individuals who bring in old newspapers and magazines and are picked up on several schedules. One such schedule involves pickup every second week. This schedule is desirable if the average amount of recycled paper is more than 1,600 cubic feet per 2-week period. The dealers records for 18 2-week periods show the following volumes (in cubic feet) at a particular site: y = 1,718.3 and s = 137.8 a. Assuming the 18 2-week periods are fairly typical of the volumes throughout the year, is there significant evidence that the average volume is greater than 1,600 cubic feet? b. Place a 95% confidence interval on . c. Compute the p-value for the test statistic. Is there strong evidence that is greater than 1,600?47E48E49E50E51E52E53E54E55E56E57SEThe concentration of mercury in a lake has been monitored for a number of years. Measurements taken on a weekly basis yielded an average of 1.20 mg/m3 (milligrams per cubic meter) with a standard deviation of .32 mg/m3. Following an accident at a smelter on the shore of the lake, 15 measurements produced the following mercury concentrations. a. Give a point estimate of the mean mercury concentration after the accident. b. Construct a 95% confidence interval on the mean mercury concentration after the accident. Interpret this interval. c. Is there sufficient evidence that the mean mercury concentration has increased since the accident? Use = .05. d. Assuming that the standard deviation of the mercury concentration is .32 mg/m3, calculate the power of the test to detect mercury concentrations of 1.28, 1.32, 1.36, and 1.40. In a standard dissolution test for tablets of a particular drug product, the manufacturer must obtain the dissolution rate for a batch of tablets prior to release of the batch. Suppose that the dissolution test consists of assays for 24 randomly selected individual 25 mg tablets. For each test, the tablet is suspended in an acid bath and then assayed after 30 minutes. The results of the 24 assays are given here. Using a graphical display, determine whether the data appear to be a random sample from a normal distribution. Estimate the mean dissolution rate for the batch of tablets, for both a point estimate and a 99% confidence interval. Is there significant evidence that the batch of pills has a mean dissolution rate less than 20 mg (80% of the labeled amount in the tablets)? Use α = .01. Calculate the probability of a Type II error if the true dissolution rate is 19.6 mg. 60SE Over the past 5 years, the mean time for a warehouse to fill a buyer’s order has been 25 minutes. Officials of the company believe that the length of time has increased recently, either due to a change in the workforce or due to a change in customer purchasing policies. The processing times (in minutes) were recorded for a random sample of 15 orders processed over the past month. Do the data present sufficient evidence to indicate that the mean time to fill an order has increased? If a new process for mining copper is to be put into full-time operation, it must produce an average of more than 50 tons of ore per day. A 15-day trial period gave the results shown in the accompanying table. a. Estimate the typical amount of ore produced by the mine using both a point estimate and a 95% confidence interval. b. Is there significant evidence that on a typical day the mine produces more than 50 tons of ore? Test by using = .05.63SE64SE65SE66SE67SE68SE69SE70SE71SE73SE74SE75SE76SE