Bartleby Sitemap - Textbook Solutions

All Textbook Solutions for Introduction To Statistics And Data Analysis

Give brief definitions of the terms descriptive statistics and inferential statistics.Give brief definitions of the terms population and sample.The following conclusion from a study appeared in the article Smartphone Nation (AARP Bulletin, September 2009): If you love your smart phone, you are not alone. Half of all boomers sleep with their cell phone within arms length. Two of three people age 50 to 64 use a cell phone to take photos, according to a 2010 Pew Research Center report. Are the given proportions (half and two of three) population values, or were they calculated from a sample?Based on a study of 2 121 children between the ages of 1 and 4, researchers at the Medical College of Wisconsin concluded that there was an association between iron deficiency and the length of time that a child is bottle-fed (Milwaukee Journal Sentinel, November 26, 2005). Describe the sample and the population of interest for this study.The student senate at a university with 15,000 students is interested in the proportion of students who favor a change in the grading system to allow for plus and minus grades (for example, B+, B, B rather than just B). Two hundred students are interviewed to determine their attitude toward this proposed change. a. What is the population of interest? b. What group of students constitutes the sample in this problem?The National Retail Federation used data from a survey of 7439 adult Americans to estimate the percent who planned to spend more on holiday shopping in 2017 than they spent in 2016. They estimated that while 24% of adult Americans planned to spend more, for those age 16 to 24, the percentage was 46% (Almost Half of Younger Consumers Plan to Spend More During the Holidays, nrf.com/media /press-releases/almost-half-of-younger-consumers -plan-spend-more-during-the-holidays, retrieved February 5, 2018). Are the estimates given calculated using data from a sample or for the entire population?The supervisors of a rural county are interested in the proportion of property owners who support the construction of a sewer system. Because it is too costly to contact all 7000 property owners, a survey of 500 owners is undertaken. Describe the population and sample for this problem.A consumer group conducts crash tests of new model cars. To determine the severity of damage to 2019 Toyota Camrys resulting from a 10-mph crash into a concrete wall, the research group tests six cars of this type and assesses the amount of damage. Describe the population and the sample for this problem.A building contractor has a chance to buy an odd lot of 5000 used bricks at an auction. She is interested in determining the proportion of bricks in the lot that are cracked and the ref ore unusable for her current project, but she does not have enough time to inspect all 5000 bricks. Instead, she checks 100 bricks to determine which ones are cracked. Describe the population and the sample for this problem.The article Brain Shunt Tested to Treat Alzheimers (San Francisco Chronicle, October 23, 2002) summarizes the findings of a study that appeared in the journal Neurology. Doctors at Stanford Medical Center were interested in determining whether a new surgical approach to treating Alzheimers disease results in improved memory functioning. The surgical procedure involves implanting a thin tube, called a shunt, which is designed to drain toxins from the fluid-filled space that cushions the brain. Eleven patients had shunts implanted and were followed for a year, receiving quarterly tests of memory function. Another sample of Alzheimers patients was used as a comparison group. Those in the comparison group received the standard care for Alzheimers disease. After analyzing the data from this study, the investigators concluded that the results suggested the treated patients essentially held their own in the cognitive tests while the patients in the control group steadily declined. However, the study was too small to produce conclusive statistical evidence. a. What were the researchers trying to learn? What questions motivated their research? b. Do you think that the study was conducted in a reasonable way? What additional information would you want in order to evaluate this study?In a study of whether taking a garlic supplement reduces the risk of getting a cold, participants were assigned to either a garlic supplement group or to a group that did not take a garlic supplement (Garlic for the Common Cold, Cochrane Database of Systematic Reviews, 2009). Based on the study, it was concluded that the proportion of people taking a garlic supplement who get a cold is lower than the proportion of those not taking a garlic supplement who get a cold. a. What were the researchers trying to learn? What questions motivated their research? b. Do you think that the study was conducted in a reasonable way? What additional information would you want in order to evaluate this study?Classify each of the following variables as either categorical or numerical. For those that are numerical, determine whether they are discrete or continuous. a. Number of students in a class of 35 who turn in a term paper before the due date b. Gender of the next baby born at a particular hospital c. Amount of fluid (in ounces) dispensed by a machine used to fill bottles with soda pop d. Thickness of the gelatin coating of a vitamin E capsule e. Birth order classification (only child, firstborn, middle child, lastborn) of a math majorClassify each of the following variables as either categorical or numerical. For those that are numerical, determine whether they are discrete or continuous. a. Brand of computer purchased by a customer b. State of birth for someone born in the United States c. Price of a textbook d. Concentration of a contaminant (micrograms per cubic centimeter) in a water sample e. Zip code (Think carefully about this one.) f. Actual weight of coffee in a 1-pound can, labeled as containing 1 poundFor the following numerical variables, state whether each is discrete or continuous. a. The number of insufficient-funds checks received by a grocery store during a given month b. The amount by which a 1-pound package of ground beef decreases in weight (because of moisture loss) before purchase c. The number of kernels in a bag of microwave popcorn that have not popped after 3 minutes of cooking d. The number of students in a class of 35 who have purchased a used copy of the textbookFor the following numerical variables, state whether each is discrete or continuous. a. The length of a 1-year-old rattlesnake b. The altitude of a location in California selected randomly by throwing a dart at a map of the state c. The distance from the left edge at which a 12-inch plastic ruler snaps when bent sufficiently to break d. The price per gallon paid by the next customer to buy gas at a particular stationFor each of the following situations, give a set of possible data values that might arise from making the observations described. a. The manufacturer for each of the next 10 automobiles to pass through a given intersection is noted. b. The grade point average for each of the 15 seniors in a statistics class is determined. c. The number of gas pumps in use at each of 20 gas stations at a particular time is determined. d. The actual net weight of each of 12 bags of fertilizer having a labeled weight of 50 pounds is determined. e. Fifteen different radio stations are monitored during a 1-hour period, and the amount of time devoted to commercials is determined for each.In a survey of 100 people who had recently purchased motorcycles, data on the following variables were recorded: Sex of purchaser Brand of motorcycle purchased Number of previous motorcycles owned by purchaser Telephone area code of purchaser Weight of motorcycle as equipped at purchase a. Which of these variables are categorical? b. Which of these variables are discrete numerical? c. Which type of graphical display would be an appropriate choice for summarizing the sex data, a bar chart or a dotplot? d. Which type of graphical display would be an appropriate choice for summarizing the weight data, a bar chart or a dotplot?The Gallup report More Americans Say Real Estate Is Best Long-Term Investment (gallup.com, April 20, 2016, retrieved April 15, 2017) included data from a poll of 1015 adults. The responses to the question What do you think is the best long-term investment? are summarized in the given relative frequency distribution. a. Use this information to construct a bar chart for the response data. b. Write a few sentences commenting on the distribution of responses to the questions posed.An article in the New Times San Luis Obispo (February 4, 2016) reported the accompanying concussion rates for different high school sports. The given data are concussion rates per 10,000 athletes participating in high school sports in 2012. a. Construct a dotplot of the concussion rate data. b. In addition to the three girls sports indicated in the table (lacrosse, basketball, and soccer), the reported concussion rates for field hockey, volleyball, and softball are also for girls. Locate the points on the dotplot that correspond to concussion rates for girls sports and highlight them in a different color. Based on the dotplot, would you say that the concussion rates for girls sports tend to be lower than or higher than for boys sports? Explain.Box Office Mojo (boxofficemojo.com) tracks movie ticket sales. Ticket sales (in millions of dollars) for each of the top 20 movies in 2014 and 2015 are shown in the accompanying table. a. Construct a dotplot of the 2014 ticket sales data. Comment on any interesting features of the dotplot. (Hint: See What to Look For in the Dotplots box on page 14.) b. Construct a dotplot of the 2015 ticket sales data. Comment on any interesting features of the dotplot. c. In what ways are the distributions of the 2014 and 2015 ticket sales observations similar? In what ways are they different?The report With Their Whole Lives Ahead of Them (publicagenda.org/files/theirwhole livesaheadofthem.pdf, retrieved February 6, 2018) includes data from a survey of 200 students who started college but did not complete a degree. Each of these students was asked, How much have you thought about going back to school? The accompanying frequency distribution summarizes the responses to this question. a. Summarize the response data using a bar chart. b. Write a few sentences commenting on the distribution of the responses.The following display is a graph similar to one that appeared in USA TODAY (June 29, 2009). This graph is meant to be a bar graph of responses to the question shown in the graph. a. Is response to the question a categorical or numerical variable? b. Explain why a bar chart rather than a dotplot was used to display the response data. c. There must have been an error made in constructing this graph. How can you tell that the graph is not a correct representation of the response data?The accompanying table gives the total number visits and the number of unique visitors for some popular social networking sites in the United States for the month of July 2017. The number of unique visitors data are taken from the online article Top 15 Most Popular Social Networking Sites (ism.org/images/files/Social-media-platforms-from-Engage-to-Succeed-webinar.pdf, retrieved February 7, 2018). The total number of visits were estimated using data from semrush.com (retrieved February 7, 2018). The data on total visits and unique visitors were used to compute the values in the final column of the data table, in which visitsperuniquevisitor=totalvisitsnumberofuniquevisitors a. A dotplot of the total visits data is shown at the top of the page. What are the most obvious features of the dotplot? What does it tell you about the online social networking sites? b. A dotplot for the number of unique visitors is shown at the top of the page. In what way is this dotplot different from the dotplot for total visits in Part (a)? What does this tell you about the online social networking sites? c. A dotplot for the visits per unique visitor data is shown at the top of the page. What new information about the online social networks is provided by this dotplot?Heal the Bay is an environmental organization that releases an annual beach report card based on water quality (Heal the Bay Beach Report Card, beachreportcard.org, retrieved May 7, 2016). The grades for 20 beaches in three counties in Washington (Whatcom, Snohomish, and Island counties) during dry weather were: CBA+A+A+A+A+AA+A+A+CA+A+A+CAFFB Summarize the dry weather grades by constructing a relative frequency distribution and a bar chart.The report referenced in the previous exercise also gave wet weather grades for the same beaches: A+A+A+A+A+A+A+FFFA+A+FA+A+FA+A+FA+ a. Construct a bar chart for the wet weather grades. b. Do the bar charts from Part (a) and from the previous exercise support the statement that beach water quality tends to be better in dry weather conditions? Explain.The U.S. Department of Health and Human Services reported the estimated percentage of households with only wireless phone service (no landline) in 2014 for the 50 states and the District of Columbia (cdc.gov/nchs/data/nhis/earlyrelease/wireless_state_201602.pdf, retrieved February 8, 2018). In the accompanying data table, each state was also classified into one of three geographical regionsWest (W), Middle states (M), and East (E). a. Display the data graphically in a way that makes it possible to compare wireless percent for the three geographical regions. b. Does the graphical display in Part (a) reveal any striking differences in w ireless percent for the three geographical regions or are the distributions of wireless percent observations similar for the three regions?Example 1.5 gave the accompanying data on violent crime on public college campuses in Florida during 2016 (fbi.gov, retrieved February 6, 2018): a. Construct a dotplot using the 16 observations on number of violent crimes reported. Which schools stand out from the rest? b. One of the Florida schools has only 861 students and a few of the schools are quite a bit larger than the rest. Because of this, it might make more sense to consider a crime rate by calculating the number of violent crimes reported per 1000 students. For example, for Florida AM University the violent crime rate would be 69928(1000)=(0.001)(1000)=1.0 Calculate the violent crime rate for the other 15 schools and then use those values to construct a dotplot. Do the same schools stand out as unusual in this dotplot? c. Based on your answers from Parts (a) and (b), write a couple of sentences commenting on violent crimes reported at Florida universities and colleges in 2016.The article Fliers Trapped on Tarmac Push for Rules on Release (USA TODAY, July 28, 2009) gave the following data for 17 airlines on number of flights that were delayed on the tarmac for at least 3 hours for the period from October 2008 to May 2009: The graph at the bottom of the page shows two dot-plots: one displays the number of delays data, and one displays the rate per 10,000 flights data. a. If you were going to rank airlines based on flights delayed on the tarmac for at least 3 hours, would you use the total number of flights data or the rate per 10,000 flights data? Explain the reason for your choice. b. Write a short paragraph that could be used as part of a newspaper article on flight delays that could accompany the dotplot of the rate per 10,000 flights data.The report Trends m Community Colleges (collegeboard.com/trends April 2106, trends .collegeboard.org/sites/default/files/trends-in-community-colleges-research-brief.pdf, retreived February 8, 2018) included the accompanying information on student debt for students graduating with an AA degree from a public community college in 2012. a. Use the given information to construct a bar chart. b. Write a few sentences commenting on student debt for public community college graduates.The article Where College Students Buy Textbooks (USA TODAY, October 14, 2010) gave data on where students purchased books. The accompanying frequency table summarizes data from a sample of 1152 full-time college students. a. Construct a bar chart to summarize the data distribution. b. Write a few sentences commenting on where students are buying textbooks.31EIn the United States, movies are rated by the Motion Picture Association of America (MPAA). The accompanying table gives the MPAA rating of the 25 top money-making movies of 2015 (data from boxofficemojo.com, retrieved October 10. 2016). Use the given information to construct a bar chart of the ratings for the top 25 movies of 2015. Write a few sentences describing the ratings distribution.The report Testing the Waters 2009 (nrdc.org) included information on the water quality at the 82 most popular swimming beaches in California. Thirty-eight of these beaches are in Los Angeles a. Construct a dotplot of the percent of tests failing to meet water quality standards for the Los Angeles County beaches. Write a few sentences describing any interesting features of the dotplot. b. Construct a dotplot of the percent of tests failing to meet water quality standards for the beaches in other counties. Write a few sentences describing any interesting features of the dotplot. c. Based on the two dolplots from Parts (a) and (b), describe how the percent of tests that fail to meet water quality standards for beaches in Los Angeles County differs from those of other counties.The U.S. Department of Education reported that 14% of adults were classified as being below a basic literacy level, 29% were classified as being at a basic literacy level, 44% were classified as being at an intermediate literacy level, and 13% were classified as being at a proficient level (2003 National Assessment of Adult Literacy). a. Is the variable literacy level categorical or numerical? b. Construct a bar chart to display the given data on literacy level. c. Would it be appropriate to display the given information using a dotplot? Explain why or why not.The Computer Assisted Assessment Center at the University of Luton published a report titled Technical Review of Plagiarism Detection Software. The authors of this report asked faculty at academic institutions about the extent to which they agreed with the statement Plagiarism is a significant problem in academic institutions. The responses are summarized in the accompanying table. Construct a bar chart for these data.The article Just How Safe Is That Jet? (USA TODAY, March 13, 2000) gave the following relative frequency distribution that summarizes data on the type of violation for fines imposed on airlines by the Federal Aviation Administration: a. Use this information to construct a bar chart for type of violation. b. Write a sentence or two commenting on the relative occurrence of the various types of violation.Each year, U.S. News and World Report publishes a ranking of U.S. business schools. The following data give the acceptance rates (percentage of applicants admitted) for the best 25 programs in a recent survey: 16.3 12.0 25.1 20.3 31.9 20.7 30.1 19.5 36.2 46.9 25.8 .36.7 33.8 24.2 21.5 .35.1 .37.6 23.9 17.0 38.4 31.2 43.8 28.9 31.4 48.9 a. Construct a dotplot. b. Comment on the interesting features of the plot.Many adolescent boys aspire to be professional athletes. The paper Why Adolescent Boys Dream of Becoming Professional Athletes (Psychological Reports [1999]:10751085) examined some of the reasons. Each boy in a sample of teenage boys was asked the following question: Previous studies have shown that more teenage boys say that they are considering becoming professional athletes than any other occupation. In your opinion, why do these boys want to become professional athletes? The resulting data are shown in the following table: Construct a bar chart to display these data.The article How Dangerous Is a Day in the Hospital? (Medical Care [2011]; 10681075) describes a study to determine if the risk of an infection is related to the length of a hospital stay. The researchers looked at a large number of hospitalized patients and compared the proportion who got an infection for two groups of patientsthose who were hospitalized overnight and those who were hospitalized for more than one night. Indicate whether the study is an observational study or an experiment. Give a brief explanation for your choice.The authors of the paper Fudging the Numbers: Distributing Chocolate Influences Student Evaluations of an Undergraduate Course (Teaching in Psychology [2007]: 245247) carried out a study to see if events unrelated to an undergraduate course could affect student evaluations. Students enrolled in statistics courses taught by the same instructor participated in the study. All students attended the same lectures and one of six discussion sections that met once a week. At the end of the course, the researchers chose three of the discussion sections to be the chocolate group. Students in these three sections were offered chocolate prior to having them fill out course evaluations. Students in the other three sections were not offered chocolate. The researchers concluded that Overall, students offered chocolate gave more positive evaluations than students not offered chocolate. Indicate whether the study is an observational study or an experiment. Give a brief explanation for your choice.The article Why We Fall for This (AARP Magazine, May/June 2011) described a study in which a business professor divided his class into two groups. He showed students a mug and then asked students in one of the groups how much they would pay for the mug. Students in the other group were asked how much they would sell the mug for if it belonged to them. Surprisingly, the average value assigned to the mug was quite different for the two groups! Indicate whether the study is an observational study or an experiment. Give a brief explanation for your choice.The article Adolescents Living the 24/7 Lifestyle: Effects of Caffeine and Technology on Sleep Duration and Daytime Functioning (Pediatrics [2009]: e1005-e1010) describes a study in which researchers investigated whether there is a relationship between amount of sleep and caffeine consumption. They found that teenagers who usually get less than 8 hours of sleep on school nights were more likely to report falling asleep during school and to consume more caffeine on average than teenagers who usually get 8 to 10 hours of sleep on school nights. a. Is the study described an observational study or an experiment? b. Is it reasonable to conclude that getting less than 8 hours of sleep on school nights causes teenagers to fall asleep during school and to consume more caffeine, on average? Explain. (Hint; Look at Table 2.1.)The article Acupuncture for Bad Backs: Even Sham Therapy Works (Time, May 12, 2009) summarized a study conducted by researchers in Seattle. In this study, 638 adults with back pain were randomly assigned to one of four groups. People in group 1 received the usual care for back pain. People in group 2 received acupuncture at a set of points tailored specifically for each individual. People in group 3 received acupuncture at a standard set of points typically used in the treatment of back pain. Those in group 4 received fake acupuncturethey were poked with a toothpick at the same set of points chosen for the people in group 3! Two conclusions from the study were: (1) patients receiving real or fake acupuncture experienced a greater reduction in pain than those receiving usual care; and (2) there was no significant difference in pain reduction for those who received acupuncture and those who received fake acupuncture toothpick pokes. a. Is this study an observational study or an experiment? Explain. b. Is it reasonable to conclude that receiving either real or fake acupuncture was the cause of the observed reduction in pain in those groups compared to the usual care group? What aspect of this study supports your answer? (Hint: Look at Table 2.1.)The article Display of Health Risk Behaviors on MySpace by Adolescents (Archives of Pediatrics and Adolescent Medicine [2009]: 2734) described a study in which researchers looked at a random sample of 500 publicly accessible MySpace web profiles posted by 18-year-oids. The content of each profile was analyzed. One of the conclusions reported was that displaying sport or hobby involvement was associated with decreased references to risky behavior (sexual references or references to substance abuse or violence). a. Is the study described an observational study or an experiment? b. Is it reasonable to generalize the stated conclusion to all 18-year-olds with a publicly accessible MySpace web profile? What aspect of the study supports your answer? c. Not all MySpace users have a publicly accessible profile. Is it reasonable to generalize the stated conclusion to all 18-year-old MySpace users? Explain. d. Is it reasonable to generalize the stated conclusion to all MySpace users with a publicly accessible profile? Explain.The article Popping Cork Sound Makes Wine Taste Better (decanter.com/wine-news/popping-cork-sound-makes-wine-taste-better-experiment-377364, retrieved February 11, 2018) describes a study in which 140 people were assigned to one of two groups. Those in one group heard the noise made by a bottle of wine being uncorked before tasting a glass of wine. Those in the other group heard the noise made by a person releasing a screw cap before tasting a glass of wine. Both groups tasted the same wine. It was reported that the average rating was higher for the group that heard the cork popping than the average rating for the group that heard the screw cap being released. a. Is the study described an observational study or an experiment? b. Can a case be made for the researchers conclusion that hearing a cork pop rather than a screw cap being released was the cau.se for the higher rating? Explain.Fruit Juice May Be Fueling Pudgy Preschoolers, Study Says is the title of an article that appeared in the San Luis Obispo Tribune (February 27, 2005). This article describes a study that found that for 3-and 4-year-olds, drinking something sweet once or twice a day doubled the risk of being seriously overweight one year later. The authors of the study state Total energy may be a confounder if consumption of sweet drinks is a marker for other dietary factors associated with overweight (Pediatrics, November 2005). Give an example of a dietary factor that might be one of the potentially confounding variables the study authors are worried about.The article Americans are Getting the Wrong Idea on Alcohol and Health (Associated Press, April 19, 2005) reported that observational studies in recent years that have concluded that moderate drinking is associated with a reduction in the risk of heart disease may be misleading. The article refers to a study conducted by the Centers for Disease Control and Prevention that showed that moderate drinkers, as a group, tended to be better educated, wealthier, and more active than nondrinkers. Explain why the existence of these potentially confounding variables prevents drawing the conclusion that moderate drinking is the cause of reduced risk of heart disease.Based on a survey conducted on the eDiets.com web site, investigators concluded that women who regularly watched Oprah were only one-seventh as likely to crave fattening foods as those who watched other daytime talk shows (San Luis Obispo Tribune, October 14, 2000). a. Is it reasonable to conclude that watching Oprah causes a decrease in cravings for fattening foods? Explain. b. Is it reasonable to generalize the results of this survey to all women in the United States? To all women who watch daytime talk shows? Explain why or why not.A survey of adult Americans who are Internet users carried out in 2016 found that 79% were Facebook users (Social Media Update 2016, Pew Research Center, November 11, 2016). a. What condition on how the data were collected would make the generalization from the sample to the population of all adult American Internet users reasonable? b. Would it be reasonable to generalize from the sample and say that 79% of all adult Americans use Facebook? Explain.Does sitting for long periods of time hurt your heart? The article Why Sitting May Be Bad for Your Heart (The New York Times, December 20, 2017) describes a study of 1700 people who were participants in the Dallas Heart Study. The study found that the people who sat for long periods of time tended to have higher levels of troponin in their blood. Troponin is a protein that is released when the heart muscle has been damaged. The article states that Of course, this was an observational study and can show only that sitting is linked to high troponin, not that it causes troponins to rise. Do you agree with this statement? Explain.A New York psychologist recommends that if you feel the need to check your e-mail in the middle of a movie or if you sleep with your cell phone next to your bed, it might be time to power off (AARP Bulletin, September 2010). Suppose that you want to learn about the proportion of students at your college who would feel the need to check e-mail during the middle of a movie and that you have access to a list of all students enrolled at your college. Describe how you would use this list to select a simple random sample of 100 students.As part of a curriculum review, a psychology department would like to select a simple random sample of 20 of last years 140 graduates to obtain information on how graduates perceived the value of the curriculum. Describe two different methods that might be used to select the sample.A petition with 500 signatures is submitted to a universitys student council. The council president would like to determine the proportion of those who signed the petition who are actually registered students at the university. There is not enough time to check all 500 names with the registrar, so the council president decides to select a simple random sample of 30 signatures. Describe how this might be done.The article Bicyclists and Other Cyclists (Annals of Emergency Medicine [2010]: 426) reported that in 2008, there were 716 bicyclists killed on public roadways in the United States, and that the average age of the cyclists killed was 41 years. These figures were based on an analysis of the records of all traffic-related deaths of bicyclists on U.S. public roadways (this information is kept by the National Highway Traffic Safety Administration). a. Does the group of 716 bicycle fatalities represent a census or a sample of the 2008 bicycle fatalities? b. If the population of interest is 2008 bicycle traffic fatalities, is the given average age of 41 years a number that describes a sample or a number that describes the population?The article Teenage Physical Activity Reduces Risk of Cognitive Impairment in Later Life (Journal of the American Geriatrics Society [2010]) describes a study of more than 9000 women from Maryland, Minnesota, Oregon, and Pennsylvania. The women were asked about their physical activity as teenagers and at ages 30 and 50. A press release about this study (wiley.com/WileyCDA/PressRelease/pressReleaseld-77637.html retrieved February 13, 2018) generalized the results of this study to all American women. In the press release, the researcher who conducted the study is quoted as saying Our study shows that women who are regularly physically active at any age have lower risk of cognitive impairment than those who are inactive but that being physically active at teenage is most important in preventing cognitive impairment. Answer the following four questions for this observational study. (Hint; Reviewing Examples 2.1 and 2.2 might be helpful.) a. What is the population of interest? b. Was the sample selected in a reasonable way? c. Is the sample likely to be representative of the population of interest? d. Are there any obvious sources of bias?For each of the situations described, state whether the sampling procedure is simple random sampling, stratified random sampling, cluster sampling, systematic sampling, or convenience sampling. a. All first-year students at a university are enrolled in one of 30 sections of a seminar course. To select a sample of freshmen at this university, a researcher selects four sections of the seminar course at random from the 30 sections and all students in the four selected sections are included in the sample. b. To obtain a sample of students, faculty, and staff at a university, a researcher randomly selects 50 faculty members from a list of faculty, 100 students from a list of students, and 30 staff members from a list of staff. c. A university researcher obtains a sample of students at his university by using the 85 students enrolled in his Psychology 101 class. d. To obtain a sample of the seniors at a particular high school, a researcher writes the name of each senior on a slip of paper, places the slips in a box and mixes them, and then selects 10 slips. The students whose names are on the selected slips of paper are included in the sample. e. To obtain a sample of those attending a basketball game, a researcher selects the 24th person through the door. Then, every 50th person after that is also included in the sample.Of the 6500 students enrolled at a community college, 3000 are part lime and the other 3500 are full time. The college can provide a list of students that is sorted so that all full-time students are listed first, followed by the part-time students. a. Describe a procedure for selecting a stratified random sample that uses full-time and part-time students as the two strata and that includes 10 students from each stratum. b. Does every student at this community college have the same chance of being selected for inclusion in this stratified random sample? Explain.Briefly explain why it is advisable to avoid the use of convenience samples.A sample of pages from this book is to be selected, and the number of words on each page in the sample will be determined. For the purposes of this exercise, equations are not counted as words and a number is counted as a word only if it is spelled outthat is, ten is counted as a word, but 10 is not. a. Describe a sampling procedure that would result in a simple random sample of pages from this book. b. Describe a sampling procedure that would result in a stratified random sample. Explain why you chose the specific strata used in your sampling plan. c. Describe a sampling procedure that would result in a systematic sample. d. Describe a sampling procedure that would result in a cluster sample.The chairman of a California ballot initiative campaign to add none of the above to the list of ballot options in all candidate races was quite critical of a Field poll that showed his measure trailing by 10 percentage points. The poll was based on a random sample of 1000 registered voters in California. He is quoted by the Associated Press (January 30, 2000) as saying, Fields sample in that poll equates to one out of 17,505 voters, and he added that this was so dishonest that Field should get out of the polling business! If you worked on the Field poll, how would you respond to this criticism? (Hint: See discussion of sample size on page 39.)The authors of the paper Digital Inequality: Differences in Young Adults Use of the Internet (Communication Research [2008]: 602-621) were interested in determining if people with higher levels of education use the Internet in different ways than those who do not have as much formal education. To answer this question, they used data from a national telephone survey. Approximately 1300 households were selected for the survey, and 270 of them completed the interview. What type of bias should the researchers be concerned about and why? (Hint: See the box on page 35 that contains the definitions of different types of bias.)The 2013 National Study of Substance Use Habits of College Student-Athletes surveyed student athletes at NCAA member colleges and universities. The passage below is from the survey website (ncaa.org/about/resources/research/about-survey, retrieved February 13, 2018). All NCAA member institutions are asked to participate. The sampling plan achieves an appropriate representation of all NCAA student-athletes while minimizing burden to institutions by asking that all student-athletes on no more than three teams be surveyed on any campus. The teams surveyed are determined by a computer-generated random draw. The survey is administered to the selected teams in a classroom setting, and no identifying information about the athletes or the college is collected. The web site also states It is important to note that even with measures to ensure anonymity, self-reported data of this kind can be problematic due to the sensitive nature of the issues. Therefore, absolute levels of use might be underestimated in a study such as this. a. Was this sample a simple random sample, a stratified sample, a cluster sample, a systematic sample, or a convenience sample? Explain. b. Give two reasons why an estimate of the proportion of students who reported using illegal drugs based on data from this survey should not be generalized to all U.S. college students.The paper Deception and Design: The Impact of Communication Technology on Lying Behavior (Computer-Human Interaction [2009]: 130136) describes an investigation into whether lying is less common in face-to-face communication than in other forms of communication such as phone conversations or e-mail. Participants in this study were 30 students in an upper-division communications course at Cornell University who received course credit for participation. Participants were asked to record all of their social interactions for a week, making note of any lies told. Based on data from these records, the authors of the paper concluded that students lie more often in phone conversations than in face-to-face conversations and more often in face-to-face conversations than in e-mail. Discuss the limitations of this study, commenting on the way the sample was selected and potential sources of bias.The authors of the paper Playing by the Rules: Parental Mediation of Video Game Play (Journal of Family Issues2015]: 124) used data from a sample of parents to investigate the ways in which parents monitor their childrens use of video games. The sample of parents consisted of 427 people who responded to a survey conducted on an Amazon-operated marketplace where people can complete surveys in exchange for compensation. a. Do you think that the sample of parents was selected in a way that makes it reasonable to think it is representative of the population of all parents? b. Is it reasonable to generalize conclusions based on data from this survey to all parents? Explain why or why not.Participants in a study of honesty in online dating profiles were recruited through print and online advertisements in the Village Voice, one of New York Citys most prominent weekly newspapers, and on Craigslist New York City (The Truth About Lying in Online Dating Profiles, Computer-Human Interaction [2007]: 14). The actual height, weight, and age of the participants were compared to what appeared in their online dating profiles. The resulting data was then used to draw conclusions about how common deception was in online dating profiles. What concerns do you have about generalizing conclusions based on data from this study to the population of all people who have an online dating profile? Be sure to address at least two concerns and give the reasons for your concern.The article Credit Card Activity for College Students (wallethub.com/edu/credit-card-statistics-for-college-students/25535/, retrieved February 13, 2018) estimated that in 2013, 62% of undergraduates with credit cards pay them off each month and that the average outstanding balance on undergraduates credit cards is 650. These estimates were based on an online survey of 800 college students. What additional information would you want in order to decide if it is reasonable to generalize the reported estimates to the population of all undergraduate students?The financial aid advisor of a university plans to use a stratified random sample to estimate the average amount of money that students spend on textbooks each term. For each of the following proposed stratification schemes, discuss whether it would be worthwhile to stratify the university students in this manner. (Hint: Remember that it is desirable to create strata that are homogeneous.) a. Strata corresponding to class standing (freshman, sophomore, junior, senior, graduate student) b. Strata corresponding to field of study, using the following categories: engineering, architecture, business, other c. Strata corresponding to the first letter of the last name: AE, FK, etc.Suppose that you were asked to help design a survey of adult city residents in order to estimate the proportion who would support a sales tax increase. The plan is to use a stratified random sample, and three stratification schemes have been proposed. Which of the three stratification schemes would be best in this situation? Explain.The article High Levels of Mercury Are Found in Californians (Los Angeles Times, February 9, 2006) describes a study in which hair samples were tested for mercury. The hair samples were obtained from more than 6000 people who voluntarily sent hair samples to researchers at Greenpeace and The Sierra Club. The researchers found that nearly one-third of those tested had mercury levels that exceeded the concentration thought to be safe. Is it reasonable to generalize this result to the larger population of U.S. adults? Explain why or why not.Whether or not to continue a Mardi Gras Parade through downtown San Luis Obispo, California, is a hotly debated topic. The parade is popular with students and many residents, but some celebrations have led to complaints and a call to eliminate the parade. The local newspaper conducted online and telephone surveys of its readers and was surprised by the results. The online survey site received more than 400 responses, with more than 60% favoring continuing the parade, while the telephone response line received more than 120 calls, with more than 90% favoring banning the parade (San Luis Obispo Tribune, March 3, 2004). What factors may have contributed to these very different results?The head of the quality control department at a printing company would like to carry out an experiment to determine which of three different glues results in the greatest binding strength. Although they are not of interest in the current study, other factors thought to affect binding strength are the number of pages in the book and whether the book is being bound as a paperback or a hardback. (Hint: See box on page 45.) a. What is the response variable in this experiment? b. What explanatory variable will determine the experimental conditions? c. What two extraneous variables are mentioned in the problem description? Can you think of any other extraneous variables that should be considered?A study of college students showed a temporary gain of up to 9 IQ points after listening to a Mozart piano sonata. This conclusion, dubbed the Mozart effect, has since been criticized by a number of researchers who have been unable to confirm the result in similar studies. Suppose that you wanted to see whether there is a Mozart effect for students at your school. (Hint: See Examples 2.4 and 2.5.) a. Describe how you might design an experiment for this purpose. b. Does your experimental design include direct control of any extraneous variables? Explain. c. Does your experimental design use blocking? Explain why you did or did not include blocking in your design. d. What role does random assignment play in your design?According to the article Rubbing Hands Together Under Warm Air Dryers Can Counteract Bacteria Reduction (Infectious Disease News, September 22, 2010) washing your hands isnt enoughgood hand hygiene'' also includes drying hands thoroughly. The article described an experiment to compare bacteria reduction for three different hand-drying methods. In this experiment subjects handled uncooked chicken for 45 seconds, then washed their hands with a single squirt of soap for 60 seconds, and then used one of the three hand-drying methods. After completely drying their hands, the bacteria count on their hands was measured. Suppose you want to carry out a similar experiment and that you have 30 subjects who are willing to participate. Describe a method for randomly assigning each of the 30 subjects to one of the hand-drying methods. (Hint: See A Note on Random Assignment on page 50.)The following is from an article titled After the Workout, Got Chocolate Milk? that appeared in the Chicago Tribune (January 18, 2005): Researchers at Indiana University at Bloomington have found that chocolate milk effectively helps athletes recover from an intense workout. They had nine cyclists bike, rest four hours, then bike again, three separate times. After each workout, the cyclists downed chocolate milk or energy drinks Gatorade or Endurox (two to three glasses per hour); then, in the second workout of each set, they cycled to exhaustion. When they drank chocolate milkthe amount of time they could cycle until they were exhausted was similar to when they drank Gatorade and longer than when they drank Endurox. The article is not explicit about this, but in order for this to have been a well-designed experiment, it must have incorporated random assignment. Briefly explain where the researcher would have needed to use random assignment in order for the stated conclusion to be valid.The report Comparative Study of Two Computer Mouse DesignsCornell Human Factors Laboratory Technical Report RP7992) included the following description of the subjects used in an experiment: Twenty-four Cornell University students and staff (12 males and 12 females) volunteered to participate in the study. Three groups of 4 men and 4 women were selected by their stature to represent the 5th percentile (female 152.1 0.3 cm, male 164.1 0.4 cm), 50th percentile (female 162.4 0.1 cm, male 174.1 0.7 cm), and 95th percentile (female 171.9 0.2 cm, male 185.7 0.6 cm) ranges All subjects reported using their right hand to operate a computer mouse. This experimental design incorporated direct control and blocking. a. Is the potential effect of the extraneous variable stature (height) addressed by blocking or direct control? b. Whether the right or left hand is used to operate the mouse was considered to be an extraneous variable. Is the potential effect of this variable addressed by blocking or direct control?The Institute of Psychiatry at Kings College London found that dealing with infomania (information overload) has a temporary, but significant derogatory effect on IQ (Discover, November 2005). Researchers divided volunteers into two groups. Each subject took an IQ test. One group had to check e-mail and respond to instant messages while taking the test, and the second group took the test without any distraction. The distracted group had an average score that was 10 points lower than the average for the control group. Explain why it is important that the researchers created the two experimental groups in this study by using random assignment.In an experiment to compare two different surgical procedures for hernia repair (A Single-Blinded, Randomized Comparison of Laparoscopic Versus Open Hernia Repair in Children, Pediatrics [2009]: 332-336), 89 children were assigned at random to one of the two surgical methods. The researchers relied on the random assignment of subjects to treatments to create comparable groups with respect to extraneous variables that they did not control. One such extraneous variable was age. After random assignment to treatments, the researchers looked at the age distribution of the children in each of the two experimental groups (laparoscopic repair [LR] and open repair OR]). The accompanying figure is similar to one in the paper. Based on this figure, has the random assignment of subjects to experimental groups been successful in creating groups that are similar with respect to the ages of the children in the groups? Explain.In many digital environments, users are allowed to choose how they are represented visually online. Does how people are represented online affect online behavior? This question was examined by the authors of the paper The Proteus Effect: The Effect of Transformed Self-Representation on Behavior (Human Communication Research [2007]: 271-290). Participants were randomly assigned either an attractive avatar (a graphical image that represents a person) to represent them or an unattractive avatar. a. The researchers concluded that when interacting with a person of the opposite gender in an online virtual environment, those assigned an attractive avatar moved significantly closer to the other person than those who had been assigned an unattractive avatar. This difference was attributed to the attractiveness of the avatar. Explain why the researchers would not have been able to reach this conclusion if participants had been allowed to choose one of the two avatars (attractive, unattractive) to represent them online. b. Construct a diagram to represent the underlying structure of this experiment.Does playing action video games provide more than just entertainment? The authors of the paper Action-Video-Game Experience Alters the Spatial Resolution of Vision (Psychological Science [2007]: 8894) concluded that spatial resolution, an important aspect of vision, is improved by playing action video games. They based this conclusion on data from an experiment in which 32 volunteers who had not played action video games were equally and randomly divided between the experimental and control groups. Subjects in each group played a video game for 30 hours over a period of 6 weeks. Those in the experimental group played Unreal Tournament 2004, an action video game. Those in the control group played the game Tetris, a game that does not require the user to process multiple objects at once. Explain why the random assignment to the two groups is an important aspect of this experiment.Construct a diagram to represent the note-taking experiment of Example 2.4. Example 2.4 Put Away That Laptop! The article The Pen Is Mightier Than the Keyboard (Psychological Science [2014]: 1-10) describes several experiments designed to investigate whether the method that students use to take notes has an effect on learning. In one of the experiments described in the article, 67 students from Princeton University were assigned to one of two groups. Both groups were shown a video of a TED Talk (seeted.com/talks) and asked to take notes during the talk using their usual note-taking strategy. One group watched the talk in a room equipped with laptops that were not connected to the Internet, and they were asked to use the laptops to take notes. The other group watched the talk in a room where they were given notebooks and asked to take their notes by hand. When the talk was finished, students were engaged in other tasks for 30 minutes and then were asked to take a test on the material from the TED talk. The test included both fact-recall questions and conceptual-application questions. The researchers found that the two groups performed equally well on the fact-recall questions, but that on the conceptual-application questions, the group that used laptops to take notes performed significantly worse than the group that took notes by hand. If we assume that the researcher randomly assigned the subjects to the two groups, then this study is an experiment that compares two treatments (laptop used to take notes and handwritten notes). The responses measured were the scores on the fact-recall questions part of the test and the scores on the conceptual-application questions part of the test. The experiment uses replication (many subjects in each treatment group) and random assignment to control for extraneous variables that might affect the response.Construct a diagram to represent the gasoline additive experiment described on page 48. The experiment just described can be viewed as consisting of a sequence of trials. Because a number of extraneous variables (such as variations in environmental conditions like wind speed or humidity and small variations in the condition of the car) might have an effect on gas mileage it would not be a good idea to use additive 1 for the first 10 trials, additive 2 for the next 10 trials, and so on. A better approach would be to randomly assign additive 1 to 10 of the 30 planned trials, and then randomly assign additive 2 to 10 of the remaining 20 trials. The resulting plan for carrying out the experiment might look as follows:An advertisement for a sweatshirt that appeared in SkyMall Magazine (a catalog distributed by some airlines) stated the following: This is not your ordinary hoody! Why? Fact: Research shows that written words on containers of water can influence the waters structure for better or worse depending on the nature and intent of the word. Fact: The human body is 70% water. What if positive words were printed on the inside of your clothing? The reference to the fact that written words on containers of water can influence the waters structure appears to be based on the work of Dr. Masaru Emoto who typed words on paper, pasted the words on bottles of water, and observed how the water reacted to the words by seeing what kind of crystals were formed in the water. He describes several of his experiments in his self-published book, The Message from Water. If you were going to interview Dr. Emoto, what questions would you want to ask him about the design of his experiment?The paper Turning to Learn: Screen Orientation and Reasoning from Small Devices (Computers in Human Behavior [2011]: 793797) describes a study that investigated whether cell phones with small screens are useful for gathering information. The researchers wondered if the ability to reason using information read on a small screen was affected by the screen orientation. The researchers assigned 33 undergraduate students who were enrolled in a psychology course at a large public university to one of two groups at random. One group read material that was displayed on a small screen in portrait orientation, and the other group read material on the same size screen but turned to display the information in landscape orientation (see figure below). The researchers found that performance on a reasoning test that was based on the displayed material was better for the group that read material in the landscape orientation. a. Is the described study an observational study or an experiment? b. Did the study use random selection from some population? c. Did the study use random assignment to experimental groups?Consider the study described in the previous exercise. a. Is the conclusionthat reasoning using information displayed on a small screen is improved by turning the screen to landscape orientationappropriate, given the study design? Explain. b. Is it reasonable to generalize the conclusions from this study to some larger population? If so, what population? The paper Turning to Learn: Screen Orientation and Reasoning from Small Devices (Computers in Human Behavior [2011]: 793797) describes a study that investigated whether cell phones with small screens are useful for gathering information. The researchers wondered if the ability to reason using information read on a small screen was affected by the screen orientation. The researchers assigned 33 undergraduate students who were enrolled in a psychology course at a large public university to one of two groups at random. One group read material that was displayed on a small screen in portrait orientation, and the other group read material on the same size screen but turned to display the information in landscape orientation (see figure below). The researchers found that performance on a reasoning test that was based on the displayed material was better for the group that read material in the landscape orientation.The Pew Research Center conducted a study of gender bias. The report Men or Women: Who is the Better Leader? A Paradox in Public Attitudes (pewsocialtrends.org, August 28, 2008) describes how the study was conducted: In the experiment, two separate random samples of more than 1000 registered voters were asked to read a profile sent to them online of a hypothetical candidate for U.S. Congress in their district. One random sample of 1161 respondents read a profile of Ann Clark, described as a lawyer, a churchgoer, a member of the local Chamber of Commerce, an environmentalist and a member of the same party as the survey respondent. They were then asked what they liked and didnt like about her, whether they considered her qualified and whether they were inclined to vote for her. There was no indication that this was a survey about gender or gender bias. A second random sample of 1139 registered voters was asked to read a profile of Andrew Clark, who-except for his gender-was identical in every way to Ann Clark. These respondents were then asked the same questions. a. What are the two treatments in this experiment? b. What are the response variables in this experiment?Consider the study described in the previous exercise. Explain why taking two separate random samples has the same benefits as random assignment to the two treatments in this experiment. The Pew Research Center conducted a study of gender bias. The report Men or Women: Who is the Better Leader? A Paradox in Public Attitudes (pewsocialtrends.org, August 28, 2008) describes how the study was conducted: In the experiment, two separate random samples of more than 1000 registered voters were asked to read a profile sent to them online of a hypothetical candidate for U.S. Congress in their district. One random sample of 1161 respondents read a profile of Ann Clark, described as a lawyer, a churchgoer, a member of the local Chamber of Commerce, an environmentalist and a member of the same party as the survey respondent. They were then asked what they liked and didnt like about her, whether they considered her qualified and whether they were inclined to vote for her. There was no indication that this was a survey about gender or gender bias. A second random sample of 1139 registered voters was asked to read a profile of Andrew Clark, who-except for his gender-was identical in every way to Ann Clark. These respondents were then asked the same questions. a. What are the two treatments in this experiment? b. What are the response variables in this experiment?Red wine contains flavonol, an antioxidant thought to have beneficial health effects. But to have an effect, the antioxidant must be absorbed into the blood. The article Red Wine is a Poor Source of Bioavailable Flavonols in Men (The Journal of Nutrition [2001]: 745748) describes a study to investigate three sources of dietary flavonolred wine, yellow onions, and black teato determine the effect of source on absorption. The article included the following statement: We recruited subjects via posters and local newspapers. To ensure that subjects could tolerate the alcohol in the wine, we only allowed men with a consumption of at least seven drinks per week to participate Throughout the study, the subjects consumed a diet that was low in flavonols. a. What are the three treatments in this experiment? b. What is the response variable? c. What are three extraneous variables that the researchers chose to control in the experiment?Explain why some studies include both a control group and a placebo treatment. What additional comparisons are possible if both a control group and a placebo group are included?Explain why blinding is a reasonable strategy in many experiments.Give an example of an experiment for each of the following: a. Single-blind experiment with the subjects blinded b. Single-blind experiment with the individuals measuring the response blinded c. Double-blind experiment d. An experiment for which it is not possible to blind the subjectsThe article Study Points to Benefits of Knee Replacement Surgery Over Therapy Alone (The New York Times, October 21, 2015) describes a study to compare two treatments for people with knee pain. In the study, 50 people with arthritis received knee replacement surgery followed by a program of exercise. Another 50 people with arthritis did not have surgery but received the same program of exercise. After 1 year, 85% of the people who had surgery and 68% of the people who did not have surgery reported pain relief. a. Why would it be important to determine if the researchers randomly assigned the people participating in the study to one of the two groups? b. Explain why you think that the researchers did not include a control group (a group that did not receive surgery and also did not receive an exercise program) in this study.In an experiment to compare two different surgical procedures for hernia repair (A Single-Blinded, Randomized Comparison of Laparoscopic Versus Open Hernia Repair in Children, Pediatrics [2009]: 332336), 89 children were assigned at random to one of the two surgical methods. The methods studied were laparoscopic repair and open repair. In laparoscopic repair three small incisions are made and the surgeon works through these incisions with the aid of a small camera that is inserted through one of the incisions. In the open repair, a larger incision is used to open the abdomen. One of the response variables in this study was the amount of medication that was given after the surgery for the control of pain and nausea. The paper states For postoperative pain, rescue fentanyl (1 g/kg) and for nausea, ondansetron (0.1 mg/kg) were given as judged necessary by the attending nurse blinded to the operative approach. a. Why do you think it was important that the nurse who administered the medications did not know which type of surgery was performed? b. Explain why it was not possible for this experiment to be double-blind.The article Placebos Are Getting More Effective. Drug Makers Are Desperate to Know Why. (Wired Magazine, August 8, 2009) states that according to research, the color of a tablet can boost the effectiveness even of genuine medsor help convince a patient that a placebo is a potent remedy. Describe how you would design an experiment to investigate if adding color to Tylenol tablets would result in greater perceived pain relief. Be sure to address how you would select subjects, how you would measure pain relief, what colors you would use, and whether or not you would include a control group in your experiment.The article Yes That Miley Cyrus Biography Helps Learning (The Globe and Mail, August 5, 2010) describes an experiment investigating whether providing summer reading books to low-income children would affect school performance. Subjects in the experiment were 1330 children randomly selected from first and second graders at low-income schools in Florida. A group of 852 of these children were selected at random from the group of 1330 participants to be in the book group. The other 478 children were assigned to the control group. Children in the book group were invited to a book fair in the spring to choose any 12 reading books which they could then take home. Children in the control group were not given any reading books but were given some activity and puzzle books. This process was repeated each year for 3 years until the children reached third and fourth grade. The researchers then compared reading test scores of the two groups. a. Do you think that randomly selecting 852 of the 1330 children to be in the book group is equivalent to random assignment of the children to the two experimental groups? Explain. b. Explain the purpose of including a control group in this experiment.Suppose that the researchers who carried out the experiment described in the previous exercise thought that sex might be a potentially confounding variable. If 700 of the children participating in the experiment were females and 630 were males, describe how blocking could be incorporated into the experiment. Be specific about how you would assign the children to treatment groups.The article Doctor Dogs Diagnose Cancer by Sniffing It Out (Knight Ridder Newspapers, January 9, 2006) reports the results of an experiment described in the journal Integrative Cancer Therapies. In this experiment, dogs were trained to distinguish between people with breast and lung cancer and people without cancer by sniffing exhaled breath. Dogs were trained to lie down if they detected cancer in a breath sample. After training, dogs ability to detect cancer was tested using breath samples from people whose breath had not been used in training the dogs. The paper states The researchers blinded both the dog handlers and the experimental observers to the identity of the breath samples. Explain why this blinding is an important aspect of the design of this experiment.Pismo Beach, California, has an annual clam festival that includes a clam chowder contest. Judges rate clam chowders from local restaurants, and the judging is done in such a way that the judges are not aware of which chowder is from which restaurant. One year, much to the dismay of the seafood restaurants on the waterfront, Dennys chowder was declared the winner! (When asked what the ingredients were, the cook at Dennys said he wasnt surehe just had to add the right amount of nondairy creamer to the soup stock that he got from Dennys distribution center!) a. Do you think that Dennys chowder would have won the contest if the judging had not been blind? Explain. b. Although this was not an experiment, your answer to Part (a) helps to explain why those measuring the response in an experiment are often blinded. Using your answer in Part (a), explain why experiments are often blinded in this way.The San Luis Obispo Tribune (May 7, 2002) reported that a new analysis has found that in the majority of trials conducted by drug companies in recent decades, sugar pills have done as well asor better thanantidepressants. What effect is being described here? What does this imply about the design of experiments with a goal of evaluating the effectiveness of a new medication?The article A Debate in the Dentists Chair (San Luis Obispo Tribune, January 28, 2000) described an ongoing debate over whether newer resin fillings are a better alternative to the more traditional silver amalgam fillings. Because amalgam fillings contain mercury, there is concern that they could be mildly toxic and prove to be a health risk to those with some types of immune and kidney disorders. One experiment described in the article used sheep as subjects and reported that sheep treated with amalgam fillings had impaired kidney function. a. In the experiment, a control group of sheep that received no fillings was used but there was no placebo group. Explain why it is not necessary to have a placebo group in this experiment. b. The experiment compared only an amalgam filling treatment group to a control group. What would be the benefit of also including a resin filling treatment group in the experiment? c. Why do you think the experimenters used sheep rather than human subjects?The article Effects of Too Much TV Can Be Undone (USA TODAY, October 1, 2007) included the following paragraph: Researchers at Johns Hopkins Bloomberg School of Public Health report that its not only how many hours children spend in front of the TV, but at what age they watch that matters. They analyzed data from a national survey in which parents of 2707 children were interviewed first when the children were 30- 33 months old and again when they were 512, about their TV viewing and their behavior. a. Is the study described an observational study or an experiment? b. The article says that data from a sample of 2707 parents were used in the study. What other information about the sample would you want in order to evaluate the study?66EA mortgage lender routinely places advertisements in a local newspaper. The advertisements are of three different types: one focusing on low interest rates, one featuring low fees for first-time buyers, and one appealing to people who may want to refinance their homes. The lender would like to determine which advertisement format is most successful in attracting customers to call for more information. a. Describe an experiment that would provide the information needed to make this determination. Be sure to consider extraneous variables, such as the day of the week that the advertisement appears in the paper, the section of the paper in which the advertisement appears, or daily fluctuations in the interest rate. b. What role does random assignment play in your design?The article Rethinking Calcium Supplements (US Airways Magazine, October 2010) describes a study investigating whether taking calcium supplements increases the risk of heart attack. Consider the following four study descriptions. For each study, answer the following five questions: Study 1: Every heart attack patient and every patient admitted for an illness other than a heart attack during the month of December at a large urban hospital was asked if he or she took calcium supplements. The researchers found that the proportion of heart attack patients who took calcium supplements was significantly higher than the proportion of patients admitted for other illnesses who took calcium supplements. Study 2: Two hundred people were randomly selected from a list of all people living in Minneapolis who receive Social Security. Each person in the sample was asked whether or not they took calcium supplements. These people were followed for 5 years, and whether or not they had a heart attack during the 5-year period was noted. The researchers found that the proportion of heart attack victims in the group taking calcium supplements was significantly higher than the proportion of heart attack victims in the group not taking calcium supplements. Study 3: Two hundred people were randomly selected from a list of all people living in Minneapolis who receive Social Security. Each person was asked to participate in a statistical study, and all agreed to participate. Those who had no previous history of heart problems were instructed to take calcium supplements. Those with a previous history of heart problems were instructed not to take calcium supplements. The participants were followed for 5 years, and whether or not they had a heart attack during the 5-year period was noted. The researchers found that the proportion of heart attack victims in the calcium supplement group was significantly higher than the proportion of heart attack victims in the no supplement group. Study 4: Four hundred people volunteered to participate in a 10-year study. Each volunteer was assigned at random to either group 1 or group 2. Those in group 1 took a daily calcium supplement. Those in group 2 did not take a calcium supplement. Those proportion who suffered a heart attack during the 10-year study period was noted for each group. The researchers found that the proportion of heart attack victims in group 1 was significantly higher than the proportion of heart attack victims in group 2.A pollster for the Public Policy Institute of California explains how the Institute selects a sample of California adults (Its about Quality, Not Quantity, San Luis Obispo Tribune, January 21, 2000): That is done by using computer-generated random residential telephone numbers with all California prefixes, and when there are no answers, calling back repeatedly to the original numbers selected to avoid a bias against hard-to-reach people. Once a call is completed, a second random selection is made by asking for the adult in the household who had the most recent birthday. It is as important to randomize who you speak to in the household as it is to randomize the household you select. If you didnt youd primarily get women and older people. Comment on this approach to selecting a sample. How does the sampling procedure attempt to minimize certain types of bias? Are there sources of bias that may still be a concern?A study in Florida is examining whether health literacy classes and using simple medical instructions that include pictures and avoid big words and technical terms can keep Medicaid patients healthier (San Luis Obispo Tribune, October 16, 2002). Twentyseven community health centers are participating in the study. For 2 years, half of the centers will administer standard care. The other centers will have patients attend classes and will provide special health materials that are easy to understand. Explain why it is important for the researchers to assign the 27 centers to the two groups (standard care and classes with simple health literature) at random.The press release Men Need to Man Up, According to Ball Park Brand Survey (PR Newswire, October 14, 2015) describes the results of a study in which 1012 U.S. men were asked a number of questions about lifes tough conversations. One result from this survey was summarized in a USA TODAY Snapshot (USA TODAY, November 6, 2015) that said that nearly 1 in 5 men would pay someone to handle their breakup for them." a. Is the study described an observational study or an experiment? b. Give at least one reason why the conclusion that nearly 1 in 5 men would pay someone to handle their breakup for them may not generalize to the population of all U.S. men.A news release from Intel Intels Security International Internet of Things Smart Home Survey Shows Many Respondents Sharing Personal Data for Money (March 30, 2016, newsroom.intel.com /news-releases/intel-securitys-international -internet-of-things-smart-home-survey/, retrieved September 25, 2016), described a survey conducted in 2015. The news release states A total of 9,000 consumers were interviewed globally, including 2,500 from the United States, 1,000 from the United Kingdom, 1,000 from France, 1,000 from Germany, 1,000 from Brazil, 1,000 from India, 500 from Canada, 500 from Mexico and 500 from Australia. Among the findings from the survey were that 54% of the respondents worldwide would be willing to share personal data collected from devices in their homes with companies in exchange for money. Do you think that the study described was an observational study or an experiment?USA TODAY (August 25, 2015) reported that American women favor Kate Middleton as a shopping buddy over Michelle Obama by 10 percentage points. This statement was based on a study in which 1001 adults were surveyed about their shopping preferences. Describe any potential sources of bias that might limit the researchers ability to draw conclusions about American women based on the data collected in this survey.The paper Effect of a Nutritional Supplement on Hair Loss in Women (Journal of Cosmetic Dermatology [2015]: 76-82) describes an experiment to see if a dietary supplement consisting of Omega 3, Omega 6, and antioxidants could reduce hair loss in women with stage 1 hair loss. One hundred twenty women volunteered to participate in the study and were randomly assigned to either the supplement group or a control group. The women in the supplement group took the supplement for 6 months. Photos of the top of the head were taken of all the women at the beginning of the study and 6 months later at the end of the study. The two photos of each woman were evaluated by an independent expert who visually determined the change in hair density. The expert who determined the change in hair density did not know which of the women had taken the supplement. a. Evaluate this experimental design. Do you think this is a good design or a poor design, and why? b. If you were designing such a study, what, if anything, would you propose to do differently?A manufacturer of clay roofing tiles would like to investigate the effect of clay type on the proportion of tiles that crack in the kiln during firing. Two different types of clay are to be considered. One hundred tiles can be placed in the kiln at any one time. Firing temperature varies slightly at different locations in the kiln, and firing temperature may also affect cracking. a. Discuss the design of an experiment to collect information that could be used to decide between the two clay types. b. How does your proposed design deal with the extraneous variable temperature?A tropical forest survey conducted by Conservation International included the following statements in the material that accompanied the survey: A massive change is burning its way through the earths environment. The band of tropical forests that encircle the earth is being cut and burned to the ground at an alarming rate. Never in history has mankind inflicted such sweeping changes on our planet as the clearing of rain forest taking place right now! The survey that followed included the questions given in Parts (a)(d) below. For each of these questions, identify a word or phrase that might affect the response and possibly bias the results of any analysis of the responses. a. Did you know that the world's tropical forests are being destroyed at the rate of 80 acres per minute? b. Considering what you know about vanishing tropical forests, how would you rate the problem? c. Do you think we have an obligation to prevent the man-made extinction of animal and plant species? d. Based on what you know now, do you think there is a link between the destruction of tropical forests and changes in the earths atmosphere?Each person in a nationally representative sample of 1252 young adults age 23 to 28 years old was asked how they viewed their financial physique (financial health) (2009 Young Adults Money Survey Findings, Charles Schwab, 2009). Toned and fit was chosen by 18% of the respondents, while 55% responded a little bit flabby, and 27% responded seriously out of shape. Summarize this information in a pie chart. (Hint: See Examples 3.2 and 3.3.)The graphical display on the next page is similar to one that appeared in USA TODAY (October 22, 2009). It summarizes survey responses to a question about whether visiting social networking sites is allowed at work. Which of the graph types introduced in this section is used to display the responses? (USA TODAY frequently adds artwork and text to their graphs to try to make them look more interesting.)The survey referenced in the previous exercise was conducted by Robert Half Technology. This company issued a press release (WhistleBut Dont Tweet-While You Work, roberthalftechnology.com, October 6, 2009) that provided more detail than in the USA TODAY graph. The actual question asked was Which of the following most closely describes your companys policy on visiting social networking sites, such as Facebook, MySpace and Twitter, while at work? The responses are summarized in the following table: a. Explain how the survey response categories and corresponding relative frequencies were used or modified to produce the graphical display in the previous exercise. b. Using the data in the table, construct a segmented bar chart. (Hint: See Example 3.5.) c. What are two other types of graphical displays that would be appropriate for summarizing these data? 3.2 The graphical display on the next page is similar to one that appeared in USA TODAY (October 22, 2009). It summarizes survey responses to a question about whether visiting social networking sites is allowed at work. Which of the graph types introduced in this section is used to display the responses? (USA TODAY frequently adds artwork and text to their graphs to try to make them look more interesting.)The National Confectioners Association asked 1006 adults the following question: Do you set aside a personal stash of Halloween candy? Fifty-five percent of those surveyed responded no, 41% responded yes, and 4% either did not answer the question or said they did not know (USA TODAY, October 22, 2009). Use the given information to construct a pie chart.College student attitudes about e-books were investigated in a survey of 1625 students. Students were asked to indicate their level of agreement with the following statement: I would like to be able to get all my textbooks in digital form. The responses are summarized in the accompanying table. (The Chronicle of Higher Education, August 23, 2013) a. Construct an appropriate graphical display to summarize the information given in the table. b. Write a headline that would be appropriate for a newspaper article that summarized the results of this survey.The Center for Science in the Public Interest evaluated school cafeterias in 20 school districts across the United States. Each district was assigned a numerical score on the basis of rigor of food codes, frequency of food safety inspections, access to inspection information, and the results of cafeteria inspections. Based on the score assigned, each district was also assigned one of four grades. The scores and grades are summarized in the accompanying table, which appears in the report Making the Grade: An Analysis of Food Safety in School Cafeterias(cspi.us/new/pdf/makingthegrade.pdf, 2007). a. Two variables are summarized in the figure, grade and overall score. Is overall score a numerical or categorical variable? Is grade (indicated by the different colors in the figure) a numerical or categorical variable? b. Explain how the figure is equivalent to a segmented bar chart of the grade data.Using the data given in the previous exercise, construct a dotplot of the overall score data. Based on the dotplot, suggest an alternate assignment of grades (top of class, passing, etc.) to the 20 school districts. Explain the reasoning you used to make your assignment. (Hint: Dotplots were covered in Section 1.4.) 3.6 The Center for Science in the Public Interest evaluated school cafeterias in 20 school districts across the United States. Each district was assigned a numerical score on the basis of rigor of food codes, frequency of food safety inspections, access to inspection information, and the results of cafeteria inspections. Based on the score assigned, each district was also assigned one of four grades. The scores and grades are summarized in the accompanying table, which appears in the report Making the Grade: An Analysis of Food Safety in School Cafeterias(cspi.us/new/pdf/makingthegrade.pdf, 2007).The article Housework around the World (USA TODAY, September 15, 2009) included the percentage of women who say their spouses never help with household chores for five different countries. Display the information in the accompanying table in a bar chart.The authors of the report Findings from the 2009 Administration of the College Senior Survey (Higher Education Research Institute, 2010) asked a large number of college seniors how they would rate themselves compared to the average person of their age with respect to physical health. The accompanying relative frequency table summarizes the responses for men and women. a. Construct a comparative bar chart of the responses that allows you to compare the responses of men and women. b. There were 8110 men and 15,260 women who responded to the survey. Explain why it is important that the comparative bar chart be constructed using the relative frequencies rather than the actual numbers of people (the frequencies) responding in each category. c. Write a few sentences commenting on how college seniors perceive themselves with respect to physical health and how men and women differ in their perceptions.The survey on student attitude toward e-books described in Exercise 3 .5 was conducted in 2011. A similar survey was also conducted in 2012 (The Chronicle of Higher Education, August 23, 2013). Data from 1588 students who participated in the 2012 survey are summarized in the accompanying table. a. Use these data and the data from Exercise 3 .5 to construct a comparative bar chart that shows the distribution of responses for the two years. (Hint: See Example 3 .1.) b. Based on your graph from part a) do you think there was much of a change in attitude toward e-books from 2011 to 2012? 3.5 College student attitudes about e-books were investigated in a survey of 1625 students. Students were asked to indicate their level of agreement with the following statement: I would like to be able to get all my textbooks in digital form. The responses are summarized in the accompanying table. (The Chronicle of Higher Education, August 23, 2013) a. Construct an appropriate graphical display to summarize the information given in the table. b. Write a headline that would be appropriate for a newspaper article that summarized the results of this survey.During 2017, Gallup conducted a survey of adult Americans and asked the following question: What was the main reason you decided to enroll in the school or college where you completed your highest level of education? (Why Higher Ed?, Gallup, Inc., January 2018). The responses are summarized in the accompanying table. a. Construct a pie chart to summarize these data. b. Construct a bar chart to summarize these data. c. Which of these chartsa pie chart or a bar chartbest summarizes the important information? Explain. accompanying relative frequency table summarizes the responses for men and women. a. Construct a comparative bar chart of the responses that allows you to compare the responses of men and women. b. There were 8110 men and 15,260 women who responded to the survey. Explain why it is important that the comparative bar chart be constructed using the relative frequencies rather than the actual numbers of people (the frequencies) responding in each category. c. Write a few sentences commenting on how college seniors perceive themselves with respect to physical health and how men and women differ in their perceptions.An article about college loans (New Rules Would Protect Students, USA TODAY, June 16, 2010) reported the percentage of students who had defaulted on a student loan within 3 years of when they were scheduled to begin repayment. Information was given for public colleges, private non-profit colleges, and for-profit colleges. a. Construct a comparative bar chart that would allow you to compare loan status for the three types of colleges. b. The article states those who attended for-profit schools were more likely to default than those who attended public or private non-profit schools. What aspect of the comparative bar chart supports this statement?The report Findings From the 2014 College Senior Survey (Higher Education Research Institute, December 2014) summarizes data collected from more than 13,000 college seniors across the United States. One question in the survey asked students to rate themselves based on their critical thinking skills. For engineering majors, 60.4% rated critical thinking as a major strength while 39.6% did not see critical thinking as a major strength. Data were also provided for humanities majors, social science majors, biological sciences majors, and business majors, and these data are summarized in the accompanying table. Construct a comparative bar chart and compare the responses over the five different majors.The National Center for Health Statistics provided the data in the accompanying table in the report National Vital Statistics Report(January 5, 2017, cdc.gov/nchs/data/nvsr/nvsr66/nvsr66_01.pdf, retrieved February 17, 2018). Entries in the table are the birth rates (births per 1000 of population) for the year 2015. Births per 1000 of population Construct a stem-and-leaf display using stems 9, 10, 11, 16. Comment on the interesting features of the display. (Hint: See Example 3. 9.)The paper State-Level Cancer Mortality Attributable to Cigarette Smoking in the United States, (JAMA Internal Medicine [2016]: 17921798) included the following state estimates of the total number cancer deaths attributable to cigarette smoking in 2014. a. Construct a stem-and-leaf display using thousands as the stems and truncating the leaves to the tens digit. b. Write a few sentences describing the shape of the distribution and any unusual observations. c. The four largest values were for California, Texas, Florida, and New York. Does this indicate that cancer deaths due to cigarette smoking is more of a problem in these states than elsewhere? Explain. d. If you wanted to compare states on the basis of the cancer deaths due to cigarette smoking, would you use the data in the given table? If yes explain why this would be reasonable. If no, what would you use instead as the basis for the comparison?The accompanying data on seat belt use for each of the 50 U.S. states and the District of Columbia are from Traffic Safety Facts, (National Highway Traffic Safety Administration, June 2015). The observations represent the percentage of drivers wearing seat belts in a large nationwide observational survey. a. The values in the data set range from 68.9% to 97.8%. Construct a stem-and-leaf display that uses repeated stems 6H, 7L, 7H, 9H. (Hint: See Example 3.10.) b. Write a few sentences commenting on what the stem-and-leaf display suggests about seat belt use.The previous exercise gave data on seat belt use for each of the 50 U.S. states and the District of Columbia (Traffic Safety Facts, National Highway Traffic Safety Administration, June 2015). The observations represent the percentage of drivers wearing seat belts in a large nationwide observational survey. Some, but not all, states enforce seat belts laws. Below are the seat belt usage data divided into two groupsstates that enforce seat belts laws and those that do not. States with Seat Belt Law Enforcement: States without Seat Belt Law Enforcement: a. Construct a comparative stem-and-leaf display using the tens digit as the stem and truncating the leaves to a single digit (the ones digit). b. Write a few sentences commenting on similarities or differences in the seat belt use distributions for states with seat belt enforcement and states without seat belt enforcement.The U.S. Department of Health and Human Services reported the estimated percentage of households with only wireless phone service (no land line) in 2014 for the 50 U.S. states and the District of Columbia (cdc.gov/nchs/data/nhis/earlyrelease/wireless_state_201602.pdf, retrieved February 17, 2018). In the accompanying data table, each state was also classified into one of three geographical regionsWest (W), Middle states (M), and East (E). a. Construct a stem-and-leaf display for the wireless percentage using the data from all 50 states and the District of Columbia. What is a typical value for this data set? b. Construct a comparative stem-and-leaf display for the wireless percentage of the states in the West and the states in the East. How do the distributions of wireless percentages compare for states in the East and states in the West? (Hint: See Example 3.11.)The article Economy Low, Generosity High (USA TODAY, July 28, 2009) noted that despite a weak economy in 2008, more Americans volunteered in their communities than in previous years. Based on census data (volunteeringinamerica.gov), the top and bottom five states in terms of percentage of the population who volunteered in 2008 were identified. The top five states were Utah (43.5%), Nebraska (38.9%), Minnesota (38.4%), Alaska (38.0%), and Iowa (37.1%). The bottom five states were New York (18.5%), Nevada (18.8%), Florida (19.6%), Louisiana (20.1%), and Mississippi (20.9%). a. For the data set that includes the percentage who volunteered in 2008 for each of the 50 states, what is the largest value? What is the smallest value? b. If you were going to construct a stem-and-leaf display for the data set consisting of the percentage who volunteered in 2008 for the 50 states, what stems would you use to construct the display? Explain your choice.The U.S. gasoline tax per gallon data for each of the 50 states and the District of Columbia in 2015 were obtained from the U.S. Energy Information Administration (eia.gov/tools/faqs/faq.cfm?id=10t=10, retrieved April 17, 2017). a. Construct a stem-and-leaf display of these data. b. Based on the stem-and-leaf display, what do you notice about the center and spread of the data distribution? c. Do any values in the data set stand out as unusual? If so, which states correspond to the unusual observations, and how do these values differ from the rest?A report from Texas Transportation Institute (Texas AM University System, 2005) titled Congestion Reduction Strategies included the accompanying data on extra travel time for peak travel time in hours per year per traveler for different-sized urban areas. a. Construct a comparative stem-and-leaf plot for extra travel time per traveler for the two different sizes of urban areas. b. Is the following statement consistent with the display constructed in Part (a)? Explain. The larger the urban area, the greater the extra travel time during peak period travel.The percentage of teens not in school or working in 2010 for the 50 states were given in the 2012 Kids Count Data Book (aecf.org) and are shown in the following table: Note that the percentages range from a low of 4% to a high of 15%. In constructing a stem-and-leaf display for these data, if we regard each percentage as a two-digit number and use the first digit for the stem, then there are only two possible stems, 0 and 1. One solution is to use repeated stems. Consider a scheme that divides the leaf range into five parts: 0 and 1, 2 and 3, 4 and 5, 6 and 7, and 8 and 9. Then, for example, stem O could be repeated as Construct a stem-and-leaf display for this data set that uses stems 0t 0f0s0 and 1, 1t, and 1f. Comment on the important features of the display. (Hint: See Example 3.10.)The data in the accompanying table are from the Organization for Economic Co-operation and Development (data.oecd.org/eduatt/population-with-tertiary-education.htm, retrieved February 18, 2018). Entries in the table are the percentage of 25- to 34-year-old people who have completed a 4-year college degree for 27 countries in 2016. a. Construct a histogram of these data using the class intervals 20 to 30, 30 to 40,, 60 to 70. (Hint: See Example 3.16.) b. Write a few sentences describing the shape, center, and variability of the distribution.The accompanying data on annual maximum wind speed (in meters per second) in Hong Kong for each year in a 45-year period were given in an article that appeared in the journal Renewable Energy (March, 2007). a. Use the annual maximum wind speed data to construct a histogram. b. Is the histogram approximately symmetric, positively skewed, or negatively skewed? c. Would you describe the histogram as unimodal, bimodal, or multimodal?The accompanying relative frequency table is based on data from the 2016 College Bound Seniors Report (collegeboard.org, retrieved February 18, 2018). a. Construct a relative frequency histogram for SAT critical reading score for males. b. Construct a relative frequency histogram for SAT critical reading score for females. c. Based on the histograms from Parts (a) and (b), write a few sentences commenting on the similarities and differences in the distribution of SAT critical reading scores for males and females.The data in the accompanying table represents the percentage of workers who are members of a union for each U.S. state and the District of Columbia (AARP Bulletin, September 2009). a. Construct a histogram of these data using class intervals of 0 to 5, 5 to 10, 10 to 15, 15 to 20, and 20 to 25. b. Construct a dotplot of these data. Comment on the interesting features of the plot. (Hint: Dotplots were covered in Section 1.4.) c. For this data set, which is a more informative graphical display-the dotplot from Part (b) or the histogram constructed in Part (a)? Explain.Construct a histogram for the data in the previous exercise using about twice as many class intervals. Use 2.5 to 5 as the first class interval. Write a few sentences that explain why this histogram does a better job of displaying this data set than the histogram in the previous exercise.The following two relative frequency distributions are based on data that appeared in The Chronicle of Higher Education (August 23, 2013). The data are from a survey of students at four-year colleges. One relative frequency distribution is for the number of hours spent online at social network sites in a typical week. The second relative frequency distribution is for the number of hours spent playing video and computer games in a typical week. a. Construct a histogram for the social media data. For purposes of constructing the histogram, assume that none of the students in the sample spent more than 40 hours on social media in a typical week and that the last interval can be regarded as 21 to 40. Be sure to use the density scale when constructing the histogram. (Hint: See Example 3.17.) b. Construct a histogram for the video and computer game data. Use the same scale that you used for the histogram in Part (a) so that it will be easy to compare the two histograms. c. Comment on the similarities and differences in the histograms from Parts (a) and (b).U.S. Census data for San Luis Obispo County, California, were used to construct the following relative frequency distribution for commute time (in minutes) of working adults in 2015 (datausa.io/profile/geo/san-luis-bispo-paso-robles-ca-metro-area/#housing, retrieved February 18, 2018) and so are only approximate): a. Notice that not all intervals in the frequency distribution are equal in width. Why do you think that unequal width intervals were used? b. Construct a table that adds a density column to the given relative frequency distribution. (Hint: See Example 3.17.) c. Use the densities computed in Part (b) to construct a histogram for this data set. (The web site referenced earlier actually displays an incorrectly drawn histogram based on relative frequencies rather than densities!) Write a few sentences commenting on the important features of the histogram.Use the commute time data given in the previous exercise to complete the following: a. Calculate the cumulative relative frequencies, and construct a cumulative relative frequency plot. b. Use the cumulative relative frequency plot constructed in Part (a) to answer the following questions. i. Approximately what proportion of commute times were less than 50 minutes? ii. Approximately what proportion of commute times were greater than 22 minutes? iii. What is the approximate commute time value that separates the shortest 50% of commute times from the longest 50%? 3.30 U.S. Census data for San Luis Obispo County, California, were used to construct the following relative frequency distribution for commute time (in minutes) of working adults in 2015 (datausa.io/profile/geo/san-luis-bispo-paso-robles-ca-metro-area/#housing, retrieved February 18, 2018) and so are only approximate): a. Notice that not all intervals in the frequency distribution are equal in width. Why do you think that unequal width intervals were used? b. Construct a table that adds a density column to the given relative frequency distribution. (Hint: See Example 3.17.) c. Use the densities computed in Part (b) to construct a histogram for this data set. (The web site referenced earlier actually displays an incorrectly drawn histogram based on relative frequencies rather than densities!) Write a few sentences commenting on the important features of the histogram.The report Trends in College Pricing 2012 (collegeboard.com) included the information in the accompanying relative frequency distributions for public and for private not-for-profit four-year college students. a. Construct a relative frequency histogram for tuition and fees for students at public four-year colleges. Write a few sentences describing the distribution of tuition and fees, commenting on center, variability, and shape. b. Construct a relative frequency histogram for tuition and fees for students at private not-for-profit four-year colleges. Use the same scale for the vertical and horizontal axes as you used for the histogram in Part (a). Write a few sentences describing the distribution of tuition and fees for students at private not-for-profit four-year colleges. c. Write a few sentences describing the differences in the distributions.An exam is given to students in an introductory statistics course. What is likely to be true of the shape of the histogram of scores if: a. the exam is quite easy? b. the exam is quite difficult? c. half the students in the class have had calculus, the other half have had no prior college math courses, and the exam emphasizes mathematical manipulation? Explain your reasoning in each case.The accompanying frequency distribution summarizes data on the number of times smokers who had successfully quit smoking attempted to quit before their final successful attempt (Demographic Variables, Smoking Variables, and Outcome Across Five Studies, Health Psychology [2007]: 278287). Assume that no one had made more than 10 unsuccessful attempts, so that the last entry in the frequency distribution can be regarded as 510 attempts. Summarize this data set using a histogram. Be carefulthe class intervals are not all the same width, so a density scale should be used for the histogram. Also remember that for a discrete variable, the bar for 1 will extend from 0.5 to 1.5. Think about what this will mean for the bars for the 34 group and the 510 group.Example 3.19 used annual rainfall data for Albuquerque, New Mexico, to construct a relative frequency distribution and cumulative relative frequency plot. The National Climate Data Center also gave the accompanying annual rainfall (in inches) for Medford, Oregon, from 1950 to 2008. a. Construct a relative frequency distribution for the Medford rainfall data. b. Use the relative frequency distribution of Part (a) to construct a histogram. Describe the shape of the histogram.Use the relative frequency distribution constructed in the previous exercise to answer the following questions. a. Construct a cumulative relative frequency plot for the Medford rainfall data. b. Use the cumulative relative frequency plot of Part (a) to answer the following questions: i. Approximately what proportion of years had annual rainfall less than 15.5 inches? ii. Approximately what proportion of years had annual rainfall less than 25 inches? iii. Approximately what proportion of years had annual rainfall between 17.5 and 25 inches? Example 3.19 used annual rainfall data for Albuquerque, New Mexico, to construct a relative frequency distribution and cumulative relative frequency plot. The National Climate Data Center also gave the accompanying annual rainfall (in inches) for Medford, Oregon, from 1950 to 2008. a. Construct a relative frequency distribution for the Medford rainfall data. b. Use the relative frequency distribution of Part (a) to construct a histogram. Describe the shape of the histogram.37EUse the cumulative relative frequencies given in the previous exercise to complete the following: a. Calculate the relative frequencies for each class interval and construct a relative frequency distribution. b. Summarize the survival time data using a histogram. c. Based on the histogram, write a few sentences describing survival time of the stage 2 myeloma patients in this study. d. What additional information would you need in order to decide if it is reasonable to generalize conclusions about survival time from the group of patients in the study to all patients younger than 50 years old who are diagnosed with multiple myeloma and who receive high dose chemotherapy? 3.37 The authors of the paper Myeloma in Patients Younger than Age 50 Years Presents with More Favorable Features and Shows Better Survival (Blood [2008]: 40394047) studied patients who had been diagnosed with stage 2 multiple myeloma prior to the age of 50. For each patient who received high dose chemotherapy, the number of years that the patient lived after the therapy (survival time) was recorded. The cumulative relative frequencies in the accompanying table were approximated from survival graphs that appeared in the paper. a. Use the given information to construct a cumulative relative frequency plot. b. Use the cumulative relative frequency plot from Part (a) to answer the following questions: i. What is the approximate proportion of patients who lived fewer than 5 years after treatment?Using the five class intervals 100 to 120, 120 to 140, , 180 to 200, devise a frequency distribution based on 70 observations whose histogram could be described as follows: a. symmetric b. bimodal c. positively skewed d. negatively skewedThe accompanying table gives data from a survey of new car owners conducted by J .D. Power and Associates (USA TODAY, usatoday.com, March 29, 2016). For each brand of car sold in the United States, data on a quality rating (defects per 100 cars, so lower numbers indicate higher quality) and a customer satisfaction rating (called the APEAL rating) are given in the accompanying table. The APEAL rating is a score between O and 1000, with higher values indicating greater satisfaction. a. Construct a scatterplot of x = quality rating and y = APEAL rating. (Hint: See Example 3.21.) b. Does customer satisfaction (as measured by the APEAL rating) appear to be related to car quality? Explain.Consumer Reports Health (consumerreports.org) gave the accompanying data on saturated fat (in grams), sodium (in mg), and calories for 36 fastfood items. a. Construct a scatterplot using y = calories and x = fat. Does it look like there is a relationship between fat and calories? Is the relationship what you expected? Explain. b. Construct a scatterplot using y = calories and x = sodium. Write a few sentences commenting on the difference between the relationship of calories to fat and calories to sodium. c. Construct a scatterplot using y = sodium and x = fat. Does there appear to be a relationship between fat and sodium? d. Add a vertical line at x = 3 and a horizontal line at y = 900 to the scatterplot in Part (c). This divides the scatterplot into four regions, with some of the points in the scatterplot falling into each of the four regions. Which of the four regions corresponds to healthier fast-food choices? Explain.Consumer Reports rated 29 fitness trackers (such as Fitbit and Jawbone) on factors such as ease of use and accuracy of step count to obtain an overall score (consumer reports.org, retrieved October 13, 2016). The accompanying table gives price and overall score for these 29 fitness trackers. a. Construct a scatterplot using y = overall score and x = price. b. Based on the scatterplot from Part (a), does there appear to be a relationship between price and overall score? Does the scatterplot support the statement that the more expensive fitness trackers tended to receive higher overall scores?Consumer Reports (consumerreports.org) rated 37 different models of laptops that were for sale in 2015. An overall score was assigned to each model based on consideration of several factors, including performance, portability, and battery life. Data on price and overall score were used to construct the following scatterplot. Would you describe the relationship between price and overall score for these laptops as positive (overall score tends to increase as price increases) or negative (overall score tends to decrease as price increases)? Explain.The Solid Waste Management section of the Environmental Protection Agency Report on the Environment (epa.gov/roe/, retrieved April 17, 2017) included a graph similar to accompanying graph. The report also included the following statement: The last several decades have seen steady growth in recycling and composting, while the total amounts landfilled peaked in 1990 (145 MT) and have generally declined since then (134 MT in 2013). Explain how the time series plot is consistent or is not consistent with the given statement. EXHIBIT 1. Municipal solid waste generated and managed in the U.S., 19602013The report Daily Cigarette Use: Indicators on Children and Youth (Child Trends Data Bank, chi ldtrends.org/wp-content/uploads/2012/11 /03_Smoking_new.pdf, retrieved April 17, 2017) included the accompanying data on the percentage of students who report smoking cigarettes daily, for students in grades 8, 10, and 12. a. Construct a time series plot for students in grade 12, and comment on any trend over time. b. Construct a time series plot that shows trends over time for each of the three grade levels. Graph each of the three time series on the same set of axes, using different colors to distinguish the different grade levels. Either label the time series in the plot or include a legend to indicate which time series corresponds to which grade level. c. Write a paragraph based on the plot from Part (b). Discuss the similarities and differences for the three grade levels.The accompanying time series plot of movie box office totals (in millions of dollars) over 18 weeks of summer for both 2001 and 2002 is similar to one that appeared in USA TODAY (September 3, 2002): Patterns that tend to repeat on a regular basis over time are called seasonal patterns. Describe any seasonal patterns that you see in the summer box office data. (Hint: Look for patterns that seem to be consistent from year to year.)The accompanying comparative bar chart is similar to one in the report More and More Teens on Cell Phones (Pew Research Center, pewresearch.org, August 19, 2009). Older teens more likely to own cell phones Percentage of teen cell phone owners by age, 20042008 All data based on teens ages 12-17. Source: Pew Internet and American Life project. Gaming and Civic Engagement Survey of Teens/Parents. Nov. 2007-Feb. 2008. N = 1,102 and margin of error is 3%. Margin of error for teens in the Oct.Nov. 2004 survey is 3% (n = 1,100), and margin of error for the Oct.Nov. 2008 survey is 4% (n = 935). Source: Pew Internet American Life Project Suppose that you plan to include this graph in an article that you are writing for your school newspaper. Write a few paragraphs that could accompany the graph. Be sure to address what the graph reveals about how teen cell phone ownership is related to age and how it has changed over time.The figure at the top left of the next page is from the Fall 2008 Census Enrollment Report at Cal Poly, San Luis Obispo. It uses both a pie chart and a segmented bar chart to summarize data on ethnicity for students enrolled at the university in Fall 2008. a. Use the information in the graphical display to construct a single segmented bar chart for the ethnicity data. b. Do you think that the original graphical display or the one you created in Part (a) is more informative? Explain your choice. c. Why do you think that the original graphical display format (combination of pie chart and segmented bar chart) was chosen over a single pie chart with 7 slices?The figure at the top right of the next page is similar to one that appeared in USA TODAY (August 5, 2008). This graph is a modified comparative bar chart. Most likely, the modifications (incorporating hands and the earth) were made to try to make a display that readers would find more interesting. a. Use the information in the USA TODAY graph to construct a traditional comparative bar chart. b. Explain why the modifications made in the USA TODAY graph may make interpretation more difficult than with the traditional comparative bar chart.The two graphical displays below are similar to ones that appeared in USA TODAY (June 8, 2009 and July 28, 2009). One is an appropriate representation and the other is not. For each of the two, explain why it is or is not drawn appropriately.The following graphical display is similar to one that appeared in USA TODAY and is meant to be a comparative bar chart (USA TODAY, August 3, 2009). Do you think that this graphical display is an effective summary of the data? If so, explain why. If not, explain why not and construct a display that makes it easier to compare the ice cream preferences of men and women.Explain why the following graphical display (similar to one appearing in USA TODAY, September 17, 2009) is misleading.Each year, The Princeton Review conducts surveys of high school students who are applying to college and of parents of college applicants. The report 2016 College Hopes Worries Survey Findings (princetonreview.com/cms-content/final_cohowo2016survrpt.pdf, retrieved April 15, 2017) included a summary of how 8347 high school students responded to the question Ideally how far from home would you like the college you attend to be? Students responded by choosing one of four possible distance categories. Also included was a summary of how 2087 parents of students applying to college responded to the question How far from home would you like the college your child attends to be? The accompanying relative frequency table summarizes the student and parent responses. a. Explain why relative frequencies should be used when constructing a comparative bar chart to compare ideal distance for students and parents. b. Construct a comparative bar chart for these data. c. Write a few sentences commenting on similarities and differences in the distributions of ideal distance for parents and students.55CR56CR57CR58CRDoes the size of a transplanted organ matter? A study that attempted to answer this question (Minimum Graft Size for Successful Living Donor Liver Transplantation, Transplantation [1999]:11121116) included a graph similar to the accompanying scatterplot. Graft weight ratio is the weight of the transplanted liver relative to the ideal size liver for the recipient. a. Discuss interesting features of this scatterplot. b. Why do you think the overall relationship is negative?60CRThe article Tobacco and Alcohol Use in G-Rated Childrens Animated Films (Journal of the American Medical Association [1999]: 11311136) reported exposure to tobacco and alcohol use in all G-rated animated films released between 1937 and 1997 by five major film studios. The researchers found that tobacco use was shown in 56% of the reviewed films. Data on the total tobacco exposure time (in seconds) for films with tobacco use produced by Walt Disney, Inc., were as follows: Data for 11 G-rated animated films showing tobacco use that were produced by MGM/United Artists, Warner Brothers, Universal, and Twentieth Century Fox were also given. The tobacco exposure times (in seconds) for these films was as follows: a. Construct a comparative stem-and-leaf display for these data. b. Comment on the interesting features of this display.62CR63CRMany nutritional experts have expressed concern about the high levels of sodium in prepared foods. The following data on sodium content (in milligrams) per frozen meal appeared in the article Comparison of Light Frozen Meals (Boston Globe, April 24, 1991): Two histograms for these data are shown below and on the next page. a. Do the two histograms give different impressions about the distribution of values? b. Use each histogram to determine the approximate proportion of observations that are less than 800, and compare to the actual proportion.Americium 241 (241Am) is a radioactive material used in the manufacture of smoke detectors. The article Retention and Dosimetry of Injected 241Am in Beagles (Radiation Research [1984]: 564575) described a study in which 55 beagles were injected with a dose of 241 Am (proportional to each animals weight). Skeletal retention of 241 Am (in microcuries per kilogram) was recorded for each beagle, resulting in the foil owing data: a. Construct a frequency distribution for these data, and draw the corresponding histogram. b. Write a short description of the important features of the shape of the histogram.Does eating broccoli reduce the risk of prostate cancer? According to an observational study from the Fred Hutchinson Cancer Research Center (see the CNN.com web site article titled Broccoli, Not Pizza Sauce, Cuts Cancer Risk, Study Finds, January 5, 2000), men who ate more cruciferous vegetables (broccoli, cauliflower, brussels sprouts, and cabbage) had a lower risk of prostate cancer. This study made separate comparisons for men who ate different levels of vegetables. According to one of the investigators, at any given level of total vegetable consumption, as the percent of cruciferous vegetables increased, the prostate cancer risk decreased. Based on this study, is it reasonable to conclude that eating cruciferous vegetables causes a reduction in prostate cancer risk? Explain.An article that appeared in USA TODAY (August 11, 1998) described a study on prayer and blood pressure. In this study, 2391 people 65 years or older, were followed for 6 years. The article stated that people who attended a religious service once a week and prayed or studied the Bible at least once a day were less likely to have high blood pressure. The researcher then concluded that attending religious services lowers blood pressure. The headline for this article was Prayer Can Lower Blood Pressure. Write a few sentences commenting on the appropriateness of the researchers conclusion and on the article headline.Sometimes samples are composed entirely of volunteer responders. Give a brief description of the dangers of using voluntary response samples.4CREMore than half of Californias doctors say they are so frustrated with managed care they will quit, retire early, or leave the state within three years. This conclusion from an article titled Doctors Feeling Pessimistic, Study Finds (San Luis Obispo Tribune, July 15, 2001) was based on a mail survey conducted by the California Medical Association. Surveys were mailed to 19,000 California doctors, and 2000 completed surveys were returned. Describe any concerns you have regarding the conclusion drawn.Based on observing more than 400 drivers in the Atlanta area, two investigators at Georgia State University concluded that people exiting parking spaces did so more slowly when a driver in another car was waiting for the space than when no one was waiting (Territorial Defense in Parking Lots: Retaliation Against Waiting Drivers, Journal of Applied Social Psychology [1997]: 821-834). a. Describe how you might design an experiment to determine whether this phenomenon is true for your city. b. What is the response variable? c. What are some extraneous variables and how does your design control for them?An article from the Associated Press (May 14, 2002) led with the headline Academic Success Lowers Pregnancy Risk. The article described an evaluation of a program that involved about 350 students at 18 Seattle schools in high crime areas. Some students took part in a program beginning in elementary school in which teachers showed children how to control their impulses, recognize the feelings of others, and get what they want without aggressive behavior. Others did not participate in the program. The study concluded that the program was effective because by the time young women in the program reached age 21, the pregnancy rate among them was 38%, compared to 56% for the women in the experiment who did not take part in the program. Explain why this conclusion is valid only if the women in the experiment were randomly assigned to one of the two experimental groups.8CRE9CRE10CREThe article Determination of Most Representative Subdivision (Journal of Energy Engineering [1993]:4455) gave data on various characteristics of subdivisions that could be used in deciding whether to provide electrical power using overhead lines or underground lines. Data on the variable x = total length of streets within a subdivision are as follows: a. Construct a stem-and-leaf display for these data using the thousands digit as the stem. Comment on the various features of the display. b. Construct a histogram using class boundaries of 0 to 1000, 1000 to 2000, and so on. How would you describe the shape of the histogram? c. What proportion of subdivisions has total length less than 2000? between 2000 and 4000?The paper Lessons from Pacemaker Implantations (Journal of the American Medical Association [1965]: 231232) gave the results of a study that followed 89 heart patients who had received electronic pacemakers. The time (in months) to the first electrical malfunction of the pacemaker was recorded: a. Summarize these data in the form of a frequency distribution, using class intervals of 0 to 6, 6 to 12, and so on. b. Calculate the relative frequencies and cumulative relative frequencies for each class interval of the frequency distribution of Part (a). c. Show how the relative frequency for the class interval 12 to 18 could be obtained from the cumulative relative frequencies. d. Use the cumulative relative frequencies to give approximate answers to the following: i. What proportion of those who participated in the study had pacemakers that did not malfunction within the first year? ii. If the pacemaker must be replaced as soon as the first electrical malfunction occurs, approximately what proportion required replacement between 1 and 2 years after implantation? e. Construct a cumulative relative frequency plot, and use it to answer the following questions. i. What is the approximate time at which 50% of the pacemakers had failed? ii. What is the approximate time at which only 10% of the pacemakers initially implanted were still functioning?How does the speed of a runner vary over the course of a marathon (a distance of 42.195 km)? Consider determining both the time (in seconds) to run the first 5 km and the time (in seconds) to run between the 35 km and 40 km points, and then subtracting the 5-km time from the 3540-km time. A positive value of this difference corresponds to a runner slowing down toward the end of the race. The histogram below is based on times of runners who participated in several different Japanese marathons (Factors Affecting Runners Marathon Performance, Chance [Fall 1993]: 2430). a. What are some interesting features of this histogram? b. What is a typical difference value? c. Roughly what proportion of the runners ran the late distance more quickly than the early distance?14CREOne factor in the development of tennis elbow, a malady that strikes fear into the hearts of all serious players of that sport, is the impact-induced vibration of the racket-and-arm system at ball contact. It is well known that the likelihood of getting tennis elbow depends on various properties of the racket used. Consider the accompanying scatterplot of x = racket resonance frequency (in hertz) and y = sum of peak-to-peak accelerations (a characteristic of arm vibration, in meters per second per second) for n = 23 different rackets (Transfer of Tennis Racket Vibrations into the Human Forearm, Medicine and Science in Sports and Exercise [1992]: 11341140). Discuss interesting features of the data and of the scatterplot.An article that appeared in USA TODAY (September 3, 2003) included a graph similar to the one shown here summarizing responses from polls conducted in 1978, 1991, and 2003 in which a sample of American adults were asked whether or not it was a good time or a bad time to buy a house. a. Construct a time series plot that shows how the percentage that thought it was a good time to buy a house has changed over time. b. Add a new line to the plot from Part (a) showing the percentage that thought it was a bad time to buy a house over time. Be sure to label the lines clearly. c. Which graph, the given bar chart or the time series plot, best shows the trend over time?The following are the prices (in dollars) of the six all-terrain truck tires rated most highly by Consumer Reports in 2018 (consumerreports.org, retrieved February 22, 2018): a. Calculate the values of the mean and median. b. Why are these values so different? c. Which of the twomean or medianappears to be better as a description of a typical value for this data set? (Hint: See Example 4.5.)The article Caffeine Content of Drinks (caffeineinformer.com/the-caffeine-database, retrieved February 22, 2018) gave the following data on caffeine concentration (mg/ounce) for eight top-selling energy drinks: Data set available online a. What is the value of the mean caffeine concentration for this set of top-selling energy drinks? b. Coca-Cola has 2.9 mg/ounce of caffeine and Pepsi Cola has 3.2 mg/ounce of caffeine. Write a sentence explaining how the caffeine concentration of top-selling energy drinks compares to that of these colas.Consumer Reports Health (consumerreports.org/health) reported the accompanying caffeine concentration (mg/cup) for 12 brands of coffee: Use at least one measure of center to compare caffeine concentration for coffee with that of the energy drinks of the previous exercise. (Note: 1 cup = 8 ounces)Consumer Reports Health (consumerreports.org/health) reported the sodium content (mg) per 2 tablespoon serving for each of 11 different peanut butters: a. Display these data using a dotplot. Comment on any unusual features of the plot. b. Calculate the mean and median sodium content for the peanut butters in this sample. c. The values of the mean and the median for this data set are similar. What aspect of the distribution of sodium contentas pictured in the dotplot from Part (a)provides an explanation for why the values of the mean and median are similar? (Hint: See the discussion of Figure 4.4.)The article The Wedding Industrys Pricey Little Secret (June 12, 2013, slate.com) stated that the widely reported average wedding cost is grossly misleading. The article reports that in 2012, the average wedding cost was 27,427 and the median cost was 18,086. a. What does the large difference between the mean cost and the median cost tell you about the distribution of wedding costs in 2012? b. Do you agree with the statement that the average wedding cost is misleading? Explain why or why not. c. The article also states the proportion of couples who spent the average or more was actually a minority. Do you agree with this statement? Explain why or why not using the reported values of the mean and median wedding cost.The state of California defines family income groups in terms of median county income as follows: Extremely low income: below 30% of county median income Very Low income: between 30o/o and 50% of county median income Low income: between 50% and 80o/o of county median income Moderate income: between 80% and 120% of county median income For San Luis Obispo county, the median household income in 2015 was 60,691 (slohealthcounts.org/indicators/index/view?indicatorld=3151ocaleld=277, retrieved March 18, 2018). a. Interpret the value of the median household income in 2015 for San Luis Obispo County. b. Each of the following statements is incorrect. For each statement, use the given information to explain why it is incorrect. Statement 1: 30% of the households in San Luis Obispo County would be classified as extremely low income. Statement 2: More than 50% of the households in San Luis Obispo County would be classified as extremely low income or very low income. Statement 3: There cannot be any households in San Luis Obispo County that would be classified as having an income that was greater than those in the moderate-income category.The report State of the News Media 2015 (Pew Research Center, April 29, 2015) published the accompanying circulation numbers for 15 news magazines (such as Time and The New Yorker) for 2014: Explain why the average may not be the best measure of a typical value for this data set.Each student in a sample of 20 seniors at a particular university was asked if he or she was registered to vote. With R denoting registered and N denoting not registered, the sample data are: a. If being registered to vote is considered a success, what is the value of the proportion of successes for this sample? b. When would it be reasonable to generalize from this sample to the population of all seniors at this university?The U.S. Department of Transportation reported the number of speed-related crash fatalities for the 15 states that had the highest numbers of these fatalities in 2012 (Traffic Safety Facts 2012 Data, Speeding, May 2014). a. Calculate the mean number of speed-related fatalities for these 15 states. b. Calculate the median number of speed-related fatalities for these 15 states. c. Explain why it is not reasonable to generalize from this sample of 15 states to the other 35 states.The ministry of Health and Long-Term Care in Ontario, Canada, publishes information on its web site (health.gov.on.ca) on the time that patients must wait for various medical procedures. For two cardiac procedures completed in fall of 2005, the following information was provided: The median wait time for angioplasty is greater than the median wait time for bypass surgery but the mean wait time is shorter for angioplasty than for bypass surgery. What does this suggest about the distribution of wait times for these two procedures?Houses in California are expensive, especially on the Central Coast where the air is clear, the ocean is blue, and the scenery is stunning. The median home price in San Luis Obispo County reached a new high in July 2004, soaring to 452,272 from 387,120 in March 2004 (San Luis Obispo Tribune, April 28, 2004). The article included two quotes from people attempting to explain why the median price had increased. Richard Watkins, chairman of the Central Coast Regional Multiple Listing Services was quoted as saying, There have been some fairly expensive houses selling, which pulls the median up. Robert Kleinhenz, deputy chief economist for the California Association of Realtors explained the volatility of house prices by stating: Fewer sales means a relatively small number of very high or very low home prices can more easily skew medians. Are either of these statements correct? For each statement that is incorrect, explain why it is incorrect and propose a new wording that would correct any errors in the statement.Consider the following statement: More than 65% of the residents of Los Angeles earn less than the average wage for that city. Could this statement be correct? If so, how? If not, why not?A sample consisting of four pieces of luggage was selected from among the luggage checked at an airline counter, yielding the following data on x = weight (in pounds): x1=33.5,x2=27.3,x3=36.7,x4=30.5 Suppose that one more piece is selected and denote its weight by x5. Find a value of x5 such that x = sample median.Suppose that 10 patients with meningitis received treatment with large doses of penicillin. Three days later, temperatures were recorded, and the treatment was considered successful if there had been a reduction in a patients temperature. Denoting success by S and failure by F, the 10 observations are a. What is the value of the sample proportion of successes? b. Replace each S with a 1 and each F with a 0. Then calculate x for this numerically coded sample. How does x compare to p? c. Suppose that it is decided to include 15 more patients in the study. How many of these would have to be Ss to give p=0.80 for the entire sample of 25 patients?A study of the lifetime (in hours) for a certain brand of light bulb involved putting 10 light bulbs into operation and observing them for 1000 hours. Eight of the light bulbs failed during that period, and those lifetimes were recorded. The lifetimes of the two light bulbs still functioning after 1000 hours were recorded as 1000+. The resulting sample observations wereAn instructor has graded 19 exam papers submitted by students in a class of 20 students, and the average so far is 70. (The maximum possible score is 100.) How high would the score on the last paper have to be to raise the class average by 1 point? By 2 points?The following data are costs (in cents) per ounce for nine different brands of sliced Swiss cheese (consumerreports.org): a. Calculate the variance and standard deviation for this data set. (Hint: See Example 4.8.) b. If a very expensive cheese with a cost per slice of 1.50 (150 cents) was added to the data set, how would the values of the mean and standard deviation change?Cost per serving (in cents) for six high-fiber cereals rated very good and for nine high-fiber cereals rated good by Consumer Reports are shown below. Write a few sentences describing how these two data sets differ with respect to center and variability. Use summary statistics to support your statements. Cereals Rated Very Good Cereals Rated GoodCombining the cost-per-serving data for high-fiber cereals rated very good and those rated good from the previous exercise gives the following data set: a. Calculate the quartiles and the interquartile range for this combined data set. (Hint: See Example 4.9.) b. Calculate the interquartile range for just the cereals rated good. Is this value greater than, less than, or about equal to the interquartile range computed in Part (a)? 4.18 Cost per serving (in cents) for six high-fiber cereals rated very good and for nine high-fiber cereals rated good by Consumer Reports are shown below. Write a few sentences describing how these two data sets differ with respect to center and variability. Use summary statistics to support your statements. Cereals Rated Very Good Cereals Rated Good20EThe accompanying data are consistent with summary statistics that appeared in the paper Shape of Glass and Amount of Alcohol Poured: Comparative Study of Effect of Practice and Concentration (British Medical Journal [2005]: 1512-1514). Data represent the actual amount poured (in ml) into a tall, slender glass for individuals who were asked to pour 44.3 ml (1.5 ounces). Calculate and interpret the values of the mean and standard deviation.The paper referenced in the previous exercise also gave data on the actual amount poured (in ml) into a short, wide glass for individuals who were asked to pour 44.3 ml (1.5 ounces). a. Calculate and interpret the values of the mean and standard deviation. b. What do the values of the mean amount poured in the short, wide glass and the mean calculated in the previous exercise suggest about the shape of glasses used?The price (in dollars) of the eight smart phones that were rated highest by Consumer Reports in 2018 (consumerreports.org, retrieved February 23, 2018) were a. Calculate the values of the variance and the standard deviation. a. The standard deviation is quite large. What does that tell you about the prices of these highly rated smart phones?In addition to the prices of the highly rated smart phones given in the previous exercise, Consumer Reports also gave the prices of the seven smart phones that received the lowest ratings. Those prices (in dollars) were Comment on how the highest rated smart phones and the lowest rated smart phones differ with respect to price and price variability. 4.23 The price (in dollars) of the eight smart phones that were rated highest by Consumer Reports in 2018 (consumerreports.org, retrieved February 23, 2018) were a. Calculate the values of the variance and the standard deviation. a. The standard deviation is quite large. What does that tell you about the prices of these highly rated smart phones?In an experiment to assess the effect of listening to audiobooks while driving, participants were asked to drive down a straight road in a driving simulator. The accompanying data on time (in milliseconds) to react when a pedestrian walked into the street for 10 drivers listening to an audiobook are consistent with summary statistics and graphs that appeared in the paper Good Distractions: Testing the Effect of Listening to an Audiobook on Driving Performance in Simple and Complex Road Environments (Accident Analysis and Prevention [2018]: 202209). Calculate the variance and the standard deviation for this data set.The paper referenced in the previous exercise also gave summary statistics and graphs for the reaction time of drivers who were not listening to audiobooks. Data on reaction time (in milliseconds) consistent with those summary statistics for 10 drivers not listening to audiobooks are given here. a. Use the data given in this exercise and the data given in the previous exercise to construct dot-plots that would allow you to compare the reaction times for the two groups. b. Based on the dot plots, do you think that the standard deviation of the reaction times for people who are not listening to audiobooks would be less than, about the same as, or greater than the standard deviation that you calculated in the previous exercise for drivers who were listening to audio books? Explain your thinking. c. Calculate the standard deviation of the reaction times for the drivers who were not listening to audiobooks. Is the value of this standard deviation consistent with your answer in Part (b)? d. Describe how the distributions of reaction time differ for drivers who are listening to audiobooks and those who are not. 4.25 In an experiment to assess the effect of listening to audiobooks while driving, participants were asked to drive down a straight road in a driving simulator.be accompanying data on time (in milliseconds) to react when a pedestrian walked into the street for 10 drivers listening to an audiobook are consistent with summary statistics and graphs that appeared in the paper Good Distractions: Testing the Effect of Listening to an Audiobook on Driving Performance in Simple and Complex Road Environments (Accident Analysis and Prevention [2018]: 202-209). Calculate the variance and the standard deviation for this data set.The accompanying data on number of minutes used for cell phone calls in 1 month was generated to be consistent with summary statistics published in a report of a marketing study of San Diego residents (TeleTruth, March 2009): a. Calculate the values of the quartiles and the interquartile range for this data set. b. Explain why the lower quartile is equal to the minimum value for this data set. Will this be the case for every data set? Explain.Give two sets of five numbers that have the same mean but different standard deviations, and give two sets of five numbers that have the same standard deviation but different means.Morningstar is an investment research from that publishes some online educational materials. The materials for an online course called Looking at Historical Risk (news.morningstar.com/classroom2/course.asp?docld=2927page=2CN=com,retrieved August 3, 2016) included the following paragraph referring to annual return (in percent) for investment funds: Using standard deviation as a measure of risk can have its drawbacks. Its possible to own a fund with a low standard deviation and still lose money. In reality, thats rare. Funds with modest standard deviations tend to lose less money over short time frames than those with high standard deviations. For example, the one-year average standard deviation among ultrashort-term bond funds, which are among the lowest-risk funds around (other than money market funds), is a mere 0.64%. a. Explain why the standard deviation of percent return is a reasonable measure of unpredictability and why a smaller standard deviation for the percent return of an investment fund means less risk. b. Explain how a fund with a small standard deviation can still lose money. (Hint: Think about the average percent return.)The U.S. Department of Transportation reported the data in the accompanying table on the number of speed-related crash fatalities during holiday periods for the years from 1994 to 2003 Traffic Safety Facts, July 20, 2005). a. Calculate the standard deviation for the New Years Day data. b. Without calculating the standard deviation of the Memorial Day data, explain whether the standard deviation for the Memorial Day data would be larger or smaller than the standard deviation of the New Years Day data. Data For Exercise 4.30 c. Memorial Day and Labor Day are holidays that always occur on Monday and Thanksgiving always occurs on a Thursday, whereas New Years Day, July 4th, and Christmas do not always fall on the same day of the week every year. Based on the given data, is there more or less variability in the speed-related crash fatality numbers from year to year for same day of the week holiday periods than for holidays that can occur on different days of the week? Support your answer with appropriate measures of variability.The Ministry of Health and Long-Term Care in Ontario, Canada, publishes information on the time that patients must wait for various medical procedures on its web site (health.gov.on.ca). For two cardiac procedures completed in fall of 2005, the following information was provided: a. Which of the following must he true for the lower quartile of the data set consisting of the 847 wait times for angioplasty? i. The lower quartile is less than 14. ii. The lower quartile is between 14 and 18. iii. The lower quartile is between 14 and 39. iv. The lower quartile is greater than 39. b. Which of the following must be true for the upper quartile of the data set consisting of the 539 wait times for bypass surgery? i. The upper quartile is less than 13. ii. The upper quartile is between 13 and 19. iii. The upper quartile is between 1.3 and 42. iv. The upper quartile is greater than 42. c. Which of the following must be true for the number of days for which only 5% of the bypass surgery wail times would be longer? i. It is less than 13. ii. It is between 13 and 19. iii. It is between 13 and 42. iv. It is greater than 42.In 1997, a woman sued a computer keyboard manufacturer, charging that her repetitive stress injuries were caused by the keyboard (Genessey v Digital Equipment Corporation). The jury awarded about 3 .5 million for pain and suffering, but the court then set aside that award as being unreasonable compensation. In making this determination, the court identified a normative group of 27 similar cases and specified a reasonable award as one within 2 standard deviations of the mean of the awards in the 27 cases. The 27 award amounts were (in thousands of dollars) What is the maximum possible amount that could be awarded under the 2-standard deviations rule?The standard deviation alone does not measure relative variation. For example, a standard deviation of 1 would be considered large if it is describing the variability from store to store in the price of an ice cube tray. On the other hand, a standard deviation of 1 would be considered small if it is describing store-to-store variability in the price of a particular brand of freezer. A quantity designed to give a relative measure of variability is the coefficient of variation. Denoted by CV, the coefficient of variation expresses the standard deviation as a percentage of the mean. It is defined the formula CV=100(sx). Consider two samples. Sample 1 gives the actual weight (in ounces) of the contents of cans of pet food labeled as having a net weight of 8 ounces. Sample 2 gives the actual weight (in pounds) of the contents of bags of dry pet food labeled as having a net weight of 50 pounds. The weights for the two samples are a. For each of the given samples, calculate the mean and the standard deviation. b. Calculate the coefficient of variation for each sample. Do the results surprise you? Why or why not?Based on a large national sample of working adults, the U.S. Census Bureau reports the following information on travel time to work for those who do not work at home: lower quartile = 7 minutes median = 18 minutes upper quartile = 31 minutes Also given was the mean travel time, which was reported as 22.4 minutes. a. Is the travel time distribution more likely to be approximately symmetric, positively skewed, or negatively skewed? Explain your reasoning based on the given summary quantities. b. Suppose that the minimum travel time was 1 minute and that the maximum travel time in the sample was 205 minutes. Construct a skeletal boxplot for the travel time data. (Hint: See Example 4.10.) c. Were there any mild or extreme outliers in the data set? How can you tell? (Hint: See Example 4.11.)The report Most Licensed Drivers Age 85+: States (bloom berg.com/graphics/best-and-worst/#most-licensed-drivers-age-85-plus-states, retrieved April 20, 2017) gives the percentage of drivers in each state and the District of Columbia in 2011 who were over 85 years of age. a. Find the values of the median, the lower quartile, and the upper quartile. b. The largest value in the data set is 5 .10% (Connecticut). Is this state an outlier? (Hint: See Example 4.11.) c. Construct a modified boxplot for this data set and comment on the interesting features of the plot. How would you describe the shape of the distribution if you dont consider the outlier?Data on the gasoline tax per gallon (in cents) in 2015 for the 50 U.S. states and the District of Columbia are shown below (eia.gov/tools/faqs/faq.cfm?id=10t=10, retrieved September 1, 2016). a. The smallest value in the data set is 9.0 (Alaska) and the largest value is 51.4 (Pennsylvania). Are these values outliers? Explain. (Hint: See Example 4.11.) b. Construct a boxplot of the data set and comment on the interesting features of the plot.The U.S. Department of Health and Human Services reported the estimated percentage of U.S. households with only wireless phone service (no landline) in 2014 for the 50 states and the District of Columbia (cdc.gov/nchs/data/nhis/earlyrelease/wireless_state_201602.pdf, retrieved April 20, 2017). In the accompanying data table, each state was also classified into one of three geographical regions-West (W), Middle states (M), and East (E). a. Construct a comparative boxplot that makes it possible to compare wireless percent for the three geographical regions. b. Does the graphical display in Part (a) reveal any striking differences, or are the distributions similar for the three regions?Fiber content (in grns per serving) for 18 high fiber cereals (consumerreports.com are shown below. Fiber Content a. Find the median, quartiles, and interquartile range for the fiber content data set. b. Explain why the minimum value for the fiber content data set and the lower quartile for the fiber content data set are equal.In addition to the fiber contents given in the previous exercise, sugar content (in grams per serving) for the same 18 high-fiber cereals were also given (consumerreports.com). a. Calculate the median, quartiles, and interquartile range for the sugar content data. b. Are there any outliers in the sugar content data?Use the fiber content and sugar content data given in the previous two exercises to construct a comparative boxplot. Comment on the differences and similarities in the fiber and sugar content distributions.The article The Bestand WorstPlaces to be a Working Woman (The Economist, Graphic Detail for March 3, 2016) reported values of what it calls the glass-ceiling index, which is designed to rate countries based on womens chances of equal treatment at work. The index weights factors that include participation of women in higher education, participation in the workforce by women, pay, childcare cost, and maternity benefits. The best possible value for this index is 100. Data for 29 countries are shown in the accompanying table. a. Are there outliers in this data set? If so, which observations are outliers? b. Draw a modified boxplot for this data set. c. The article points out that Nordic countries (Iceland, Sweden, Norway, and Finland) come out on top on this index. Where are the values for the Nordic countries located in terms of the boxplot?The average playing time of music albums in a large collection is 35 minutes, and the standard deviation is 5 minutes. a. What value is 1 standard deviation above the mean? 1 standard deviation below the mean? What values are 2 standard deviations away from the mean? (Hint: See Example 4.14.) b. Without assuming anything about the distribution of times, at least what percentage of the times are between 25 and 45 minutes? (Hint: See Example 4.15.) c. Without assuming anything about the distribution of times, what can be said about the percentage of times that are either less than 20 minutes or greater than 50 minutes? d. Assuming that the distribution of times is approximately normal, about what percentage of times are between 25 and 45 minutes? less than 20 minutes or greater than 50 minutes? less than 20 minutes?In a study investigating the effect of car speed on accident severity, 5000 reports of fatal automobile accidents were examined, and the vehicle speed at impact was recorded for each one. For these 5000 accidents, the average speed was 42 mph and the standard deviation was 15 mph. A histogram revealed that the vehicle speed at impact distribution was approximately normal. a. Approximately what proportion of these vehicle speeds were between 27 and 57 mph? (Hint: See Example 4.17.) b. Approximately what proportion of these vehicle speeds exceeded 57 mph?The U.S. Census Bureau (2000 census) reported the following relative frequency distribution for travel time to work for a large sample of adults who did not work at home: a. Draw the histogram for the travel time distribution. In constructing the histogram, assume that the last interval in the relative frequency distribution (90 or more) ends at 200; so the last interval is 90 to 200. Be sure to use the density scale to determine the heights of the bars in the histogram because not all the intervals have the same width. (Hint: Histograms were covered in Chapter 3.) b. Describe the interesting features of the histogram from Part (a), including center, shape, and variability. c. Based on the histogram from Part (a), would it be appropriate to use the Empirical Rule to make statements about the travel time distribution? Explain why or why not.For the travel time distribution given in the previous exercise, the approximate mean and standard deviation for the travel time distribution are 27 minutes anti 24 minutes, respectively. Based on this mean and standard deviation anti the fact that travel time cannot be negative, explain why the travel time distribution could not be well approximated by a normal curve. The U.S. Census Bureau (2000 census) reported the following relative frequency distribution for travel time to work for a large sample of adults who did not work at home: a. Draw the histogram for the travel lime distribution. In constructing the histogram, assume that the last interval in the relative frequency distribution (90 or more) ends at 200; so the last interval is 90 to 200. Be sure to use the density scale to determine the heights of the bars in the histogram because not all the intervals have the same width. (Hint: Histograms were covered in Chapter 3.) b. Describe the interesting features of the histogram from Part (a), including center, shape, and variability. c. Based on the histogram from Part (a), would it be appropriate to use the Empirical Rule to make statements about the travel time distribution? Explain why or why not.Use the information given in the previous two exercises and Chebyshevs Rule to complete this exercise. a. Make a statement about i. the percentage of travel times that were between 0 and 75 minutes ii. the percentage of travel times that were between 0 and 54 minutes b. How well do the statements in Part (a) based on Chebyshevs Rule agree with the actual percentages for the travel time distribution? (Hint: You can estimate the actual percentages from the relative frequency distribution given in Exercise 4.44.)Mobile homes are tightly constructed for energy conservation. This can lead to a buildup of indoor pollutants. The paper A Survey of Nitrogen Dioxide Levels Inside Mobile Homes (Journal of the Air Pollution Control Association [1988]: 647651) discussed various aspects of NO2 concentration in these structures. a. For one sample of mobile homes in the Los Angeles area, the mean NO2 concentration in kitchens during the summer was 36.92 ppb, and the standard deviation was 11.34. Making no assumptions about the shape of the NO2 distribution, what can be said about the percentage of observations between 14.24 and 59.60? b. Inside what interval can you be sure that at least 89% of the concentration observations will lie? c. For a sample of mobile homes that were not in Los Angeles, the average kitchen NO2 concentration during the winter was 24.76 ppb, and the standard deviation was 17.20. Do these values suggest that the histogram of sample observations did not closely resemble a normal curve? (Hint: What is x2s?)The article Impact of Berkeley Excise Tax on Sugar-Sweetened Beverage Consumption (American Journal of Public Health [2016]:18651871) estimated that the mean number of times that adults in Berkeley, California, drank regular soda per day to be 0.34. The standard deviation for the number of times per day was estimated to be 0.86. Would you use the Empirical Rule to approximate the proportion of adults who drink regular soda more than 1.20 times per day on average (i.e., the proportion of adults in Berkeley whose value exceeds the mean by more than 1 standard deviation)? Explain your reasoning.A student took two national aptitude tests. The national mean and standard deviation were 475 and 100, respectively, for the first test and 30 and 8, respectively, for the second test. The student scored 625 on the first test and 45 on the second test. Use z scores to determine on which exam the student performed better relative to the other test takers. (Hint: See Example 4.18.)Suppose that your younger sister is applying for entrance to college and has taken the SAT. She scored at the 83rd percentile on the verbal section of the test and at the 94th percentile on the math section of the test. Because you have been studying statistics, she asks you for an interpretation of these values. What would you tell her? (Hint: See Example 4.19.)The report Who Borrows Most? Bachelors Degree Recipients with High Levels of Student Debt (trends.collegeboard.org/content/who-borrows-most-bachelors-degree-recipients-high-levels-student-debt-april-2010, retrieved April 20, 2017) reported the following percentiles for amount of student debt for those graduating with a bachelors degree in 2010: For each of these percentiles, write a sentence interpreting the value of the percentile.The paper Study of the Flying Ability of Rhynchophorus ferrugineus Adults Using a Computer-Monitored Mill (Bulletin of Entomological Research [2014]: 462467) summarized data from a study of red palm weevils, a pest that is a threat to palm trees. The following frequency distribution from the paper was constructed using the longest flight (in meters) observed for 132 weevils. Estimate the approximate values of the following percentiles: a. 54th b. 80th c. 92ndSuppose that the manufacturer of a scale claims that its scale weighs items up to 110 pounds and provides accuracy to within 0.25 ounce. Suppose that a 50-ounce weight was repeatedly weighed on this scale and the weight readings recorded. The mean value was 49.5 ounces, and the standard deviation was 0.1. What can be said about the percentage of the time that the scale actually showed a weight that was within 0.25 ounce of the true value of 50 ounces? (Hint: Use Chebyshevs Rule.)Suppose that your statistics professor returned your first midterm exam with only a z score written on it. She also told you that a histogram of the scores was approximately normal. How would you interpret each of the following z scores? a. 2.2 b. 0.4 c. 1.8 d. 1.0 e. 0The paper Answer Changing in Multiple Choice Assessment: Change that Answer When in Doubtand Spread the Word (BMC Medical Education [2007]: 2832) reported that for a group of 72 students, the average number of responses changed from the correct answer to an incorrect answer on a test containing 78 multiple-choice items was 0.9. The corresponding standard deviation was reported to be 1.0. Based on this mean and standard deviation, what can you tell about the shape of the distribution of the variable number of answers changed from right to wrong? What can you say about the number of students who changed at least three answers from correct to incorrect?Suppose that the average reading speed of students completing a speed-reading course is 450 words per minute (wpm). If the standard deviation is 70 wpm, find the z score associated with each of the following reading speeds. a. 320 wpm b. 475 wpm c. 420 wpm d. 610 wpmThe following data values are 2014 per capita operating expenditures on public libraries for each of the 50 U.S. states and the District of Columbia (imls.gov/research-evaluation/data-collection/public-libraries-survey/explore-pls-data, retrieved April 20, 2017): a. Summarize this data set with a frequency distribution. Construct the corresponding histogram. b. Use the histogram in Part (a) to find approximate values of the following percentiles: i. 50th ii. 70th iii. 10th iv. 90th v. 40thThe accompanying table gives the mean and standard deviation of reaction times (in seconds) for each of two different stimuli: If your reaction time is 4.2 seconds for the first stimulus and 1.8 seconds for the second stimulus, to which stimulus are you reacting relatively more quickly (compared with other individuals)? (Hint: See Example 4.18.)The authors of the paper Delayed Time to Defibrillation after In-Hospital Cardiac Arrest (New England Journal of Medicine [2008]: 916) described a study of how survival is related to the length of time it takes from the time of a heart attack to the administration of defibrillation therapy. The following is a statement from the paper: We identified 6789 patients from 369 hospitals who had in-hospital cardiac arrest due to ventricular fibrillation (69.7%) or pulseless ventricular trachycardia (30.3%). Overall, the median time to defibrillation was 1 minute (interquartile range [was] 3 minutes). Data from the paper on time to defibrillation in minutes) for these 6789 patients was used to produce the Minitab output and boxplot at the bottom of the page. a. Why is there no lower whisker in the given boxplot? b. How is it possible for the median, the lower quartile, and the minimum value in the data set to all be equal? (Notethis is why you do not see a median line in the box part of the boxplot.) c. The authors of the paper considered a time to defibrillation of greater than 2 minutes as unacceptable. Based on the given boxplot and summary statistics, is it possible that the percentage of patients having an unacceptable time to defibrillation is greater than 50%? Greater than 25%? Less than 25%? Explain. d. Is the outlier shown at 7 a mild outlier or an extreme outlier? Descriptive Statistics: Time to DefibrillationThe paper Portable Social Groups: Willingness to Communicate, Interpersonal Communication Gratifications, and Cell Phone Use among Young Adults (International Journal of Mobile Communications [2007]: 139156) describes a study of young adult cell phone use patterns. a. Comment on the following quote from the paper. Do you agree with the authors? Seven sections of an Introduction to Mass Communication course at a large southern university were surveyed in the spring and fall of 2003. The sample was chosen because it offered an excellent representation of the population under studyyoung adults. b. Below is another quote from the paper. In this quote, the author reports the mean number of minutes of cell phone use per week for those who participated in the survey. What additional information would have been provided about cell phone use behavior if the author had also reported the standard deviation? Based on respondent estimates, users spent an average of 629 minutes (about 10.5 hours) per week using their cell phone on or off line for any reason.Acrylamide (a possible cancer-causing substance) forms in high-carbohydrate foods cooked at high temperatures and acrylamide levels can vary widely even within the same type of food. An article appearing in the journal Food Chemistry (March 2014, pages 204211) included the following acrylamide content (in nanograms/gram) for five brands of biscuits: a. Calculate the mean acrylamide level and the five deviations from the mean. b. Verify that, except for the effect of rounding, the sum of the deviations from the mean is equal to 0 for this data set. (If you rounded the sample mean or the deviations, your sum may not be exactly zero, but it should be close to zero if you have calculated the deviations correctly.) c. Calculate the variance and standard deviation for this data set.62CRBecause some homes have selling prices that are much higher than most, the median price is usually used to describe a typical home price for a given location. The three accompanying quotes are all from the San Luis Obispo Tribune, but each gives a different interpretation of the median price of a home in San Luis Obispo County. Comment on each of these statements. (Look carefully. At least one of the statements is incorrect.) a. So we have gone from 23% to 27% of county residents who can afford the median priced home at 278,380 in SLO County. That means that half of the homes in this county cost less than 278,380 and half cost more. (October 11, 2001) b. The countys median price rose to 285, 170 in the fourth quarter, a 9.6% increase from the same period a year ago, the report said. (The median represents the midpoint of a range.) (February 13, 2002) c. Your median is going to creep up above 300,000 if there is nothing available below 300,000, Walker said. (February 26, 2002)Although bats are not known for their eyesight, they are able to locate prey (mainly insects) by emitting high-pitched sounds and listening for echoes. A paper appearing in Animal Behaviour (The Echolocation of Flying Insects by Bats [1960]: 141154) gave the following distances (in centimeters) at which a bat first detected a nearby insect: a. Calculate the sample mean distance at which the bat first detects an insect. b. Calculate the sample variance and standard deviation for this data set. Interpret these values.For the data in the previous exercise, subtract 10 from each sample observation. For the new set of values, calculate the mean and the deviations from the mean. How do these deviations compare to the deviations from the mean for the original sample? How does s2 for the new values compare to s2 for the old values? In general, what effect does subtracting (or adding) the same number to each observation have on s2 and s? Explain. 4.64 Although bats are not known for their eyesight, they are able to locate prey (mainly insects) by emitting high-pitched sounds and listening for echoes. A paper appearing in Animal Behaviour (The Echolocation of Flying Insects by Bats [1960]: 141154) gave the following distances (in centimeters) at which a bat first detected a nearby insect: a. Calculate the sample mean distance at which the bat first detects an insect. b. Calculate the sample variance and standard deviation for this data set. Interpret these values.For the data of Exercise 4.64, multiply each data value by 10. How does s for the new values compare to s for the original values? More generally, what happens to s if each observation is multiplied by the same positive constant c? 4.64 Although bats are not known for their eyesight, they are able to locate prey (mainly insects) by emitting high-pitched sounds and listening for echoes. A paper appearing in Animal Behaviour (The Echolocation of Flying Insects by Bats [1960]: 141154) gave the following distances (in centimeters) at which a bat first detected a nearby insect: a. Calculate the sample mean distance at which the bat first detects an insect. b. Calculate the sample variance and standard deviation for this data set. Interpret these values.The Bloomberg web site included the data in the accompanying table on the number of movies made by 25 Saturday Night Live cast members as of 2014 (bloomberg.com/graphics/best-and-worst /#top-grossing-saturday-night-live-alumni, retrieved April 20, 2017). Also given was the top grossing movie made by each and the gross income for that movie adjusted for inflation. Construct a boxplot for the number of movies data and comment on what the boxplot tells you about the distribution of the number of movies data.Refer to the data given in the previous exercise. c. Are there any outliers in the inflation-adjusted gross movie income data? If so, which data values are outliers? d. Construct a boxplot for the inflation-adjusted gross movie income data. e. For the inflation-adjusted gross movie income data, the mean is 351.8 million and the median is 322.0 million. What characteristic of the boxplot explains why the mean is greater than the median for this data set? 4.67 The Bloomberg web site included the data in the accompanying table on the number of movies made by 25 Saturday Night Live cast members as of 2014 (bloomberg.com/graphics/best-and-worst/#top-grossing-saturday-night-live-alumni, retrieved April 20, 2017). Also given was the top grossing movie made by each and the gross income for that movie adjusted for inflation. Construct a boxplot for the number of movies data and comment on what the boxplot tells you about the distribution of the number of movies data.Age at diagnosis for each of 20 patients under treatment for meningitis was given in the paper Penicillin in the Treatment of Meningitis (Journal of the American Medical Association [1984]: 18701874). The ages (in years) were as follows: a. Calculate the values of the sample mean and the standard deviation. b. Compute the upper quartile, the lower quartile, and the interquartile range. c. Are there any mild or extreme outliers present in this data set? d. Construct the boxplot for this data set.Suppose that the distribution of scores on an exam can be described by a normal curve with mean 100. The 16th percentile of this distribution is 80. a. What is the 84th percentile? b. What is the approximate value of the standard deviation of exam scores? c. What z score is associated with an exam score of 90? d. What percentile corresponds to an exam score of 140? e. Do you think there were many scores below 40? Explain.For each of the scatterplots shown, answer the following questions: a. Does there appear to be a relationship between x and y? b. If so, does the relationship appear to be linear? c. If so, would you describe the linear relationship as positive or negative? Scatterplot 1: Scatterplot 2: Scatterplot 3: Scatterplot 4:For each of the following pairs of variables, indicate whether you would expect a positive correlation, a negative correlation, or a correlation close to 0. Explain your choice. a. Maximum daily temperature and cooling costs b. Interest rate and number of loan applications c. Amount of fertilizer used per acre and crop yield (Hint: As the amount of fertilizer is increased, yield tends to increase for a while but then tends to start decreasing.)For each of the following pairs of variables, indicate whether you would expect a positive correlation, or a negative correlation, a correlation close to 0. Explain your choice. a. Incomes of husbands and wives when both have full-time jobs b. Height and IQ c. Height and shoe sizeFor each of the following pairs of variables, indicate whether you would expect a positive correlation, a negative correlation, or a correlation close to 0. Explain your choice. a. Score on the math section of the SAT exam and score on the verbal section of the same test b. Time spent on homework and time spent watching television during the same day by elementary school childrenIs the following statement correct? Explain why or why not. (Hint: See Example 5.3.) A correlation coefficient of 0 implies that no relationship exists between the two variables under study.Draw a scatterplot for which r = 1.Draw a scatterplot for which r = 1.Each year J.D. Power and Associates surveys new car owners 90 days after they purchase their cars. This data is used to rate auto brands (such as Toyota and Ford) on quality and customer satisfaction. USA TODAY (usatoday.com, March 29, 2016) reported a quality rating and a satisfaction score for all 33 brands sold in the United States. a. Construct a scatterplot of y = Satisfaction rating versus x = Quality rating. How would you describe the relationship between x and y? b. Calculate and interpret the value of the correlation coefficient.The accompanying data are x = Cost (cents per serving) and y = Fiber content (grams per serving) for 18 high-fiber cereals rated by Consumer Reports (consumerreports.org/health). a. Calculate and interpret the value of the correlation coefficient for this data set. (Hint: See Example 5.1.) b. The serving size differed for the different cereals, with serving sizes varying from cup to 1 cups. Converting price and fiber content to per cup rather than per serving results in the accompanying data. Is the correlation coefficient for the per cup data greater than or less than the correlation coefficient for the per serving data?The authors of the paper Flat-footedness Is Not a Disadvantage for Athletic Performance in Children Aged 11 to 15 Years (Pediatrics [2009]: e386-e392) studied the relationship between y = Arch height and scores on a number of different motor ability tests for 218 children. They reported the following correlation coefficients: a. Interpret the value of the correlation coefficient between average hopping height and arch height. What does the fact that the correlation coefficient is negative say about the relationship? Do higher arch heights tend to be paired with higher or lower average hopping heights? b. The title of the paper suggests that having a small value for arch height (flat-footedness) is not a disadvantage when it comes to motor skills. Do the given correlation coefficients support this conclusion? Explain.The paper The Relationship Between Cell Phone Use, Academic Performance, Anxiety, and Satisfaction with Life in College Students (Computers in Human Behavior [2014]: 343-350) described a study of cell phone use among undergraduate college students at a public university. The paper reported that the value of the correlation coefficient between x = Cell phone use (measured as total amount of time in hours spent using a cell phone on a typical day) and y = GPA (cumulative GPA determined from university records) was r = 0.203. a. Interpret the given value of the correlation coefficient. Does the value of the correlation coefficient suggest that students who use a cell phone for more hours per day tend to have higher GPAs or lower GPAs? b. The study also investigated the correlation between texting (measured as the total number of texts sent and texts received per day) and GPA. The direction of the relationship between texting and GPA was the same as the direction of the relationship between cell phone use and GPA, but the relationship between texting and GPA was not as strong. Which of the following possible values for the correlation coefficient between texting and GPA could have been the one observed? r=0.30r=0.10r=0.10r=0.30 c. The paper included the following statement: Participants filled in two blanks-one for texts sent and one for texts received. These two texting items were nearly perfectly correlated. Do you think that the value of the correlation coefficient for texts sent and texts received was close to 1, close to 0, or close to +1? Explain your reasoning.Data from the U.S. Federal Reserve Board (federal reserve.gov/releases/housedebt/,retrieved April 21, 2017) on consumer debt (as a percentage of personal income) and mortgage debt (also as a percentage of personal income) for the 10 years from 2006 to 2015 are shown in the following table: a. What is the value of the correlation coefficient for this data set? b. Is it reasonable to conclude in this case that there is no strong relationship between the variables (linear or otherwise)? Use a graphical display to support your answer.The article 115K! The 13 Best Paying U.S. Companies (USA TODAY, August 11, 2015) gave the following data on median worker pay (in thousands of dollars) and the I -year percent change in stock price for the 13 highest paying companies in the United States. a. Construct a scatterplot for these data. b. Calculate and interpret the value of the correlation coefficient. c. The article states that companies that pay more are seeing a payoff in their stock performance. Is this conclusion justified based on these data? Explain. d. Is it reasonable to generalize conclusions based on these data to the population of all companies in the United States? Explain why or why not.It may seem odd, but one of the ways biologists can tell how old a lobster is involves measuring the concentration of a pigment called neurolipofuscin in the eyestalk of a lobster. (We are not making this up!) The authors of the paper Neurolipofuscin Is a Measure of Age in Panulirus argus, the Caribbean Spiny Lobster, in Florida (Biological Bulletin [2007]: 55-66) wondered if it was sufficient to measure the pigment in just one eye stalk, which would be the case if there is a strong relationship between the concentration in the right and left eyestalks. Pigment concentration (as a percentage of tissue sample) was measured in both eyes talks for 39 lobsters, resulting in the following summary quantities (based on data read from a graph that appeared in the paper): n=39x=88.8y=86.1 xy=281.1x2=288.0y2=286.6 An alternative formula for calculating the correlation coefficient that is based on raw data and is algebraically equivalent to the one given in the text is r=xy(x)(y)nx2(x)2ny2(y)2n Use this formula to calculate the value of the correlation coefficient, and interpret this value.An auction house released a list of 25 recently sold paintings. Eight artists were represented in these sales. The sale price of each painting also appears on the list. Would the correlation coefficient be an appropriate way to summarize the relationship between artist (x) and sale price (y)? Why or why not?A sample of automobiles traversing a certain stretch of highway is selected. Each one travels at roughly a constant rate of speed, although speed does vary from auto to auto. Let x = speed and y = time needed to traverse this segment of highway. Would the sample correlation coefficient be closest to 0.9, 0.3 0.3, or 0.9? Explain.Two scatterplots are shown below. Explain why it makes sense to use the least-squares line to summarize the relationship between x and y for one of these data sets but not the other. (Hint: See Example 5.5.) Scatterplot 1: Scatterplot 2:The authors of the paper Statistical Methods for Assessing Agreement Between Two Methods of Clinical Measurement (International Journal of Nursing Studies [2010]: 931936) compared two different instruments for measuring a persons ability to breathe out air. (This measurement is helpful in diagnosing various lung disorders.) The two instruments considered were a Wright peak flow meter and a mini-Wright peak flow meter. Seventeen subjects participated in the study, and for each person air flow was measured once using the Wright meter and once using the mini-Wright meter. a. Suppose that the Wright meter is considered to provide a better measure of air flow, but the mini-Wright meter is easier to transport and to use. If the two types of meters produce different readings but there is a strong relationship between the readings, it would be possible to use a reading from the mini-Wright meter to predict the reading that the larger Wright meter would have given. Use the given data to find an equation to predict Wright meter reading using a reading from the mini-Wright meter. (Hint: See Example 5.4.) b. What would you predict for the Wright meter reading for a person whose mini-Wright meter reading was 500? c. What would you predict for the Wright meter reading for a person whose mini-Wright meter reading was 300? (Hint: See the discussion of extrapolation that follows Example 5.4.)The accompanying data are a subset of data from the report Great Jobs, Great Lives (Gallup-Purdue Index 2015 Report, gallup.com/reports/197144/gallup-purdue-index-report-2015.aspx). The values are approximate values read from a scatterplot. Students at a number of universities were asked if they agreed that their education was worth the cost. One variable in the table is the percentage of students at the university who responded strongly agree. The other variable in the table is the U.S. News and World Report ranking of the university. a. Find the equation of the least-squares line that would allow you to predict the percentage of alumni who would strongly agree that their education was worth the cost, using ranking as the independent variable. b. Predict the percentage of alumni who would strongly agree that their education was worth the cost for a university with a ranking of 50. c. Explain why it would not be a good idea to use the least-squares line to predict the percentage of alumni who would strongly agree that their education was worth the cost for a university that had a ranking of 10.The authors of the paper Evaluating Existing Movement Hypotheses in Linear Systems Using Larval Stream Salamanders (Canadian Journal of Zoology [2009]: 292298) investigated whether water temperature was related to how far a salamander would swim and whether it would swim upstream or downstream. Data for 14 streams with different mean water temperatures where salamander larvae were released are given (approximated from a graph that appeared in the paper). The two variables of interest are x = Mean water temperature (C) and y = Net directionality, which was defined as the difference in the relative frequency of the released salamander larvae moving upstream and the relative frequency of released salamander larvae moving downstream. A positive value of net directionality means a higher proportion were moving upstream than downstream. A negative value of net directionality means a higher proportion were moving downstream than upstream. a. Construct a scatterplot of the data. How would you describe the relationship between x and y? b. Find the equation of the least-squares line describing the relationship between y = Net directionality and x = Mean water temperature. c. What value of net directionality would you predict for a stream that had mean water temperature of 15 C?The authors of the paper referenced in the previous exercise state that when temperatures were warmer, more larvae were captured moving upstream, but when temperatures were cooler, more larvae were captured moving downstream. a. Do the scatterplot and least-squares line from the previous exercise support this statement? Explain. b. Approximately what mean temperature would result in a prediction of the same number of salamander larvae moving upstream and downstream?A sample of 548 ethnically diverse students from Massachusetts were followed over a 19-month period from 1995 and 1997 in a study of the relationship between TV viewing and eating habits (Pediatrics [2003]: 1321326). For each additional hour of television viewed per day, the number of fruit and vegetable servings per day was found to decrease on average by 0.14 serving. a. For this study, what is the dependent variable? What is the independent variable? b. Would the least-squares line for predicting number of servings of fruits and vegetables using number of hours spent watching TV as a predictor have a positive or negative slope? Explain.The relationship between hospital patient-to-nurse ratio and various characteristics of job satisfaction and patient care has been the focus of a number of research studies. Suppose x = Patient-to-nurse ratio is the independent variable. For each of the following potential dependent variables, indicate whether you expect the slope of the least-squares line to be positive or negative and give a brief explanation for your choice. a. y = Measure of nurses job satisfaction (higher values indicate higher satisfaction) b. y = Measure of patient satisfaction with hospital care (higher values indicate higher satisfaction) c. y = Measure of patient quality of careThe report Airline Quality Rating 2016 (airlinequalityrating.com/reports/2016_AQR_Final.pdf) included the accompanying data on the on-time arrival percentage and the number of complaints filed per 100,000 passengers for U.S. airlines. The report did not include data on the number of complaints for two of the airlines. Use the given data on the other airlines to fit the least-squares line and use it to predict the number of complaints per 100,000 passengers for Spirit and Virgin America Airlines.Acrylamide is a chemical that is sometimes found in cooked starchy foods and which is thought to increase the risk of certain kinds of cancer. The paper A Statistical Regression Model for the Estimation of Acrylamide Concentrations in French Fries for Excess Lifetime Cancer Risk Assessment (Food and Chemical Toxicology [2012]: 38673876) describes a study to investigate the effect of frying time (in seconds) and acrylamide concentration (in micrograms per kilogram) in French fries. The data in the accompanying table are approximate values read from a graph that appeared in the paper. a. If the goal is to learn how acrylamide concentration is related to frying time, which of these two variables is the dependent variable and which is the independent variable? b. Construct a scatterplot of these data. Describe any interesting features of the scatterplot.Use the acrylamide data given in the previous exercise to answer the following questions. a. Find the equation of the least-squares line for predicting acrylamide concentration using frying time. b. Does the equation of the least-squares line support the conclusion that longer frying times tend to be paired with higher acrylamide concentrations? Explain. c. What is the predicted acrylamide concentration for a frying time of 225 seconds? d. Would you use the least-squares line to predict acrylamide concentration for a frying time of 500 seconds? If so, what is the predicted concentration? If not, explain why. 5.25 Acrylamide is a chemical that is sometimes found in cooked starchy foods and which is thought to increase the risk of certain kinds of cancer. The paper A Statistical Regression Model for the Estimation of Acrylamide Concentrations in French Fries for Excess Lifetime Cancer Risk Assessment (Food and Chemical Toxicology [2012]: 38673876) describes a study to investigate the effect of frying time (in seconds) and acrylamide concentration (in micrograms per kilogram) in French fries. The data in the accompanying table are approximate values read from a graph that appeared in the paper. a. If the goal is to learn how acrylamide concentration is related to frying time, which of these two variables is the dependent variable and which is the independent variable? b. Construct a scatterplot of these data. Describe any interesting features of the scatterplot.Studies have shown that people who suffer sudden cardiac arrest have a better chance of survival if a defibrillator shock is administered very soon after cardiac arrest. How is survival rate related to the time between when cardiac arrest occurs and when the defibrillator shock is delivered? This question is addressed in the paper Improving Survival from Sudden Cardiac Arrest: The Role of Home Defibrillators (by J. K. Stross, University of Michigan, February 2002; available at heartstarthome.com). The accompanying data give y = Survival rate (percent) and x = Mean call-to-shock time (minutes) for a cardiac rehabilitation center (in which cardiac arrests occurred while victims were hospitalized and so the call-to-shock time tended to be short) and for four communities of different sizes: a. Construct a scatterplot for these data. How would you describe the relationship between mean call-to-shock time and survival rate? b. Find the equation of the least-squares line. c. Use the least-squares line to predict survival rate for a community with a mean call-to-shock time of 10 minutes.The data given in the previous exercise on x = Call-to- shock time (in minutes) and y = Survival rate (percent) were used to compute the equation of the least-squares line, which was = 101.33 9.30x The newspaper article FDA OKs Use of Home Defibrillators (San Luis Obispo Tribune, November 13, 2002) reported that every minute spent waiting for paramedics to arrive with a defibrillator lowers the chance of survival by 10 percent. Is this statement consistent with the given least-squares line? Explain.An article on the cost of housing in Califomia that appeared in the San Luis Obispo Tribune (March 30, 2001) included the following statement: In Northern California, people from the San Francisco Bay area pushed into the Central Valley, benefiting from home prices that dropped on average 4000 for every mile traveled east of the Bay area. If this statement is correct, what is the slope of the leastsquares line, a bx where y = House price (in dollars) and x = Distance east of the Bay (in miles)? Explain.The following data on sale price, size, and land-to-building ratio for 10 large industrial properties appeared in the paper Using Multiple Regression Analysis in Real Estate Appraisal (Appraisal Journal [2002]: 424430): a. Calculate and interpret the value of the correlation coefficient between sale price and size. b. Calculate and interpret the value of the correlation coefficient between sale price and land-to-building ratio. c. If you wanted to predict sale price and you could use either size or land-to-building ratio as the basis for making predictions, which would you use? Explain. d. Based on your choice in Part (c), find the equation of the least-squares line you would use for predicting y = sale price.Explain why it can be dangerous to use the least-squares line to obtain predictions for x values that are substantially larger or smaller than those contained in the sample.The sales manager of a large company selected a random sample of n = 10 salespeople and determined for each one the values of x = Years of sales experience and y = Annual sales (in thousands of dollars). A scatterplot of the resulting (x, y) pairs showed a linear pattern. a. Suppose that the sample correlation coefficient is r = 0.75 and that the average annual sales is y=100. If a particular salesperson is 2 standard deviations above the mean in terms of experience, what would you predict for that persons annual sales? b. If a particular person whose sales experience is 1.5 standard deviations below the average experience is predicted to have an annual sales value that is 1 standard deviation below the average annual sales, what is the value of r?Explain why the slope b of the least-squares line always has the same sign (positive or negative) as the sample correlation coefficient r.34EDoes it pay to stay in school? The report Trends in Higher Education (The College Board, 2010) looked at the median hourly wage gain per additional year of schooling. The report states that workers with a high school diploma had a median hourly wage that was 10% higher than those who had only completed 11 years of school. Workers who had completed 1 year of college (13 years of education) had a median hourly wage that was 11% higher than that of the workers who had completed only 12 years of school. The added gain in median hourly wage for each additional year of school is shown in the accompanying table. The entry for 15 years of schooling has been intentionally omitted from the table. a. Use the given data to predict the median hourly wage gain for the 15th year of schooling. b. The actual wage gain for 15th year of schooling was 14%. How close was the actual value to the predicted wage gain percent from Part (a)?The data in the accompanying table is from the paper Six-Minute Walk Test in Children and Adolescents (The Journal of Pediatrics [2007]: 395399). Two hundred and eighty boys completed a test that measures the distance that the subject can walk on a flat, hard surface in 6 minutes. For each age group shown in the table, the median distance walked by the boys in that age group is also given. a. With x = Representative age and y = Median distance walked in 6 minutes, construct a scatterplot. Does the pattern in the scatterplot look linear? b. Find the equation of the least-squares line that describes the relationship between median distance walked in 6 minutes and representative age. c. Calculate the five residuals and construct a residual plot. (Hint: See Examples 5.7 and 5.8.) d. Are there any unusual features in the residual plot?The paper referenced in the previous exercise also gave the 6-minute walk distances for 248 girls age 3 to 18 years. The median 6-minute walk times for girls for the five age groups were a. With x = Representative age and y = Median distance walked in 6 minutes, construct a scatterplot. b. How does the pattern in the scatterplot for girls differ from the pattern in the scatterplot for boys from the previous exercise? c. Find the equation of the least-squares line that describes the relationship between median distance walked in 6 minutes and representative age for girls. d. Calculate the five residuals and construct a residual plot. 5.36 The data in the accompanying table is from the paper Six-Minute Walk Test in Children and Adolescents (The Journal of Pediatrics [2007]: 395399). Two hundred and eighty boys completed a test that measures the distance that the subject can walk on a flat, hard surface in 6 minutes. For each age group shown in the table, the median distance walked by the boys in that age group is also given. a. With x = Representative age and y = Median distance walked in 6 minutes, construct a scatterplot. Does the pattern in the scatterplot look linear? b. Find the equation of the least-squares line that describes the relationship between median distance walked in 6 minutes and representative age. c. Calculate the five residuals and construct a residual plot. (Hint: See Examples 5.7 and 5.8.) d. Are there any unusual features in the residual plot?Consider the residual plot from the previous exercise. The authors of the paper decided to use a curve rather than a straight line to describe the relationship between median distance walked in 6 minutes and age for girls. What aspect of the residual plot supports this decision? (Hint: See Example 5.8.) 5.37 The paper referenced in the previous exercise also gave the 6-minute walk distances for 248 girls age 3 to 18 years. The median 6-minute walk times for girls for the five age groups were a. With x = Representative age and y = Median distance walked in 6 minutes, construct a scatterplot. b. How does the pattern in the scatterplot for girls differ from the pattern in the scatterplot for boys from the previous exercise? c. Find the equation of the least-squares line that describes the relationship between median distance walked in 6 minutes and representative age for girls. d. Calculate the five residuals and construct a residual plot.The report Airline Quality Rating 2016 (airlinequalityrating.com/reports/2016_AQR_Final.pdf, retrieved April 22, 2017) included the data for 13 U.S. airlines given in the table below. a. With x = Airline quality rating and y = On-time arrival percentage, construct a scatterplot. Does the pattern in the scatterplot look linear? b. Find the equation of the least-squares line. c. Calculate the residuals and construct a residual plot. Are there any unusual features in the residual plot?Acrylamide is a chemical that is sometimes found in cooked starchy foods and which is thought to increase the risk of certain kinds of cancer. The paper A Statistical Regression Model for the Estimation of Acrylamide Concentrations in French Fries for Excess Lifetime Cancer Risk Assessment (Food and Chemical Toxicology [2012]: 38673876) describes a study to investigate the effect of x = Frying time (in seconds) and y = Acrylamide concentration (in micrograms per kilogram) in French fries. The data in the accompanying table are approximate values read from a graph that appeared in the paper. a. Construct a scatterplot of these data. b. Find the equation of the least-squares line. Based on this line, what would you predict acrylamide concentration to be for a frying time of 270 seconds? What is the residual associated with the observation (270, 185)?