01 Jan 8 population versus sample

.pdf

School

University Of Georgia *

*We aren’t endorsed by this school

Course

6315

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

10

Uploaded by DrZebra4092

Report
So, what is Statistics? Statistics is a way of reasoning, along with a collection of tools and methods, designed to help us understand the world. The Statistical Method consists of three main components to answer a statistical question: 1. Design: This refers to how the data are obtained (using a survey, setting up an experiment). Valid inferences cannot be made without a good design. 2. Description: This means summarizing the data obtained (providing the average or a percentage, creating a bar graph or pie chart). Summarizing can help identify patterns in the data. 3. Inference: This means making decisions or predictions based on the data. Example: Aspirin and Heart Attacks Does regular aspirin intake reduce heart attacks? Harvard Medical School conducted a study to investigate. The study included about 22,000 male physicians. Half of these physicians regularly took an aspirin; the other half regularly took a placebo (a pill with no active ingredient). Whether a given individual would be assigned to take aspirin or the placebo was determined by flipping a coin. Of those who took aspirin, 1% had heart attacks during the study. Of those who took the placebo, 4% had heart attacks during the study. Based on these results, the study authors concluded that taking aspirin reduces the risk of having a heart attack. 1. Which aspect pertains to design? 2. Which aspect pertains to description? 3. Which aspect pertains to inference? 1 Setting up an experiment; control and experimental groups, with particpants in each determined by a coin toss. Providing a percentage of physicians in each group that experienced a heart attack. Authors concluded that taking aspirin reduces the risk of having a heart attack. 22,000 male physicians; physicians randomly assigned through a coin toss.
Our ability to answer questions and draw conclusions from data depends largely on our ability to understand vari a tion . The key to learning from data is understanding the variation that is all around us. Example: How many siblings do you have? a) 0 b) 1 c) 2 d) 3 or more Suppose all UGA students have ex actly the same number of siblings. Then, there would be no variability. How many students would we have to poll to estimate the average number of siblings for all UGA students? Statistics, as a field of study, is the science of learning from data. This definition raises two questions: (1) What is data? (Actually, what are data?) (2) How do we learn from it? The first question is easy. Data are information – could be numbers (weight, income), could be words (Covid vaccination status, race). Most of your professors collect data for their research. Sports data are readily available online. Cell phone companies keep data on their customers, doctor’s offices keep data on patients, Kroger tracks purchases (Kroger Plus card system), and Netflix/Amazon/Google all keep data on their users (and their products). The answer to the second question isn’t as obvious. Learning from data involves analyzing data and making inferences. We’ll talk more about statistical inference during the second half of this course. For now, let’s just say inference is a component of the statistical method, outlined below. 2
Population versus Sample The population is the set of all individuals we are interested in studying. Studying an entire population can be difficult (size, cost, time-consuming), so samples are often studied instead. A sample is the subset of the population for whom we have data. Subjects are the things or individuals who make up the sample. Who or what are we gathering data from? Ideally, the characteristics of the subjects in the sample will tell us something about the overall population. Population Sample Subjects The entire voting public 200 randomly selected voters Each voter in the sample Every MLB player 60 randomly selected players Each player in the sample All 159 counties in GA 40 randomly selected counties Each county in the sample A parameter is a numerical value summarizing the population data. The value of a parameter is usually unknown, and unknowable. A statistic is a numerical value summarizing the sample data. If we have data, we KNOW the statistics. Sample statistics serve as estimates for population parameters. 3
Example: SEC Teams A college football fan is interested in learning about the average weight of football players in the SEC. The fan takes a random sample of 75 SEC football players and calculates the average weight. Match these terms with the descriptions that follow a) population the average weight of the 75 randomly selected SEC players b) sample a single player from the sample c) subjects the average weight of all SEC football players d) parameter all SEC football players e) statistic 75 randomly selected SEC football players 4 c a d e b numerical
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help