Data 205 Test #1

.pdf

School

CUNY Queens College *

*We aren’t endorsed by this school

Course

205

Subject

Sociology

Date

Jan 9, 2024

Type

pdf

Pages

3

Uploaded by DrComputer9650

Report
Data 205 Test #1 (38 points) Zander Guadalupe Instructions For this test, we’ll again use the data from the National Health and Nutrition Examination Survey (NHANES), a program of studies designed to assess the health and nutritional status of adults and children in the United States and conducted from the early 1960s to present. This is an open-book and open-note test due 11:59 PM on Monday, October 31 of 2022, via the Blackboard Turnitin. Late submissions will not be accepted. DO NOT DISCUSS THE TEST WITH OTHER STUDENTS. All work on the test must be your own. Cheating will result in a grade of zero and reporting to the Office of Academic Integrity. Exercise 1: Codebook, Variables, and Measurement (14 points) NHANES strongly recommend researchers to combine data from 2-year cycles to improve the statistical reliability and stability of estimates. Combining data from is particularly appropriate for rare events, estimates pertaining to detailed demographic subdomains, and measures that may have considerable geographic variation. I’ve downloaded the 2011-2012 NHANES demographics data, shown in the output below. You can find the full codebook for the data used in this test at: https://wwwn.cdc.gov/Nchs/Nhanes/2011-2012/DEMO_G.htm Question 1. The screenshot below shows the code used to import the data into R and to print out the first ten observations along with some other information about the data. How many variables and how many respondents are in the data? (2 points) 48 variables 9,756 respondents Question 2. The NHANES data include a variable named INDFMPIR according to the NHANES codebook shown below. (6 points) a. Is the variable INDFMPIR a valid measure of the NHANES participants’ family economic conditions? Why or why not? (2 points; recall the definition of validity from Lecture 2) It is valid because the values come out correctly a. Is the variable INDFMPIR a reliable measure of the NHANES participants’ family economic conditions? Why or why not? (2 points; recall the definition of reliability from Lecture 2) It is not reliable because it only measures the ratio of family income to poverty and the economic conditions can change. a. If we want to visualize the distribution of the variable INDFMPIR, which type of graph is more appropriate, histogram or bar chart? And why? (2 points) A histogram more appropriate because the value is based on range and a histogram would give you the general answer. Question 3. The NHANES data also include a variable named INDFMIN2 according to the NHANES codebook shown below. (6 points)
a. Is the variable INDFMIN2 a valid measure of the NHANES participants’ family economic conditions? Why or why not? (2 points; recall the definition of validity from Lecture 2) Yes it is valid because the amount still equates to 9,756 a. Is the variable INDFMIN2 a reliable measure of the NHANES participants’ family economic conditions? Why or why not? (2 points; recall the definition of reliability from Lecture 2) Yes it is reliable because the chart gives more detail about families total income. a. If we want to visualize the distribution of the variable INDFMIN2, which type of graph is more appropriate, histogram or bar chart? And why? (2 points) A bar chart because there is a larger range of values and need to be more in depth. Exercise 2: Data Distributions (16 points) Question 1. I consider myself not only a sociologist but also a demographer. Demographers are interested in the age structure of a population because it is related to three fundamental demographic phenomena – birth, migration, and death. You may have heard some people (probably demographers) talking about “America is aging” – the proportion of older people in the population is increasing. Is it true? Let’s look at the data. Demographers would use a tool known as the population pyramid to visualize the age structure of a population, but here we will use histograms and boxplots instead. (8 points) a. Look at the histogram of the age variable (named RIDAGEYR) in the NHANES data below, what is its bin width and what does the bin width tell us in this case? (2 points) The bin width is 5. It tells us the amount of people within that age range. a. which measure of the central tendency (mode, median, or mean) has the largest value, and which has the smallest value? And why? (4 points) The median has the largest value and the mode has the smallest value, because the median is around 40 and the mode is around 5 to 10. And the mean falls in between those values. a. Based on the information from the histogram, do you think America has a population aging problem or not? Why or why not? (2 points) No because the majority of people are young. Question 2. One important demographic characteristic for understanding the American society is race/ethnicity. Below is a graph of the race/ethnicity variable (named RIDRETH3) in the NHANES data. (4 points) a. What measure of central tendency does the graph show, and what is its value? (2 points) The measure of central tendency the graph shows is mode and the value is white. a. Based on the information from this graph, do you think America is a racially/ethnically diverse country and why? Frame your answers in terms of the dispersion of the distribution (high, medium, or low) as shown in the graph. (2 points)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help