final_report-1

.pdf

School

University of Minnesota-Twin Cities *

*We aren’t endorsed by this school

Course

3811

Subject

Sociology

Date

Jan 9, 2024

Type

pdf

Pages

5

Uploaded by katecake123

SOC 3811: Social Statistics Final Assignment (Due: Dec-17-23; 11:59 PM) Your final project will, once again, place you in the role of “Lead Data Scientist” at the The Missouri Institute of Social Research (M-ISR). This time around though, you’ll be going national: The United States Centers for Disease Control and Prevention (CDC) has noticed a troubling uptick in high blood pressure among young adults across the country — and is desperate for answers. In this project, you’ll serve as a consultant, helping the CDC understand how to draw insights about blood pressure levels among all young adults across the country, using just a sample. This task will give you the opportunity to apply and demonstrate inferential statistical tools in a real world setting. Assignments should be uploaded to Canvas, no later than 11:59 PM on Dec. 17th. Overview As a narrative background, The United States Centers for Disease Control and Prevention (CDC) saw your careful report describing the social factors that shaped unequal life expectancy across Missouri communities. Policymakers at The CDC were so impressed by your analysis that they’ve asked you to serve as a consultant for a project that they’ve been working on. In more detail, the CDC has collected data from nearly 5,000 U.S. adults between the ages of 25 and 34 years old. In these data, they’ve noticed a troubling trend: compared to past years, these data seem to suggest that blood pressure is on the rise among young adults. They’ve conducted some data analyses to better understand this phenomena, but could use your help in a few key areas to better understand what their sample data suggest about blood pressure levels among the entire population of young adults in the US. Below are a number of questions and concerns that the CDC could use your expert guidance on. Your task is to provide responses to each of these questions to help officials better understand the prevalence and determinants of systolic blood pressure levels among the U.S. population of young adults (i.e., people between 25-34 years old). As always, remember that your audience is not comprised of statistical experts! Indeed, your aim should be to produce detailed responses to each of the CDC’s questions, in a way that almost anyone can understand! Instructions On the Canvas website, I have provided you with the CDC’s sample of 4,693 young adults between the ages of 25 and 34 years old. These data are located in a .csv file called ah data on the Canvas website, under the Files/Data tab. The outcome variable sbp measures each individual in the sample’s systolic blood pressure level. For context, the CDC provides you with the following fact sheet on systolic blood pressure and what different levels of this outcome 1
indicate about cardiovascular health: https://bit.ly/3nsO0ZM . The overarching idea is that higher systolic blood pressure levels indicate worse cardiovascular health. Please use these data to address each of the following questions: 1. The CDC wants you to first provide a descriptive analysis of how blood pressure levels are distributed among the nearly 4,693 young adults in their sample. To accomplish this, please provide a histogram and the median, range, and standard deviation of the sbp variable. Combine these descriptive statistical tools into a clear and intuitive narrative that helps CDC officials understand what systolic blood pressure looks like across the young adults within their sample. 2. Next, the CDC is interested in understanding if and how the statistics in their sample describe the population of young adults in the U.S. To assist them in this, calculate the population mean and 95% confidence interval of systolic blood pressure from the ah data sample. Describe what these statistics tell us about the average blood pressure level among the whole population of young adults in the U.S. According to the background information on systolic blood pressure provided to you above, is your analysis cause for concern? Be specific and as approachable as possible in your explanation. 3. A CDC official asks for more information here. In particular, they want you to use their sample data to provide an informed-guess at the proportion of all young adults across the US who have systolic blood pressure levels high enough to be considered at least “Hypertension – Stage 1.” Calculate an estimate of this population proportion from your sample and a corresponding 95% confidence interval. 1 Describe what your calculations suggest about the proportion of young adults with dangerously high blood pressure levels in the US. 4. Prior to reaching out to you, the CDC reached out to another statistical consultant to assist them in their analysis of blood pressure among young adults. This researcher produced a simple linear regression model describing the relationship between an individuals’ age (measured in years) and their systolic blood pressure level. The results of this analysis are presented in the regression table below: parameter estimate 95% CI p-value Intercept 111.35 [104, 117] < 0.001 Age (in years) 0.516 [0.29, 0.74] < 0.001 Table 1: Linear regression model of the relationship between systolic blood pressure and age 1 Note that the procedure for calculating the confidence interval for a proportion is a little different than calculating the confidence interval for a mean. See the following online reference for more detail: https://bit.ly/3x2Vxle . 2
Before writing up a summary of this analysis, the other researcher quit in a fit of rage! The officials at the CDC aren’t statistical experts, so they can’t make sense of what this data analysis demonstrates. To help them out, please provide a detailed summary of what this regression model communicates about the relationship between age and blood pressure levels among the population of young adults in the US. Be as specific and detailed as possible in explaining the story conveyed by this analysis. Make sure to describe every piece of information presented in the regression table. 5. Before they quit, the same researcher also preformed a simple linear regression analysis describing the relationship between race(-ism) 2 and systolic blood pressure levels. Here, the variable race measures an individual’s racial identity as either: White; Black; American Indian/Native Alaskan; Asian; Latinx; or Other . Below is a regression table that displays the results of this analysis: parameter estimate 95% CI p-value Intercept 125.08 [124, 126] < 0.001 Race: White [ref.] - - - Race: AI/AN 4.07 [-2.07, 10.2] 0.202 Race: Asian 0.02 [-2.44, 2.47] 0.991 Race: Black 6.86 [5.86, 7.83] < 0.001 Race: Latinx 1.64 [0.31 2.98] 0.015 Race: Other -1.55 [-3.43 0.33] 0.111 Table 2: Linear regression model of the relationship between systolic blood pressure and race. Note: “ref.” stands for “reference category.” “AI/AN” stands for “American Indian/Native Alaskan.” CDC officials are truly puzzled by this regression analysis. They know that it suggest something about racialized disparities in blood pressure among young adults, but are not sure what exactly it’s communicating. Part of the difficulty here is that these officials have no idea how to interpret a linear regression model where the predictor is a nominal categorical variable (like race). To help the CDC out here, please provide a careful summary of what this analysis suggests about racial disparities in blood pressure among the full population of young adults in the US. Point to specific evidence from the above regression model to support your explanation. 3 6. A very eager — if somewhat unusual — CDC official has a strange theory of what explains why some young adults have higher blood pressure than others. Indeed, he believes that the single most important factor in determining someone’s systolic blood pressure level is the 2 Always remember that “race” is a proxy for a broad system of unequal social experiences — not biology! 3 Hint: read through this document — particularly the section “Categorical variables with two levels” — to help foundation your thinking: https://bit.ly/2YZuRFq. 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help