Lab #2 - Data and Variables - Julia Smithers

.pdf

School

Texas Tech University *

*We aren’t endorsed by this school

Course

3314

Subject

Political Science

Date

Apr 3, 2024

Type

pdf

Pages

5

Uploaded by SargentAtom13526

Report
POLS 3314 Lab 2 Data and Variables Download the .Rdata file titled “Lab 2 Data” from Blackboard. Open it to populate it into RStudio. You should see nine variables loaded on the right pane. All R commands are in brackets below { }. Do not use the brackets in the R command line (they’re just to separate the commands from the text here). 1. country – the unit of analysis 2. year – the dataset is a cross-section of countries for the year 2013 3. region – the geographic region where each country is located 4. life_expectancy – the total life expectancy at birth, measured in years 5. gdp_growth – annual growth of gdp, measured as a % 6. cellphone_subscriptions – the count of mobile phone subscriptions per 100 people 7. women_businesslaw_score – an additive score ranging from 1 to 10 recording women’s engagement in business and law industries 8. annual_precipitation – average precipitation, measured in depth by discrete millimeters 9. disaster_risk_reduction – a score ranging from 1 (worst) to 5 (best) tracking a country’s progress in reducing risks related to natural disasters Remember always to include the libraries used at the beginning of your R script file: { # Install the packages (if necessary) install.packages("questionr") install.packages("ggplot2") # Load the libraries library(questionr) library(ggplot2) } 1. Identify the nominal variable in the list. What is the most appropriate measure of central tendency for this variable? Nominal Variable: Region Most appropriate measure of central tendency: mode 2. Run a frequency table for the nominal variable { freq(data$varname,cum = TRUE, total = TRUE)}. Which category are you most interested in? What percentage of cases falls into that category? The North America Region - 0.90% for the cases that falls into that category. 3. Generate a bar graph for the nominal variable { ggplot(data = data, aes(x = region)) + geom_bar() + scale_x_discrete(limit = c(1, 2, 3, 4, 5, 6, 7),
POLS 3314 Lab 2 labels = c('E. Asia / Pacific','Europe / C. Asia','S. America / Caribbean','M. East / N. Africa','N. America','S. Asia','Sub-Saharan Africa')) + theme(axis.text.x = element_text(angle = 45, vjust = .5, hjust = .5)) }. Copy / paste the barplot of the nominal variable here: 4. Identify the two ordinal variables in the list. Select one and describe the crucial junctures featured in the rank statistics (min, max, median, IQR). {summary(data$ varname )} 1. Women business law score 2. Disaster risk reduction: Min: 1.00 Max: 5.00 Median: 3.00 IGQ: 1st QR: 3.00 3rd QU: 4.00 5. Generate a bar graph for your ordinal variable. Copy / paste it into this document. Describe what kind of distribution (modality) you find. { ggplot(data, aes(x = varname)) + geom_bar() + scale_x_continuous(breaks=seq(min,max,1))}
POLS 3314 Lab 2 The distribution (modality) of the graph is negatively skewed, and is presented by a unimodal modality. 6. Identify the numerical variables in the list. Select one and report its median and mean. { summary(data$ varname )} 1. GDP Growth: - Median: 3.360 - Mean: 3.278 2. Life expectancy 3. Annual precipitation 4. Cell phone subscriptions 7. For the same numerical variable, report the variance and the standard deviation. { var(data$varname, na.rm=TRUE) sd(data$varname, na.rm=TRUE)} GDP Growth: Variance: 25.23 SD: 5.02
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help