Lab #2 - Data and Variables - Julia Smithers
.pdf
keyboard_arrow_up
School
Texas Tech University *
*We aren’t endorsed by this school
Course
3314
Subject
Political Science
Date
Apr 3, 2024
Type
Pages
5
Uploaded by SargentAtom13526
POLS 3314
Lab 2
Data and Variables
Download the .Rdata file titled “Lab 2 Data” from Blackboard. Open it to populate it into RStudio.
You should see nine variables loaded on the right pane. All R commands are in brackets below {
}. Do not use the brackets in the R command line (they’re just to separate the commands from
the text here).
1.
country – the unit of analysis
2.
year – the dataset is a cross-section of countries for the year 2013
3.
region – the geographic region where each country is located
4.
life_expectancy – the total life expectancy at birth, measured in years
5.
gdp_growth – annual growth of gdp, measured as a %
6.
cellphone_subscriptions – the count of mobile phone subscriptions per 100 people
7.
women_businesslaw_score – an additive score ranging from 1 to 10 recording women’s
engagement in business and law industries
8.
annual_precipitation – average precipitation, measured in depth by discrete millimeters
9.
disaster_risk_reduction – a score ranging from 1 (worst) to 5 (best) tracking a country’s
progress in reducing risks related to natural disasters
Remember always to include the libraries used at the beginning of your R script file:
{
# Install the packages (if necessary)
install.packages("questionr")
install.packages("ggplot2")
# Load the libraries
library(questionr)
library(ggplot2)
}
1.
Identify the nominal variable in the list. What is the most appropriate measure of central
tendency for this variable?
Nominal Variable: Region
Most appropriate measure of central tendency: mode
2.
Run a frequency table for the nominal variable { freq(data$varname,cum = TRUE, total =
TRUE)}. Which category are you most interested in? What percentage of cases falls into that
category?
The North America Region - 0.90% for the cases that falls into that category.
3.
Generate a bar graph for the nominal variable {
ggplot(data = data, aes(x = region)) +
geom_bar() +
scale_x_discrete(limit = c(1, 2, 3, 4, 5, 6, 7),
POLS 3314
Lab 2
labels = c('E. Asia / Pacific','Europe / C. Asia','S. America / Caribbean','M. East / N.
Africa','N. America','S. Asia','Sub-Saharan Africa')) +
theme(axis.text.x = element_text(angle = 45, vjust = .5, hjust = .5))
}.
Copy / paste the barplot of the nominal variable here:
4.
Identify the two ordinal variables in the list. Select
one
and describe the crucial junctures
featured in the rank statistics (min, max, median, IQR). {summary(data$
varname
)}
1. Women business law score
2. Disaster risk reduction:
Min: 1.00
Max: 5.00
Median: 3.00
IGQ: 1st
QR: 3.00
3rd QU: 4.00
5.
Generate a bar graph for your ordinal variable. Copy / paste it into this document. Describe
what kind of distribution (modality) you find. {
ggplot(data, aes(x = varname)) +
geom_bar() +
scale_x_continuous(breaks=seq(min,max,1))}
POLS 3314
Lab 2
The distribution (modality) of the graph is negatively skewed, and is presented by a
unimodal modality.
6.
Identify the numerical variables in the list. Select
one
and report its median and mean. {
summary(data$
varname
)}
1. GDP Growth:
-
Median: 3.360
-
Mean: 3.278
2. Life expectancy
3. Annual precipitation
4. Cell phone subscriptions
7.
For the same numerical variable, report the variance and the standard deviation. {
var(data$varname, na.rm=TRUE)
sd(data$varname, na.rm=TRUE)}
GDP Growth: Variance: 25.23 SD: 5.02
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help