Data Visualization Week 5 Notes
.docx
keyboard_arrow_up
School
Virginia Commonwealth University *
*We aren’t endorsed by this school
Course
MISC
Subject
Economics
Date
Feb 20, 2024
Type
docx
Pages
25
Uploaded by DukeElementCobra30
Data Visualization Week 5 Notes
Facet Wrap:
-
Facet wraps are a useful way to view individual categories in their own graph.
Example 1 - Obtain a scatterplot with ‘facet_wrap()’ function:
library(ggplot2)
library(gapminder)
view(gapminder)
ggplot(data = gapminder,
mapping = aes(x = gdpPercap, y = lifeExp, size = pop, color = continent)) +
geom_point(alpha = 0.7) +
facet_wrap(~year)
--- This R code is using the ‘ggplot2’ package to create a scatter plot using the ‘gapminder’ dataset. The ‘gapminder’ dataset contains information about various countries, including their GDP per capita (‘gdpPercap’), life expectancy (‘lifeExp’), population (‘pop’), continent, and year. ---
Breakdown of the code:
library(ggplot2)
library(gapminder)
This code loads the necessary libraries, ‘ggplot2’ for creating plots and ‘gapminder’ for accessing
the dataset.
view(gapminder)
This command displays the ‘gapminder’ dataset. ggplot(data = gapminder,
mapping = aes(x = gdpPercap, y = lifeExp, size = pop, color = continent)) +
geom_point(alpha = 0.7) +
facet_wrap(~year)
ggplot():
Initiates the creation of a new ggplot object.
data=gapminder:
Specifies the dataset to be used, which is gapminder.
mapping=aes(...):
Defines the aesthetic mappings. Here:
x=gdpPercap:
GDP per capita on the x-axis.
y=lifeExp:
Life expectancy on the y-axis.
size=pop:
The size of points is determined by the population.
color=continent:
Points are colored based on the continent.
geom_point(alpha=0.7):
Adds points to the plot with a transparency (alpha) of 0.7, making overlapping points more visible.
facet_wrap(~year):
Creates multiple plots, each representing a different year. The tilde ~ indicates that the variable to be faceted is year.
So, the final result is a scatter plot where each point represents a country, with GDP per capita on
the x-axis, life expectancy on the y-axis, point size based on population, point color based on continent, and separate panels for each year.
Example 2: Using `scale_x_log10()` function to transform gdpPercap into log10 scale:
ggplot(data = gapminder,
mapping = aes(x = gdpPercap, y = lifeExp, size = pop, color = continent)) +
geom_point(alpha = 0.7) +
facet_wrap(~year) + scale_x_log10()
1.
ggplot() and geom_point():
These functions are the same as in the previous code and are
used to set up the basic structure of the scatter plot.
2.
facet_wrap(~year):
This part creates separate panels for each year, as in the previous example.
3.
scale_x_log10():
This function is used to apply a logarithmic scale to the x-axis. Specifically, scale_x_log10() transforms the x-axis to a logarithmic scale with a base of 10. This is often used when dealing with data that spans several orders of magnitude, such as GDP per capita, to make the visualization more interpretable.
So, the addition of scale_x_log10() in this code indicates that the x-axis (GDP per capita) will be displayed on a logarithmic scale, providing a clearer representation of the data when there is a wide range of values.
Line Graphs (aka time series graphs):
Line graphs, also known as time series graphs, are a type of data visualization that represents data points over a continuous interval or time span. These graphs are particularly useful for showing trends, patterns, and relationships in data that evolve over time. Time series graphs typically have time on the x-axis (horizontal axis) and a variable of interest on the y-axis (vertical axis).
Example 1: Time series graph for lifeExp by year for the two countries in the continent Oceania
This R code uses the ggplot2 package to create a plot using the gapminder dataset, specifically focusing on the Oceania continent.
library(ggplot2)
library(gapminder)
help(gapminder, package = "gapminder")
library(dplyr)
plotdata <- gapminder %>% filter(continent == "Oceania")
plotdata
ggplot(data = plotdata,
mapping = aes(x = year, y = lifeExp, color = country,
size = pop)) + geom_point(alpha = 0.7) +
geom_line(linewidth = 1)
1.
library(ggplot2):
Loads the ggplot2 library, which is a powerful data visualization package in R.
2.
library(gapminder):
Loads the gapminder library, which contains the gapminder dataset. This dataset includes information about countries, continents, population, life expectancy, and GDP per capita over time.
3.
help(gapminder, package = "gapminder"):
Displays help information for the gapminder dataset.
4.
library(dplyr):
Loads the dplyr library, which is used for data manipulation and filtering.
5.
plotdata <- gapminder %>% filter(continent == "Oceania"):
Creates a new data frame called plotdata by filtering the gapminder dataset to include only rows where the continent is "Oceania."
6.
plotdata:
Displays the filtered data frame, showing only the data for countries in the Oceania continent.
7.
The ggplot function is used to create a plot:
a.
data = plotdata:
Specifies the data frame to be used for plotting.
b.
mapping = aes(...):
Maps the aesthetics (variables) to the visual elements of the plot.
i.
x = year:
Maps the x-axis to the "year" variable.
ii.
y = lifeExp:
Maps the y-axis to the "lifeExp" (life expectancy) variable.
iii.
color = country:
Colors the points based on the "country" variable.
iv.
size = pop:
Sizes the points based on the "pop" (population) variable.
8.
geom_point(alpha = 0.7):
Adds a scatter plot layer with points, where alpha controls the transparency of the points.
9.
geom_line(linewidth = 1):
Adds a line plot layer connecting the points, with a specified line width of 1.
The code creates a scatter plot and line plot visualizing life expectancy over time for countries in the Oceania continent. Each point represents a country, colored based on the country and sized based on its population. The transparency of the points is set to 0.7, and a line connects the points
for each country.
Example 2: Time series graph of medianLifeExp by year for the five continents
This R code also uses the ggplot2, gapminder, and dplyr libraries to create a summarized dataset and display it.
library(ggplot2)
library(gapminder)
library(dplyr)
plotdata <- gapminder %>% group_by(year, continent) %>%
summarize(medianLifeExp = median(lifeExp))
plotdata
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Questions
Use the data in the table below to answer the following questions.
X-variable
Y-variable
Point
8
5
30
40
4
F
30
3.
H.
Area-Problem #1c
Plot the points CGE on the graph.
Instructions: Click the 3-point shader tool called "CGE" and then click on the graph. This will place a triangle on
drag each of the end points of the shaded area until the triangle highlights the desired region.
arrow_forward
Supply
Price of Computers ($)
Sdomestic
Demand
Tariff Amount
1,000
New Domestic Equilibrium
CALCULATIONS
800
World Price
Domestic Quantity Supplied
110
Imports
Domestic Quantity Demanded
190
domestic
Level of Imports
80
190
110
Tariff Amount
$0
arrow_forward
question part c should contain a graph
arrow_forward
Refresh your Math & Graphing Skills
SOP-A
D
For each of the following scenarios, indicate whether the relationship between the two variables is positive or negative, as well as which line on the
previous graph has a slope that reflects this type of relationship.
X-axis
Hint: The X-axis and Y-axis on the graph are not labeled intentionally. You need to substitute the variables from each scenario for the horizontal and
vertical axis. For example, in the first scenario, X-axis should be labeled The average grade received" and Y-axis should be labeled "The number of
hours spent studying".
Scenario
As the number of hours spent studying rises, the average grade received rises.
As the number of hours spent studying falis, the likelihood of getting an A falls.
As the number of hours spent watching TV rises, the average grade received falls.
True
True or False: Line B has a slope of infinity.
O False
www
Relationship Line
Y
arrow_forward
Please explain it with a graph.
arrow_forward
Could you add some graphs as well
arrow_forward
Construct a table from the data shown. Which is the dependent variable and which is the independent variable? Summarize the data in equation form.
arrow_forward
I asked this question twice and it got rejected. This is for homework and I need help. I respect the honor code, but this is a HOMEWORK question.
Thank you!
arrow_forward
i need in words (not handwritten)
no copy paste please plagrism free text
arrow_forward
Use the data in the table below to answer the following questions.
X-variable
Y-variable
Point
8
A
30
4
40
4
3
F
30
H.
Area-Problem #1d
Plot the points DHE on the graph.
Instructions: Click the 3-point shader tool called "DHE" and then click on the graph. This will place a triang
drag each of the end points of the shaded area until the triangle highlights the desired region.
DHE
57
Next>
6.
7.
of 33
4
Prev
arrow_forward
The following table shows worldwide sales of a certain type of cell phone and their average selling prices in 2012 and 2013.
Year
2012
2013
Selling Price ($)
395
325
Sales (millions)
741
1,133
(a) Use the data to obtain a linear demand function for this type of cell phone. (Let p be the price, and let q be the demand).
q(p) =
Use your demand equation to predict sales if the price is lowered to $255.
million phones
(b) Fill in the blank.
For every $1 increase in price, sales of this type of cell phone decrease by
million units.
arrow_forward
Construct a table from the data shown on the graph below. Which is the dependent variable and which the independent variable? Summarize the data in equation form.
arrow_forward
The following graph contains four lines (A, B, C and D), each of which has a slope that is either positive, negative, zero, or infinite.
Y-axisX-axisBDCA
For each of the following scenarios, indicate whether the relationship between the two variables is positive or negative, as well as which line on the previous graph has a slope that reflects this type of relationship.
Hint: The X-axis and Y-axis on the graph are not labeled intentionally. You need to substitute the variables from each scenario for the horizontal and vertical axis. For example, in the first scenario, X-axis should be labeled “ ice-cream" and Y-axis should be labeled "The temperature".
Scenario
Relationship
Line
As the temperature rises, the demand for ice-cream rises.
Negative
As the temperature rises, the demand for hot cocoa falls.
Positive
As the temperature falls, the demand for popsicles falls.
Negative
True or False: Line B has a slope of infinity.…
arrow_forward
Only typed answer and please don't use chatgpt otherwise I downvote the answer
Assume that the relationship between test scores and the student-teacher ratio can be modeled as a linear function with an intercept of 698.9 and a slope of (-2.28). A decrease in the student teacher ratio by 2 will:
A) reduce test scores by 2.28 on average
B) result in a test score of 698.9
C) reduce test scores by 2.56 on average
D) reduce test scores by 4.56 for every school district
PLEASE EXPLAIN WHY C IS CORRECT.
arrow_forward
PLease reference attachment. See image below.
arrow_forward
8
arrow_forward
SLOPE OF A LINE
1. Compute the slope between points B and D.
2. Interpret/Explain descriptively the slope of the line as shown on the graph. What is the nature of the relationship between the two goods in terms of purchases?
3. What is the price per unit of the food? What is the price per unit of the clothing?
Help me with this one, please. Thank you.
arrow_forward
a) You need to take a trip by a car to another town that you never visited before. Therefore,
you are studying a map to determine the shortest route to your destination. Depending on
which route you choose, there are five other towns (call them A, B, C, D, E) through which
you might pass on the way. The map shows the mileage along each road that directly
connects two towns without any intervening towns. These numbers are summarized in the
following table, where a dash indicates that there is no road directly connecting these two
towns without going through any other towns.
Table 3:
Miles between Adjusted Towns
Town
A B
C
D E
Destination
Origin
40
60
50
A
10
70
B
20
55
40
с
50
D
10
60
E
80
i) Formulate a network model for his problem as a shortest path problem by drawing a network
where nodes represent towns, links represents roads, and numbers indicate the length of each
link in miles.
ii) Use the network form in i) to find the shortest path from origin to the destination.
arrow_forward
Please help me graph this
arrow_forward
Define isthmus
arrow_forward
Hi hlo Expert Hand written solution is not allowed.
arrow_forward
100
80
60
Exam score
(points)
40
20
0 2 4 6 8
Study time (hours)
10
QUESTION 1:
(i)
(ii)
(iii) Summarize the data in equation form.
Construct a table from the data shown on the graph above.
Which is the dependent variable and which is the independent variable
arrow_forward
Economics
arrow_forward
What are the values of the function
f(-2)
f(3)
arrow_forward
A researcher wants to study about the behaviours of postgraduate students in Australia in mobile phone usage.
One of the goals of the study is to analyse whether there is a relationship between the type of apps that the students use the most and the degree they are studying.
What is the most suitable graph to see the relationship described above? Why?
arrow_forward
Problem 2:
Y = C +1, + Go
C = Co + bYd
T = T, + tY
%3D
%3D
Determine and comment the effect of changes in To on Y*, C* and T* .
[Hint: Please use template for Implicit and General Function]
arrow_forward
Task II:Your manager asked you to answer the following:A) Define quantitative and qualitative data.B) Mention the differences between quantitative and qualitative data.C) Provide Real-World one example with Quantitative Data that could be presentedas a histogram (The example should contain the data collected + draw the frequency table for bothexamples).D) Provide Real-World one example with Qualitative Data that could be presented as a bar graph.(The example should contain the data collected + draw the frequency table for bothexamples).
E) Use Excel software to represent the data in parts C and D.
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
Economics (MindTap Course List)
Economics
ISBN:9781337617383
Author:Roger A. Arnold
Publisher:Cengage Learning
Related Questions
- Use the data in the table below to answer the following questions. X-variable Y-variable Point 8 5 30 40 4 F 30 3. H. Area-Problem #1c Plot the points CGE on the graph. Instructions: Click the 3-point shader tool called "CGE" and then click on the graph. This will place a triangle on drag each of the end points of the shaded area until the triangle highlights the desired region.arrow_forwardSupply Price of Computers ($) Sdomestic Demand Tariff Amount 1,000 New Domestic Equilibrium CALCULATIONS 800 World Price Domestic Quantity Supplied 110 Imports Domestic Quantity Demanded 190 domestic Level of Imports 80 190 110 Tariff Amount $0arrow_forwardquestion part c should contain a grapharrow_forward
- Refresh your Math & Graphing Skills SOP-A D For each of the following scenarios, indicate whether the relationship between the two variables is positive or negative, as well as which line on the previous graph has a slope that reflects this type of relationship. X-axis Hint: The X-axis and Y-axis on the graph are not labeled intentionally. You need to substitute the variables from each scenario for the horizontal and vertical axis. For example, in the first scenario, X-axis should be labeled The average grade received" and Y-axis should be labeled "The number of hours spent studying". Scenario As the number of hours spent studying rises, the average grade received rises. As the number of hours spent studying falis, the likelihood of getting an A falls. As the number of hours spent watching TV rises, the average grade received falls. True True or False: Line B has a slope of infinity. O False www Relationship Line Yarrow_forwardPlease explain it with a graph.arrow_forwardCould you add some graphs as wellarrow_forward
- Construct a table from the data shown. Which is the dependent variable and which is the independent variable? Summarize the data in equation form.arrow_forwardI asked this question twice and it got rejected. This is for homework and I need help. I respect the honor code, but this is a HOMEWORK question. Thank you!arrow_forwardi need in words (not handwritten) no copy paste please plagrism free textarrow_forward
- Use the data in the table below to answer the following questions. X-variable Y-variable Point 8 A 30 4 40 4 3 F 30 H. Area-Problem #1d Plot the points DHE on the graph. Instructions: Click the 3-point shader tool called "DHE" and then click on the graph. This will place a triang drag each of the end points of the shaded area until the triangle highlights the desired region. DHE 57 Next> 6. 7. of 33 4 Prevarrow_forwardThe following table shows worldwide sales of a certain type of cell phone and their average selling prices in 2012 and 2013. Year 2012 2013 Selling Price ($) 395 325 Sales (millions) 741 1,133 (a) Use the data to obtain a linear demand function for this type of cell phone. (Let p be the price, and let q be the demand). q(p) = Use your demand equation to predict sales if the price is lowered to $255. million phones (b) Fill in the blank. For every $1 increase in price, sales of this type of cell phone decrease by million units.arrow_forwardConstruct a table from the data shown on the graph below. Which is the dependent variable and which the independent variable? Summarize the data in equation form.arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Economics (MindTap Course List)EconomicsISBN:9781337617383Author:Roger A. ArnoldPublisher:Cengage Learning
Economics (MindTap Course List)
Economics
ISBN:9781337617383
Author:Roger A. Arnold
Publisher:Cengage Learning