hw4
.pdf
keyboard_arrow_up
School
University of Oregon *
*We aren’t endorsed by this school
Course
102
Subject
Statistics
Date
Apr 27, 2024
Type
Pages
8
Uploaded by MajorKookaburaMaster1051
hw4
April 26, 2024
[ ]:
import
otter
grader
=
otter
.
Notebook()
1
Homework 4: Advanced operations in pandas
Due Date: 11:59PM on the date posted to Canvas
Collaboration Policy
Data science is a collaborative activity. While you may talk with others
about the homework, we ask that you
write your solutions individually
. If you do discuss the
assignments with other students please include their names below.
Collaborators:
list collaborators here
Grading
Grading is broken down into autograded answers and free response.
For autograded answers, the results of your code are compared to provided and/or hidden tests.
For autograded probability questions, the provided tests will only check that your
answer is within a reasonable range.
For free response, readers will evaluate how well you answered the question and/or fulfilled the
requirements of the question.
For plots, make sure to be as descriptive as possible: include titles, axes labels, and units wherever
applicable.
[ ]:
import
numpy
as
np
import
pandas
as
pd
import
matplotlib
import
matplotlib.pyplot
as
plt
import
seaborn
as
sns
'imports completed'
1.1
Introduction
The purpose of this module is to expand your ‘pandas’ skillset by performing various new and old
operations on ‘pandas’ dataframes. A lot of these operations will be things you’ve done before in
the
datascience
package, so you should reference the included notebook to translate between the
two if need be.
1
You are expected to answer all relevant questions programatically
i.e.
use indexing and func-
tions/methods to arrive to your answers. Your answers don’t need to be in one single line, you may
use as many intermediate steps as you need.
1.1.1
Question 1
Reading in data from file is made easy in the
pandas
package. We have included two datasets in
your assignment folder to read in, ‘broadway.csv’ and ‘diseases.txt’.
Question 1.1
Read in broadway using
pd.read_csv
.
[ ]:
broadway
= ...
broadway
.
head(
6
)
[ ]:
grader
.
check(
"q1_1"
)
Question 1.2
Now read in the diseases dataset. Diseases is not a
.csv
but a
.txt
file
i.e.
a plain-
text file. Because it’s not
.csv
, we can’t assume that the values are comma separated. Fortunately
pd.read_csv
can be used on any file. It may not parse the data correctly, but it may reveal the
values that do separate entries.
Identify the separator used in
diseases.txt
and use it to successfully read in your data with
pd.read_csv
.
[ ]:
separator
= ...
diseases
=
pd
.
read_csv(
"diseases.txt"
, sep
= ...
)
diseases
.
head(
6
)
[ ]:
grader
.
check(
"q1_2"
)
Question
1.3
Read
in
the
the
DataFrame
called
nst-est2016-alldata.csv
from
the
course
Github.
The
url
path
to
the
repository
is
https://github.com/oregon-data-
science/DSCI101/raw/main/data/. You should do this with
pd.read_csv
.
[ ]:
pop_census
= ...
[ ]:
grader
.
check(
"q1_3"
)
This DataFrame gives census-based population estimates for each state on both July 1, 2015 and
July 1, 2016. The last four columns describe the components of the estimated change in population
during this time interval.
For all questions below, assume that the word “states” refers
to all 52 rows including Puerto Rico & the District of Columbia.
The data was taken from
here
.
If you want to read more about the different column descriptions, click
here
!
The raw data is a bit messy - run the cell below to clean the DataFrame and make it easier to work
with.
2
[ ]:
# Don't change this cell; just run it.
pop_sum_level
=
pop_census[
'SUMLEV'
]
== 40
pop
=
pop_census[pop_sum_level]
# grab a numbered list of columns to use
columns_to_use
=
pop
.
columns[[
1
,
4
,
12
,
13
,
27
,
34
,
62
,
69
]]
pop
=
pop[columns_to_use]
pop
=
pop
.
rename(columns
=
{
'POPESTIMATE2015'
:
'2015'
,
'POPESTIMATE2016'
:
'2016'
,
'BIRTHS2016'
:
'BIRTHS'
,
'DEATHS2016'
:
'DEATHS'
,
'NETMIG2016'
:
'MIGRATION'
,
'RESIDUAL2016'
:
'OTHER'
})
#pop['REGION'].unique()
pop[
'REGION'
]
=
pop[
'REGION'
]
.
replace({
'1'
:
1
,
'2'
:
2
,
'3'
:
3
,
'4'
:
4
,
'X'
:
0
})
pop
.
head(
12
)
1.1.2
Question 2 - Census data
Question 2.1
Assign
us_birth_rate
to the total US annual birth rate during this time interval.
The annual birth rate for a year-long period is the total number of births in that period as a
proportion of the population size at the start of the time period.
Hint:
Which year corresponds to the start of the time period?
[ ]:
us_birth_rate
= ...
us_birth_rate
[ ]:
grader
.
check(
"q2_1"
)
Question 2.2
Assign
movers
to the number of states for which the
absolute value
(
np.abs
) of
the
annual rate of migration
was higher than 1%. The annual rate of migration for a year-long
period is the net number of migrations (in and out) as a proportion of the population size at the
start of the period.
The
MIGRATION
column contains estimated annual net migration counts by
state.
[ ]:
...
movers
= ...
movers
[ ]:
grader
.
check(
"q2_2"
)
Question 2.3
Assign
west_births
to the total number of births that occurred in region 4 (the
Western US).
Hint:
Make sure you double check the type of the values in the region column, and appropriately
filter (i.e. the types must match!).
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Questions
Please solve the questions
I have attached the data in the screenshot and the codes that can be used for the questions.
Thank you.
arrow_forward
Part 2. Refer to the Excel file Cereal data set to complete the following tasks. All results and explanations need to be reported within this Word document after each question. Make sure to use complete sentences when explaining your results. Your results should be formatted and edited.
Data Set: Cereals
The data set shows the name of different brands of cereals, the manufacturers, the total calories, proteins, sugar, fat, potassium, sodium, location of the shelf in the supermarket, etc. The amount of sugar, protein, etc., is measured in grams (g).
Exercise 1:
A. Construct a frequency distribution and a bar graph for the cereal manufactures (mfr). Include the relative frequencies. Edit and format the graph and include appropriate labels for the horizontal and vertical axes. Describe your findings in the context of the problem (Include which manufacturer produces the most cereals and least number of cereals in the cereal market).
N = Nabisco, K = Kellog’s, Q = Quaker Oats…
arrow_forward
Class: 2nd
3/11/21
Date:
ID: A
Name: Oallas F-B
MSA- Final Quiz Review
Answer
1 Mark opens a bank account with $10. Each week he plans to put in $2.
a. Make a table to show the total amount of money Mark has in his bank account. Show the amount he
has in his account from 0 to 10 weeks.
b. Make a graph that matches the table.
Table
Graph
c. Write an equation to represent the total money Mark has in his account over time.
1.
What is the y-intercept for this equation? What does it tell you?
2. What is the rate of change for the amount of money that he saved?
d. In which week will Mark have a total of $20. Explain your reasoning.
arrow_forward
Please help me answer a,b and c with complete solution
arrow_forward
how would you find SV (number 8) ?
arrow_forward
Tn Poodll Record MP3..
EDUTYPING LINK .
Kami Schoolo...
STAAR+2019.pdf
athematics
GO ON
age 13
7 Aquarium I contains 4.6 gallons of water. Louise will begin filling Aquarium I at a rate of
1.2 gallons per minute.
Aquarium II contains 54.6 gallons of water. Isaac will begin draining Aquarium II at a rate of
0.8 gallon per minute.
+
After how many minutes will both aquariums contain the same amount of water?
A 148 min
125 min
C
25 min
D 50 min
arrow_forward
There are 4 parts to this question ok
arrow_forward
Only typed solution
arrow_forward
Republic of the Phi
Depart
Ms. Kaye of the Lakwatsera tours offers a group rate of Php 24000 per person for a week in Cebu if
16people sign up for a tour. For each additional person who signs up, the price per person is reduced by
Php 1000. How many person should sign up for the tour in order for the travel agent to maximize his
revenue?
Score
20
Solutions are complete and accurate.
Descriptors
arrow_forward
Can I please get some help! I have no idea hie to do this problem. I’m reposting it, it’s been 8 plus hours.
Thank you!
arrow_forward
Please show the solution
arrow_forward
Option 1: Inyestigate salaries in a career that you choose in cities across
the country. Use the table to help organize your information.
• Choose 5 cities in the United States.
• Choose a career you would like to investigate.
Find the average salary for the career you are investigating for each
of those cities.
o Include a link to the sources of your information.
• Find a way to compare the cost of living in each of those cities. You
can use CPI numbers like you did in Lesson 5B (this link may help:
https://www.bls.gov/regions/subjects/consumer-price-
indexes.htm) or other sources to find the cost of a "basket of goods"
that is relevant to your search.
o Include links to the sources of your information.
Based on your findings, compare and contrast the value of the salary
in each city.
o You may want to use a table or spreadsheet like you did in
lessons 5A or 5B. Include this in your submission.
• Why do you think the value of the salary paid for the same career
may be different in…
arrow_forward
The bar graph shows elavations (in feet) of several lakes in North America.(Note: The Salton Sea is actually a lake located in California.) Michelle claims that Lake Pontchartrain and Lake Okeechobee are the two lakes closest to sea level. Explain why Michelle’s reasoning is incorrect? Provide the correct answer. Use appropriate mathematical language to justify your response.
arrow_forward
Hi i’m in grade 12 Data Management and i need help with this practice question
please use permutations if they apply
arrow_forward
newq1 please make the answers distinct from the problem solving. box the answers.
arrow_forward
please answer the first 2 question thank you
arrow_forward
Can you please help me with this
arrow_forward
Please explain each step clearly, and no excel formula should be used for solving this problem
arrow_forward
(THIS QUESTION IS POSTED TWICE DUE TO THE AMOUNT OF SECTIONS, PLEASE ANSWER A-C AND THEN D-F. THANK YOU)
An instructor asked a random sample of eight students to record their study times at the beginning of a course. She then made a table for total hours studied (x) over 2 weeks and test score (y) at the end of the 2 weeks. The table is given below. Complete parts (a) through (f).
x
9
13
10
18
6
18
14
23
y
94
82
80
76
85
82
88
77
y = _____ + (___________)x
(round to two decimal places as needed)
b. Graph the regression equation and the data points.
c. Describe the apparent relationship between the two variables.
A. Test Score does not appear to change as Hours Studied increases.
B. Test Score tends to increase as Hours Studied increases.
C. Test Score tends to decrease as Hours Studied increases.
D. There is no apparent relationship between Test Score and Hours Studied.
a. Find the regression equation for…
arrow_forward
Esp
An airplane travels 2028 kilometers against the wind in 3 hours and 2298 kilometers with the wind in the same amount of time. What is the rate of the plane in
still air and what is the rate of the wind?
Note that the ALEKS graphing calculator can be used to make computations easier.
km
Rate of the plane in still air:
團
km
Rate of the wind:
Explanation
Check
2020 McGraw-H Education. All Rights Reserved Terms of Use Privacy Accessibility
MacBook Air
F9
23
%24
2
3
5
6.
7
8
%3D
Q
W
E
R
Y
U
P
A
S
D
F
G
K
ck
C
В
N
M
alt
alt
trol
option
command
command
option
I
>
N
arrow_forward
create a diagram of the situation 81% of college students take online classes, 36% online students have gained more than 5 pounds in last 6 months 15% students did not take online gained 5 pounds in the last 6 months.
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Elementary Algebra
Algebra
ISBN:9780998625713
Author:Lynn Marecek, MaryAnne Anthony-Smith
Publisher:OpenStax - Rice University
Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL
Algebra for College Students
Algebra
ISBN:9781285195780
Author:Jerome E. Kaufmann, Karen L. Schwitters
Publisher:Cengage Learning
Intermediate Algebra
Algebra
ISBN:9781285195728
Author:Jerome E. Kaufmann, Karen L. Schwitters
Publisher:Cengage Learning
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt
Related Questions
- Please solve the questions I have attached the data in the screenshot and the codes that can be used for the questions. Thank you.arrow_forwardPart 2. Refer to the Excel file Cereal data set to complete the following tasks. All results and explanations need to be reported within this Word document after each question. Make sure to use complete sentences when explaining your results. Your results should be formatted and edited. Data Set: Cereals The data set shows the name of different brands of cereals, the manufacturers, the total calories, proteins, sugar, fat, potassium, sodium, location of the shelf in the supermarket, etc. The amount of sugar, protein, etc., is measured in grams (g). Exercise 1: A. Construct a frequency distribution and a bar graph for the cereal manufactures (mfr). Include the relative frequencies. Edit and format the graph and include appropriate labels for the horizontal and vertical axes. Describe your findings in the context of the problem (Include which manufacturer produces the most cereals and least number of cereals in the cereal market). N = Nabisco, K = Kellog’s, Q = Quaker Oats…arrow_forwardClass: 2nd 3/11/21 Date: ID: A Name: Oallas F-B MSA- Final Quiz Review Answer 1 Mark opens a bank account with $10. Each week he plans to put in $2. a. Make a table to show the total amount of money Mark has in his bank account. Show the amount he has in his account from 0 to 10 weeks. b. Make a graph that matches the table. Table Graph c. Write an equation to represent the total money Mark has in his account over time. 1. What is the y-intercept for this equation? What does it tell you? 2. What is the rate of change for the amount of money that he saved? d. In which week will Mark have a total of $20. Explain your reasoning.arrow_forward
- Please help me answer a,b and c with complete solutionarrow_forwardhow would you find SV (number 8) ?arrow_forwardTn Poodll Record MP3.. EDUTYPING LINK . Kami Schoolo... STAAR+2019.pdf athematics GO ON age 13 7 Aquarium I contains 4.6 gallons of water. Louise will begin filling Aquarium I at a rate of 1.2 gallons per minute. Aquarium II contains 54.6 gallons of water. Isaac will begin draining Aquarium II at a rate of 0.8 gallon per minute. + After how many minutes will both aquariums contain the same amount of water? A 148 min 125 min C 25 min D 50 minarrow_forward
- There are 4 parts to this question okarrow_forwardOnly typed solutionarrow_forwardRepublic of the Phi Depart Ms. Kaye of the Lakwatsera tours offers a group rate of Php 24000 per person for a week in Cebu if 16people sign up for a tour. For each additional person who signs up, the price per person is reduced by Php 1000. How many person should sign up for the tour in order for the travel agent to maximize his revenue? Score 20 Solutions are complete and accurate. Descriptorsarrow_forward
- Can I please get some help! I have no idea hie to do this problem. I’m reposting it, it’s been 8 plus hours. Thank you!arrow_forwardPlease show the solutionarrow_forwardOption 1: Inyestigate salaries in a career that you choose in cities across the country. Use the table to help organize your information. • Choose 5 cities in the United States. • Choose a career you would like to investigate. Find the average salary for the career you are investigating for each of those cities. o Include a link to the sources of your information. • Find a way to compare the cost of living in each of those cities. You can use CPI numbers like you did in Lesson 5B (this link may help: https://www.bls.gov/regions/subjects/consumer-price- indexes.htm) or other sources to find the cost of a "basket of goods" that is relevant to your search. o Include links to the sources of your information. Based on your findings, compare and contrast the value of the salary in each city. o You may want to use a table or spreadsheet like you did in lessons 5A or 5B. Include this in your submission. • Why do you think the value of the salary paid for the same career may be different in…arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw HillElementary AlgebraAlgebraISBN:9780998625713Author:Lynn Marecek, MaryAnne Anthony-SmithPublisher:OpenStax - Rice UniversityHolt Mcdougal Larson Pre-algebra: Student Edition...AlgebraISBN:9780547587776Author:HOLT MCDOUGALPublisher:HOLT MCDOUGAL
- Algebra for College StudentsAlgebraISBN:9781285195780Author:Jerome E. Kaufmann, Karen L. SchwittersPublisher:Cengage LearningIntermediate AlgebraAlgebraISBN:9781285195728Author:Jerome E. Kaufmann, Karen L. SchwittersPublisher:Cengage LearningBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin Harcourt
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Elementary Algebra
Algebra
ISBN:9780998625713
Author:Lynn Marecek, MaryAnne Anthony-Smith
Publisher:OpenStax - Rice University
Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL
Algebra for College Students
Algebra
ISBN:9781285195780
Author:Jerome E. Kaufmann, Karen L. Schwitters
Publisher:Cengage Learning
Intermediate Algebra
Algebra
ISBN:9781285195728
Author:Jerome E. Kaufmann, Karen L. Schwitters
Publisher:Cengage Learning
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt