hw02_revised
.pdf
keyboard_arrow_up
School
University of North Georgia, Dahlonega *
*We aren’t endorsed by this school
Course
MATH-240
Subject
Mathematics
Date
Jan 9, 2024
Type
Pages
19
Uploaded by DukeDragonfly3623
hw02_revised
December 7, 2023
1
Homework 2: Arrays, Table Manipulation, and Visualization
[37]:
# Don't change this cell; just run it.
# When you log-in please hit return (not shift + return) after typing in your
␣
,
→
email
import
numpy
as
np
from
datascience
import
*
# These lines do some fancy plotting magic.\n",
import
matplotlib
%
matplotlib
inline
import
matplotlib.pyplot
as
plots
plots
.
style
.
use(
'fivethirtyeight'
)
Recommended Reading
: *
Data Types
*
Sequences
*
Tables
Please complete this notebook by filling in the cells provided. Throughout this homework and all
future ones, please be sure to not re-assign variables throughout the notebook! For example, if you
use
max_temperature
in your answer to one question, do not reassign it later on.
Before continuing the assignment, select “Save and Checkpoint” in the File menu.
1.1
1. Creating Arrays
Question 1.
Make an array called
weird_numbers
containing the following numbers (in the given
order):
1. -2
2. the sine of 1.2
3. 3
4. 5 to the power of the cosine of 1.2
Hint:
sin
and
cos
are functions in the
math
module.
Note:
Python lists are different/behave differently than numpy arrays. In Data 8, we use numpy
arrays, so please make an
array
, not a python list.
[2]:
# Our solution involved one extra line of code before creating
# weird_numbers.
import
math
1
weird_numbers
=
make_array(
-2
,math
.
sin(
1.2
),
3
,
5**
math
.
cos(
1.2
))
weird_numbers
[2]:
array([-2.
,
0.93203909,
3.
,
1.79174913])
Question 2.
Make an array called
numbers_in_order
using the
np.sort
function.
[115]:
numbers_in_order
=
np
.
sort(weird_numbers)
numbers_in_order
[115]:
array([-2.
,
0.93203909,
1.79174913,
3.
])
Question 3.
Find the mean and median of
weird_numbers
using the
np.mean
and
np.median
functions.
[116]:
weird_mean
=
np
.
mean(weird_numbers)
weird_median
=
np
.
median(weird_numbers)
# These lines are provided just to print out your answers.
print
(
'weird_mean:'
, weird_mean)
print
(
'weird_median:'
, weird_median)
weird_mean: 0.930947052910613
weird_median: 1.361894105821226
1.2
2. Indexing Arrays
These exercises give you practice accessing individual elements of arrays. In Python (and in many
programming languages), elements are accessed by
index
, so the first element is the element at
index 0.
Note:
Please don’t use bracket notation when indexing (i.e.
arr[0]
), as this can yield different
data type outputs than what we will be expecting.
Question 1.
The cell below creates an array of some numbers. Set
third_element
to the third
element of
some_numbers
.
[117]:
some_numbers
=
make_array(
-1
,
-3
,
-6
,
-10
,
-15
)
third_element
=
some_numbers
.
item(
2
)
third_element
[117]:
-6
Question 2.
The next cell creates a table that displays some information about the elements of
some_numbers
and their order. Run the cell to see the partially-completed table, then fill in the
missing information (the cells that say “Ellipsis”) by assigning
blank_a
,
blank_b
,
blank_c
, and
blank_d
to the correct elements in the table.
2
[118]:
blank_a
=
'third'
blank_b
=
'fourth'
blank_c
= 0
blank_d
= 3
elements_of_some_numbers
=
Table()
.
with_columns(
"English name for position"
, make_array(
"first"
,
"second"
, blank_a,
␣
,
→
blank_b,
"fifth"
),
"Index"
,
make_array(blank_c,
1
,
2
, blank_d,
4
),
"Element"
,
some_numbers)
elements_of_some_numbers
[118]:
English name for position | Index | Element
first
| 0
| -1
second
| 1
| -3
third
| 2
| -6
fourth
| 3
| -10
fifth
| 4
| -15
Question 3.
You’ll sometimes want to find the
last
element of an array. Suppose an array has
142 elements. What is the index of its last element?
[119]:
index_of_last_element
= 141
More often, you don’t know the number of elements in an array, its
length
. (For example, it might
be a large dataset you found on the Internet.) The function
len
takes a single argument, an array,
and returns the
len
gth of that array (an integer).
Question 4.
The cell below loads an array called
president_birth_years
. Calling
.column(...)
on a table returns an array of the column specified, in this case the
Birth Year
column of the
president_births
table.
The last element in that array is the most recent birth year of any
deceased president. Assign that year to
most_recent_birth_year
.
[120]:
president_birth_years
=
Table
.
read_table(
"president_births.csv"
)
.
column(
'Birth
␣
,
→
Year'
)
most_recent_birth_year
=
president_birth_years
.
item(
-1
)
most_recent_birth_year
[120]:
1917
Question 5.
Finally, assign
sum_of_birth_years
to the sum of the first, tenth, and last birth
year in
president_birth_years
.
[121]:
sum_of_birth_years
=
president_birth_years
.
item(
0
)
+
president_birth_years
.
,
→
item(
9
)
+
president_birth_years
.
item(
-1
)
sum_of_birth_years
[121]:
5433
3
1.3
3. Basic Array Arithmetic
Question 1.
Multiply the numbers 42, 4224, 42422424, and -250 by 157.
Assign each variable
below such that
first_product
is assigned to the result of
42
∗
157
,
second_product
is assigned
to the result of
4224
∗
157
, and so on.
For this question,
don’t
use arrays.
[3]:
first_product
= 42*157
second_product
= 4224*157
third_product
= 42422424*157
fourth_product
= -250*157
print
(first_product, second_product, third_product, fourth_product)
6594 663168 6660320568 -39250
Question 2.
Now, do the same calculation, but using an array called
numbers
and only a single
multiplication (
*
) operator. Store the 4 results in an array named
products
.
[4]:
numbers
=
make_array(
42
,
4224
,
42422424
,
-250
)
products
=
numbers
* 157
products
[4]:
array([
6594,
663168, 6660320568,
-39250])
Question 3.
Oops, we made a typo! Instead of 157, we wanted to multiply each number by 1577.
Compute the correct products in the cell below using array arithmetic.
Notice that your job is
really easy if you previously defined an array containing the 4 numbers.
[5]:
correct_products
=
numbers
* 1577
correct_products
[5]:
array([
66234,
6661248, 66900162648,
-394250])
Question 4.
We’ve loaded an array of temperatures in the next cell. Each number is the highest
temperature observed on a day at a climate observation station, mostly from the US. Since they’re
from the US government agency
NOAA
, all the temperatures are in Fahrenheit.
Convert them
all to Celsius by first subtracting 32 from them, then multiplying the results by
5
9
. Make sure to
ROUND
the final result after converting to Celsius to the nearest integer using the
np.round
function.
[6]:
max_temperatures
=
Table
.
read_table(
"temperatures.csv"
)
.
column(
"Daily Max
␣
,
→
Temperature"
)
celsius_max_temperatures
=
np
.
round(
5*
(max_temperatures
-32
)
/9
)
celsius_max_temperatures
[6]:
array([-4., 31., 32., …, 17., 23., 16.])
Question 5.
The cell below loads all the
lowest
temperatures from each day (in Fahrenheit).
4
Compute the size of the daily temperature range for each day.
That is, compute the difference
between each daily maximum temperature and the corresponding daily minimum temperature.
Pay attention to the units, give your answer in Celsius!
Make sure
NOT
to round your
answer for this question!
[7]:
min_temperatures
=
Table
.
read_table(
"temperatures.csv"
)
.
column(
"Daily Min
␣
,
→
Temperature"
)
celsius_temperature_ranges
= 5*
(max_temperatures
-
min_temperatures)
/9
celsius_temperature_ranges
[7]:
array([ 6.66666667, 10.
, 12.22222222, …, 17.22222222,
11.66666667, 11.11111111])
1.4
4. World Population
The cell below loads a table of estimates of the world population for different years, starting in
1950. The estimates come from the
US Census Bureau website
.
[8]:
world
=
Table
.
read_table(
"world_population.csv"
)
.
select(
'Year'
,
'Population'
)
world
.
show(
4
)
<IPython.core.display.HTML object>
The name
population
is assigned to an array of population estimates.
[9]:
population
=
world
.
column(
1
)
population
[9]:
array([2557628654, 2594939877, 2636772306, 2682053389, 2730228104,
2782098943, 2835299673, 2891349717, 2948137248, 3000716593,
3043001508, 3083966929, 3140093217, 3209827882, 3281201306,
3350425793, 3420677923, 3490333715, 3562313822, 3637159050,
3712697742, 3790326948, 3866568653, 3942096442, 4016608813,
4089083233, 4160185010, 4232084578, 4304105753, 4379013942,
4451362735, 4534410125, 4614566561, 4695736743, 4774569391,
4856462699, 4940571232, 5027200492, 5114557167, 5201440110,
5288955934, 5371585922, 5456136278, 5538268316, 5618682132,
5699202985, 5779440593, 5857972543, 5935213248, 6012074922,
6088571383, 6165219247, 6242016348, 6318590956, 6395699509,
6473044732, 6551263534, 6629913759, 6709049780, 6788214394,
6866332358, 6944055583, 7022349283, 7101027895, 7178722893,
7256490011])
In this question, you will apply some built-in Numpy functions to this array. Numpy is a module
that is often used in Data Science!
The difference function
np.diff
subtracts each element in an array from the element after it within
5
the array. As a result, the length of the array
np.diff
returns will always be one less than the
length of the input array.
The cumulative sum function
np.cumsum
outputs an array of partial sums. For example, the third
element in the output array corresponds to the sum of the first, second, and third elements.
Question 1.
Very often in data science, we are interested understanding how values change with
time. Use
np.diff
and
np.max
(or just
max
) to calculate the largest annual change in population
between any two consecutive years.
[10]:
largest_population_change
=
np
.
max(np
.
diff(population))
largest_population_change
[10]:
87515824
[11]:
np
.
diff(population)
[11]:
array([37311223, 41832429, 45281083, 48174715, 51870839, 53200730,
56050044, 56787531, 52579345, 42284915, 40965421, 56126288,
69734665, 71373424, 69224487, 70252130, 69655792, 71980107,
74845228, 75538692, 77629206, 76241705, 75527789, 74512371,
72474420, 71101777, 71899568, 72021175, 74908189, 72348793,
83047390, 80156436, 81170182, 78832648, 81893308, 84108533,
86629260, 87356675, 86882943, 87515824, 82629988, 84550356,
82132038, 80413816, 80520853, 80237608, 78531950, 77240705,
76861674, 76496461, 76647864, 76797101, 76574608, 77108553,
77345223, 78218802, 78650225, 79136021, 79164614, 78117964,
77723225, 78293700, 78678612, 77694998, 77767118])
Question 2.
What do the values in the resulting array represent (choose one)?
[12]:
np
.
cumsum(np
.
diff(population))
[12]:
array([
37311223,
79143652,
124424735,
172599450,
224470289,
277671019,
333721063,
390508594,
443087939,
485372854,
526338275,
582464563,
652199228,
723572652,
792797139,
863049269,
932705061, 1004685168, 1079530396, 1155069088,
1232698294, 1308939999, 1384467788, 1458980159, 1531454579,
1602556356, 1674455924, 1746477099, 1821385288, 1893734081,
1976781471, 2056937907, 2138108089, 2216940737, 2298834045,
2382942578, 2469571838, 2556928513, 2643811456, 2731327280,
2813957268, 2898507624, 2980639662, 3061053478, 3141574331,
3221811939, 3300343889, 3377584594, 3454446268, 3530942729,
3607590593, 3684387694, 3760962302, 3838070855, 3915416078,
3993634880, 4072285105, 4151421126, 4230585740, 4308703704,
4386426929, 4464720629, 4543399241, 4621094239, 4698861357])
1) The total population change between consecutive years, starting at 1951.
2) The total population change between 1950 and each later year, starting at 1951.
6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help