hw02_revised

.pdf

School

University of North Georgia, Dahlonega *

*We aren’t endorsed by this school

Course

1001

Subject

Computer Science

Date

Dec 6, 2023

Type

pdf

Pages

18

Uploaded by ProfessorIron11938

Report
hw02_revised September 15, 2023 1 Homework 2: Arrays, Table Manipulation, and Visualization [4]: # Don't change this cell; just run it. # When you log-in please hit return (not shift + return) after typing in your , email import numpy as np from datascience import * # These lines do some fancy plotting magic.\n", import matplotlib % matplotlib inline import matplotlib.pyplot as plots plots . style . use( 'fivethirtyeight' ) Recommended Reading : * Data Types * Sequences * Tables Please complete this notebook by filling in the cells provided. Throughout this homework and all future ones, please be sure to not re-assign variables throughout the notebook! For example, if you use max_temperature in your answer to one question, do not reassign it later on. Before continuing the assignment, select “Save and Checkpoint” in the File menu. 1.1 1. Creating Arrays Question 1. Make an array called weird_numbers containing the following numbers (in the given order): 1. -2 2. the sine of 1.2 3. 3 4. 5 to the power of the cosine of 1.2 Hint: sin and cos are functions in the math module. Note: Python lists are different/behave differently than numpy arrays. In Data 8, we use numpy arrays, so please make an array , not a python list. [5]: # Our solution involved one extra line of code before creating # weird_numbers. ... 1
weird_numbers = make_array( -2 , np . sin( 1.2 ), 3 , 5** np . cos( 1.2 )) weird_numbers [5]: array([-2. , 0.93203909, 3. , 1.79174913]) Question 2. Make an array called numbers_in_order using the np.sort function. [6]: numbers_in_order = make_array(np . sort(weird_numbers)) numbers_in_order [6]: array([[-2. , 0.93203909, 1.79174913, 3. ]]) Question 3. Find the mean and median of weird_numbers using the np.mean and np.median functions. [7]: weird_mean = np . mean(weird_numbers) weird_median = np . median(weird_numbers) # These lines are provided just to print out your answers. print ( 'weird_mean:' , weird_mean) print ( 'weird_median:' , weird_median) weird_mean: 0.930947052910613 weird_median: 1.361894105821226 1.2 2. Indexing Arrays These exercises give you practice accessing individual elements of arrays. In Python (and in many programming languages), elements are accessed by index , so the first element is the element at index 0. Note: Please don’t use bracket notation when indexing (i.e. arr[0] ), as this can yield different data type outputs than what we will be expecting. Question 1. The cell below creates an array of some numbers. Set third_element to the third element of some_numbers . [8]: some_numbers = make_array( -1 , -3 , -6 , -10 , -15 ) third_element = some_numbers . item( 2 ) third_element [8]: -6 Question 2. The next cell creates a table that displays some information about the elements of some_numbers and their order. Run the cell to see the partially-completed table, then fill in the missing information (the cells that say “Ellipsis”) by assigning blank_a , blank_b , blank_c , and blank_d to the correct elements in the table. 2
[9]: blank_a = "third" blank_b = "fourth" blank_c = 0 blank_d = 3 elements_of_some_numbers = Table() . with_columns( "English name for position" , make_array( "first" , "second" , blank_a, , blank_b, "fifth" ), "Index" , make_array(blank_c, 1 , 2 , blank_d, 4 ), "Element" , some_numbers) elements_of_some_numbers [9]: English name for position | Index | Element first | 0 | -1 second | 1 | -3 third | 2 | -6 fourth | 3 | -10 fifth | 4 | -15 Question 3. You’ll sometimes want to find the last element of an array. Suppose an array has 142 elements. What is the index of its last element? [10]: index_of_last_element = 141 index_of_last_element [10]: 141 More often, you don’t know the number of elements in an array, its length . (For example, it might be a large dataset you found on the Internet.) The function len takes a single argument, an array, and returns the len gth of that array (an integer). Question 4. The cell below loads an array called president_birth_years . Calling .column(...) on a table returns an array of the column specified, in this case the Birth Year column of the president_births table. The last element in that array is the most recent birth year of any deceased president. Assign that year to most_recent_birth_year . [11]: president_birth_years = Table . read_table( "president_births.csv" ) . column( 'Birth , Year' ) most_recent_birth_year = president_birth_years . item( 37 ) most_recent_birth_year [11]: 1917 [12]: president_birth_years = Table . read_table( "president_births.csv" ) . column( 'Birth , Year' ) president_birth_years 3
[12]: array([1732, 1735, 1743, 1751, 1758, 1767, 1767, 1773, 1782, 1784, 1790, 1791, 1795, 1800, 1804, 1808, 1809, 1822, 1822, 1829, 1831, 1833, 1837, 1843, 1856, 1857, 1858, 1865, 1872, 1874, 1882, 1884, 1890, 1908, 1911, 1913, 1913, 1917]) Question 5. Finally, assign sum_of_birth_years to the sum of the first, tenth, and last birth year in president_birth_years . [13]: sum_of_birth_years = president_birth_years . item( 0 ) + president_birth_years . , item( 9 ) + president_birth_years . item( 37 ) sum_of_birth_years [13]: 5433 1.3 3. Basic Array Arithmetic Question 1. Multiply the numbers 42, 4224, 42422424, and -250 by 157. Assign each variable below such that first_product is assigned to the result of 42 157 , second_product is assigned to the result of 4224 157 , and so on. For this question, don’t use arrays. [14]: first_product = 42*157 second_product = 4224*157 third_product = 42422424*157 fourth_product = -250*157 print (first_product, second_product, third_product, fourth_product) 6594 663168 6660320568 -39250 Question 2. Now, do the same calculation, but using an array called numbers and only a single multiplication ( * ) operator. Store the 4 results in an array named products . [15]: numbers = make_array( 42 , 4224 , 42422424 , -250 ) products = numbers *157 products [15]: array([ 6594, 663168, 6660320568, -39250]) Question 3. Oops, we made a typo! Instead of 157, we wanted to multiply each number by 1577. Compute the correct products in the cell below using array arithmetic. Notice that your job is really easy if you previously defined an array containing the 4 numbers. [16]: correct_products = numbers *1577 correct_products [16]: array([ 66234, 6661248, 66900162648, -394250]) Question 4. We’ve loaded an array of temperatures in the next cell. Each number is the highest temperature observed on a day at a climate observation station, mostly from the US. Since they’re 4
from the US government agency NOAA , all the temperatures are in Fahrenheit. Convert them all to Celsius by first subtracting 32 from them, then multiplying the results by 5 9 . Make sure to ROUND the final result after converting to Celsius to the nearest integer using the np.round function. [17]: max_temperatures = Table . read_table( "temperatures.csv" ) . column( "Daily Max , Temperature" ) celsius_max_temperatures = np . round((max_temperatures) -32 ) *5/9 celsius_max_temperatures [17]: array([-3.88888889, 30.55555556, 31.66666667, …, 16.66666667, 22.77777778, 16.11111111]) Question 5. The cell below loads all the lowest temperatures from each day (in Fahrenheit). Compute the size of the daily temperature range for each day. That is, compute the difference between each daily maximum temperature and the corresponding daily minimum temperature. Pay attention to the units, give your answer in Celsius! Make sure NOT to round your answer for this question! [18]: min_temperatures = Table . read_table( "temperatures.csv" ) . column( "Daily Min , Temperature" ) celsius_temperature_ranges = ((max_temperatures) -32 ) *5/9 - , ((min_temperatures) -32 ) *5/9 celsius_temperature_ranges [18]: array([ 6.66666667, 10. , 12.22222222, …, 17.22222222, 11.66666667, 11.11111111]) 1.4 4. World Population The cell below loads a table of estimates of the world population for different years, starting in 1950. The estimates come from the US Census Bureau website . [19]: world = Table . read_table( "world_population.csv" ) . select( 'Year' , 'Population' ) world . show( 4 ) <IPython.core.display.HTML object> The name population is assigned to an array of population estimates. [20]: population = world . column( 1 ) population [20]: array([2557628654, 2594939877, 2636772306, 2682053389, 2730228104, 2782098943, 2835299673, 2891349717, 2948137248, 3000716593, 3043001508, 3083966929, 3140093217, 3209827882, 3281201306, 5
3350425793, 3420677923, 3490333715, 3562313822, 3637159050, 3712697742, 3790326948, 3866568653, 3942096442, 4016608813, 4089083233, 4160185010, 4232084578, 4304105753, 4379013942, 4451362735, 4534410125, 4614566561, 4695736743, 4774569391, 4856462699, 4940571232, 5027200492, 5114557167, 5201440110, 5288955934, 5371585922, 5456136278, 5538268316, 5618682132, 5699202985, 5779440593, 5857972543, 5935213248, 6012074922, 6088571383, 6165219247, 6242016348, 6318590956, 6395699509, 6473044732, 6551263534, 6629913759, 6709049780, 6788214394, 6866332358, 6944055583, 7022349283, 7101027895, 7178722893, 7256490011]) In this question, you will apply some built-in Numpy functions to this array. Numpy is a module that is often used in Data Science! The difference function np.diff subtracts each element in an array from the element after it within the array. As a result, the length of the array np.diff returns will always be one less than the length of the input array. The cumulative sum function np.cumsum outputs an array of partial sums. For example, the third element in the output array corresponds to the sum of the first, second, and third elements. Question 1. Very often in data science, we are interested understanding how values change with time. Use np.diff and np.max (or just max ) to calculate the largest annual change in population between any two consecutive years. [21]: largest_population_change = np . max(np . diff(population)) largest_population_change [21]: 87515824 Question 2. What do the values in the resulting array represent (choose one)? [22]: np . cumsum(np . diff(population)) [22]: array([ 37311223, 79143652, 124424735, 172599450, 224470289, 277671019, 333721063, 390508594, 443087939, 485372854, 526338275, 582464563, 652199228, 723572652, 792797139, 863049269, 932705061, 1004685168, 1079530396, 1155069088, 1232698294, 1308939999, 1384467788, 1458980159, 1531454579, 1602556356, 1674455924, 1746477099, 1821385288, 1893734081, 1976781471, 2056937907, 2138108089, 2216940737, 2298834045, 2382942578, 2469571838, 2556928513, 2643811456, 2731327280, 2813957268, 2898507624, 2980639662, 3061053478, 3141574331, 3221811939, 3300343889, 3377584594, 3454446268, 3530942729, 3607590593, 3684387694, 3760962302, 3838070855, 3915416078, 3993634880, 4072285105, 4151421126, 4230585740, 4308703704, 4386426929, 4464720629, 4543399241, 4621094239, 4698861357]) 1) The total population change between consecutive years, starting at 1951. 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help