Lab_Assignment_2 - 2

.html

School

Pennsylvania State University *

*We aren’t endorsed by this school

Course

200

Subject

Computer Science

Date

Dec 6, 2023

Type

html

Pages

30

Uploaded by GeneralSummer13484

Report
DS200: Introduction to Data Sciences Lab Assignment 2: Loops, Tables, Visualizations (2 points) First, let's import the Python modules needed for this assignment. Please remember that you need to import the modules again every time when you restart your kernel, runtime, or session. In [1]: from datascience import * import matplotlib matplotlib.use('Agg') %matplotlib inline import matplotlib.pyplot as plots plots.style.use('fivethirtyeight') import numpy as np Part 1: Random Choice (Chapter 9) NumPy has a function np.random.choice(...) that can be used to pick one item at random from a given array. It is equally likely to pick any of the items in the array. Here is an example. Imagine that one day, when you get home after a long day, you see a hot bowl of nachos waiting on the dining table! Let's say that whenever you take a nacho from the bowl, it will either have only cheese , only salsa , both cheese and salsa, or neither cheese nor salsa (a sad tortilla chip indeed). Let's try and simulate taking nachos from the bowl at random using the function, np.random.choice(...) . Run the cell below three times, and observe how the results may differ between these runs. In [2]: nachos = make_array('cheese', 'salsa', 'both', 'neither') np.random.choice(nachos) Out[2]: 'cheese' In [3]: np.random.choice(nachos) Out[3]: 'cheese' In [4]: np.random.choice(nachos) Out[4]: 'both' To repeat this process multiple times, pass in an int n as the second argument. By default, np.random.choice samples with replacement and returns an array of items. Run the next cell to see an example of sampling with replacement 10 times from the nachos array. In [5]: np.random.choice(nachos, 10) Out[5]: array(['salsa', 'cheese', 'salsa', 'neither', 'salsa', 'cheese', 'both', 'both', 'neither', 'cheese'], dtype='<U7')
Next, let's use np.random.choice to simulate one roll of a fair die. The following code cell gives a statement that simulates rolling a die once and records the number of spots on the die in a variable x . You can run it multiple times to see how variable the results are. In [6]: x = np.random.choice(np.arange(1, 7)) x Out[6]: 6 Problem 1: Rolling a Fair Die 10 Times (0.25 points) Write an expression that rolls a die 10 times and return the results in an array. In [7]: # write code for Problem 1 in this cell x = np.random.choice(np.arange(1, 7), 10) x Out[7]: array([6, 3, 1, 5, 5, 3, 4, 6, 2, 5]) Part 2: Python Loops (Chapter 9.2) Iteration It is often the case in programming – especially when dealing with randomness – that we want to repeat a process multiple times. For example, let's consider the game of betting on one roll of a die with the following rules: If the die shows 1 or 2 spots, my net gain is -1 dollar. If the die shows 3 or 4 spots, my net gain is 0 dollars. If the die shows 5 or 6 spots, my net gain is 1 dollar. The function bet_on_one_roll takes no argument. Each time it is called, it simulates one roll of a fair die and returns the net gain in dollars. In [8]: def bet_on_one_roll(): """Returns my net gain on one bet""" x = np.random.choice(np.arange(1, 7)) # roll a die once and record the number of spots if x <= 2: return -1 elif x <= 4: return 0 elif x <= 6: return 1 Playing this game once is easy: In [9]: bet_on_one_roll() Out[9]: -1 To get a sense of how variable the results are, we have to play the game over and over again. We could run the cell repeatedly, but that's tedious, and if we wanted to do it a thousand times or a million times, forget it. A more automated solution is to use a for statement to loop over the contents of a sequence. This is called iteration . A for statement begins with the word for , followed by a name we want to give each item in the sequence, followed by the word in , and ending with an expression that evaluates to a sequence. The indented body of the
for statement is executed once for each item in that sequence . The code cell below gives an example. In [10]: for animal in make_array('cat', 'dog', 'rabbit'): print(animal) cat dog rabbit It is helpful to write code that exactly replicates a for statement, without using the for statement. This is called unrolling the loop. A for statement simply replicates the code inside it, but before each iteration, it assigns a new value from the given sequence to the name we chose. For example, here is an unrolled version of the loop above. In [11]: animal = make_array('cat', 'dog', 'rabbit').item(0) print(animal) animal = make_array('cat', 'dog', 'rabbit').item(1) print(animal) animal = make_array('cat', 'dog', 'rabbit').item(2) print(animal) cat dog rabbit Notice that the name animal is arbitrary, just like any name we assign with = . Here we use a for statement in a more realistic way: we print the results of betting five times on the die as described earlier. This is called simulating the results of five bets. We use the word simulating to remind ourselves that we are not physically rolling dice and exchanging money but using Python to mimic the process. To repeat a process n times, it is common to use the sequence np.arange(n) in the for statement. It is also common to use a very short name for each item. In our code we will use the name i to remind ourselves that it refers to an item. In [12]: for i in np.arange(5): print(bet_on_one_roll()) 1 1 -1 1 0 In this case, we simply perform exactly the same (random) action several times, so the code in the body of our for statement does not actually refer to i . The iteration variable i can be used in the indented body of a loop. The code cell below gives an example. In [13]: nums = np.arange(5) sum_nums = 0 for i in nums: sum_nums = sum_nums + i
print('Sum of the first four positive integers is: ' + str(sum_nums)) Sum of the first four positive integers is: 10 Problem 2A: Iterating over a Custom Array (0.25 points) Create an array that contains the items "Apple", "Banana", "Kiwi", and "Orange". Write a for loop to iterate over the items in the array and print them one by one. In [14]: # write code for Problem 2A in this cell fruits = make_array('Apple', 'Banana', 'Kiwi', 'Orange') for item in fruits: print(item) Apple Banana Kiwi Orange Augmenting Arrays While the for statement above does simulate the results of five bets, the results are simply printed and are not in a form that we can use for computation. An array of results would be more useful. Thus a typical use of a for statement is to create an array of results, by augmenting the array each time. The append method in NumPy helps us do this. The call np.append(array_name, value) evaluates to a new array that is array_name augmented by value . When you use append , keep in mind that all the entries of an array must have the same type. In [15]: pets = make_array('Cat', 'Dog') np.append(pets, 'Another Pet') Out[15]: array(['Cat', 'Dog', 'Another Pet'], dtype='<U11') This keeps the array pets unchanged: In [16]: pets Out[16]: array(['Cat', 'Dog'], dtype='<U3') But often while using for loops it will be convenient to mutate an array – that is, change it – when augmenting it. This is done by assigning the augmented array to the same name as the original. In [17]: pets = np.append(pets, 'Another Pet') pets Out[17]: array(['Cat', 'Dog', 'Another Pet'], dtype='<U11') Problem 2B: Creating a New Array by Augmenting (0.25 points) Use np.append to create an array with letters A through E , by adding the letters one by one to the array, starting from an empty array. In [18]: # write code for Problem 2B in this cell alphabets = make_array() alphabets = np.append(alphabets, 'A')
alphabets = np.append(alphabets, 'B') alphabets = np.append(alphabets, 'C') alphabets = np.append(alphabets, 'D') alphabets = np.append(alphabets, 'E') Example: Betting on 5 Rolls We can now simulate five bets on the die and collect the results in an array that we will call the collection array . We will start out by creating an empty array for this, and then append the outcome of each bet. Notice that the body of the for loop contains two statements. Both statements are executed for each item in the given sequence. In [19]: outcomes = make_array() for i in np.arange(5): outcome_of_bet = bet_on_one_roll() outcomes = np.append(outcomes, outcome_of_bet) outcomes Out[19]: array([ 0., 1., 1., 1., 1.]) As shown in the example above, the indented body of a for statement can contain multiple statements and each of the statements is executed for each item in the given sequence. By capturing the results in an array, we have given ourselves the ability to use array methods to do computations. For example, we can use np.count_nonzero to count the number of times money changed hands. In [20]: np.count_nonzero(outcomes) Out[20]: 4 Betting on 300 Rolls Iteration is a powerful technique. For example, we can see the variation in the results of 300 bets by running bet_on_one_roll for 300 times intead of five. In [21]: outcomes = make_array() for i in np.arange(300): outcome_of_bet = bet_on_one_roll() outcomes = np.append(outcomes, outcome_of_bet) outcomes Out[21]: array([ 1., 0., 0., 1., 1., 1., -1., 0., -1., 1., -1., 1., 1., 1., -1., 1., 1., 1., 0., 0., 1., -1., -1., 1., 0., 0., 0., -1., 0., 1., 0., -1., -1., 1., 0., 0., 1., 0., -1., 0., 0., -1., 1., 1., 0., 1., -1., -1., 1., 1., -1., -1., 0., 0., 1., -1., 1., 1., 1., 0., -1., 1., 1., -1., 0., 1., 1., -1., 0., 1., 0., -1., 0., 0., 0., 1., 0., 1., 0., 1., 0., 1., -1., -1., -1., 0., -1., 1., 1., -1., -1., -1., 0., 1., 0., -1., -1., -1., 1., 0., 1., 1., 0., -1., -1., -1., 1., 0., 1., -1., -1., 1., -1., -1., -1., 0., 1., -1., -1., 1., 1., -1., -1., -1., 0., 1., -1., -1., -1., 1.,
1., 0., 1., 0., 1., 1., 1., -1., 1., 0., 1., -1., 0., 0., 1., -1., 0., 1., 0., 0., 1., 1., -1., 0., 1., 1., 0., 1., -1., 1., -1., 0., -1., -1., -1., -1., -1., -1., -1., 1., -1., -1., 1., -1., 1., 1., -1., 1., 0., 1., 1., 1., 0., -1., 0., 1., -1., -1., -1., 0., 0., 1., 0., 0., 0., 0., -1., 1., -1., 1., -1., 0., -1., 1., 0., 1., 0., 1., -1., 0., -1., 0., 0., 1., 1., -1., -1., -1., 0., -1., -1., 0., -1., 0., 1., -1., 1., -1., 1., 0., 1., -1., 1., 1., 0., 1., 0., 0., 1., -1., -1., 1., 0., -1., 1., 0., 1., 1., 0., 0., 0., 0., 1., 0., -1., 0., 1., 1., 1., -1., 0., 1., 1., -1., 0., 1., 0., 1., 0., 1., -1., 0., 1., 1., 1., -1., -1., 0., 1., 1., 0., 0., -1., 0., 1., 0., -1., 0., -1., 1., 1., -1., 0., 0., 1., 0., 1., 0., 1., 1.]) The array outcomes now contains the results of all 300 bets. In [22]: len(outcomes) Out[22]: 300 In [23]: for i in np.arange(5): print(i) 0 1 2 3 4 Problem 2C: Probability of Non-Zero Gain (0.25 points) Write an expression that uses the 300 simulated outcomes to estimate the probability that our net gain for a bet is not zero. In [24]: # write code for Problem 2C in this cell outcomes = make_array() for i in np.arange(300): outcome_of_bet = bet_on_one_roll() outcomes = np.append(outcomes, outcome_of_bet) probability = np.count_nonzero(outcomes) / 300 print(probability) 0.6533333333333333 Part 3: Selecting Rows from Table (Chapter 6.2) In Lab Assignment 1, we practiced the Table method take , which takes a specific set of rows from a table. Its argument is a row index or an array of indices, and it creates a new table consisting of only those rows. For example, if we wanted just the first row of movies, we could use take as follows. In [25]: movies = Table.read_table('IMDB_movies.csv') movies.take(0) Out[25]:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help