Lab03-prelab

.html

School

University of British Columbia *

*We aren’t endorsed by this school

Course

119

Subject

Computer Science

Date

Feb 20, 2024

Type

html

Pages

Uploaded by MajorTreeLyrebird3

Lab 03 Prelab, Part 2 - Analysis preparation and initial data collection ¶ Please complete Part 1 of the prelab on Canvas before working through this notebook. In [41]: %reset -f # Clear all variables, start with a clean environment. import numpy as np import data_entry2 This prelab activity introduces a useful features in our data_entry2 spreadsheet tool and then walks you through how to calculate, using Python, the quantities average , standard deviation and (standard) uncertainty of the mean . It starts by using a hypothetical example data set to guide you through the use of the relevant Python functions. The work done with the hypothetical data set will not be handed in directly, and instead will set you up to perform these same calculations on some real data, also collected in this prelab. Simple Calculations in data_entry2 cells ¶ It is possible to do some simple calculations directly in the data_entry2 sheet. In general we want you to do calculations using python, but for some tasks, most notably recording your uncertainties, it is very convenient to use this feature of the sheet. As an example, if you measure a mass of 497 g, and estimate a 95% confidence interval of 477 -> 516 g, your sheet could look like: m u_m g g 497 = (516-477)/4 Alternatively, if you have a rectangular PDF on a balance with a 10 g resolution, you might use something like: m u_m g g 142 = 10/(2 * np.sqrt(3)) Try it

Use the sheet below to try out both of these styles of uncertainty. • enter a variable name, m (in grams) for the first column, and um in the second column. • In the next two rows, enter the measurements and expressions to calculate uncertainties as shown in the two examples above. • To get rid of unused rows and columns, execute (Shift-Enter) in the notebook cell that creates the sheet again. • Notice that in the sheet interface, you see the formulas you've entered, but that when you Generate Vectors, the expressions are evaluated and the generated uncertainy vector contains the results of the calculations. • Alter one of the expressions in the uncertainy column so that it contains an error - perhaps add an extra ')' at the end of the expression to see what happens. In [42]: de0 = data_entry2.sheet("test_formulas") Sheet name: test_formulas.csv Summary of Part 1 of the prelab ¶ Here is a summary of the statistics concepts covered or reviewed in part 1 of this prelab: a) Average is given by $$x_{ave} = \frac{1}{N} \sum_{i=1}^N x_i$$ b) For variables that follow a Gaussian distribution, approximately 68% of the values lie between the range $ x_{ave} - \sigma$ to $x_{ave} + \sigma$ (68% CI) c) Approximately 95% of the values will lie within the range $ x_{ave} - 2\sigma$ to $x_{ave} + 2\sigma$ (95% CI) d) Standard deviation is given by $$ \sigma = \frac{95\% \,\mathrm{CI}}{4} = \sqrt{\frac{1}{N-1}\sum_{i=1}^N \left(x_i - x_{ave}\right)^2} $ $ e) We use the standard deviation as an indicator of the uncertainty (or the variability) in a single measurement and it does not depend on the number of measurements taken. f) Uncertainty of the mean (often called standard error of the mean) is given by $$\sigma_m = u\_x_{ave} = \frac{\sigma}{\sqrt{N}}$$ We use uncertainty of the mean as an indicator of the uncertainty (or the variability) in the average of multiple

measurements and it does improve as we increase the number of measurements. Developing your Python skills ¶ Let's import a spreadsheet of our data "prelab03_01" In [43]: # Run me to import the spreadsheet, `prelab03_1`, which is found in the same directory as `Lab03-prelab.ipynb` de = data_entry2.sheet('prelab03_1') Sheet name: prelab03_1.csv Below is a table of the hypothetical data in your imported spreadsheet Your turn #1: Double-check that you have the correct number of data points. It should be 25, but you need to recall that Python indexing starts at 0! Hypothetical data ¶ d (mm) 439.3 431.6 434.6 433.3 439.3 442.6 428.6 441.6 431.2 427.6 433.2 441.3 436 437.6 434.7 433.2 433.1 431.3 436 432.9 436.5 437.2 435.7 432.6 434.7 Calculating average and standard deviation using Python numpy functions ¶ Your turn #2: Press the 'Generate Vectors' button at the top of your spreadsheet to transfer the data into the Python environment and then calculate the average and standard deviation in the cell below using the np.mean and np.std() functions, respectively. np.mean has a single argument , which is the vector of values over which to calculate the average. We discuss the second argument in np.std below. Note: If it is not working correctly, double-check above that you have correctly titled the single spreadsheet column as 'd' and that there is a resulting generated vector 'dVec'. In [44]: # Run me to calculate average and standard deviation. Note how we're able to include descriptive text and units in the print commands. dAve = np.mean(dVec) print("Average of d =", dAve, "mm")

dStd = np.std(dVec, ddof=1) print("Standard deviation of d =", dStd, "mm") Average of d = 435.028 mm Standard deviation of d = 3.8362872676586677 mm You should find that the average is 435.028 mm, which is consistent with our earlier estimate of 435 mm from the histogram. The standard deviation should be 3.8362872676586677 mm, which would be 3.8 mm if we were to round it to 2 significant figures when we report it. This is also consistent with our estimate of 4 mm using the 95% Confidence Interval with the histogram earlier. Note that in 'np.std()' we are supplying a second argument, ddof=1 ; this additional argument is needed because the np.std() function uses a general formula in its calculation - it can be used for a number of related calculations. In particular the formula it uses is: $$ \textrm{np.std()} = \sqrt{\frac{1}{N-\textrm{ddof}}\sum_{i=1}^N \left(x_i - x_{ave}\right)^2} $$ We want $N-1$ in the denominator as per our definition of standard deviation, so we need to use ddof = 1 : $$ \sigma = \sqrt{\frac{1}{N-1}\sum_{i=1}^N \left(x_i - x_{ave}\right)^2} $$ If you are interested, ddof is an abbreviation for 'delta degrees of freedom.' As discussed in Lab 01, we use one 'degree of freedom' from our dataset when we calculate the average. Since the average is used in the calculation of standard deviation, we control for this in the formula for standard deviation by dividing the squared differences between each data point in the mean by $N-1$ instead of $N$. If you want to control the number of significant figures displayed you can modify the print statement as follows. Within the curly braces, the ':.2' tells the print function to round the variable specified to 'format()' - in this case 'dStd', the standard deviation of 'd' - to two digits. In [45]: # Run me to print dStd with 2 decimal places "{:.2}" print("Standard deviation to 2 sig figs = {:.2} mm".format(dStd)) Standard deviation to 2 sig figs = 3.8 mm Let's step back for a moment and think about what the standard deviation represents. Twenty-five measurements were made using the same experimental procedure, so this standard deviation is a method we can use to represent the variability in our measurements. In the language we are using in the lab, this standard deviation is the single-measurement standard uncertainty of the distance, $u\_d_1$. What does this mean? It means that if we wanted to report the value and uncertainty for one of our measurements of $d$, 434.7 mm for example, we would report it as: $$ d_1 = (434.7 \pm 3.8) \, \textnormal{mm} $$ The subscript '1' is being used here to emphasize that we are talking about a single measurement and not the average. We will look at the uncertainty in the average later. The variability (the standard deviation) in the 25 measurements that we made describes us how confident we

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version