Lab03-prelab
.html
keyboard_arrow_up
School
University of British Columbia *
*We aren’t endorsed by this school
Course
119
Subject
Computer Science
Date
Feb 20, 2024
Type
html
Pages
11
Uploaded by MajorTreeLyrebird3
Lab 03 Prelab, Part 2 - Analysis preparation and initial data collection
¶
Please complete Part 1 of the prelab on Canvas before working through this notebook.
In [41]:
%reset -f # Clear all variables, start with a clean environment.
import numpy as np
import data_entry2
This prelab activity introduces a useful features in our data_entry2 spreadsheet tool and then walks you through how to calculate, using Python, the quantities average
, standard deviation
and (standard) uncertainty of the mean
. It starts by using a hypothetical example data set to guide you through the use of the relevant Python functions. The work done with the hypothetical data set will not be handed in directly, and instead will set you up to perform these same calculations on some real data, also collected in this prelab.
Simple Calculations in data_entry2 cells
¶
It is possible to do some simple calculations directly in the data_entry2 sheet. In general we want you to do calculations using python, but for some tasks, most notably recording your uncertainties, it is very convenient to
use this feature of the sheet.
As an example, if you measure a mass of 497 g, and estimate a 95% confidence interval of 477 -> 516 g, your sheet could look like:
m
u_m
g
g
497
= (516-477)/4
Alternatively, if you have a rectangular PDF on a balance with a 10 g resolution, you might use something like:
m
u_m
g
g
142
= 10/(2 * np.sqrt(3))
Try it
Use the sheet below to try out both of these styles of uncertainty.
•
enter a variable name, m (in grams) for the first column, and um in the second column.
•
In the next two rows, enter the measurements and expressions to calculate uncertainties as shown in the two examples above.
•
To get rid of unused rows and columns, execute (Shift-Enter) in the notebook cell that creates the sheet again.
•
Notice that in the sheet interface, you see the formulas you've entered, but that when you Generate Vectors, the expressions are evaluated and the generated uncertainy vector contains the results of the calculations.
•
Alter one of the expressions in the uncertainy column so that it contains an error - perhaps add an extra ')' at the end of the expression to see what happens.
In [42]:
de0 = data_entry2.sheet("test_formulas")
Sheet name: test_formulas.csv
Summary of Part 1 of the prelab
¶
Here is a summary of the statistics concepts covered or reviewed in part 1 of this prelab:
a)
Average is given by $$x_{ave} = \frac{1}{N} \sum_{i=1}^N x_i$$
b)
For variables that follow a Gaussian distribution, approximately 68% of the values lie between the range $ x_{ave} - \sigma$ to $x_{ave} + \sigma$ (68% CI)
c)
Approximately 95% of the values will lie within the range $ x_{ave} - 2\sigma$ to $x_{ave} + 2\sigma$ (95% CI)
d)
Standard deviation is given by
$$ \sigma = \frac{95\% \,\mathrm{CI}}{4} = \sqrt{\frac{1}{N-1}\sum_{i=1}^N \left(x_i - x_{ave}\right)^2} $
$
e)
We use the standard deviation as an indicator of the uncertainty (or the variability) in a single measurement and it does not depend on the number of measurements taken.
f)
Uncertainty of the mean (often called standard error of the mean) is given by
$$\sigma_m = u\_x_{ave} = \frac{\sigma}{\sqrt{N}}$$
We use uncertainty of the mean as an indicator of the uncertainty (or the variability) in the average of multiple
measurements and it does improve as we increase the number of measurements.
Developing your Python skills
¶
Let's import a spreadsheet of our data "prelab03_01"
In [43]:
# Run me to import the spreadsheet, `prelab03_1`, which is found in the same directory as `Lab03-prelab.ipynb`
de = data_entry2.sheet('prelab03_1')
Sheet name: prelab03_1.csv
Below is a table of the hypothetical data in your imported spreadsheet
Your turn #1:
Double-check that you have the correct number of data points. It should be 25, but you need to recall that Python indexing starts at 0!
Hypothetical data
¶
d (mm)
439.3
431.6 434.6 433.3 439.3 442.6 428.6 441.6 431.2 427.6 433.2 441.3 436 437.6 434.7 433.2 433.1 431.3 436 432.9 436.5 437.2 435.7 432.6 434.7
Calculating average and standard deviation using Python numpy functions
¶
Your turn #2:
Press the 'Generate Vectors' button at the top of your spreadsheet to transfer the data into the Python environment and then calculate the average and standard deviation in the cell below using the np.mean
and np.std()
functions, respectively. np.mean
has a single argument
, which is the vector of values over which to calculate the average. We discuss the second argument in np.std
below.
Note: If it is not working correctly, double-check above that you have correctly titled the single spreadsheet column as 'd' and that there is a resulting generated vector 'dVec'.
In [44]:
# Run me to calculate average and standard deviation. Note how we're able to include descriptive text and units in the print commands.
dAve = np.mean(dVec)
print("Average of d =", dAve, "mm")
dStd = np.std(dVec, ddof=1)
print("Standard deviation of d =", dStd, "mm")
Average of d = 435.028 mm
Standard deviation of d = 3.8362872676586677 mm
You should find that the average is 435.028 mm, which is consistent with our earlier estimate of 435 mm from the histogram. The standard deviation should be 3.8362872676586677 mm, which would be 3.8 mm if we were to round it to 2 significant figures when we report it. This is also consistent with our estimate of 4 mm using the 95% Confidence Interval with the histogram earlier.
Note that in 'np.std()' we are supplying a second argument, ddof=1
; this additional argument is needed because the np.std() function uses a general formula in its calculation - it can be used for a number of related calculations. In particular the formula it uses is:
$$ \textrm{np.std()} = \sqrt{\frac{1}{N-\textrm{ddof}}\sum_{i=1}^N \left(x_i - x_{ave}\right)^2} $$
We want $N-1$ in the denominator as per our definition of standard deviation, so we need to use ddof = 1
:
$$ \sigma = \sqrt{\frac{1}{N-1}\sum_{i=1}^N \left(x_i - x_{ave}\right)^2} $$
If you are interested, ddof is an abbreviation for 'delta degrees of freedom.' As discussed in Lab 01, we use one 'degree of freedom' from our dataset when we calculate the average. Since the average is used in the calculation of standard deviation, we control for this in the formula for standard deviation by dividing the squared differences between each data point in the mean by $N-1$ instead of $N$.
If you want to control the number of significant figures displayed you can modify the print statement as follows.
Within the curly braces, the ':.2' tells the print function to round the variable specified to 'format()' - in this case 'dStd', the standard deviation of 'd' - to two digits.
In [45]:
# Run me to print dStd with 2 decimal places "{:.2}"
print("Standard deviation to 2 sig figs = {:.2} mm".format(dStd))
Standard deviation to 2 sig figs = 3.8 mm
Let's step back for a moment and think about what the standard deviation represents. Twenty-five measurements
were made using the same experimental procedure, so this standard deviation is a method we can use to represent the variability in our measurements. In the language we are using in the lab, this standard deviation is the single-measurement standard uncertainty of the distance, $u\_d_1$. What does this mean? It means that if we wanted to report the value and uncertainty for one of our measurements of $d$, 434.7 mm for example, we would report it as:
$$ d_1 = (434.7 \pm 3.8) \, \textnormal{mm} $$
The subscript '1' is being used here to emphasize that we are talking about a single measurement and not the average. We will look at the uncertainty in the average later.
The variability (the standard deviation) in the 25 measurements that we made describes us how confident we
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Questions
The Python code for step 7 is needed. Step 6 has been attached for data reference
arrow_forward
Data preprocessing is an important stage of data analysis and mining. sklearn is a well-known machine learning
package. Please write the name of the module commonly used in sklearn for data preprocessing.
arrow_forward
Write script using matlab please
arrow_forward
This is an open-ended lab. Using Python, run a linear regression analysis on data you have collected from the public domain.
Recommended packages:
scikit-learn
NumPy
matplotlib
pandas
python code [.py file(s)]
Explanation of work:
Create an original how-to document with step-by-step instructions you have followed to create your program.
arrow_forward
Question 2 (18 marks)
This question provides an opportunity for you to demonstrate your understanding of the problem-solving approach taught in TM112 and the patterns introduced in Block 1 Part 4 and Block 2 Part 2. You can find an overview of the problem-solving approach and a list of all the patterns TM112 teaches in the Problem solving and Python quick reference and you will need to refer to this document as you work on the question.
Important note: you do not need to get a working program in part a. in order to attempt part b.
A student wants to design and implement a Python program to convert any 6-bit unsigned binary number to its decimal equivalent. There are many ways of doing this, but here is their initial top-level decomposition:
> Convert binary to decimal
>> Input a list of six 1s and 0s corresponding to the binary number to be converted
>> Input a list of six column weightings consisting of powers of two
>> Create a new list that contains the…
arrow_forward
What springs to mind when you catch wind of the term "data abstraction"? How can one dissect a dataset into its various pieces using the plethora of methods at our disposal?
arrow_forward
What is the difference between a data flowgraph and a data flow machine? In data flow analysis, make a list of the various variables.
arrow_forward
What is meant by "data manipulation language" (DML)? Give me the rundown.
arrow_forward
Explain how the data flow method excels above traditional methods of elucidation.
arrow_forward
Arrays have many benefits. How many dimensions an array can have is. What is the difference between a structure and a simple kind of variable? I'd want to see an example of a more complex data model.
arrow_forward
The vocabulary of data terms should be explained in a few words. As time goes on, the data dictionary grows bigger and bigger, and it has more and more things in it.
arrow_forward
Define Data manipulation?
arrow_forward
In addition to a variable's name, its "type" and "extra characteristics" must be specified. That is to say, apart from its data type, every variable has its own distinct characteristics. If you could elaborate on the idea so that we could better clarify the terms, that would be great.
arrow_forward
Question 2
I need help with ii below
This question provides an opportunity for you to demonstrate your understanding of the problem-solving approach taught in TM112 and the patterns introduced in Block 1 Part 4 and Block 2 Part 2. You can find an overview of the problem-solving approach and a list of all the patterns TM112 teaches in the Problem solving and Python quick reference and you will need to refer to this document as you work on the question.
Important note: you do not need to get a working program in part a. in order to attempt part b.
A student wants to design and implement a Python program to convert any 6-bit unsigned binary number to its decimal equivalent. There are many ways of doing this, but here is their initial top-level decomposition:
> Convert binary to decimal
>> Input a list of six 1s and 0s corresponding to the binary number to be converted
>> Input a list of six column weightings consisting of powers of two
>> Create a new list…
arrow_forward
"Data abstraction" means what to you? How can you deconstruct data?
arrow_forward
Any number of operations, from modifying the data's format or representation to combining data from several sources, fall under the umbrella term "data transformation." Do you have any thoughts?
arrow_forward
7.For example, you are assigned to create a Parent Bank class that will have two abstract methods for showing account balance and account
details respectively. All the child classes are supposed to show the same result. However, one child class disagrees with the naming of the
parent class methods but still will show the same output as above mentioned in the parent class. You tried to negotiate with them but failed.
Now suggest a design pattern that will work perfectly for this scenario and in the end all child classes will show the same output. Justify your
answer.
x=[10,1, 20,3,6,2,5,11,15,2,12,14,17,18,29]
for i in range (0, len(x)):
m = x[i]
1 = i
# Finding the minimum value and its location
for j in range(i+1, len(x)):
if (m >x [j]):
m= x[j]
1 = j
# Now we wilLl do the swapping
tmpry = m
x[1] = x[i]
x[i] = tmpry
print(x)
arrow_forward
What do you think about data hiding, data encapsulation and data binding? please Elaborate.
arrow_forward
How do data dictionaries facilitate better documentation in large-scale projects?
arrow_forward
Can page table shadowing be made to take up less of the designer's time?
arrow_forward
Q3 Computer Programming
arrow_forward
Evaluate polynomial, indicator, dichotomous, & piecewise model components using Python.
Provide Python code to evaluate the model requirements above please.
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education
Related Questions
- The Python code for step 7 is needed. Step 6 has been attached for data referencearrow_forwardData preprocessing is an important stage of data analysis and mining. sklearn is a well-known machine learning package. Please write the name of the module commonly used in sklearn for data preprocessing.arrow_forwardWrite script using matlab pleasearrow_forward
- This is an open-ended lab. Using Python, run a linear regression analysis on data you have collected from the public domain. Recommended packages: scikit-learn NumPy matplotlib pandas python code [.py file(s)] Explanation of work: Create an original how-to document with step-by-step instructions you have followed to create your program.arrow_forwardQuestion 2 (18 marks) This question provides an opportunity for you to demonstrate your understanding of the problem-solving approach taught in TM112 and the patterns introduced in Block 1 Part 4 and Block 2 Part 2. You can find an overview of the problem-solving approach and a list of all the patterns TM112 teaches in the Problem solving and Python quick reference and you will need to refer to this document as you work on the question. Important note: you do not need to get a working program in part a. in order to attempt part b. A student wants to design and implement a Python program to convert any 6-bit unsigned binary number to its decimal equivalent. There are many ways of doing this, but here is their initial top-level decomposition: > Convert binary to decimal >> Input a list of six 1s and 0s corresponding to the binary number to be converted >> Input a list of six column weightings consisting of powers of two >> Create a new list that contains the…arrow_forwardWhat springs to mind when you catch wind of the term "data abstraction"? How can one dissect a dataset into its various pieces using the plethora of methods at our disposal?arrow_forward
- What is the difference between a data flowgraph and a data flow machine? In data flow analysis, make a list of the various variables.arrow_forwardWhat is meant by "data manipulation language" (DML)? Give me the rundown.arrow_forwardExplain how the data flow method excels above traditional methods of elucidation.arrow_forward
- Arrays have many benefits. How many dimensions an array can have is. What is the difference between a structure and a simple kind of variable? I'd want to see an example of a more complex data model.arrow_forwardThe vocabulary of data terms should be explained in a few words. As time goes on, the data dictionary grows bigger and bigger, and it has more and more things in it.arrow_forwardDefine Data manipulation?arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Database System ConceptsComputer ScienceISBN:9780078022159Author:Abraham Silberschatz Professor, Henry F. Korth, S. SudarshanPublisher:McGraw-Hill EducationStarting Out with Python (4th Edition)Computer ScienceISBN:9780134444321Author:Tony GaddisPublisher:PEARSONDigital Fundamentals (11th Edition)Computer ScienceISBN:9780132737968Author:Thomas L. FloydPublisher:PEARSON
- C How to Program (8th Edition)Computer ScienceISBN:9780133976892Author:Paul J. Deitel, Harvey DeitelPublisher:PEARSONDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781337627900Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningProgrammable Logic ControllersComputer ScienceISBN:9780073373843Author:Frank D. PetruzellaPublisher:McGraw-Hill Education
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education