hw2

.pdf

School

Cornell University *

*We aren’t endorsed by this school

Course

2700

Subject

English

Date

Dec 6, 2023

Type

pdf

Pages

23

Uploaded by CommodoreGiraffe3679

Report
ENGRD 2700: Basic Engineering Probability and Statistics Fall 2023 Homework 2 Due Monday, September 6th at 11:59pm. Submit Solutions to Gradescope by clicking the name of the assignment. See syllabus for detailed submission instructions. When completing this assignment (and all subsequent ones), keep in mind the following: You must complete the homework individually and independently. Provide evidence for each of your answers. If a calculation involves only very minor computation then explain the computation you performed and give the results. If a calculation involves more complicated steps on many many records then hand in the calculations and formulas for the first few records only. Write clearly and legibly. You are encouraged to type your work although you do not have to. We may deduct points if your answers are di ffi cult to read or disorganized. For questions that you answer using Python, attach any code that you write, along with the relevant plots. You may use other software, but the same condition applies. Submit your homework a single pdf file on Gradescope. Read Chapters 1 and 2 of textbook.
1. The file quartet.csv contains four datasets of x and y values, side by side. (a) Compute the sample mean, sample median, and sample standard deviation for each column of the dataset. (b) Based solely upon the summary statistics you computed in part (a), how do the four datasets compare? (c) Construct scatterplots for each of the four datasets. (d) Based solely upon the plots you generated in part (c), how do the four datasets compare? (e) What’s the moral of the story? (That is, what does this example suggest about what should be done when analyzing data?) Why I am assigning this question: This question is assigned to reinforce your knowledge of sample statistics. It also is given to help you learn the Python programming language and get you working with real data sets. x1's mean: 9 y1's mean: 7.500909090909091 x2's mean: 9 y2's mean: 7.500909090909091 x3's mean: 9 y3's mean: 7.5 x4's mean: 9 y4's mean: 7.500909090909091 x1's median: 9 y1's median: 7.58 x2's median: 9 y2's median: 8.14 x3's median: 9 y3's median: 7.11 x4's median: 9 y4's median: 7.04 x1's standard deviation: 3.3166247903554 y1's standard deviation: 2.031568135925815 x2's standard deviation: 3.3166247903554 y2's standard deviation: 2.0316567355016177 x3's standard deviation: 3.3166247903554 y3's standard deviation: 2.030423601123667 x4's standard deviation: 3.3166247903554 y4's standard deviation: 2.0305785113876023
b) Four datasets are almost equal to each other mean of four datasets are equal to (9, 7.500909090) four x datasets have same median of 9 but y datasets have different median, ranging 7.04~8.14 Four x data sets have same standard deviation of x = 3.3166247903554 but y datasets have different standard deviation, but very similar around 2.03
Four datasets' scatterplot on one graph (X1,y1) dataset (X2,y2) dataset (x3,y3) dataset (X4,y4) dataset
D) even though the numerical values seem like four datasets are similar or equal, the actual data plots tell me that those four datasets are so different in the pattern. (X1,y1) graph seems like two linear lines (X2,y1) graph seems like a parabola (X3,y3) graph seems like a one linear line (X4,y4) graph seems like two parabolas. e) moral of the story is that comparing datasets using central tendency like mean, median, and standard deviation is not an accurate method. We should look at overall trend in order to analyze datasets.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help