Assignment-6-Introduction-to-working-with-R-RStudio
.docx
keyboard_arrow_up
School
University of Saskatchewan *
*We aren’t endorsed by this school
Course
311
Subject
Statistics
Date
Apr 3, 2024
Type
docx
Pages
4
Uploaded by MagistrateStar1002
Assignment #6: Introduction to Working with R/RStudio
Submission Instructions
Due:
Friday, April 6, 2018 at 11:59 PM.
Submit the following four
files through Canvas>Assignments>To-Do: (1)
The completed, working R script that produced the analysis in Steps 1 through 9
(2)
The output file – descriptivesOutput.txt (3)
Another output file – histogram.pdf
(4)
The completed answer sheet provided on the last page and also as a separate word file
If you do not follow the instructions, your assignment will be counted late.
o
Late Assignment policy: Same as before.
Evaluation
Your submission will be graded based on the correctness of the completed answer sheet, with other files
as supporting documents.
Before you start
For this assignment, you’ll run simple analyses by modifying the R script you used in the ICA #11 (
Descriptives.r
). You will also need a new data set – OnTimeAirport2017Dec.csv
, which contains actual data regarding on-time flight statistics for 83,915 flights, by airline and airport, for December 2017, collected from Bureau of Transportation Statistics.
1
IMPORTANT! When downloading the .csv file, please make sure that the name doesn’t change, and that it is in the same folder as the Descriptives.r file that you are modifying
.
The metadata for the – OnTimeAirport2017Dec.csv spreadsheet is below:
Variable Name
Variable Description
FlightDate
The date of the flight (mm/dd/yyyy)
UniqueCarrier
The unique carrier code
CarrierlName
The name of the carrier
FlightNum
Flight Number
Origin
The origin airport of the flight
OriginCity
The origin city of the flight
Dest
The destination airport of the flight
DestCity
The destination city of the flight
DepDelay
The delay in departing from the origin gate (in minutes)
TaxiOut
The minutes spent taxiing out to the runway at origin
TaxiIn
The minutes spent taxiing in from the runway at destination
ArrDelay
The delay in arrive to the destination gate (in minutes)
Cancelled
Whether the flight was cancelled (0 = no, 1 = yes)
AirTime
Flight Time (in minutes)
Distance
The total distance of the flight (in miles)
1
https://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236
Modifying the Descriptives.r script
To complete the assignment, modify the Descriptives.r
script (used in ICA #11) to perform an analysis of departure delays by origin airport, following the instructions below, and complete the answer sheet on the last page
.
1)
Use OnTimeAirport2017Dec.csv as the input file.
HINT: In line 21 of the Descriptives.r script, it says:
INPUT_FILENAME <- "NBA14Salaries.csv"
Change that line to:
INPUT_FILENAME <- "OnTimeAirport2017Dec.csv"
2)
Present the number of flights, grouped by destination airport (using Dest
).
HINT: In line 61, change the line to read:
summary(dataSet$Dest)
This presents the number of observations/rows (flights) by destination airport. You will need the output from this command to answer the first question in the answersheet on the last page.
3)
Present summary statistics for arrival delay (using ArrDelay
).
HINT: In line 66, change the line by replacing Salary with ArrDelay
:
describe(dataSet$ArrDelay)
4)
Present summary statistics for arrival delay (using ArrDelay
), grouped by airline carriers (using UniqueCarrier
).
HINT: Check line 73 in the script:
describeBy(dataSet$Salary,dataSet$Position)
This presents summary statistics for salary by position (for the NBA salary data). Now that we are using a different data set, you should be able to figure out how to change line 73 to present summary statistics for arrival delay (
ArrDelay
), grouped by airline carrier (
UniqueCarrier
).
If you get that, you will now be able to answer questions 2 through 4 on the answer sheet!
5)
Compare, using a t-test, the arrival delays for two airline carrier
s (using UniqueCarrier
)
, American Airlines (AA) and United Airlines (UA).
HINT: Now please change line 87 and line 93 on your own. Hopefully the first few steps will get you started!
Check line 87:
subset <- dataSet[ which(dataSet$Position=='PG' | dataSet$Position=='SF'), ] This create a subset with only two positions: PG and SF (for the NBA salary data). Now that we are using a different data set, you should be able to figure out how to change this line to create a subset with only two airline carriers: AA and UA.
Check line 93:
Page 2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Questions
need evaluation steps
arrow_forward
Write a report of the different research design with exceptional of experimental research design.
arrow_forward
Create a side-by-side boxplot for vitamin D level vs. NewAge and a side-
by-side boxplot for vitamin D level vs. country.
Create a scatterplot to show the relationship between vitamin D level
and Age.
Compare these two side-by-side boxplots and the scatterplot and explain
your findings.
• Note: Write appropriate captions for the tables, graphs, and outputs.
arrow_forward
The r code for side by side boxplot of vitamind v newage and vitamin d v country.
Scatterplot code for relationship between vitamin d level and age.
arrow_forward
The data file, data2.xls (Excel format), has been uploaded to this module. Click, download, and open this file. It contains:
Table 1. Violent victimization, by type of crime, 2016, and 2017
Appendix table 3. Standard errors for table 1: Violent victimization, by type of crime, 2016, and 2017
From the estimation of the number of Rape/sexual assault (298,410) in 2016 at 95% CI. what is the lower limit?
arrow_forward
You have been asked to complete a short skills assessment exam that will be given to screen applicants to a Jr. Operations Analyst position.
check the attched pic for full question
arrow_forward
Recently, management at Oak Tree Golf Course received a few complaints about the condition of the greens. Several players complained that the greens are too fast. Rather than react to the comments of just a few, the Golf Association conducted a survey of 100 male and
100 female golfers. The survey results are summarized here.
Excel File: data02-31.xlsx
Male Golfers
Male
Green Condition
Handicap
Under 15
15 or more
25
25
a. Complete the crosstabulation shown below.
Green Condition
Gender Too Fast Fine
Female
35
40
Too Fast
10
65
60
Fine
40
Total
100
100
Female Golfers
200
Green Condition
Handicap
Under 15
15 or more
Too Fast
1
Note: This exercise is an example of Simpson's Paradox.
39
Fine
9
Total
75
125
Which group shows the highest percentage saying that the greens are too fast?
Females, at 40%
b. Refer to the initial crosstabulations. For those players with low handicaps (better players), which group (male or female) shows the highest percentage saying the greens are too fast?
For…
arrow_forward
A survey about social media reported that 82% of B2B marketers (marketers that focus primarily on
attracting businesses) plan to increase their use of social media, as compared to 55% of B2C marketers
(marketers that primarily target consumers). The survey was based on 1,286 B2B marketers and 1,731
B2C marketers. The accompanying table summarizes the results. Complete parts (a) through (d) below.
A. What is the probability that a randomly selected respondent is a B2C marketer?
B. What is the probability that a randomly selected respondent plans to increase use of social media
or is a B2C marketer?
C. Explain the difference in the results in (a) and (b)
arrow_forward
Evaluate the different types of visual representations which can be used to communicate the findings.
arrow_forward
A survey about social media reported that 79% of B2B marketers (marketers that focus primarily on attracting businesses) plan to increase their use of social media, as compared to 54% of B2C marketers
(marketers that primarily target consumers). The survey was based on 1,333 B2B marketers and 1,669 B2C marketers. The accompanying table summarizes the results. Complete parts (a) through (d) below.
A Click the icon to view the contingency table about social media use and marketers.
Contingency table
a. What is the probability that a randomly selected respondent plans to increase use of social media?
(Round to three decimal places as needed.)
Increase Use of
Social Media?
Business Focus
B2B
B2C
Total
b. What is the probability that a randomly selected respondent is a B2C marketer?
Yes
1,049
901
1,950
(Round to three decimal places as needed.)
No
284
768
1,052
Total
1,333
1,669
3,002
c. What is the probability that a randomly selected respondent plans to increase use of social media or is…
arrow_forward
Thank you for any feedback on this one.
arrow_forward
Recently, management at Oak Tree Golf Course received a few complaints about the condition of the greens. Several players complained that the greens are too fast. Rather than react to the comments of just a few, the Golf Association conducted a survey of 100 male and
100 female golfers. The survey results are summarized here.
Excel File: data02-31.xlsx
Male Golfers
Green Condition
Gender Too Fast
Male
Handicap
Under 15
15 or more
25
25
a. Complete the crosstabulation shown below.
Green Condition
Female
Too Fast
10
Fine
Fine
40
Female Golfers
Total
Green Condition
Handicap
Under 15
15 or more
Too Fast
1
Fine
9
39 51
Total
Which group shows the highest percentage saying that the greens are too fast?
- Select your answer -
b. Refer to the initial crosstabulations. For those players with low handicaps (better players), which group (male or female) shows the highest percentage saying the greens are too fast?
For the low handicappers, the - Select your answer - have a higher percentage who…
arrow_forward
(Please do not give solution in image format thanku)
arrow_forward
Identify the steps in creating the Answer Report.
arrow_forward
Background information: Allison collected additional days of data to monitor the process.
Steps to monitor using the control charts:
Now monitor the process. An additional ten days of data have been collected, see table labeled “1st 10 Days of Monitoring Reservation Processing Time” in the Data File.
Develop Xbar and R charts for the 1st 10 days of monitoring. Plot the data for the 1st 10 days on the Xbar and R charts.
Is the process in control? If the control chart indicates an out-of-control process, note which days, the pattern, and whether it is the Xbar or R chart.
Now that we have set up the control charts using enough data from a stable process, the 30 days of data, we will monitor the process. While monitoring the process, what will we use as the upper control limit for the R (range) Chart to compare against our new range values? Enter your response to three decimal places. You do not need to include the units (minutes), ONLY the numeric value.
USE EXCELL DATA TO GET…
arrow_forward
Sahar Rasoul-Math 7 End of Yea X Gspy ninjas book-Google
docs.google.com/spreadsheets/d/1j5MotWzsc0V1V3Qyl4rbP_OFOUotaNXCIIFax>
Copy of Copy of Col...
8.8
Sahar Rasoul - Math 7 End of Year Digital Task Cards Student Version ☆
File Edit View Insert Format Data Tools Extensions Help Last edit was 5 minu
$ % .0 .00 123 Century Go... ▼ 18 Y BIS
fx| =IF(B4="Question 1", Sheet2! H21, if(B4="Question 2", Sheet2! H22, IF(B4="
n
100%
36:816
A
B
C
6
16
A flashlight can light
a circular area of up
to 6 feet in diameter.
What is the maximum
area that can be lit?
Round to the nearest
tenth.
30x
0004
15
A Sheet1
https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.amazon.com%2FSpy-Ninjas-Ultimate-Guidebook-Scholastic%2Fdp
7
8
9
10
11
12
13
14
3
5.
7.
a
5
$9
A
arrow_forward
Alert for not submit AI generated answer. I need unique and correct answer. Don't try to copy from anywhere. Do not give answer in image formet and hand writing
arrow_forward
Classification of Data
Identify the individuals and give the variables under the following:
1.
You want to study about the people who climbed Mt. Everest.
2.
The Department of Agriculture wishes to conduct a study about the pineapples in Tagaytay.
arrow_forward
The accompanying data were compiled by the admissions office of a certain college during the past 5 years. The data relate the number of college brochures and follow-up letters (x) sent to a preselected list of high school juniors who took the PSAT and the number of completed applications (y) received from these students (both measured in thousands).
Brochures Sent, x
1.1
2
3.8
5
6.2
Applications Completed, y
0.3
0.5
1
1.25
1.6
(a) Derive an equation of the straight line that passes through the points (2, 0.5) and (5, 1.25).y =
(b) Use this equation to predict the number of completed applications that might be expected if 8000 brochures and follow-up letters are sent out during the next year. applications
arrow_forward
Provide a graphical display of respondent marital status (MARITAL), where: 1 = Married; 2 = Widowed; 3 = Divorced; 4 = Separated; 5 = Never married.
Can you tell me the steps to do this in excel and the formulas to use, as I cannot share the actual excel spreadsheet with the data here.
arrow_forward
A national park conducts a study on the behaviour of their leopards. A few of the park's leopards are registered and receive a GPS device which allows measuring the position of the leopard. Use this example to describe the following concepts: population, sample, observation, value, and variable.
arrow_forward
Use the data provided on Canvas. In automotive assembly processes, automation cannot always guarantee the dimensional accuracy of a car assembly as required by the design specification. Thus, some skillful workers will visually inspect those assembled car bodies and conduct manual adjustments when necessary. These workers are called "Fitter" in the automotive industry. This scenario is illustrated in the following
arrow_forward
Name the forecasting methods
arrow_forward
Describe the similarities and differences between a research proposal and a research report.
arrow_forward
Create scatterplot using Excel with the following variables:
gender: 0 = male, 1 = female.
height: in inches.
weight: in pounds.
First we will create a scatterplot to examine how weight is related to height, ignoring gender.
To do that in Excel:
Sort the data by gender:
Hold down the Control key (Command key on MacOS) and click the A key to select all of the data in the worksheet.
Select the Home tab, then the Editing group Sort & Filter -> Custom Sort.
In the pop-up window, make sure that My list has headers box is checked and then choose gender from the pull-down menu next to Sort by. Click OK.
Now select all of the data in columns B and C, select the Insert tab and in the Charts group choose Scatter.
Choose the first scatterplot option (Scatter with only Markers).
Now we have a scatterplot, but the data is all on the right of the plot. To fix this:
Right-click on the x-axis, and choose Format Axis from the pop-up menu.
Make sure that Axis Options is selected on the left, and…
arrow_forward
On December 17, 2007 baseball writer John Hickey wrote an article for the Seattle P-I about increases to ticket prices for Seattle Mariners
games during the 2008 season. The article included a data set that listed the average ticket price for each MLB team, the league in which the team
plays (AL or NL), the number of wins during the 2007 season and the cost per win (in dollars). The data for the 16 National League teams are shown
below.
league
price
wins
cost/win
team
Arizona Diamondbacks
NL
19.68
90
35.40
Atlanta Braves
NL
17.07
84
32.89
Chicago Cubs
NL
34.30
85
65.33
cincinnati Reds
NL
17.90
72
40.32
Colorado Rockies
NL
14.72
90
26.67
Florida Marlins
NL
16.70
71
38.13
Houston Astros
NL
26.66
73
59.11
Los Angeles Dodgers
20.09
82
34.64
NL
Milwaukee Brewers
NL
18.11
83
35.37
N.Y. Mets
NL
25.28
88
46.56
Philadelphia Phillies
26.73
89
48.69
NL
Pittsburgh Pirates
NL
17.08
68
40.67
San Diego Padres
NL
20.83
89
38.15
San Francisco Giants
NL
24.53
71
56.00
St. Louis Cardinals
NL
29.78
78…
arrow_forward
The Ministry of Tourism in Trinidad and Tobago is interested in developing a campaign to increase the number of visitors to the island. The Ministry, in collaboration with the island’s hotels, collected data to be used as a guide to determine what steps should be taken going forward. Using the data in the Microsoft Excel file attached, you are required to use the knowledge you have acquired during the semester to answer the following question. Ensure that your responses are detailed and that all the necessary steps are clearly outlined.
ID Visitor Number (1-150)
Length_stay Length of stay on the island (days)
Age_years Age of the visitor (years)
Return_pct Average estimated probability of returning to the island
Attraction Number of attractions visited in the island
Trip_ratio Number of trips taken off the resort
Spending Average money spent during a visit (US$)
Region_num Geographic region: 1 = North East, 2 = North Central, 3 = South, 4 = West Rating Rating of the property visited (1…
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Related Questions
- need evaluation stepsarrow_forwardWrite a report of the different research design with exceptional of experimental research design.arrow_forwardCreate a side-by-side boxplot for vitamin D level vs. NewAge and a side- by-side boxplot for vitamin D level vs. country. Create a scatterplot to show the relationship between vitamin D level and Age. Compare these two side-by-side boxplots and the scatterplot and explain your findings. • Note: Write appropriate captions for the tables, graphs, and outputs.arrow_forward
- The r code for side by side boxplot of vitamind v newage and vitamin d v country. Scatterplot code for relationship between vitamin d level and age.arrow_forwardThe data file, data2.xls (Excel format), has been uploaded to this module. Click, download, and open this file. It contains: Table 1. Violent victimization, by type of crime, 2016, and 2017 Appendix table 3. Standard errors for table 1: Violent victimization, by type of crime, 2016, and 2017 From the estimation of the number of Rape/sexual assault (298,410) in 2016 at 95% CI. what is the lower limit?arrow_forwardYou have been asked to complete a short skills assessment exam that will be given to screen applicants to a Jr. Operations Analyst position. check the attched pic for full questionarrow_forward
- Recently, management at Oak Tree Golf Course received a few complaints about the condition of the greens. Several players complained that the greens are too fast. Rather than react to the comments of just a few, the Golf Association conducted a survey of 100 male and 100 female golfers. The survey results are summarized here. Excel File: data02-31.xlsx Male Golfers Male Green Condition Handicap Under 15 15 or more 25 25 a. Complete the crosstabulation shown below. Green Condition Gender Too Fast Fine Female 35 40 Too Fast 10 65 60 Fine 40 Total 100 100 Female Golfers 200 Green Condition Handicap Under 15 15 or more Too Fast 1 Note: This exercise is an example of Simpson's Paradox. 39 Fine 9 Total 75 125 Which group shows the highest percentage saying that the greens are too fast? Females, at 40% b. Refer to the initial crosstabulations. For those players with low handicaps (better players), which group (male or female) shows the highest percentage saying the greens are too fast? For…arrow_forwardA survey about social media reported that 82% of B2B marketers (marketers that focus primarily on attracting businesses) plan to increase their use of social media, as compared to 55% of B2C marketers (marketers that primarily target consumers). The survey was based on 1,286 B2B marketers and 1,731 B2C marketers. The accompanying table summarizes the results. Complete parts (a) through (d) below. A. What is the probability that a randomly selected respondent is a B2C marketer? B. What is the probability that a randomly selected respondent plans to increase use of social media or is a B2C marketer? C. Explain the difference in the results in (a) and (b)arrow_forwardEvaluate the different types of visual representations which can be used to communicate the findings.arrow_forward
- A survey about social media reported that 79% of B2B marketers (marketers that focus primarily on attracting businesses) plan to increase their use of social media, as compared to 54% of B2C marketers (marketers that primarily target consumers). The survey was based on 1,333 B2B marketers and 1,669 B2C marketers. The accompanying table summarizes the results. Complete parts (a) through (d) below. A Click the icon to view the contingency table about social media use and marketers. Contingency table a. What is the probability that a randomly selected respondent plans to increase use of social media? (Round to three decimal places as needed.) Increase Use of Social Media? Business Focus B2B B2C Total b. What is the probability that a randomly selected respondent is a B2C marketer? Yes 1,049 901 1,950 (Round to three decimal places as needed.) No 284 768 1,052 Total 1,333 1,669 3,002 c. What is the probability that a randomly selected respondent plans to increase use of social media or is…arrow_forwardThank you for any feedback on this one.arrow_forwardRecently, management at Oak Tree Golf Course received a few complaints about the condition of the greens. Several players complained that the greens are too fast. Rather than react to the comments of just a few, the Golf Association conducted a survey of 100 male and 100 female golfers. The survey results are summarized here. Excel File: data02-31.xlsx Male Golfers Green Condition Gender Too Fast Male Handicap Under 15 15 or more 25 25 a. Complete the crosstabulation shown below. Green Condition Female Too Fast 10 Fine Fine 40 Female Golfers Total Green Condition Handicap Under 15 15 or more Too Fast 1 Fine 9 39 51 Total Which group shows the highest percentage saying that the greens are too fast? - Select your answer - b. Refer to the initial crosstabulations. For those players with low handicaps (better players), which group (male or female) shows the highest percentage saying the greens are too fast? For the low handicappers, the - Select your answer - have a higher percentage who…arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw Hill
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill