104624234_Mendonca_ws2

.pdf

School

Algoma University *

*We aren’t endorsed by this school

Course

100

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

8

Uploaded by AmbassadorCat5961

BAN140 Introduction to Data Visualization WorkShop2 Page 1 of 8 Question 1. Click on # in the header of the first column. What is the data type of “trip duration”? Ans: The data type of trip duration is Number (whole) Question 2. Now click on the ‘Abc’ above “start station name” column. What is the data type for this column? Ans: The data type of trip duration is string. Question 3. How many variables do you have per each data type (e.g., integer, float, string, date)? Ans: There are 4 data types for the variables: number (whole), number (decimal), Date and time, String Number (whole) - trip duration, start station ID, BikeID, Birth year, Gender, End station ID (6 variables) Number (decimal) – Start station Latitude, Start station Longitude, End station Latitude, End station Longitude (4 variables) Date & Time – start time, stop time (2 variables) String – Start station name, End station name, User type (3 variables) Question 4. Click on the sorting symbol for Bikeid to sort in ascending or descending order. What is the lowest Bikeid in this table? What is the highest Bikeid in this table? Ans: The lowest BikeID is 14,648 and the highest BikeID is 39,949 Question 5. Present a detailed data exploration that includes your own analysis of the data after loading the table. The goal of this question is for you to explain the data as much as you can before starting the exploration using visualizations. (Minimum 250 words) Ans: From the dataset provided, what we can observe the trip was done month of August 2019. We can analyse the number of trips done in specific longitude and latitude done from start to end point of the station. We can also see the number of subscribers and customers starting the trip from various locations. Most of the trips were started from multiple stations and ended at multiple stations. Also, it was observed that the number of subscribers we more than the number of customers. People born in the birth year 1969 had the highest trip duration compared to the other birth years. Question 6. Does the dataset have null values? Ans: No, there is no null value in the data set. Question 7. Do you see any column that might not be useful and you can drop it? Explain the reason. Ans: The column “Birth year” might not be useful because in a trip the birth year is not an important criteria as age is just a number.
BAN140 Introduction to Data Visualization WorkShop2 Page 2 of 8 Question 8. Write the name of three fields from Dimensions. Ans: Birth year, Start station name, End station name are three fields from dimensions. Question 9. Write the name of three fields from Measures. Ans: Start Station ID, Start station Longitude, Start Station Latitude are the three fields from measures. Question 10. Question 11. What is the station name with the shortest total trip duration? Ans: NYCBS depot - Delancey is the name of the station with shortest total trip duration.
BAN140 Introduction to Data Visualization WorkShop2 Page 3 of 8 Question 12. Question 13. What type of graph did you get? Ans: The line-graph is now horizontally represented with start time in x axis and trip duration in y axis.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help