Homework1Solution

.txt

School

Purdue University *

*We aren’t endorsed by this school

Course

348

Subject

Computer Science

Date

Apr 3, 2024

Type

txt

Pages

9

Uploaded by MegaTitanium11932

Report
CS 348 - Homework 1: SQL (Joins, Aggregations, Set Operators) (100 Points) Spring 2024 Due on: 1/26/2024 at 11:59 pm This assignment is to be completed by individuals. There will be a 10% penalty if the homework is submitted 24 hours after the due date, a 20% penalty if the homework is submitted 48 hours after the due date, or a 30% penalty if the homework is submitted 72 hours after the due date. The homework will not be accepted after 72 hours, as a solution will be posted by then. Submission Instructions: Submit your answers using the hw1.py skeleton file included with this assignment. The python file is read by the autograder. It is important to have a syntax-error free file. Your query result should have the same attributes(columns) exactly in the same order as the expected result shown in each question. However, the rows can be in any order unless a specific order is required by the question. Column headers can be different. Before submitting your Python file, execute the file using the command: Python3 hw1.py The output should contain all of your queries. Getting the correct output ensures that the autograder will be able to read your queries. The homework will be submitted in Gradescope. Make sure you submit the correct file (hw1.py). The Homework 1 lead TA will announce in Ed when the Gradescope submission is open. Meanwhile, you can test your queries using the database included with this homework. Database instructions: For this assignment, use SFBikeShare SQLite database included with this assignment. You can open the database by simply typing .open SFBikeShare in the SQLite shell. For a readable layout in SQLite shell, run the commands .mode column .headers on Grading: This assignment will be auto-graded using SQLite. You are highly encouraged to use SQLite shell to test your queries before submitting your assignment. Grading will use a different database instance than the one given to students to test their queries. Therefore, make sure your queries are robust and that they work on any data instance. There is no partial credit in this homework. =================================================================== Database: We will use the San Francisco Bike share database. Only part of the data is included with this homework. Database Schema: CREATE TABLE station ( // bike stations id INTEGER PRIMARY KEY, name TEXT, lat NUMERIC, long NUMERIC, dock_count INTEGER, city TEXT, install_date TEXT); CREATE TABLE station_status ( // the number of available bikes and docks in a station at a given date and time id integer primary key autoincrement,
station_id INTEGER, // foreign key to station.id bikes_available INTEGER, docks_available INTEGER, time TEXT); CREATE TABLE trip ( id INTEGER PRIMARY KEY, duration INTEGER, start_station_id INTEGER, // foreign key to station.id end_station_id INTEGER, // foreign key to station.id bike_id INTEGER, subscription_type TEXT, zip_code INTEGER, start_d Text, // start date and time of trip end_d Text // end date and time of trip ); CREATE TABLE daily_weather ( id integer primary key autoincrement, date TEXT, max_temperature_f INTEGER, mean_temperature_f INTEGER, min_temperature_f INTEGER, max_dew_point_f INTEGER, mean_dew_point_f INTEGER, min_dew_point_f INTEGER, max_humidity INTEGER, mean_humidity INTEGER, min_humidity INTEGER, max_sea_level_pressure_inches NUMERIC, mean_sea_level_pressure_inches NUMERIC, min_sea_level_pressure_inches NUMERIC, max_visibility_miles INTEGER, mean_visibility_miles INTEGER, min_visibility_miles INTEGER, max_wind_Speed_mph INTEGER, mean_wind_speed_mph INTEGER, max_gust_speed_mph INTEGER, precipitation_inches INTEGER, cloud_cover INTEGER, events TEXT, wind_dir_degrees INTEGER, zip_code INTEGER); Notes: - For all questions, include all columns shown in the expected result. - Use the round function to round to two decimal places (unless the expected result of a query has a different number of decimal places). All questions have the same number of points (100/13) ======================================= 1) List trips with id number is less than or equal 7450. Trips for id <= 7450 Select t.id, start_station_id, name, start_d, end_d, duration FROM trip t join station s on t.start_station_id = s.id WHERE t.id <= 7450;
Expected Result: id start_station_id name start_d end_d duration ---- ---------------- --------------------------------------- ---------------- ---------------- -------- 7442 75 Mechanics Plaza (Market at Battery) 2013-09-01 00:11 2013-09-01 00:36 1508 7443 75 Mechanics Plaza (Market at Battery) 2013-09-01 00:12 2013-09-01 00:37 1513 7444 66 South Van Ness at Market 2013-09-01 00:19 2013-09-01 00:41 1345 7445 75 Mechanics Plaza (Market at Battery) 2013-09-01 00:21 2013-09-01 00:36 931 7446 25 Stanford in Redwood City 2013-09-01 00:23 2013-09-01 00:25 159 7447 66 South Van Ness at Market 2013-09-01 00:24 2013-09-01 00:42 1033 7448 69 San Francisco Caltrain 2 (330 Townsend) 2013-09-01 00:29 2013-09-01 00:35 323 7449 56 Beale at Market 2013-09-01 00:38 2013-09-01 00:52 825 7450 56 Beale at Market 2013-09-01 00:38 2013-09-01 00:52 819 ====================================== 2) Modify the previous query to add the name of the destination station (end station). Note that you will need two joins (connections) with the station table. origin_station start_d duration destination_station --------------------------------------- ---------------- -------- --------------------------------------- Mechanics Plaza (Market at Battery) 2013-09-01 00:11 1508 Beale at Market Mechanics Plaza (Market at Battery) 2013-09-01 00:12 1513 Beale at Market South Van Ness at Market 2013-09-01 00:19 1345 Harry Bridges Plaza (Ferry Building) Mechanics Plaza (Market at Battery) 2013-09-01 00:21 931 Mechanics Plaza (Market at Battery) Stanford in Redwood City 2013-09-01 00:23 159 Franklin at Maple South Van Ness at Market 2013-09-01 00:24 1033 Harry Bridges Plaza (Ferry Building) San Francisco Caltrain 2 (330 Townsend) 2013-09-01 00:29 323 San Francisco Caltrain 2 (330 Townsend) Beale at Market 2013-09-01 00:38 825 Harry Bridges Plaza (Ferry Building) Beale at Market 2013-09-01 00:38 819 Harry Bridges Plaza (Ferry Building) Select s1.name as origin_station, start_d, duration, s2.name as destination_station FROM trip t join station s1 on t.start_station_id = s1.id join station s2 on t.end_station_id = s2.id WHERE t.id <= 7450;
====================================== 3) Count the number of trips between every pair of stations. Keep only pairs of stations with 10 or more trips. Show the average trip duration between a pair of stations as well. name name cnt avg_duration ------------------------------------ ---------------------------------------- --- ------------ Davis at Jackson Davis at Jackson 10 5404.80 Embarcadero at Bryant Embarcadero at Sansome 10 1349.60 Harry Bridges Plaza (Ferry Building) San Francisco Caltrain (Townsend at 4th) 10 1477.10 Powell at Post (Union Square) Powell at Post (Union Square) 11 7733.73 Embarcadero at Sansome Embarcadero at Sansome 12 8081.08 Harry Bridges Plaza (Ferry Building) Harry Bridges Plaza (Ferry Building) 12 5274.67 Powell at Post (Union Square) Harry Bridges Plaza (Ferry Building) 12 858.08 Harry Bridges Plaza (Ferry Building) Embarcadero at Vallejo 14 1157.21 Embarcadero at Vallejo Harry Bridges Plaza (Ferry Building) 16 3212.38 Harry Bridges Plaza (Ferry Building) Embarcadero at Sansome 18 1435.44 Select s1.name, s2.name, count(*) trip_count, round(avg(duration), 2) as avg_duration FROM trip t join station s1 on t.start_station_id = s1.id join station s2 on t.end_station_id = s2.id GROUP BY s1.name, s2.name HAVING trip_count >= 10 ORDER BY trip_count; ============================ 4) The trip table includes data for two days only. Compute the average max_tempreature and number of trips in these days. You will have to use the date() function to extract the date part in the date and start_d attributes in the trip and weather tables, respectively. Select date, avg(max_temperature_f) as avg_temp, count(distinct t.id) as number_trips from daily_weather w join trip t on date(w.date) = date(t.start_d) group by date order by avg_temp desc; date avg_temp number_trips ---------- -------- ------------ 2014-09-01 85.8 368 2013-09-01 79.2 706 =============================================== 5) The station_status table includes stats (bikes_available and docks_available)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help