zyBooks_exercise_2023f-1

.pdf

School

University of British Columbia *

*We aren’t endorsed by this school

Course

404

Subject

Computer Science

Date

Jan 9, 2024

Type

pdf

Pages

19

Uploaded by JudgePenguinMaster491

1 CPSC 404: zyBooks Exercise and Questions “Database Systems with SQL” Due Date: Sunday, November 12, 2023 before 23:59 (near midnight) 10% penalty per half day late (i.e., 10% penalty of the online component if you’re late for that; 10% penalty of the written component if you’re late for that) Last Update: Nov. 4, 2023 @ 16:15 November 4, 2023 @ 16:15 Clarification for Part B ’s Q2(a)(i) October 16, 2023 @ 15:30 Created/Posted You have the choice of doing either: 1. The zyBooks online exercise with its online participation and challenge activities (the two chapters we’re interested in are quite do-able and the activities are laid out like a tutorial), and you also need to answer the questions found in this document. 2. The SQL Server DBA lab exercise (posted on Canvas). D on’t do both. Either of these will make up 10% of your course grade. Check the course outline on Canvas for how to register for the zyBooks course. It costs about $64 USD (roughly $90 CAD on a credit card). Note that option (2) above is free, and is hosted by UBC’s Department of Computer Science. This zyBooks exercise involves 2 parts: 5% of your overall grade will be for Part A (zyBooks online activities), and 5% will be for Part B (written answers to questions to be submitted on Canvas). If you are using your name or CWL ID or CS ID when registering for zyBooks, that’s fine; we can easily identify you and we can download your zyBooks points —and, don’t worry, we will send you e-mail if we have any doubt. zyBooks won’t lose a record of your work on their site. However, if you are using a pseudonym (fake name to maintain privacy), then when you upload your written answers to Canvas Assignments, include a note that says what pseudonym or e-mail address you used for zyBooks, so that we can credit you with the points for successfully completing the online exercises. zyBooks will track your online completion of the Participation Activities and Challenge Activities for the various sections. After the due date, we will transfer these points to the Canvas gradebook. In the following questions, where we have written “1 - 2 sentences” or “1 paragraph” (for example), then you can always provide more sentences, if you wish. Note that where we have written 1 paragraph”, we expect 2-4 sentences (or more, if you wish), and a not just a few words.
2 There are 2 database papers that tie in to the zyBooks material that you will need to read. Both are found in the ACM Digital Library . If you’re on a UBC VPN (or accessing it from a UBC computer), you get them for free. J ust click on the link to get access (it’ll pop up a CWL sign -in if you’re not already connected) and download the PDF copy of each . The 2 papers are: 1. [ ACM Inroads paper about DB education, NoSQL, etc.] Goldweber, Mikey; Wei, Min; Aly, Sherif; Raj, Rajendra K.; and Mokbel, Mohamed. “The 2022 Undergraduate Database Course in Computer Science: What to Teach”. ACM Inroads , Volume 13, Number 3, September 2022, pp. 16-21. https://dl.acm.org/doi/10.1145/3549545 2. [ CACM paper about The Seattle Report on Database Research] Abadi, Daniel; Ailamaki, Anastasia; et al. (actually 33 authors). “The Seattle Report on Database Research”, Communications of the ACM, Volume 65, Issue 8, August 2022, pp. 72-79. https://dl.acm.org/doi/10.1145/3524284 This strategic research report comes out every 5 years, and it is the output from a large panel of database researchers that discusses research trends, needs, and challenges in database systems and data management. Its authors include our textbook ’s co-author Raghu Ramakrishnan, Turing Award winner Mike Stonebraker (a database person who won this Nobel-like prize), ARIES crash- recovery inventor C. Mohan of IBM, other database textbook authors, etc. The report discusses some of the topics mentioned in your zyBooks activities including data lakes, big data, machine learning, data science, data integration, cloud services for databases, key-value DBs, wide-column DBs, scaling, etc. This 2022 CACM paper is an update of the original 2018 Seattle Report that includes additional commentary and progress since then. So, it’s kind of a “greatest hits” type of paper, and it will let us work with the latest findings. Part A: The zyBooks reading, participation activities, and challenge activities You need to read the materials, follow the animations, and answer the multiple choice, short answer, drag-and-drop, etc. questions. There are no questions whose answers are not (reasonably) found within the zyBooks materials. After you answer, they even tell you which answers are wrong, and you get to repeat the questions. Correct your wrong answers to get full credit. We urge you to take the activities seriously because these are good topics in data management. A future employer may be happy that you have some understanding of these topics. We will track your activities, and award you points for successfully completing the activities. Those marks will be imported into the Canvas Gradebook.
3 We will only do 2 chapters in the zyBooks course materials. In particular, we will focus on these chapters and the following major sections within the chapters: Chapter 7: Database Architectures but only the following sections 1. MySQL Architecture 2. Cloud Databases A comparison of on-premise services, IaaS, PaaS, and SaaS 3. Distributed (and Parallel) Databases 4. Replicated Databases 5. n/a (skip Data Warehouses now part of CPSC 304) 6. n/a (skip Data Warehouse Design) 7. Other Databases, and this includes: Data Lakes, Embedded Databases, Federated Databases, and In-Memory Databases Chapter 9: NoSQL Databases and Big Data 1. Big Data Databases 2. Key-Value Databases 3. Wide-Column Databases 4. Document Databases 5. Graph Databases 6. MongoDB Part B: Questions and Answers There are a series of questions on the following pages. We’re looking for good answers to each, but they don’t have to be bullet-proof explanations (but they need to be correct, not just a “good effort”) . You can answer these questions as you go along in zyBooks’ activities; or, you can save them for afterwards. Some questions require some additional reading, namely the ACM Inroads paper and the CACM paper about The Seattle Report. Be sure to answer the answers on your own, and not copy someone else’s work. The use of ChatGPT is not allowed for this assignment. Here are the questions: 1. Cloud Databases a. Hand in screenshots of your completed Challenge Activity 7.2.1, for two different sets of questions. (You might have to make multiple attempts, and there are different sets of questions presented when you re-take it.)
4 This is actually a nice summary and comparison of on-premise services, IaaS, PaaS, and SaaS. You might want to refer to Participation Activity 7.2.2 when doing this. Be careful to read the instructions (they can change between attempts).
5 2. Parallel Databases a. For this question, use the sailing database that we have been discussing in our lectures, pre-class exercises, in-class exercises, and sample questions from the assignments and exams. Provide one example of a query that falls into each of the 3 categories in zyBooks’ Participation Activity 7.3.1. So, there will be 3 queries in all. If you wish to provide the actual SQL, that’s fine (you don’t have to), but for this question, we want you to explain the query in words, and briefly justify why it fits into that category. (Use 1-2 paragraphs for each.) i. In other words, there are 3 memory architectures described and we want you to provide/explain a sailing query that would perform best under that architecture, but not very well on the other two architectures. 1 Shared Memory Computer: Query Example: "Find the average age of sailors who have reserved a specific type of boat." Explanation In a shared memory computer
6 In a shared memory computer, complex operations requiring frequent and fast memory accesses are efficient because all processors access the same memory. The given query involves joining Sailors and Reserves and then calculating the average. Shared memory enables fast, shared access to data. In a Shared Storage: In a shared-memory computer, although the processors can handle disk operations efficiently, the lack of shared memory can slow down operations that require frequent inter-processor communication or access to common data sets. Calculating averages between joined tables requires more complex coordination between processors because each processor will use its own memory instead of the common memory space. Shared-Nothing computer In a shared-nothing setup, the processor operates independently using its own memory and storage. This independence can lead to inefficiencies in queries that require frequent access and combination of data from multiple sources. The overhead of distributing, processing, and aggregating data across different nodes can be significant compared to a shared memory environment. 2 Shared Storage Computer: Query Example : "listing all sailors and their corresponding boat reservations, sorted by date." Explanation: In a Shared Storage: Shared-storage computers, in which each processor has its own memory but shares storage. It is ideal for tasks involving many disks read/write operations. This query requires accessing and merging large amounts of data from two tables, which a shared storage system can handle efficiently. Independent memory in this architecture avoids contention, but shared disk access enables efficient data retrieval. In a Shared Memory: In a shared memory environment, while processors can quickly access common memory, they may be bottlenecked by disk I/O operations. Reading large amounts of data from disk and writing becomes less efficient due to potential I/O contention between multiple processors.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help