Consider two relations R(a,b) and S(c,d) that are both horizontally partitioned across N = 3 nodes as shown in the diagram below. Each node locally stores approximately of the tuples in R and of the tuples in S. The tuples of R are randomly organized across machines (i.e., R is block partitioned across machines) while the tuples of S are hash-partitioned on S.c.

Computer Networking: A Top-Down Approach (7th Edition)
7th Edition
ISBN:9780133594140
Author:James Kurose, Keith Ross
Publisher:James Kurose, Keith Ross
Chapter1: Computer Networks And The Internet
Section: Chapter Questions
Problem R1RQ: What is the difference between a host and an end system? List several different types of end...
icon
Related questions
Question
CONTEXT:
Consider two relations R(a,b) and S(c,d) that are both horizontally
partitioned across N = 3 nodes as shown in the diagram below. Each node locally
stores approximately i of the tuples in R and of the tuples in S. The tuples
of R are randomly organized across machines (i.e., R is block partitioned across
machines) while the tuples of S are hash-partitioned on S.c.
QUERY:
SELECT a, avg(d) as avg
FROM R, S
WHERE R.b = S.c
S.c
%3D
AND S.d > 0
GROUP BY a
PLEASE SOLVE THE FOLLOWING:
Assume the same query as before, only now the data is distributed on
300 servers instead of just 3 servers. We expect a linear speedup, in other words
we expect the runtime to be about 100 times faster. However, if the values of some
attribute are skewed, then the performance of a parallel query plan can be far from
a linear speedup. Indicate with of the attributes below, if skewed, can significantly
prevent your query plan from achieving linear speedup.
• Skew on attribute R.a.
• Skew on attribute R.b.
• Skew on attribute S.c.
• Skew on attribute S.d.
Transcribed Image Text:CONTEXT: Consider two relations R(a,b) and S(c,d) that are both horizontally partitioned across N = 3 nodes as shown in the diagram below. Each node locally stores approximately i of the tuples in R and of the tuples in S. The tuples of R are randomly organized across machines (i.e., R is block partitioned across machines) while the tuples of S are hash-partitioned on S.c. QUERY: SELECT a, avg(d) as avg FROM R, S WHERE R.b = S.c S.c %3D AND S.d > 0 GROUP BY a PLEASE SOLVE THE FOLLOWING: Assume the same query as before, only now the data is distributed on 300 servers instead of just 3 servers. We expect a linear speedup, in other words we expect the runtime to be about 100 times faster. However, if the values of some attribute are skewed, then the performance of a parallel query plan can be far from a linear speedup. Indicate with of the attributes below, if skewed, can significantly prevent your query plan from achieving linear speedup. • Skew on attribute R.a. • Skew on attribute R.b. • Skew on attribute S.c. • Skew on attribute S.d.
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Recommended textbooks for you
Computer Networking: A Top-Down Approach (7th Edi…
Computer Networking: A Top-Down Approach (7th Edi…
Computer Engineering
ISBN:
9780133594140
Author:
James Kurose, Keith Ross
Publisher:
PEARSON
Computer Organization and Design MIPS Edition, Fi…
Computer Organization and Design MIPS Edition, Fi…
Computer Engineering
ISBN:
9780124077263
Author:
David A. Patterson, John L. Hennessy
Publisher:
Elsevier Science
Network+ Guide to Networks (MindTap Course List)
Network+ Guide to Networks (MindTap Course List)
Computer Engineering
ISBN:
9781337569330
Author:
Jill West, Tamara Dean, Jean Andrews
Publisher:
Cengage Learning
Concepts of Database Management
Concepts of Database Management
Computer Engineering
ISBN:
9781337093422
Author:
Joy L. Starks, Philip J. Pratt, Mary Z. Last
Publisher:
Cengage Learning
Prelude to Programming
Prelude to Programming
Computer Engineering
ISBN:
9780133750423
Author:
VENIT, Stewart
Publisher:
Pearson Education
Sc Business Data Communications and Networking, T…
Sc Business Data Communications and Networking, T…
Computer Engineering
ISBN:
9781119368830
Author:
FITZGERALD
Publisher:
WILEY