
Database System Concepts
7th Edition
ISBN: 9780078022159
Author: Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher: McGraw-Hill Education
expand_more
expand_more
format_list_bulleted
Concept explainers
Question
1.) The regulation of electric and gas utilities is an important public policy question affecting consumer’s choice and cost of energy provider. To inform deliberation on public policy, data on eight numerical variables have been collected for a group of energy companies. To summarize the data, hierarchical clustering has been executed using Euclidean distance as the similarity measure and Ward’s method as the clustering method. Based on the following dendrogram, what is the most appropriate number of clusters to organize these utility companies?
Answer:
What is the most appropriate number of clusters to organize these utility companies?

Transcribed Image Text:Distance
11.0
10.5
10.0
9.5
9.0
8.5
8.0
7.5
7.0
6.5
6.0
5.5
5.0
4.5
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0
10 13 4 20 2 21 1 18 14 19 3 9 6 8 16 11 5 7 12 15 17
Cluster
Expert Solution

This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
This is a popular solution
Trending nowThis is a popular solution!
Step by stepSolved in 3 steps

Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.Similar questions
- Give 2-3 paragraph well-reasoned answers to Problem Referenced from book Introduction to Data Mining by Pan Ning Tan, Michael Steinbach, Vipin Kumar, Anuj Karpatne chapter 2 exercise 17arrow_forwardGive an example of when you would want to analyze the mean of a dataset.arrow_forward2) Given a Clustering task, how you can evaluate the performance on the test set and how wewould know if the clusters are correct. Explain any three possible solutions.arrow_forward
- Consider the advantages and disadvantages of integration based on decomposition.arrow_forwardAssume that we use cosine similarity as the similarity measure. In the hierarchical agglomerative clustering (HAC), we need to define a good way to measure the similarity of two clusters. One usual way is to use the group average similarity between documents in two clusters. Formally, for two cluster C; and C, let C= C;U Cj, n = C, we define 1 sim(C,, C) = E s(x, y) Σ n.(n- 1) x.y € C, x =y Where s(x, is the cosine similarity between y) and y. Given a list of clusters C. C2, .... Cm assume that their pairwise similarities are saved in a two dimensional array **", of size m 2. Given three clusters C, C;, and Cr show that there is a way to compute sim(C;U C;, C) in constant time. Note that we ignore the dimensionality in time complexity.arrow_forwardAfter learning about the k-means clustering algorithm in the big data course, some of your classmates tell you that they are not very enthusiastic about using it. The main reason they provide is that, when applied to the same dataset, the algorithm seems to be giving different clusters every times it is run. What should you say to them? You should explain to them that they are interpreting the computer output incorrectly. Even though K-means seems to give different clusters every time it is run on the same dataset, if they look more closely at those clusters, they will notice that they are really the same clusters, but with different labels. You should explain to them that they are using the computer functions incorrectly. The K-means algorithm always results in the same clusters. You should explain to them that they should run the k-means algorithm several times and then pick up the clusters with the smallest objective function (all while warning them…arrow_forward
- How is the rise of graph databases impacting serialization techniques, given their unique data relationships?arrow_forwardPlease please do this manually a.What is the distance between the two farthest members? (max or complete link) (round to four decimal places here, and next 2 problems); b. What is the distance between the two closest members? (min or single link);c. What is the average distance between all pairs? d. What is the center distance between two clusters? e. Among all three distances above, which one is robust to noise? Answer either “complete”, “single”, “average”, and "center"arrow_forward2. Examine the dendrogram: How many clusters seem reasonable for describing the data?arrow_forward
arrow_back_ios
arrow_forward_ios
Recommended textbooks for you
- Database System ConceptsComputer ScienceISBN:9780078022159Author:Abraham Silberschatz Professor, Henry F. Korth, S. SudarshanPublisher:McGraw-Hill EducationStarting Out with Python (4th Edition)Computer ScienceISBN:9780134444321Author:Tony GaddisPublisher:PEARSONDigital Fundamentals (11th Edition)Computer ScienceISBN:9780132737968Author:Thomas L. FloydPublisher:PEARSON
- C How to Program (8th Edition)Computer ScienceISBN:9780133976892Author:Paul J. Deitel, Harvey DeitelPublisher:PEARSONDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781337627900Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningProgrammable Logic ControllersComputer ScienceISBN:9780073373843Author:Frank D. PetruzellaPublisher:McGraw-Hill Education

Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education

Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON

Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON

C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON

Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning

Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education