Suppose we have a document of D distinct words and we want to return the N most frequently occuring words in the document. Assume N is much less than D.    Describe the fastest algorithm (clearly list each step of the algorithm) to solve this problem, starting with a hash table in which to store the frequencies. You may use additional structures, as long as they are one of the structures covered in class. Make sure to specify what is the key and what is the value for all structures used.   Against each step of your algorithm, write down the big 0 running time with an explanation of how you got it, and specify whether this running time is expected time or worst case time. You can use known-from-class structures you use, **but you still have to write what the actual running time is**

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

Using java 

 

Suppose we have a document of D distinct words and we want to return the N most frequently occuring words in the document. Assume N is much less than D. 

 

Describe the fastest algorithm (clearly list each step of the algorithm) to solve this problem, starting with a hash table in which to store the frequencies. You may use additional structures, as long as they are one of the structures covered in class. Make sure to specify what is the key and what is the value for all structures used.

 

Against each step of your algorithm, write down the big 0 running time with an explanation of how you got it, and specify whether this running time is expected time or worst case time. You can use known-from-class structures you use, **but you still have to write what the actual running time is**



Expert Solution
Step 1

using java best way effective algorithm to find the most frequently word is using HashMap data structure, whose insertion and deletion time complexity is O(1) and the wrost case time complexity is O(N).

Step 2

Hash_Map:

Hash_Map<K, V> is a piece of Java's assortment since Java 1.2.

This class is found in java.util bundle.

It gives the essential execution of the Map interface of Java.

It stores the information in (Key, Value) sets, and you can get to them by a list of another sort. example : Integer

One item is utilized as a key (file) to another article. In the event that you attempt to embed the copy key, it will supplant the component of the relating key.

Hash_Map is like the Hash_Table, however it is un-synchronized. It permits to store the invalid keys also, yet there should be just a single invalid key article and there can be quite a few invalid qualities. This class makes no certifications concerning the request for the guide.

steps

Step by step

Solved in 3 steps

Blurred answer
Knowledge Booster
Potential Method of Analysis
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education