When an author produces an index for his or her book, the first step in this process is to decide which words should go into the index; the second is to produce a list of the pages where each word occurs. Instead of trying to choose words out of our heads, we decided to let the computer produce a list of all the unique words used in the manuscript and their frequency of occurrence. We could then go over the list and choose which words to put into the index. The main object in this problem is a "word" with associated frequency. The tentative definition of "word" here is a string of alphanumeric characters between markers where markers are white space and all punctuation marks; anything non-alphanumeric stops the reading. If we skip all un-allowed characters before getting the string, we should have exactly what we want. Ignoring words of fewer than three letters will remove from consideration such as "a", "is", "to", "do", and "by" that do not belong in an index. In this project, you are asked to write a program to read any text file and then list all the "words" in alphabetic order with their frequency together appeared in the article. The "word" is defined above and has at least three letters. Your result should be printed to an output file named YourUserID.txt. You need to create a Binary Search Tree (BST) to store all the word object by writing an insertion or increment function. Finally, a proper traversal print function of the BST should be able to output the required results. The BST class in the text can not be used directly to solve this problem. It is also NOT a good idea to modify the BST class to solve this problem. Instead, the following codes are recommended to start your program. //Data stored in the node type struct WordCount { string word; int count; };   //Node type:  struct TreeNode { WordCount info; TreeNode * left; TreeNode * right; };   // Two function's prototype // Increments the frequency count if the string is in the tree // or inserts the string if it is not there. void Insert(TreeNode*&, string);   // Prints the words in the tree and their frequency counts. void PrintTree(TreeNode* , ofstream&);   //Start your main function and the definitions of above two functions.

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

When an author produces an index for his or her book, the first step in this process is to decide which words should go into the index; the second is to produce a list of the pages where each word occurs. Instead of trying to choose words out of our heads, we decided to let the computer produce a list of all the unique words used in the manuscript and their frequency of occurrence. We could then go over the list and choose which words to put into the index.

The main object in this problem is a "word" with associated frequency. The tentative definition of "word" here is a string of alphanumeric characters between markers where markers are white space and all punctuation marks; anything non-alphanumeric stops the reading. If we skip all un-allowed characters before getting the string, we should have exactly what we want. Ignoring words of fewer than three letters will remove from consideration such as "a", "is", "to", "do", and "by" that do not belong in an index.

In this project, you are asked to write a program to read any text file and then list all the "words" in alphabetic order with their frequency together appeared in the article. The "word" is defined above and has at least three letters.

  1. Your result should be printed to an output file named YourUserID.txt.
  2. You need to create a Binary Search Tree (BST) to store all the word object by writing an insertion or increment function. Finally, a proper traversal print function of the BST should be able to output the required results.
  3. The BST class in the text can not be used directly to solve this problem. It is also NOT a good idea to modify the BST class to solve this problem. Instead, the following codes are recommended to start your program.

//Data stored in the node type struct WordCount { string word; int count; };   //Node type:  struct TreeNode { WordCount info; TreeNode * left; TreeNode * right; };   // Two function's prototype // Increments the frequency count if the string is in the tree // or inserts the string if it is not there. void Insert(TreeNode*&, string);   // Prints the words in the tree and their frequency counts. void PrintTree(TreeNode* , ofstream&);   //Start your main function and the definitions of above two functions.

You only need to write these two functions. Do not add anymore functions besides the ones given here. 

Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Knowledge Booster
Lists
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education