Incoroporate or use below code as reference #include #include #include typedef struct node_struct {int item;struct node_struct *next;} node;/*** 10-NULL* We want to insert 20* First call ([10], 20) [has not finished yet]* {10, {20, NULL}}*** Second call (NULL, 20) {20, NULL}*/node* insert_unordered(node *temp_root, int item) {if (temp_root == NULL) {temp_root = malloc(sizeof(node));temp_root-item = item;temp_root-next = NULL;} else {temp_root-next = insert_unordered(temp_root-next, item);}return temp_root;}/*** This is atypical linked list because* (i) it does not allow duplicates and* (ii) it is sorted* 10-20, i want to insert 15* 10-15-20**/ Submission details: Make a directory and name it, using your first and last names,applying the camel notation, for example haniGirgis. Please name the main file,where execution starts, as hw2.cpp. Compress your directory by the zip utility, andsubmit the compressed file using Blackboard.IntroductionSuppose you had a document (perhaps a newspaper article, or an unattributedmanuscript for a book), and you were interested in knowing who wrote it. One wayto try to determine the authorship of the anonymous document is by comparingproperties of the anonymous document with properties of known documents, andseeing if there is enough similarity to make a judgment of authorship.Some simple properties one might use to distinguish different authors include:• Vocabulary (ie. the set of words an author uses).• Word frequencies (i.e. the frequencies with which an author uses words).• Bigram frequencies (i.e. the frequencies of two consecutive words).• Bigram probabilities (i.e. the probability that one word follows anotherword).Terminology• A unigram is a sequence of words of length one (i.e. a single word).A bigram is a sequence of words of length two.The conditional probability of an event E2 given another event E1, writtenp(E2|E1), is the probability that E2 will occur given that event El has alreadyoccurred.We write p(w(k)]w(k-1)) for the conditional probability of a word w in position k,w(k), given the immediately preceding word, w(k-1). You determine the conditionalprobabilities by determining unigram counts (the number of times each wordappears, written c(w(k)), bigram counts (the number of times each pair of wordsappears, written c(w(k-1) w(k)), and then dividing each bigram count by theunigram count of the first word in the bigram:p(WORD(k)|WORD(k-1)) = c(WORD(k-1) WORD(k}) / c(WORD(k-1))For example, if the word "time" occurs seven times in a text, and "time of" occursthree times, then the probability of "of" occurring after "time" is 3/7.Project descriptionIn this project, you will be determining conditional probabilities of bigrams. To dothis, you will write a C program, which reads in a file of text and produces threeoutput files, as described below. To compute the conditional probabilities you need to determine unigram andbigram counts first (you can do this in a single pass through a file if you do thingscarefully) and store them in a Binary Search Tree (BST). After that, you can computethe conditional probabilities.Input filesTest files can be found on (http://www.gutenberg.org/ebooks/). For example,search for "Mark Twain." Then click on any of his books. Next download the "PlainText UTF-8" format.In addition, you should test your program on other input files as well for which voucan hand-compute the correct answer.Output filesYour program must accept the name of an input file as a command line argument.Let's call the file name of this file fn. Your program must then produce as output thefollowing set of files:• Your program must write the unigram counts to a file named fn.uni in whicheach unigram is listed on a separate line, and each line contains just theunigram and its count (an integer), separated by a single space.• Your program must write the bigram counts to a file named fn.bi in whicheach bigram is listed on a separate line, and each line contains just thebigram and its count (an integer), separated by a single space.Your program must write the conditional probabilities to a file named fn.cp,reported in the form P(WORD(k)|WORD(k-1)) = p, where p is the conditionalprobability of WORD(k) given WORD(k-1).

Question

Incoroporate or use below code as reference #include #include #include typedef struct node_struct {int item;struct node_struct *next;} node;/*** 10->NULL* We want to insert 20* First call ([10], 20) [has not finished yet]* {10, {20, NULL}}*** Second call (NULL, 20) {20, NULL}*/node* insert_unordered(node *temp_root, int item) {if (temp_root == NULL) {temp_root = malloc(sizeof(node));temp_root->item = item;temp_root->next = NULL;} else {temp_root->next = insert_unordered(temp_root->next, item);}return temp_root;}/*** This is atypical linked list because* (i) it does not allow duplicates and* (ii) it is sorted* 10->20, i want to insert 15* 10->15->20**/ Submission details: Make a directory and name it, using your first and last names,applying the camel notation, for example haniGirgis. Please name the main file,where execution starts, as hw2.cpp. Compress your directory by the zip utility, andsubmit the compressed file using Blackboard.IntroductionSuppose you had a document (perhaps a newspaper article, or an unattributedmanuscript for a book), and you were interested in knowing who wrote it. One wayto try to determine the authorship of the anonymous document is by comparingproperties of the anonymous document with properties of known documents, andseeing if there is enough similarity to make a judgment of authorship.Some simple properties one might use to distinguish different authors include:• Vocabulary (ie. the set of words an author uses).• Word frequencies (i.e. the frequencies with which an author uses words).• Bigram frequencies (i.e. the frequencies of two consecutive words).• Bigram probabilities (i.e. the probability that one word follows anotherword).Terminology• A unigram is a sequence of words of length one (i.e. a single word).A bigram is a sequence of words of length two.The conditional probability of an event E2 given another event E1, writtenp(E2|E1), is the probability that E2 will occur given that event El has alreadyoccurred.We write p(w(k)]w(k-1)) for the conditional probability of a word w in position k,w(k), given the immediately preceding word, w(k-1). You determine the conditionalprobabilities by determining unigram counts (the number of times each wordappears, written c(w(k)), bigram counts (the number of times each pair of wordsappears, written c(w(k-1) w(k)), and then dividing each bigram count by theunigram count of the first word in the bigram:p(WORD(k)|WORD(k-1)) = c(WORD(k-1) WORD(k}) / c(WORD(k-1))For example, if the word "time" occurs seven times in a text, and "time of" occursthree times, then the probability of "of" occurring after "time" is 3/7.Project descriptionIn this project, you will be determining conditional probabilities of bigrams. To dothis, you will write a C program, which reads in a file of text and produces threeoutput files, as described below. To compute the conditional probabilities you need to determine unigram andbigram counts first (you can do this in a single pass through a file if you do thingscarefully) and store them in a Binary Search Tree (BST). After that, you can computethe conditional probabilities.Input filesTest files can be found on (http://www.gutenberg.org/ebooks/). For example,search for "Mark Twain." Then click on any of his books. Next download the "PlainText UTF-8" format.In addition, you should test your program on other input files as well for which voucan hand-compute the correct answer.Output filesYour program must accept the name of an input file as a command line argument.Let's call the file name of this file fn. Your program must then produce as output thefollowing set of files:• Your program must write the unigram counts to a file named fn.uni in whicheach unigram is listed on a separate line, and each line contains just theunigram and its count (an integer), separated by a single space.• Your program must write the bigram counts to a file named fn.bi in whicheach bigram is listed on a separate line, and each line contains just thebigram and its count (an integer), separated by a single space.Your program must write the conditional probabilities to a file named fn.cp,reported in the form P(WORD(k)|WORD(k-1)) = p, where p is the conditionalprobability of WORD(k) given WORD(k-1).

Accepted Answer

*As per the company norms and guidelines we are providing first question answer only please repost…

Incoroporate or use below code as reference #include #include #include typedef struct node_struct { int item; struct node_struct *next; } node; /** * 10->NULL * We want to insert 20 * First call ([10], 20) [has not finished yet]