
Write a Python program to create a Markov model of order k, and then use that model to generate text. Our Markov model will be stored in a dictionary. The keys of the dictionary will be k-grams, and the value for each key will also be a dictionary, storing the number of occurrences of each character that follows the k-gram.
Note how, for instance, the key 'ga' in the dictionary has the value {'g': 4, 'a': 1}. That is because, following 'ga' in the input text, the letter 'g' appears four times and the letter 'a' appears one time.
write the functions:
get_grams(text, k): Returns a dictionary of k-grams as described above, using the input string text and the given positive integer k. Do not form k-grams for the last k characters of the text.
combine_grams(grams1, grams2): Takes two k-gram dictionaries and combines them, returning the new combined dictionary. All key-value pairs from both dictionaries should be added into the new dictionary. If a key exists in both dictionaries, then combine the values (if the two values are dictionaries, then combine the dictionaries as above; if the two values are integers, then add them together). The input dictionaries should not be modified.
get_grams_from_files(filenames, k): This function will take a list of strings filenames and a positive integer k. It will read in the files at the given filenames, and create a k-grams dictionary for each file. It will combine all such k-grams dictionaries and return the combined dictionary. Note: When opening the files, use the keyword argument encoding='utf-8', which tells Python to interpret the text file using a particular character encoding. Otherwise, Python may use a different encoding depending on your OS which can result in errors.
generate_next_char(grams, cur_gram): This function returns the prediction of the next character to follow the k-gram cur_gram, given the dictionary of k-grams grams. To do so, it must determine the probability of each character that can follow the cur_gram. Probability is beyond the scope of this course, so we will allow you to use a function from the random module to help here called random.choices(). random.choices(population, weights) takes two lists as arguments. The first list is the items to be chosen from at random (in this case, the characters that can possibly follow the given k-gram). The second list is the weighting for each of the items in the former list. That is, the element (character) at population[i] will be chosen with probability weights[i]. All you have to do is create these two lists, then call the latter function to obtain the predicted character. Note that the weight for a character can be found by taking the number of occurrences of that character following the k-gram and dividing by the total number of occurrences of any character following the k-gram. That is, if k = 1, the current k-gram is 'a', and the k-gram dictionary is {'a': {'b': 3, 'c': 9}, 'c': {'d': 4}}, then either 'b' or 'c' could follow, with 'b' having weight 3 12 and 'c' having weight 9 12 . (So the function would be much more likely to return 'c' than 'b'.) If the cur_gram is not present in the grams dictionary, or if it has a different number of characters than the k-grams in the dictionary, then raise an AssertionError with an appropriate error message in each case
generate_text(grams, start_gram, k, n): This function generates a piece of text of length n (a positive integer), given the k-grams dictionary grams, the positive integer k, and the starting kgram start_gram. That is, starting with the start_gram, continue generating characters until you have a text of length k. Then, cut off the text at the last empty whitespace (space or newline character), and return the text. (Cutting off the text may result in the text being a few characters smaller than n, but that’s OK.) Note: If the start_gram string is longer than k characters, then use only the first k characters (discard the rest).


Trending nowThis is a popular solution!
Step by stepSolved in 3 steps with 4 images

- Solve using JAVA only Given a square chess board of size N, solve the N queens problem. ONLY Print out the NUMBER of solutions for a given N. DO NOT print out or produce the actual solutions, the code shold ONLY PROVIDE the number of correct solutions. Solve for N between 2 and 8. Try your program with 9 and 10, produce those results if it returns in a reasonable amount of time. The N queens problem is to find every configuration of N queens distributed on an NxN square chess board such that all queens are safe from attack by each other.arrow_forwardComputer Science You must count the number of strings of length 6 over the alphabet A = {x, y, z}. Any string of length 6 can be considered as a 6-tuple. For example, the string xyzy can be represented by the tuple (x,y,z,y). So the number of strings of length 6 over A equals the number of 6-tuples over A, which by product rule is N. Find N. (Use the following product rule for any finite set A and any natural number n : |A^n|=|A|^n)arrow_forwardWe learnt this week that lists can be multi-dimensional. For e.g., the following is another example of 2-D multidimensional list. Each row contains student name followed by their grades in 5 subjects: students = [ ['Anna', 98.5, 77.5, 89, 93.5, 85.5], ['Bob', 77, 66.5, 54, 90, 85.5], ['Sam', 98, 97, 89.5, 92.5, 96.5] ] To access, a specific row, you would use students[row_number][column_number]. students[0][0] would print 'Anna' students[0][1] would print 98.5 and so on.. Write a program that defines a function that takes a list as an argument, adds the scores of each student, calculate average for each student (append them to a separate list) and display them. Your program should: Define a function display_average(students) that takes in a 2-D list as an argument. Display the original list using for/while loop. Calculate and display the average of each student. You do not need to ask user for input. You can use your own 2-D lists with at-least 2-rows.arrow_forward
- Write a Program to implement binary search using recursion.arrow_forwardWrite a computer program, or code, to calculate the number of partitions p(n) of a finite set of n elements from n=1 to n=50. (In Java)arrow_forwardMerge sort is an efficient sorting algorithm with a time complexity of O(n log n). This means that as the number of elements (chocolates or students) increases significantly, the efficiency of merge sort remains relatively stable compared to other sorting algorithms. Merge sort achieves this efficiency by recursively dividing the input array into smaller sub-arrays, sorting them individually, and then merging them back together. The efficiency of merge sort is primarily determined by its time complexity, which is , where n is the number of elements in the array. This time complexity indicates that the time taken by merge sort grows logarithmically with the size of the input array. Therefore, even as the number of chocolates or students increases significantly, merge sort maintains its relatively efficient performance. Regarding the distribution of a given set of x to y using iterative and recursive functions, the complexity analysis depends on the specific implementation of each…arrow_forward
- Assume, you have been given a dictionary where the keys present the name of the father and the value is a list of names of the sons of that person. For example A has 3 sons namely X, Y, and Z. Again X has three sons namely E, F, and G. So, A is the grandfather of E, F, and G. family = {"A" : ["X", "Y", "Z"], "B": ["M", "N"], "W" : ["A", "B"], "X" : ["E", "F", "G"]} You need to write a code which takes an input and prints the names of all his grandsons in the format shown in output samples. If he does not have any grandchildren just print “Get your sons married first! Wanna_be_grandpa!!” You can assume the input will always be a key from the dictionary. But we will check your code with a different dictionary. So do not write a code for this particular dictionary only. Enter Assume, you have been given a dictionary where the keys present the name of the father and…arrow_forwardHow will this project be solved in Python? Thanks. There are example outputs in the picture.arrow_forwardgiven dictionaries, d1 and d2, create a new dictionary with the following property. For each entry(a,b) in d1, if a is not a key of d2 then add (a,b) to the new dictionary. For each entry (a,b) in d2, if a is not a key of d1 then add (a,b) to the new dictionary. For example, if d1 is {2:3, 8:19, 6:4, 5:12} and d2 is {2:5, 4:3, 3:9}, then the new dictionary should be {8:19, 6,4, 5:12, 4:3, 3:9}. using puthonarrow_forward
- In $12.2 Lamport's Hash we mentioned the notion of using only 64 bits of the hash. At each stage, 128 bits are computed, 64 bits are thrown away, and the hash of the retained 64 bits is used in the next stage. The purpose of only using 64 bits is so that a human does not need to type as long a string. Assuming the person will still only type 64 bits, does it work if hash" does a hash of all 128 bits of hash"-1, but what the person actually transmits is 64 bits of the result?arrow_forward1. Write an algorithm to determine whether a given element x belongs to a set S := {s1, . . . , sn}.arrow_forwardThe same professor writes a program to analyze the chat logs from the various lectures as a partial measure of class participation. Each chat log consists of multiple lines, each of form TIME, NAME, COMMENT, in the order in which they were posted to the chat . The algorithm then sorts all that data - across all the chat logs - in order of student names, and scores each set using a metric based on the number of contributions and their length. [a] What are the various input sizes to consider when analyzing the algorithmic approaches? (One is the number of students in the class, but that is not the only factor.) [b] Which sorting algorithm do you think is best suited for the sorting portion of this algorithm and why? [c] What is the time complexity of the sorting algorithm chosen in step [b] as a function of the various input sizes identified in step [a]? [d] What is the overall time complexity of the chat analysis program as described above?arrow_forward
- Computer Networking: A Top-Down Approach (7th Edi...Computer EngineeringISBN:9780133594140Author:James Kurose, Keith RossPublisher:PEARSONComputer Organization and Design MIPS Edition, Fi...Computer EngineeringISBN:9780124077263Author:David A. Patterson, John L. HennessyPublisher:Elsevier ScienceNetwork+ Guide to Networks (MindTap Course List)Computer EngineeringISBN:9781337569330Author:Jill West, Tamara Dean, Jean AndrewsPublisher:Cengage Learning
- Concepts of Database ManagementComputer EngineeringISBN:9781337093422Author:Joy L. Starks, Philip J. Pratt, Mary Z. LastPublisher:Cengage LearningPrelude to ProgrammingComputer EngineeringISBN:9780133750423Author:VENIT, StewartPublisher:Pearson EducationSc Business Data Communications and Networking, T...Computer EngineeringISBN:9781119368830Author:FITZGERALDPublisher:WILEY





