Skip to main content

Documents Computer Science

Assignent 11.docx

Assignent 11

.docx

School

University of Missouri, Columbia *

*We aren’t endorsed by this school

Course

8740

Subject

Computer Science

Date

Jan 9, 2024

Type

docx

Pages

5

Uploaded by sharukh95

Assignment 11 1.)  The code is exploring topics in the Associated Press dataset using Latent Dirichlet Allocation (LDA) from the topicmodels library.  This code applies LDA to the Associated Press dataset with 2 topics and sets the seed for reproducibility.  The tidy function is used to convert the LDA results into a tidy data frame. This code extracts the top 10 terms for each topic based on their beta values.

 This code creates a bar plot using ggplot2, displaying the top terms for each topic.  Overall, this script is a comprehensive exploration of topics in the Associated Press dataset using LDA, followed by visualizing the top terms for each topic. The tidy function from the tidytext library is used to convert the LDA results into a tidy data frame. This makes it easier to work with the data and extract relevant information.

 The code groups the tidy LDA results by topic, then extracts the top 10 terms for each topic based on their beta values (indicating the strength of association with the topic). The result is sorted in descending order of beta values.  The code uses ggplot2 to create a bar plot. Each bar represents a term, and the bars are colored by the topic. The plot is faceted, meaning there is a separate facet for each topic. The scale_y_reordered() ensures that the terms are ordered within each facet based on their beta values. #2.

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Related Questions

Computer programming for business analytics. R programming: import the HouseData.csv file into a data frame called house using the stringsAsFactors = FALSE setting. Next, multiply the Parking and City_Category vectors using the factor() function.

R programming First, import the HouseData.csv file into a data frame called house with the stringsAsFactors = FALSE setting. Next, use the factor() function to encode the Parking and City_Category vectors as factors.

Create six TIME SERIES(USING pd.series function) and store in a Pandas data frame:a)The data frame must have an index that is a range of dates from 2016-01-01 until today.b)Each column contains one set of random numbers in a range 0 to 1 (each column will haveone random number for each date). number of columns = 6 number of rows = date range between today and 2016-01-01

get_total_cases() takes the a 2D-list (similar to database) and an integer x from this set {0, 1, 2} as input parameters. Here, 0 represents Case_Reported_Date, 1 represents Age_Group and 2 represents Client_Gender (these are the fields on the header row, the integer value represents the index of each of these fields on that row). This function computes the total number of reported cases for each instance of x in the text file, and it stores this information in a dictionary in this form {an_instance_of_x : total_case}. Finally, it returns the dictionary and the total number of all reported cases saved in this dictionary.

Create six TIME SERIES(USING pd.series function) and store in a Pandas data frame:a)The data frame must have an index that is a range of dates from 2016-01-01 until today.b)Each column contains one set of random numbers in a range 0 to 1 (each column will haveone random number for each date).

Computer programming for business analytics. 1. Using the R programming language, import the HouseData.csv file into a data frame called house while setting stringsAsFactors = FALSE. After that, use the factor() function on the Parking and City_Category vectors.

Q7: Using the AIRPORT KLX Table (Textbook page 131), describe an example that illustrates the insertion anomaly. Q8: Using the AIRPORT KLX Table (Textbook page 131), describe an example that illustrates the deletion anomaly. Q9: Using the AIRPORT KLX Table (Textbook page 131), describe an example that illustrates the modification anomaly.

Week 4 Lab assessment task Use the function julia, that you have defined, to produce an image of a Julia set. Please try to find a "nice" seed constant that produces an interesting Julia set that is different from the provided example. At your choice, you may also customise the colour map or other aspects of the image. Use the title command, and optionally the subtitle command, to add a title in the same format as the image shown below. Sample code: julia (0.4,0.4,1500,50) ; 2 hold on 3 title (" Julia set c = 0.4 + 0.41") 4 subtitle ("Oliver Heaviside ID 123456789") s hold off

List data_list contains integers read from input, representing a sequence of data values. For each index i of data_list from 1 through the second-to-last index: The element at index i is a drop if the element is less than both the preceding element and the following element. If the element at index i is a drop, then output 'Drop: ', followed by the preceding element, the current element, and the following element, separating each element by a space.

This exercise allows a user to load one of two CSV files and then perform histogram analysis and plots for select variables on the datasets. The first dataset represents the population change for specific dates for U.S. regions. The second dataset represents Housing data over an extended period of time describing home age, number of bedrooms and other variables. The first row provides a column name for each dataset. The following columns should be used to perform analysis:  PopChange.csv:  Pop Apr 1  Pop Jul 1  Change Pop Housing.csv:  AGE  BEDRMS  BUILT  ROOMS  UTILITY Notice for the Housing CSV file, there are more columns in the file than are required to be analyzed. You can and should still load each column. Specific statistics should include:  Count  Mean  Standard Deviation  Min  Max  Histogram A user interface might look similar to this: ***************** Welcome to the Python Data Analysis App********** Select the file you want to analyze: 1. Population Data 2.…

Language - Python

For this exercise, we’ll use the (built-in) dataset VADeaths. a) Make sure the object is a data frame, if not change it to a data frame. b) Create a new variable, named Total, which is the sum of each row. c) Change the order of the columns so total is the first variable.

Background: We have a set of handwritten digits, 0 to 9, the image sizes is 28-by-28 pixels; it includes ten images per digit (i.e., 100 images in total). We want to create a barcode to present each image. We will use the corresponding barcode of the image to search for the most similar image in the dataset. In fact, we will compare the barcode of the query image with other barcodes to find the most similar image (the closest would be the most similar). Then, we will conduct some experiments to report the retrievalaccuracy. Furthermore, you will analyze the designed algorithms complexity (based on Big-O notation). Question: Create an Barcode_Generator Algorithm to generate the barcodes for each image?

7]: fig, ax = plt.subplots() data_d.plot.hist(density=False, ax=ax, title='Histogram: Set1 and Setl samples vs. Set2 and Set2 samples', bins=40) data.plot.hist (density=False, ax=ax, bins=40) ax.set_ylabel('Count') ax.grid(axis='y') Count Histogram: Set1 and Set1 samples vs. Set2 and Set2 samples 35 30 25 20 15 10 600 500 400 300 5 200 0 100 0 100 Use boxplots to compare the four sets. Discuss their differences. 0 200 8]: fig = plt.figure(figsize =(10, 7)) plt.boxplot ([set1, set1_s, set2, set2_s],1, '') plt.show() 300 400 500 Setls Set2s Set1 Set2 2 600 3 The first pair and the second pair look similar while the two pairs look differnet, right? The question is how can we KNOW if two sets are truly (significantly) different or not?

#The Iris Dataset import sklearn.datasetsimport matplotlib.pyplot as plt import numpy as np import scipy iris = sklearn.datasets.load_iris() Write a function that takes in an index i and prints out a verbose desciption of the species and measurements for data point i. For example:Data point 5 is of the species setosaIts sepal length (cm) is 5.4Its sepal width (cm) is 3.9Its petal length (cm) is 1.7Its petal width (cm) is 0.4

sacramento.pyUse the sacramento.csv file to complete the following assignment. Create a file, sacramento.py, that loads the .csv file and runs a logistic regression. The regression should predict whether or not a house has 1 or more than one bathroom based on beds, sqft, and price, in that order. Note: you will not need to upload the .csv to CodeGrade because I have pre-loaded it.You will need to create a new variable from baths, and it should make it such that those observations of 1 bath correspond to a value of 0, and those with more than 1 bath correspond to a 1.Make sure to add a constant using sm.add_constant(X)Your file should print the results in this way: print(mod.params.round(2))print(mod.pvalues.round(2))print('The smallest p-value is for sqft') sacramento.csv…

Alter the attached code so that the bar chart can be up to 25 categories, and can be sorted ascending or descending. In this assignment an array of Category structs are used to store both the category name (label) and the category value. The Category struct is provided below along with changes to the prototypes from assignment 3. The get_longest_category_name function has been removed because after sorting, the last or first element in the cats array will be the longest length label. Use this information in create_bar_chart and the asc Boolean value where true is sort ascending and false is sort descending. struct Category { std::string label; double value;}; Update the attached code to include the following in global scope: #define CATEGORIES 25int num_categories = 5; Change the attached code's functions to the following: //Ask user how many categories, up to a max of 25int how_many_categories();void get_category(Category cats[CATEGORIES]);void get_values(Category…

Can somebody help me with my homework? I provided screenshots of my code that goes alongside the question. *Side note comments would be helpful, but not required.

16-7. Automated Title: In eq_world_map_3.py, we specified the title manually when defining my_layout, which means we have to remember to update the title every time the source file changes. Instead, you can use the title for the data set in the metadata part of the JSON file. Pull this value, assign it to a variable, and use this for the title of the map when you’re defining my_layout.

The CustNames spreadsheet imported customer names with non-printable characters, additional spaces, and inconsistent formatting. First name, middle initial, and last name must be cleared. Use the text function in column B of the CustNames worksheet to remove nonprinting characters from column A. To fit contents, resize column. 0.5

Modify 1. Replace the elements of r_vec with the indices 4, 6, 8, 10 and 15 with 3, 6, 9, 12 and 15, respectively. 2. Replace the elements of mat with the indices (4, 5), (4, 17), (20, 5) and (20,17) with 5, 8, 2 and 9, respectively. 3. Replace the all the elements of the eleventh column of mat with ones. 4. Replace the characters 11 to 19 of chr with the words white cat. 5. Replace the first entry of str with the value of sclr.

Capture the student performance record in a sentinel-controlled loop and store the results in three parallel arrays. The information to be stored in the three arrays is the student’s full name, continuous assessment mark, and final mark. If the lecturer types the word ‘Done’ instead of a full name, the loop should immediately stop even before capturing any marks. Search the array for the student’s full name and then display the full record of the student, or the notification that student does not exist. You should also allow the teachers to search for the best or worst performer in the Final Mark column. When the mark is found, let it be displayed with all the student’s details. Make this program a menu driven system with the following options: Capture Marks, Find a Student, Find the Best Performer, Find the Worst Performer,

Plz help with code

Go to Sheet1. In cell C11 type a VLOOKUP function to find the corresponding letter grade for the score in B11. Use absolute references for the table array parameter (A4 to B8). Use FALSE for the range lookup parameter. Copy the formula in C11 to C12:C15Notice this formula only works correctly in C11 and C15. That is because you used FALSE for the range lookup parameter. FALSE means search for an exact match. The only scores in C11:C15 that exactly match values in the table array are 80 and 90. In cell D11 type a VLOOKUP function to find the corresponding letter grade for the score in B11. The table array parameter is the same as in C11. Do not type anything for the range lookup parameter. Copy the formula in D11 to D12:D15Notice this formula works correctly for all cells. The default value for the range lookup parameter is TRUE. (You could have typed TRUE and it would work the same way, However, why type when you don't need to?) TRUE means search for an approximate match - the last…

The chr21_genes.txt file lists genes from human chromosome 21, in their order along the chromosome, as described in Hattori et al. (Nature 405, 311-319) (Links to an external site.). For each gene, the file gives the gene symbol, description and category. The fields are separated by tabs. You will need to get the the meaning of each category. You can find these meanings in the original paper (Links to an external site.), under the "Gene categories" section. Create a file named chr21_genes_categories.txt that store this information in tab separated fields:

5. Write an R program, using the corrgram library, to plot the Average_Daily_Traffic_Counts.csv dataset.Use the lower.panel=panel.conf, upper.panel=panel.pts settings for the plot. 6. Write an R program, using the ggplot2 library, to print a scatter chart for the count_female andcount_male columns. Use the Demographic_Statistics_By_Zip_Code.csv dataset.

Define the function print_trophic_class_summary(tli3_values) that accepts a list of trophic level index values and prints a summary outlining the number of lakes in each trophic classification, in order from highest trophic classification to lowest. See the examples for the required format. Notes: Your function must print the summary, not return it. In each state line, the initial number should be formatted with width 3. (Hint: :3 will be helpful) All possible states must be included in the output, even states with zero lakes. The following list will be helpful: ['Hypertrophic', 'Supertrophic', 'Eutrophic', 'Mesotrophic', 'Oligotrophic', 'Microtrophic', 'Ultra-microtrophic'] You must include and use one of your number_in_trophic_class functions (take your pick!), plus your trophic_class function. Basically, you can start with your answer to Question 5 or 6 and add your print_trophic_class_summary function definition after your previous definitions. For example: Test Result…

3. In file 'R-Factor-Basics.docx', page 4, use the factor() command to modify the column dat$Group so that the control group is plotted last.e

Apply the discretization filter in iris dataset. (Note: iris dataset can be directly loaded into WEKA from the “C:\Program Files\Weka-3-8\data” link). After applying the discretization filter, list all the features (attributes) and their ranges.

SEE MORE QUESTIONS

Recommended textbooks for you

Text book image

Np Ms Office 365/Excel 2016 I Ntermed

Computer Science

ISBN:9781337508841

Author:Carey

Publisher:Cengage

SEE MORE TEXTBOOKS

Related Questions

SEE MORE QUESTIONS

Recommended textbooks for you

Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:9781337508841
Author:Carey
Publisher:Cengage

Text book image

Np Ms Office 365/Excel 2016 I Ntermed

Computer Science

ISBN:9781337508841

Author:Carey

Publisher:Cengage

SEE MORE TEXTBOOKS