
Database System Concepts
7th Edition
ISBN: 9780078022159
Author: Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher: McGraw-Hill Education
expand_more
expand_more
format_list_bulleted
Question
Write a function which takes in a pandas dataframe and returns a modified dataframe that includes two new columns that contain information about the municipality and hashtag of the tweet.
Function Specifications:
- Function should take a pandas dataframe as input.
- Extract the municipality from a tweet using the mun_dict dictonary given at the start of the notebook and insert the result into a new column named 'municipality' in the same dataframe.
- Use the entry np.nan when a municipality is not found.
- Extract a list of hashtags from a tweet into a new column named 'hashtags' in the same dataframe.
- Use the entry np.nan when no hashtags are found.
Hint: you will need to mun_dict variable defined at the top of this notebook.
```
Expert Solution

This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
This is a popular solution
Trending nowThis is a popular solution!
Step by stepSolved in 4 steps with 2 images

Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.Similar questions
- ● Create a module called my_first_module. ● Initialise the module with NPM. ● Install lodash to this module. ● Create a script called remove_duplicates.js. ● Within this script, you will need to import lodash, and use the uniq function. ● Create the following array: [1, 2, 10, 100, 10, 2, 5, 6, 10, 1000, 7, 2, 100, 1, 5, 7, 10] ● Using lodash, print out that same array, but with all duplicates removed. ● Finally, set up your module to run the script using: > npm run rduparrow_forwardGDB and Getopt Class Activity activity [-r] -b bval value Required Modify the activity program from last week with the usage shown. The value for bval is required to be an integer as is the value at the end. In addition, there should only be one value. Since everything on the command line is read in as a string, these now need to be converted to numbers. You can use the function atoi() to do that conversion. You can do the conversion in the switch or you can do it at the end of the program. The number coming in as bval and the value should be added together to get a total. That should be the only value printed out at the end. Total = x should be the only output from the program upon success. Remove all the other print statements after testing is complete. Take a screenshot of the output to paste into a Word document and submit. Practice Compile the program with the -g option to load the symbol table. Run the program using gdb and use watch on the result so the program stops when the…arrow_forwardDefine a function named create_usernames_dictionary (usernames_list) which takes a list of usernames as a parameter and creates a dictionary. The keys consist of integers and the values are lists of unique usernames where the last digit of each username is equal to the key value. The lists of unique words must be in sorted alphabetical order. Note: A username contains 3 or 4 letters followed by 3 digits. You can assume the list is not empty. For example: Test Result a_list = ['dsmi001', 'dsmi023', 'pwong002', 'kng126', 'mgra734', 'bcar735', 'dng134¹] [] a_dict = create_usernames_dictionary (a_list) 1 ['dsmi001'] for key in sorted (a_dict): 2 ['pwong002¹] print(key, a dict[key]) 3 ['dsmi023¹] 4 ['dng134', 'mgra734'] 5 ['bcar735¹] 6 ['kng126¹] 7 [] 8 [1 9 []arrow_forward
- 5arrow_forwardThe function below, get_value_if_key, takes two arguments: the dictionary data_dict and a key of any type my_key. Fix the code to return the value from data_dict for key my_key if that key exists. The function should not return anything otherwise (i.e. return None).arrow_forwardInterest rates for four financial instruments are given as ['10.20%', '5.8%', '3.25%', '6.25%'] Create a Pandas series of the above data and make sure the data type is Object Iterate over the elements of the series and remove the % sign. Then convert the series data type to floating point. Generate summary descriptive statistics for the seriesarrow_forward
- Define the function print_trophic_class_summary(tli3_values) that accepts a list of trophic level index values and prints a summary outlining the number of lakes in each trophic classification, in order from highest trophic classification to lowest. See the examples for the required format. Notes: Your function must print the summary, not return it. In each state line, the initial number should be formatted with width 3. (Hint: :3 will be helpful) All possible states must be included in the output, even states with zero lakes. The following list will be helpful: ['Hypertrophic', 'Supertrophic', 'Eutrophic', 'Mesotrophic', 'Oligotrophic', 'Microtrophic', 'Ultra-microtrophic'] You must include and use one of your number_in_trophic_class functions (take your pick!), plus your trophic_class function. Basically, you can start with your answer to Question 5 or 6 and add your print_trophic_class_summary function definition after your previous definitions. For example: Test Result…arrow_forwardget_total_cases() takes the a 2D-list (similar to database) and an integer x from this set {0, 1, 2} as input parameters. Here, 0 represents Case_Reported_Date, 1 represents Age_Group and 2 represents Client_Gender (these are the fields on the header row, the integer value represents the index of each of these fields on that row). This function computes the total number of reported cases for each instance of x in the text file, and it stores this information in a dictionary in this form {an_instance_of_x : total_case}. Finally, it returns the dictionary and the total number of all reported cases saved in this dictionary.arrow_forwardJupyter Notebook Fixed Income - Certicificate of Deposit (CD) - Compound Interest Schedule An interest-at-maturity CD earns interest at a compounding frequency, and pays principal plus all earned interest at maturity. Write a function, called CompoundInterestSchedule, that creates and returns a pandas DataFrame, where each row has: time (in years, an integer starting at 1), starting balance, interest earned, and ending balance, for an investment earning compoundedinterest. Use a for(or while) loop to create this table. The equation for theith year's ending balance is given by: Ei =Bi (1+r/f)f where: Ei is year i's ending balance Bi is year i's beginning balance (note: B1 is the amount of the initial investment (principal) r is the annual rate of interest (in decimal, e.g., 5% is .05) f is the number of times the interest rate compounds (times per year) The interest earned for a given year is Ei - Bi Note the term of the investment (in years) is not in the above equation; it is used…arrow_forward
- Following the Function Design Recipe, create a complete functionnamed get_consonant_cluster that returns a tuple of the consonant phonemes atthe BEGINNING of a word pronunciation. The parameter is of type PHONEMES. Inthe example word pronunciation ('G', 'UW1', 'F', 'IY0') from above, thereturned tuple would be ('G',). If a word pronunciation begins with a vowelphoneme, the empty tuple would be returned. Use the image attached as a helper function."""arrow_forwardusing the R apparrow_forwardDefine the function lake_to_tli3_dict(readings) that takes a list of (lake_name, chla, tn, tp) tuples and returns a dict mapping lake names to trophic level index values. That is, the keys are lake names and the values will be TLI3 values. Notes: You will need to have import math as your first line of code. You must include and use/call the trophic_level_index function that you wrote in Question 1. trophic_level_index must still work as specified in Question 1. Lakes should be added to the dictionary in the same order as they appear in the readings list. If a lake appears in the readings list more than once, then the TLI for the last occurrence in the readings list should be the one that appears in the result. (This should happen naturally, since the lake name can only occur in the dictionary once.) For example: Test Result names_cnp_1 = [('Lake Brunner', 0.8, 218.0, 6.0), ('Lake Carrot', 5.1, 505.0, 18.5) ] tli3_dict = lake_to_tli3_dict(names_cnp_1 ) print(tli3_dict)…arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Database System ConceptsComputer ScienceISBN:9780078022159Author:Abraham Silberschatz Professor, Henry F. Korth, S. SudarshanPublisher:McGraw-Hill EducationStarting Out with Python (4th Edition)Computer ScienceISBN:9780134444321Author:Tony GaddisPublisher:PEARSONDigital Fundamentals (11th Edition)Computer ScienceISBN:9780132737968Author:Thomas L. FloydPublisher:PEARSON
- C How to Program (8th Edition)Computer ScienceISBN:9780133976892Author:Paul J. Deitel, Harvey DeitelPublisher:PEARSONDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781337627900Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningProgrammable Logic ControllersComputer ScienceISBN:9780073373843Author:Frank D. PetruzellaPublisher:McGraw-Hill Education

Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education

Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON

Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON

C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON

Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning

Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education