taset.csv) data set Tokenize

Np Ms Office 365/Excel 2016 I Ntermed
1st Edition
ISBN:9781337508841
Author:Carey
Publisher:Carey
Chapter5: Working With Excel Tables, Pivottables, And Pivotcharts
Section: Chapter Questions
Problem 15RA
icon
Related questions
Question

Thanks in advance.

Subject Artificial Intellegent. Please provide me the code to do the following tasks by using (twitter_sexism_parsed_dataset.csv) data set

  1. Tokenize
  2. Clean
  3. Vectorize using
    1. CountVectorization
    2. TF-IDF
  4. Create new features (feature engineering)
    1. Length of text field
    2. Percentage of characters that are punctuation in the text
  5. Using RandomForest, classify text into sexism or none

Complete the table below

 

Model

Precision

Recall

Accuracy

1

CountVectorizer + Length

 

 

 

2

CountVectorizer + Length + %punct

 

 

 

3

CountVectorizer + Length + %punct

 

 

 

4

TF-IDF + Length

 

 

 

5

TF-IDF + Length + %punct

 

 

 

6

TF-IDF + Length + %punct

 

 

 

  1. Show all graphs
  2. Write one page description of your methodology.
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Knowledge Booster
Database Environment
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
Np Ms Office 365/Excel 2016 I Ntermed
Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:
9781337508841
Author:
Carey
Publisher:
Cengage