(a) Write down the model using the coefficients from the model fit. log_odds(spam) = to_multiple + winner + format + re_subj (b) Suppose we have an observation where to_multiple-0, winner-1, format-0, and re_subj-0. What is the predicted probability that this message is spam?

Big Ideas Math A Bridge To Success Algebra 1: Student Edition 2015
1st Edition
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:HOUGHTON MIFFLIN HARCOURT
Chapter9: Solving Quadratic Functions
Section: Chapter Questions
Problem 4CA
icon
Related questions
Question
Spam filters are built on principles similar to those used in logistic regression. We fit a probability that
each message is spam or not spam. We have several variables for each email. Here are a few:
to_multiple=1 if there are multiple recipients, winner=1 if the word 'winner' appears in the subject line,
format=1 if the email is poorly formatted, re_subj=1 if "re" appears in the subject line. A logistic model was
fit to a dataset with the following output:
Estimate
SE
Pr(>|Z])
(Intercept)
to_multiple
-0.8166
0.0883
-9.248
-2.6583
0.3023
-8.7936
winner
1.5705
0.3178
4.9418
format
-0.0799
0.1232
-0.6485
0.5166
re_subj
-2.807
0.3652
-7.6862
(a) Write down the model using the coefficients from the model fit.
log_odds(spam) =
to_multiple +
winner +
format +
re_subj
(b) Suppose we have an observation where to_multiple=0, winner=1, format=0, and re_subj=0. What is the
predicted probability that this message is spam?
Transcribed Image Text:Spam filters are built on principles similar to those used in logistic regression. We fit a probability that each message is spam or not spam. We have several variables for each email. Here are a few: to_multiple=1 if there are multiple recipients, winner=1 if the word 'winner' appears in the subject line, format=1 if the email is poorly formatted, re_subj=1 if "re" appears in the subject line. A logistic model was fit to a dataset with the following output: Estimate SE Pr(>|Z]) (Intercept) to_multiple -0.8166 0.0883 -9.248 -2.6583 0.3023 -8.7936 winner 1.5705 0.3178 4.9418 format -0.0799 0.1232 -0.6485 0.5166 re_subj -2.807 0.3652 -7.6862 (a) Write down the model using the coefficients from the model fit. log_odds(spam) = to_multiple + winner + format + re_subj (b) Suppose we have an observation where to_multiple=0, winner=1, format=0, and re_subj=0. What is the predicted probability that this message is spam?
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Knowledge Booster
Point Estimation, Limit Theorems, Approximations, and Bounds
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, statistics and related others by exploring similar questions and additional content below.
Recommended textbooks for you
Big Ideas Math A Bridge To Success Algebra 1: Stu…
Big Ideas Math A Bridge To Success Algebra 1: Stu…
Algebra
ISBN:
9781680331141
Author:
HOUGHTON MIFFLIN HARCOURT
Publisher:
Houghton Mifflin Harcourt
Glencoe Algebra 1, Student Edition, 9780079039897…
Glencoe Algebra 1, Student Edition, 9780079039897…
Algebra
ISBN:
9780079039897
Author:
Carter
Publisher:
McGraw Hill
Calculus For The Life Sciences
Calculus For The Life Sciences
Calculus
ISBN:
9780321964038
Author:
GREENWELL, Raymond N., RITCHEY, Nathan P., Lial, Margaret L.
Publisher:
Pearson Addison Wesley,
Holt Mcdougal Larson Pre-algebra: Student Edition…
Holt Mcdougal Larson Pre-algebra: Student Edition…
Algebra
ISBN:
9780547587776
Author:
HOLT MCDOUGAL
Publisher:
HOLT MCDOUGAL