
Problem 2: Naïve Bayes Classification
In order to reduce my email load, I decide to use a classifier to decide whether or not I should read an email, or simply file it away instead. To train my model, I obtain the following data set of binary-valued features about each email, including whether I know the author or not, whether the email is long or short, and whether it has any of several key words, along with my final decision about whether to read it ( y = +1 for “read”, y = −1 for “discard”). Help me build the classifier.
Know author? |
Is long? |
Has “research”? |
Has “grade”? |
Has “lottery”? |
Read? |
X1 |
X2 |
X3 |
X4 |
X5 |
Y |
0 |
0 |
1 |
1 |
0 |
-1 |
1 |
1 |
0 |
1 |
0 |
-1 |
0 |
1 |
1 |
1 |
1 |
-1 |
1 |
1 |
1 |
1 |
0 |
-1 |
0 |
1 |
0 |
0 |
0 |
-1 |
1 |
0 |
1 |
1 |
1 |
1 |
0 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
0 |
1 |
1 |
0 |
1 |
1 |
0 |
1 |
1 |
1 |
1 |
1 |
1 |
-1 |
In the case of any ties, predict class +1. Use naïve Bayes classifier to make the decisions.
Compute all the probabilities necessary for a naïve Bayes classifier, i.e., the prior probability p(Y) for both the classes and all the likelihoods p(Xi | Y), for each class Y and feature Xi.

Trending nowThis is a popular solution!
Step by stepSolved in 5 steps with 5 images

- The shares of the U.S. automobile market held in 1990 by General Motors, Japanese manufacturers, Ford, Chrysler, and other manufacturers were, respectively, 36%, 26%, 21%, 9%, and 8%. Suppose that a new survey of 1,000 new-car buyers shows the following purchase frequencies: GM Japanese Ford Chrysler Other 193 384 170 90 163 Click here for the Excel Data File (a) Show that it is appropriate to carry out a chi-square test using these data. (b) Determine whether the current market shares differ from those of 1990. Use α = .05. (Round your answer to 3 decimal places.)arrow_forwardDetermine the original set of data: 1 | 0 1 4 2 | 1 4 4 7 9 3 | 3 5 5 5 7 7 8 4 | 0 0 1 2 6 6 8 9 9 5 | 3 3 5 8 6 | 1 2arrow_forwardUse the following information in TABLE 1 on five college professors to answer the following 5 questions. Name Specialty Sex Age Rank Able Economics M 40 Associate Professor Roth Management F 41 Full Professor Martin Marketing M 52 Associate Professor Brock Finance F 30 Assistant Professor Young Accounting F 35 Assistant Professor 1. Based on Table 1, how many elements are in this data set? 2. Based on Table 1, how many variables are in this data set? 'arrow_forward
- Consider two data sets. Set A: n = 5; x = 4 Set B: n = 50; x = 4 (a) Suppose the number 26 is included as an additional data value in Set A. Compute x for the new data set. Hint: x = nx. To compute x for the new data set, add 26 to x of the original data set and divide by 6. (Round your answer to two decimal places.) (b) Suppose the number 26 is included as an additional data value in Set B. Compute x for the new data set. (Round your answer to two decimal places.)arrow_forwardThe International Average Salary Income Database provides a comparison of average salaries for various professions. The data are gathered from publications and reports obtained directly from government agencies (such as the US Bureau of Labor Statistics) or from the International Labour Organization. Suppose an industrial/organizational psychologist is interested in the relationships between job satisfaction, job performance, and job compensation. She has data from five different countries on the average monthly salaries paid to firefighters. She begins her analysis by converting all of the salary data into US dollars: Country Lithuania Germany the United States Australia Austria Country Lithuania Germany the United States Average Monthly Salary(US dollars) $185 Australia Austria $1,895 $2,729 $2,605 $1,134 Standard Deviation (US dollars) $27.80 $473.80 $818.60 z-score for $1,500 Salary $781.60 $283.60 Original Currency litas To appreciate the differences among the countries, she…arrow_forwardSet A: n = 5; x = 10Set B: n = 50; x = 10(a) Suppose the number 32 is included as an additional data value in Set A. Compute x for the new data set. To compute x for the new data set, add 32 to x of the original data set and divide by 6. (Round your answer to two decimal places.) b.) Suppose the number 32 is included as an additional data value in Set B. Compute x for the new data set. (Round your answer to two decimal places.)arrow_forward
- The following data represent the responses to two questions asked in a survey of 10 college students majoring in business-What is your gender? (male = M; female = F) and What is your major? (accounting = A; computer information systems = C; marketing = M). Complete parts (a) and (b). м м M A C Gender: F F M F M Major: M A A A a. Tally the data into a contingency table where the two rows represent the gender categories and the three columns represent the academic-major categories. Student Major Categories Gender A Totals Male 3 6 Female 1 2 4 Totals 4 4 10 b. Construct contingency tables based on percentages of all 10 student responses, based on row percentages and based on column percentages. Complete the following contingency table based on total percentages. Student Major Categories Gender A M Totals Male 30.00 10.00 20.00 60.00 Female 10.00 10.00 20.00 40.00 Totals 40.00 20.00 40.00 100.00 (Round to two decimal places as needed.) Complete the following contingency table based on row…arrow_forwardChoose all of the following true statements. {7,"Albert R.", 7/2, True} is not a set because it contains mixed data types. {7,"Albert R.", T/2, True} is a set. {7,"Albert R.", T/2, True} is the same as {True, "Albert R.", 7, 7/2} O 14/2 e {7," Albert R.", 7/2, True} O T/3 € {7,"Albert R.", 7/2, True} O 2/3 ¢ Z ACB= Vx. [x E A → x E B where A, B are sets. 6. {3} € {5,7,3} The empty set is a subset of every set.arrow_forward
- Advanced Engineering MathematicsAdvanced MathISBN:9780470458365Author:Erwin KreyszigPublisher:Wiley, John & Sons, IncorporatedNumerical Methods for EngineersAdvanced MathISBN:9780073397924Author:Steven C. Chapra Dr., Raymond P. CanalePublisher:McGraw-Hill EducationIntroductory Mathematics for Engineering Applicat...Advanced MathISBN:9781118141809Author:Nathan KlingbeilPublisher:WILEY
- Mathematics For Machine TechnologyAdvanced MathISBN:9781337798310Author:Peterson, John.Publisher:Cengage Learning,





