Homework 3: Classification (CS 5810)

 March 22, 2021

Total Points: 100

Save your time - order a paper!

Get your paper written from scratch within the tight deadline. Our service is a reliable solution to all your troubles. Place an order on any task and we will take care of it. You won’t have to worry about the quality and deadlines

Order Paper Now

Submission Deadline:

Submission Guideline: Submit your answers in a .docx or a .pdf file (do not submit your answers in a file with any other format).Insert tables and figures in your answer where appropriate. You can only work with your team member on the homework but you have to submit the work individually through blackboard. Do not forget to mention your team member’s name at the top of the document and also a paragraph containing peer evaluation at the end of the document. In peer evaluation, mention the contribution of your team member to the homework. If you wish to work alone on this homework even after having a team you are most welcome to do that. In that case, you do not need to add a peer evaluation. 

Problem 2: Naïve Bayes Classification [40]

In order to reduce my email load, I decide to use a classifier to decide whether or not I should read an email, or simply file it away instead. To train my model, I obtain the following data set of binary-valued features about each email, including whether I know the author or not, whether the email is long or short, and whether it has any of several key words, along with my final decision about whether to read it ( y = +1 for “read”, y = −1 for “discard”). Help me build the classifier.

Know author?Is long?Has “research”?Has “grade”?Has “lottery”?Read?

In the case of any ties, predict class +1. Use naïve Bayes classifier to make the decisions.

  • Compute all the probabilities necessary for a naïve Bayes classifier, i.e., the prior probability p(Y) for both the classes and all the likelihoods p(Xi | Y), for each class Y and feature Xi. [15 points]
  • Which class would be predicted for X = (0 0 0 0 0)? What about for X = (1 1 0 1 0)? [10 points]
  • Compute the posterior probability that Y = +1 given the observation X = (1 1 0 1 0). [5 points]
  • Suppose that, before we make our predictions, we lose access to my address book, so that we cannot tell whether the email author is known. Should we re-train the model, and if so, how? (e.g.: how does the model, and its parameters, change in this new situation?) Hint: what will the naïve Bayes model over only features X2 . . . X5 look like, and what will its parameters be? [10 points]