Given the data sets shown in Figures 5.6, explain how the decision tree, na ??ve Bayes, and k-nearest neighbor classifiers would perform on these data sets.

(a) Both decision tree and NB will do well on this data set because the
distinguishing attributes have better discriminating power than noise
attributes in terms of entropy gain and conditional probability. k-NN
will not do as well due to relatively large number of noise attributes.
(b) NB will not work at all with this data set due to attribute dependency.
Other schemes will do better than NB.

(c) NB will do very well in this data set, because each discriminating at-
tribute has higher conditional probability in one class over the other

and the overall classification is done by multiplying these individual
conditional probabilities. Decision tree will not do as well, due to the
relatively large number of distinguishing attributes. It will have an
overfitting problem. k-NN will do reasonably well.
(d) k-NN will do well on this data set. Decision trees will also work, but
will result in a fairly large decision tree. The first few splits will be quite
random, because it may not find a good initial split at the beginning.
NB will not perform quite as well due to the attribute dependency.
(e) k-NN will do well on this data set. Decision trees will also work, but
will result in a large decision tree. If decision tree uses an oblique split
instead of just vertical and horizontal splits, then the resulting decision
tree will be more compact and highly accurate. NB will not perform
quite as well due to attribute dependency.
(f) kNN works the best. NB does not work well for this data set due to
attribute dependency. Decision tree will have a large tree in order to
capture the circular decision boundaries.

Computer Science & Information Technology

You might also like to view...

The mouse pointer changes to a(n) ________ once a shape is about to be "drawn" on a slide

A) arrow B) pointing finger C) star D) black cross

Computer Science & Information Technology

A Dynamic VHD is only as large as the data contained in it. You can specify the maximum size. For example, if you create a dynamic VHD of 30 GB size, it starts out at approximately 80 MB but expands as you write data to it. It cannot exceed the specified maximum size

Indicate whether the statement is true or false

Computer Science & Information Technology