The likelihood ratio statistic.

Consider a training set that contains 100 positive examples and 400 negative
examples. For each of the following candidate rules,
R1: A ?? + (covers 4 positive and 1 negative examples),
R2: B ?? + (covers 30 positive and 10 negative examples),
R3: C ?? + (covers 100 positive and 90 negative examples),
determine which is the best and worst candidate rule according to:

For R1, the expected frequency for the positive class is 5×100/500 = 1



and the expected frequency for the negative class is 5 × 400/500 = 4.



Therefore, the likelihood ratio for R1 is







For R2, the expected frequency for the positive class is 40×100/500 = 8



and the expected frequency for the negative class is 40 × 400/500 = 32.



Therefore, the likelihood ratio for R2 is







For R3, the expected frequency for the positive class is 190×100/500 =



38 and the expected frequency for the negative class is 190×400/500 =



152. Therefore, the likelihood ratio for R3 is







Therefore,

Computer Science & Information Technology

You might also like to view...

The banner on the front page of a newsletter that identifies the publication is the header

Indicate whether the statement is true or false

Computer Science & Information Technology

Match the e-mail header information with its description

I. To A. Used if sender requests an automated confirmation of the recipient having read the e-mail II. X-Confirm-Reading-To B. Easily spoofed by hackers III. Content-Type C. Nonstandard heading sometimes used when encountering a mailing list IV. Apparently-To D. Deals with non-text items such as photos

Computer Science & Information Technology