Consider the data set shown in Table 7.8. The first attribute is continuous, while the remaining two attributes are asymmetric binary. A rule is consid- ered to be strong if its support exceeds 15% and its confidence exceeds 60%. The data given in Table 7.8 supports the following two strong rules:

(i) {(1 ? A ? 2), B = 1}?{C = 1}

(ii) {(5 ? A ? 8), B = 1}?{C = 1}



(a) Compute the support and confidence for both rules(b) To find the rules using the traditional Apriori algorithm, we need to

discretize the continuous attribute A. Suppose we apply the equal width

binning approach to discretize the data, with bin-width = 2, 3, 4. For

each bin-width, state whether the above two rules are discovered by

the Apriori algorithm. (Note that the rules may not be in the same

exact form as before because it may contain wider or narrower intervals

for A.) For each rule that corresponds to one of the above two rules,

compute its support and confidence.

(c) Comment on the effectiveness of using the equal width approach for

classifying the above data set. Is there a bin-width that allows you to find both rules satisfactorily? If not, what alternative approach can you

take to ensure that you will find both rules?

(a) s({(1 ? A ? 2), B = 1}?{C = 1})=1/6


c({(1 ? A ? 2), B = 1}?{C = 1})=1


s({(5 ? A ? 8), B = 1}?{C = 1})=1/6


c({(5 ? A ? 8), B = 1}?{C = 1})=1


(b) When bin ? width = 2:





Where





A1=1 ? A ? 2; A2=3 ? A ? 4;


A3=5 ? A ? 6; A4=7 ? A ? 8;


A5=9 ? A ? 10; A6 = 11 ? A ? 12;


For the first rule, there is one corresponding rule:


{A1=1, B = 1}?{C = 1}





s(A1=1, B = 1}?{C = 1})=1/6


c(A1=1, B = 1}?{C = 1})=1


Since the support and confidence are greater than the thresholds, the


rule can be discovered.


For the second rule, there are two corresponding rules:


{A3=1, B = 1}?{C = 1}


{A4=1, B = 1}?{C = 1}


For both rules, the support is 1/12 and the confidence is 1. Since


the support is less than the threshold (15%), these rules canno

Computer Science & Information Technology

You might also like to view...

Which utility temporarily removes redundancies in a file to reduce the file size?

A) Error-checking B) File Compression C) Disk Cleanup D) Disk Defragmenter

Computer Science & Information Technology

Items such as graphics, charts, or spreadsheets that can be inserted into Word documents are called:

A) blocks. B) placeholders. C) templates. D) objects.

Computer Science & Information Technology