Consider the data set shown in Table 7.8. The first attribute is continuous, while the remaining two attributes are asymmetric binary. A rule is consid- ered to be strong if its support exceeds 15% and its confidence exceeds 60%. The data given in Table 7.8 supports the following two strong rules:
(i) {(1 ? A ? 2), B = 1}?{C = 1}
(ii) {(5 ? A ? 8), B = 1}?{C = 1}
(a) Compute the support and confidence for both rules(b) To find the rules using the traditional Apriori algorithm, we need to
discretize the continuous attribute A. Suppose we apply the equal width
binning approach to discretize the data, with bin-width = 2, 3, 4. For
each bin-width, state whether the above two rules are discovered by
the Apriori algorithm. (Note that the rules may not be in the same
exact form as before because it may contain wider or narrower intervals
for A.) For each rule that corresponds to one of the above two rules,
compute its support and confidence.
(c) Comment on the effectiveness of using the equal width approach for
classifying the above data set. Is there a bin-width that allows you to find both rules satisfactorily? If not, what alternative approach can you
take to ensure that you will find both rules?
(a) s({(1 ? A ? 2), B = 1}?{C = 1})=1/6
c({(1 ? A ? 2), B = 1}?{C = 1})=1
s({(5 ? A ? 8), B = 1}?{C = 1})=1/6
c({(5 ? A ? 8), B = 1}?{C = 1})=1
(b) When bin ? width = 2:
Where
A1=1 ? A ? 2; A2=3 ? A ? 4;
A3=5 ? A ? 6; A4=7 ? A ? 8;
A5=9 ? A ? 10; A6 = 11 ? A ? 12;
For the first rule, there is one corresponding rule:
{A1=1, B = 1}?{C = 1}
s(A1=1, B = 1}?{C = 1})=1/6
c(A1=1, B = 1}?{C = 1})=1
Since the support and confidence are greater than the thresholds, the
rule can be discovered.
For the second rule, there are two corresponding rules:
{A3=1, B = 1}?{C = 1}
{A4=1, B = 1}?{C = 1}
For both rules, the support is 1/12 and the confidence is 1. Since
the support is less than the threshold (15%), these rules canno
You might also like to view...
Which utility temporarily removes redundancies in a file to reduce the file size?
A) Error-checking B) File Compression C) Disk Cleanup D) Disk Defragmenter
Items such as graphics, charts, or spreadsheets that can be inserted into Word documents are called:
A) blocks. B) placeholders. C) templates. D) objects.