Consider the data set shown in Table 7.13. Suppose we are interested in extracting the following association rule:
{?1 ? Age ? ?2,Play Piano = Yes} ?? {Enjoy Classical Music = Yes}
To handle the continuous attribute, we apply the equal-frequency approach
with 3, 4, and 6 intervals. Categorical attributes are handled by introducing
as many new asymmetric binary attributes as the number of categorical val-
ues. Assume that the support threshold is 10% and the confidence threshold
is 70%.
(a) Suppose we discretize the Age attribute into 3 equal-frequency intervals.
Find a pair of values for ?1 and ?2 that satisfy the minimum support
and minimum confidence requirements.
(b) Repeat part (a) by discretizing the Age attribute into 4 equal-frequency
intervals. Compare the extracted rules against the ones you had ob-
tained in part (a).
(c) Repeat part (a) by discretizing the Age attribute into 6 equal-frequency
intervals. Compare the extracted rules against the ones you had ob-
tained in part (a).
(d) From the results in part (a), (b), and (c), discuss how the choice of
discretization intervals will affect the rules extracted by association rule
mining algorithms.
{?1 ? Age ? ?2,Play Piano = Yes} ?? {Enjoy Classical Music = Yes}
To handle the continuous attribute, we apply the equal-frequency approach
with 3, 4, and 6 intervals. Categorical attributes are handled by introducing
as many new asymmetric binary attributes as the number of categorical val-
ues. Assume that the support threshold is 10% and the confidence threshold
is 70%.
(a) Suppose we discretize the Age attribute into 3 equal-frequency intervals.
Find a pair of values for ?1 and ?2 that satisfy the minimum support
and minimum confidence requirements.
(b) Repeat part (a) by discretizing the Age attribute into 4 equal-frequency
intervals. Compare the extracted rules against the ones you had ob-
tained in part (a).
(c) Repeat part (a) by discretizing the Age attribute into 6 equal-frequency
intervals. Compare the extracted rules against the ones you had ob-
tained in part (a).
(d) From the results in part (a), (b), and (c), discuss how the choice of
discretization intervals will affect the rules extracted by association rule
mining algorithms.
You might also like to view...
A(n) ________ section break separates a single-column format from a two-column format in a document
Fill in the blank(s) with correct word
Use the _____ attribute on the form element to specify the name and location of the script that will process the form control values
a. action b. process c. method d. none of the above