For the Partition algorithm, prove that any frequent itemset in the database must appear as a local frequent itemset in at least one partition.

What will be an ideal response?

We can do a proof by contradiction.
Assume M transactions,
N partitions, wlog each contains M/N transactions
frequent itemset I with support S,
where S * M = number of transactions containing I,

We know that since I is a frequent itemset, then S >= min_support
or equivalently, S * M >= min_support * M.

Now assume that I is not frequent within any of the N partitions, Pi,
i.e., the support within a partition Pi is Si which is < min_support, or
equivalently Si * M/N < min_support * M/N.
Hence,
```
(S1 * M/N) + (S2 *M/N) + ... + (SN * M/N) < N * (min_support * M/N)

(S1 * M/N) + (S2 *M/N) + ... + (SN * M/N) < min_support * M
```

This contradicts the fact that the support of itemset I should be
>= min_support or equivalently that the number of transactions containing
I be >= min_support * M.

Computer Science & Information Technology

You might also like to view...

A Booleanfunction answers the question, "What type of thing are you?"

Answer the following statement true (T) or false (F)

Computer Science & Information Technology

Which hard drive type is typically used for servers?

A. SATA B. IDE C. PATA D. SCSI

Computer Science & Information Technology