For the definition of SNN similarity provided by Algorithm 9.10, the cal- culation of SNN distance does not take into account the position of shared
neighbors in the two nearest neighbor lists. In other words, it might be de-
sirable to give higher similarity to two points that share the same nearest
neighbors in the same or roughly the same order.
(a) Describe how you might modify the definition of SNN similarity to give
higher similarity to points whose shared neighbors are in roughly the
same order.
(a) Describe how you might modify the definition of SNN similarity to give
higher similarity to points whose shared neighbors are in roughly the
same order.(b) Discuss the advantages and disadvantages of such a modification.
(a) This can be done by assigning weights to the points based on their
position in the nearest neighbor list. For example, we can weight the
i
th point in the nearest neighbor list by n ? i + 1. For each point, we
then take the sum or product of its rank on both lists. These values are
then summed to compute the similarity between the two objects. This
approach was suggested by Jarvis and Patrick [5].
(b) Such an approach is more complex. However, it is advantageous if it is
the case that two objects are more similar if the shared neighbors are
roughly of the same rank. Furthermore, it may also help to compensate
for arbitrariness in the choice of k.
You might also like to view...
In Excel 2010, you can create formatting rules based on formulas to set conditions based on content in multiple columns
Indicate whether the statement is true or false
Which of the following can be used to compromise a WPA encrypted wireless network when the rainbow table does not contain the key?
A. Evil twin B. War chalking C. Buffer overflow D. Virus