Estimate the number of page transfers needed to compute r s using a sort-merge join, assuming the following: A=B
1. The size of r is 1000 pages, 10 tuples per page; the size of s is 500 pages, 20 tuples per page.
2. The size of the main memory bu?er for this join computation is 10 pages.
3. The Cartesian product of matching tuples in r and s (see Figure 10.7) is computed using a block-nested loops join.
4. r.A has 100 distinct values and s.B has 50 distinct values. These values are spread around the ?les more or less evenly, so the size of ?A=c(r), where c r.A, does not vary much with c.
1. External sort of relation r
2 × 1, 000(log9|1, 000|) = 8, 000 pages
2. External sort of relation s
2 × 500(log9|500|) = 3, 000 pages
3. The merge step
First, clearly, we need to scan both r and s, which costs us 1,500 page transfers.
For each value of the attribute A, the number of pages of r that contain tuples that match that value is 1,000/100=10. Similarly, for each value of B, the number of pages of s that contain tuples that match that value is 500/50=10. Thus, each time we have a match, we compute a Cartesian product of two 10-page relations. Using block-nested loops with 10 bu?er pages requires that we scan the r-part once and the s-part twice for a total of 30 pages. However, one scan of r and s needed to compute the produce is accounted for in the aforesaid initial scan (the one that takes 1,500 page transfers). So, the extra cost is only 10 pages needed to scan the s-part the second time.
The output has 10 10 = 100 pages, so for each match computing the Cartesian product takes 110 page transfers. The question is how many matches we can have. Since A has 100 values and B has 50, we can have at most 50 matches, so computing and outputting the joined tuples will thus take at most 50 × 110 = 5, 500 page transfers. Together we get:
1, 000 + 500 + 5, 500 = 7, 000 pages
4. Thus, the total cost is
6, 000 + 2, 000 + 7, 000 = 15, 000 pages
You might also like to view...
In order to install Active Directory on a Windows Server 2008 R2, there must be additional disk space of _________________________
a. 100 MB for the Active Directory database and SYSVOL folder, plus at least 100 MB for the transaction log files b. 200 MB for the Active Directory database and SYSVOL folder, plus at least 100 MB for the transaction log files c. 300 MB for the Active Directory database and SYSVOL folder, plus at least 100 MB for the transaction log files d. 500 MB for the Active Directory database and SYSVOL folder, plus at least 100 MB for the transaction log files.
Considering eigenvectors of the normalized Laplacian as the graph harmonics, answer the following:
(a) Plot graph signal f defined in Problem 23(c) for c = 0.1. Compute and plot the GFT coefficients by taking eigenvectors of the normalized Laplacian as the graph Fourier basis. Compare the result with that of eigenvectors of the Laplacian as graph Fourier basis. (b) Plot the graph signal translated to node 6. (Translation is defined considering normalized Laplacian as graph Fourier basis.)