Consider the following schema, where the keys are underlined (di?erent keys are underlined di?erently):
Professor(Id, Name, Department)
Course(CrsCode, Department, CrsName)
Teaching(ProfId, CrsCode, Semester)
Consider the following query:
SELECT C.CrsName, P.Name
FROM Professor P, Teaching T, Course C
WHERE T.Semester=’F1995’ AND P.Department=’CS’
AND P.Id = T.ProfId AND T.CrsCode=C.CrsCode
Assume the following statistical information:
1. 1000 tuples with 10 tuples per page in Professor relation
2. 20,000 tuples with 10 tuples per page in Teaching relation
3. 2000 tuples, 5 tuples per page, in Course
4. 5-page bu?er
5. 50 di?erent values for Department
6. 200 di?erent values for Semester
7. Indices
Professor relation
On Department: Clustered, 2-level B+ tree
On Id: Unclustered, hash
Course relation
On CrsCode: Sorted (no index)
On CrsName: Hash, unclustered
Teaching relation
On ProfId: Clustered, 2-level B+-tree
On Semester, CrsCode: Unclustered, 2-level B+ tree
a. First, show the unoptimized relational algebra expression that corresponds to the above SQL query. Then draw the corresponding fully pushed query tree.
b. Find the best execution plan and its cost. Explain how you arrived at your costs.
a. The unoptimized algebraic expression is:
?CrsName,Name(?Semester='F1995' ND Professor.Department='CS'
(Professor Id=ProfId Teaching CrsCode=CrsCode Course))
One fully-pushed query tree is depicted in Figure 11.2. A slightly di?erent tree results if we ?rst join Teaching with Course.
b. There are 100 pages in Professor, 2000 in Teaching, and 400 pages in Courses. With such large relations, doing straight joins is going to take many pages transfers, so we try selections ?rst.
Selection on Department in the Professor relation can use the clustered 2 level index and yields 100/50 = 2 data pages. The cost: 2 page transfers to search the index and 2 to fetch the data pages of Professor. In total, 4 page transfers.
It is tempting to select Teaching on Semester, yielding 20000/200 = 100 tuples (10 pages), but the index on Semester is unclustered, so it may take us 100 page transfers to ?nd the relevant tuples, not counting the cost of the index search.
So, instead, we will use index-nested loops to join the 2 selected pages of Professor with Teaching. Since the index on ProfId in Teaching is clustered, we can fetch all tuples in Teaching that match any given tuple in Professor in two page transfers (20000 teaching tuples per 1000 professors yields 20 tuples per professor, i.e., 2 pages). For each such fetch, we need to search the B+ tree index on ProfId. We can keep the top level of that index in main memory, so there is only one index page fetch per search. Thus, the join takes 1 (to fetch the top level of index) + 20 (professors) * 3 (index search plus fetching the matching tuples) = 61.
The intermediate join has 20 * 20 = 400 tuples, each tuple twice the size of the tuples in either Professor or Teaching. So, only 5 tuples ?t in a page for a total of 80 pages. However, we can select the result of the join on Semester on the ?y,
reducing the size to just 2 tuples, 1 page. We could also apply projection on Name and CrsCode, but this will not give us any more gains. Since we are talking about just a 1-page result, there is no need to write it on disk — we can keep it in main memory.
Now, since Course is sorted on CrsCode, we can use binary search on Course (for each of the two tuples in the join result) to ?nd the matches. This would take us log2 400 = 9 page transfers per search, 18 page transfers in total.
Thus, the total cost is 4+61+18=83 page transfers.
Note that joining Teaching with Course ?rst would require us to scan Course at least once (since there is no index on CrsCode in either relation). So the cost would be at least 400 pages — by far higher than the 83 pages that we came
up with.
Computer Science & Information Technology
You might also like to view...
Clear All Formatting is located under the ________ tab
A) DESIGN B) HOME C) INSERT D) REVIEW
Computer Science & Information Technology
You can save a document to ________, which is Word's Web-based storage site
A) DropBox B) OneDrive C) Google drive D) iCloud
Computer Science & Information Technology