A file has r=20,000 STUDENT records of fixed-length. Each record has the following fields: NAME (30 bytes), SSN (9 bytes), ADDRESS (40 bytes), PHONE (9 bytes), BIRTHDATE (8 bytes), SEX (1 byte), MAJORDEPTCODE (4 bytes), MINORDEPTCODE (4 bytes), CLASSCODE (4 bytes, integer), and DEGREEPROGRAM (3 bytes). An additional byte is used as a deletion marker. The file is stored on the disk whose parameters are given in Exercise 17.27.
(a) Calculate the record size R in bytes.
(b) Calculate the blocking factor bfr and the number of file blocks b assuming an
unspanned organization.
(c) Calculate the average time it takes to find a record by doing a linear search on
the file if (i) the file blocks are stored contiguously and double buffering is used,
and (ii) the file blocks are not stored contiguously.
(d) Assume the file is ordered by SSN; calculate the time it takes to search for a
record given its SSN value by doing a binary search.
(a) R = (30 + 9 + 40 + 9 + 8 + 1 + 4 + 4 + 4 + 3) + 1 = 113 bytes
(b) bfr = floor(B / R) = floor(512 / 113) = 4 records per block
b = ceiling(r / bfr) = ceiling(20000 / 4) = 5000 blocks
(c) For linear search we search on average half the file blocks= 5000/2= 2500 blocks.
i. If the blocks are stored consecutively, and double buffering is used, the time to read
2500 consecutive blocks
= s+rd+(2500*(B/btr))= 30+12.5+(2500*(512/409.6))
= 3167.5 msec = 3.1675 sec
(a less accurate estimate is = s+rd+(2500*btt)= 30+12.5+2500*1= 2542.5 msec)
ii. If the blocks are scattered over the disk, a seek is needed for each block, so the time
is: 2500 * (s + rd + btt) = 2500 * (30 + 12.5 + 1) = 108750 msec = 108.75 sec
(d) For binary search, the time to search for a record is estimated as:
ceiling(log 2 b) * (s +rd + btt)
= ceiling(log 2 5000) * (30 + 12.5 + 1) = 13 * 43.5 = 565.5 msec = 0.5655 sec