Consider a STUDENT relation in a UNIVERSITY database with the following attributes (Name, SSN, Local_phone, Address, Cell_phone, Age, GPA). Note that the cell phone may be from a different city and state (or province) from the local phone. A possible tuple of the relation is shown below:
a. Identify the critical missing information from the LocalPhone and CellPhone attributes as shown in the example above. (Hint: How do call someone who lives in a different state or province?)
b. Would you store this additional information in the LocalPhone and CellPhone attributes or add new attributes to the schema for STUDENT?
c. Consider the Name attribute. What are the advantages and disadvantages of splitting this field from one attribute into three attributes (first name, middle name, and last name)?
d. What general guideline would you recommend for deciding when to store information in a single attribute and when to split the information?
a. A combination of first name, last name, and home phone may address the issue assuming that there are no two students with identical names sharing a home phone line. It also assumes that every student has a home phone number. Another solution may be to use first name, last name, and home zip code. This again has a potential for duplicates, which would be very rare within one university. An extreme solution is to use a combination of characters from last name, major, house number etc.
b. If we use name in a primary key and the name changes then the primary key changes. Changing the primary key is acceptable but can be inefficient as any references to this key in the database need to be appropriately updated, and that can take a long time in a large database. Also, the new primary key must remain unique. [Footnote: Name change is an example of where our database must be able to model the natural world. In this case, we recognize that the name change can occur regardless of whether it is due to marriage, or a consequence of a religious and/or spiritual conversion, or for any other reason.]
c. The challenge of choosing an invariant primary key from the natural data items leads to the concept of generated keys, also known as surrogate keys. Specifically, we can use surrogate keys instead of keys that occur naturally in the database. Some database professionals believe that it is best to use keys that are uniquely generated by the database, for example each row may have a primary key that is generated in the sequence of creation of rows (tuples). There are many advantages and disadvantages that are often been argued in design sessions. The main advantage is that it gives us an invariant key without any worries about choosing a unique primary key. The main disadvantages of surrogate keys are that they do not have a business meaning (making some aspects of database management challenging) and that they are slightly less efficient (because they require another pass when inserting a row because the key often needs to be returned to the application after a row is inserted).
You might also like to view...
Which of the following is not a protocol?
A. TCP/IP B. IE C. HTTP D. SMTP
Amazon S3 provides;
A. Unlimited File Size for Objects B. Unlimited Storage C. A great place to run a No SQL database from D. The ability to act as a web server for dynamic content (i.e. can query a database)