Preview (15 of 54 pages)

Chapter 6 Normalization of Database Tables Discussion Focus Why are some table structures considered to be bad and others good and how do you recognize the difference between good and bad structures? From an information management point of view, possibly the most vexing and destructive problems are created through uncontrolled data redundancies. Such redundancies produce update and delete anomalies that create data integrity problems. The loss of data integrity can destroy the usefulness of the data within the database. (If necessary, review Chapter 1, Section 1-6b, “Data Redundancy”, to make sure that your students understand the terminology and that they appreciate the dangers of data redundancy.) Table structures are poor whenever they promote uncontrolled data redundancy. For example, the table structure shown in Figure IM6.1 is poor because it stores redundant data. In this example, the AC_MODEL, AC_RENT_CHG, and AC_SEATS attributes are redundant. (For example, note that the hourly rental charge of $58.50 is stored four times, once for each of the four Cessna C-172 Skyhawk aircraft – check records 1, 2, 4, and 9.) Figure IM6.1 A Poor Table Structure If you use the AIRCRAFT_1 table as shown in Figure IM6.1, a change in hourly rental rates for the Cessna 172 Skyhawk must be made four times; if you forget to change just one of those rates, you have a data integrity problem. How much better it would be to have critical data in only one place! Then, if a change must be made, it need be made only once. In contrast to the poor AIRCRAFT_1 table structure shown in Figure IM6.1, table structures are good when they preclude the possibility of producing uncontrolled data redundancies. You can produce such a happy circumstance by splitting the AIRCRAFT_1 table shown in Figure IM6.1 into the AIRCRAFT and MODEL tables shown in Figures IM6.2 and IM6.3, respectively. To retain access to all of the data originally stored in the AIRCRAFT_1 table, these two tables can be connected through the AIRCRAFT table's foreign key, MOD_CODE. Figure IM6.2 The Revised AIRCRAFT Table Figure IM6.3 The MODEL Table Note that – after the revision -- a rental rate change need be made in only one place and the number of seats for each model is given in only one place. No more data update and delete anomalies -- and no more data integrity problems. The relational diagram in Figure IM6.4 shows how the two tables are related. Figure IM6.4 The Relational Diagram What does normalization have to do with creating good tables, and what's the point of having to learn all these picky normalization rules? Normalization provides an organized way of determining a table's structural status. Better yet, normalization principles and procedures provide a set of simple techniques through which we can achieve the desired and definable structural results. Without normalization principles and procedures, we lack evaluation standards and must rely on experience (and yes, some intuition) to minimize the probability of generating data integrity problems. The problem with relying on experience is that we usually learn from experience by making errors. While we're learning, who and what will be hurt by the errors we make? Relying on intuition may work reasonably well for some, but intuitive work habits seldom create design consistency. Worse, you can't teach intuition to those who follow in your database footsteps. In short, the normalization principles and rules drastically decrease the likelihood of producing bad table structures, they help standardize the process of producing good tables, and they make it possible to transmit skills to the next generation of database designers. NOTE Given the clear advantages of using normalization procedures to check and correct table structures, students sometimes think that normalization corrects all design defects. Unfortunately, normalization is only a part of the “good design to implementation” process. For example, normalization does not detect the presence of synonyms. Remind your students that normalization takes place in tandem with data modeling. The proper procedure is to follow these steps: 1. Create a detailed description of operations. 2. Derive all the appropriate business rules from the description of operations. 3. Model the data with the help of a good tool such as Visio’s Crow’s Foot option to produce an initial ERD. This ERD is the initial database blueprint. 4. Use the normalization procedures to remove data redundancies. This process may produce additional entities. 5. Revise the ERD created in step 3. 6. Use the normalization procedures to audit the revised ERD. If additional data redundancies are discovered, repeat steps 4 and 5. Also remind your students that some business rules cannot be incorporated in the ERD, regardless of the level of business rule detail or the completeness of the normalization process. For example, the business rule that specifies the constraint “A pilot may not perform flight duties more than 10 hours per 24-hour period.” cannot be modeled in the ERD. However, tools such a Visio do allow you to write “reminders” of such constraints as text. Because such constraints cannot be modeled, they must be enforced through the application software. Answers to Review Questions 1. What is normalization? Normalization is the process for assigning attributes to entities. Properly executed, the normalization process eliminates uncontrolled data redundancies, thus eliminating the data anomalies and the data integrity problems that are produced by such redundancies. Normalization does not eliminate data redundancy; instead, it produces the carefully controlled redundancy that lets us properly link database tables. 2. When is a table in 1NF? A table is in 1NF when all the key attributes are defined (no repeating groups in the table) and when all remaining attributes are dependent on the primary key. However, a table in 1NF still may contain partial dependencies, i.e., dependencies based on only part of the primary key and/or transitive dependencies that are based on a non-key attribute. 3. When is a table in 2NF? A table is in 2NF when it is in 1NF and it includes no partial dependencies. However, a table in 2NF may still have transitive dependencies, i.e., dependencies based on attributes that are not part of the primary key. 4. When is a table in 3NF? A table is in 3NF when it is in 2NF and it contains no transitive dependencies. 5. When is a table in BCNF? A table is in Boyce-Codd Normal Form (BCNF) when it is in 3NF and every determinant in the table is a candidate key. For example, if the table is in 3NF and it contains a nonprime attribute that determines a prime attribute, the BCNF requirements are not met. (Reference the text's Figure 6.8 to support this discussion.)This description clearly yields the following conclusions: • If a table is in 3NF and it contains only one candidate key, 3NF and BCNF are equivalent. • BCNF can be violated only if the table contains more than one candidate key. Putting it another way, there is no way that the BCNF requirement can be violated if there is only one candidate key. 6. Given the dependency diagram shown in Figure Q6.6, answer items 6a-6c: FIGURE Q5.6 Dependency Diagram for Question 6 a. Identify and discuss each of the indicated dependencies. C1  C2 represents a partial dependency, because C2 depends only on C1, rather than on the entire primary key composed of C1 and C3. C4  C5 represents a transitive dependency, because C5 depends on an attribute (C4) that is not part of a primary key. C1, C3  C2, C4, C5 represents a set of proper functional dependencies, because C2, C4, and C5 depend on the primary key composed of C1 and C3. b. Create a database whose tables are at least in 2NF, showing the dependency diagrams for each table. The normalization results are shown in Figure Q6.6b. Figure Q6.6b The Dependency Diagram for Question 6b c. Create a database whose tables are at least in 3NF, showing the dependency diagrams for each table. The normalization results are shown in Figure Q6.6c. Figure Q6.6c The Dependency Diagram for Question 6c 7. The dependency diagram in Figure Q6.7 indicates that authors are paid royalties for each book that they write for a publisher. The amount of the royalty can vary by author, by book, and by edition of the book. Figure Q6.7 Book royalty dependency diagram a. Based on the dependency diagram, create a database whose tables are at least in 2NF, showing the dependency diagram for each table. The normalization results are shown in Figure Q6.7a. Figure Q6.7a The 2NF normalization results for Question 7a. b. Create a database whose tables are at least in 3NF, showing the dependency diagram for each table. The normalization results are shown in Figure Q6.7a. Figure Q6.7b The 3NF normalization results for Question 7b. 8. The dependency diagram in Figure Q6.8 indicates that a patient can receive many prescriptions for one or more medicines over time. Based on the dependency diagram, create a database whose tables are in at least 2NF, showing the dependency diagram for each table. Figure Q6.8 Prescription dependency diagram The normalization results are shown in Figure Q6.8a. Figure Q6.8a The 2NF normalization results for Question 8. 9. What is a partial dependency? With what normal form is it associated? A partial dependency exists when an attribute is dependent on only a portion of the primary key. This type of dependency is associated with 1NF. 10. What three data anomalies are likely to be the result of data redundancy? How can such anomalies be eliminated? The most common anomalies considered when data redundancy exists are: update anomalies, addition anomalies, and deletion anomalies. All these can easily be avoided through data normalization. Data redundancy produces data integrity problems, caused by the fact that data entry failed to conform to the rule that all copies of redundant data must be identical. 11. Define and discuss the concept of transitive dependency. Transitive dependency is a condition in which an attribute is dependent on another attribute that is not part of the primary key. This kind of dependency usually requires the decomposition of the table containing the transitive dependency. To remove a transitive dependency, the designer must perform the following actions: • Place the attributes that create the transitive dependency in a separate table. • Make sure that the new table's primary key attribute is the foreign key in the original table. Figure Q6.11 shows an example of a transitive dependency removal. Figure Q6.11 Transitive Dependency Removal 12. What is a surrogate key, and when should you use one? A surrogate key is an artificial PK introduced by the designer with the purpose of simplifying the assignment of primary keys to tables. Surrogate keys are usually numeric, they are often automatically generated by the DBMS, they are free of semantic content (they have no special meaning), and they are usually hidden from the end users. 13. Why is a table whose primary key consists of a single attribute automatically in 2NF when it is in 1NF? A dependency based on only a part of a composite primary key is called a partial dependency. Therefore, if the PK is a single attribute, there can be no partial dependencies. 14. How would you describe a condition in which one attribute is dependent on another attribute when neither attribute is part of the primary key? This condition is known as a transitive dependency. A transitive dependency is a dependency of one nonprime attribute on another nonprime attribute. (The problem with transitive dependencies is that they still yield data anomalies.) 15. Suppose that someone tells you that an attribute that is part of a composite primary key is also a candidate key. How would you respond to that statement? This argument is incorrect if the composite PK contains no redundant attributes. If the composite primary key is properly defined, all of the attributes that compose it are required to identify the remaining attribute values. By definition, a candidate key is one that can be used to identify all of the remaining attributes, but it was not chosen to be a PK for some reason. In other words, a candidate key can serve as a primary key, but it was not chosen for that task for one reason or another. Clearly, a part of a proper (“minimal”) composite PK cannot be used as a PK by itself. More formally, you learned in Chapter 3, “The Relational Database Model,” Section 3-2, that a candidate key can be described as a superkey without redundancies, that is, a minimal superkey. Using this distinction, note that a STUDENT table might contain the composite key STU_NUM, STU_LNAME This composite key is a superkey, but it is not a candidate key because STU_NUM by itself is a candidate key! The combination STU_LNAME, STU_FNAME, STU_INIT, STU_PHONE might also be a candidate key, as long as you discount the possibility that two students share the same last name, first name, initial, and phone number. If the student’s Social Security number had been included as one of the attributes in the STUDENT table—perhaps named STU_SOCSECNUM—both it and STU_NUM would have been candidate keys because either one would uniquely identify each student. In that case, the selection of STU_NUM as the primary key would be driven by the designer’s choice or by end-user requirements. Note, incidentally, that a primary key is a superkey as well as a candidate key. 16. A table is in ___3rd___ normal form when it is in ___2nd normal form___ and there are no transitive dependencies. (See the discussion in Section 6-3c, “Conversion to Third Normal Form.” Problem Solutions 1. Using the descriptions of the attributes given in the figure, convert the ERD shown in Figure P6.1 into a dependency diagram that is in at least 3NF. An initial dependency diagram depicting only the primary key dependencies is shown in Figure P6.1a below. Figure P6.1a Initial dependency diagram for Problem 1. There are no composite keys being used, therefore, by definition, there is not an issue with partial dependencies and the entities are already in 2NF. Based on the descriptions of the attributes, it appears that the patient name, phone number, and address can be determined by the patient id number. Therefore, the following transitive dependency can be determined. App_PatientID  (App_Name, App_Phone, App_Street, App_City, App_State, App_Zip) As discussed in the chapter, ZIP_Codes can be used to determine a city and state; therefore, we also have the transitive dependency: App_Zip  App_City, App_State Figure P6.1b depicts the dependency diagram with these transitive dependencies included. Figure P6.1b Revised dependency diagram for Problem 1. Since the first transitive dependency completely encloses the second transitive dependency, it is appropriate to resolve the first transitive dependency before resolving the second. Figure P6.1c shows the results of resolving the first transitive dependency. Figure P6.1c Resolving the first transitive dependency Finally, the second and final transitive dependency can now be resolved as shown in the final dependency diagram in Figure P6.1d. Figure P6.1d Final dependency diagram for Problem 1 Note that at this time we have resolved all of the transitive dependencies. Decisions on whether or not to denormalize, and perhaps not remove the final transitive dependency, have yet to be made. Also, the structures have not yet had the benefit of additional design modifications such as achieving proper naming conventions for the attributes in the new tables. However, creating the fully normalized structures is an important set toward making informed decisions about the compromises in the design that we may choose to make. NOTE: Please note that we are making the assumption that a zip code only determines one city and state. Unfortunately, this is not true, there are a handful of zip codes that traverse states. In these cases, it would be appropriate not to use the [App_zip, App_City, App_State] relation and instead add these attributes to the previous relation. Hence, the relation would be: [App_PatiendID, App_Name, App_Phone, App_Street, App_City, App_Zip, App_State] 2. Using the descriptions of the attributes given in the figure, convert the ERD shown in Figure P6.2 into a dependency diagram that is in at least 3NF. An initial dependency diagram depicting only the primary key dependencies is shown in Figure P6.2a below. Figure P6.2a Initial dependency diagram for Problem 2. Based on the descriptions of the attributes given, the following partial dependency can be determined: Pres_SessionNum  (Pres_Date, Pres_Room) Also, the following transitive dependencies can be determined: Pres_AuthorID  (Pres_FName, Pres_LName) Figure P6.2b shows the revised dependency diagram including the partial and transitive dependencies. Figure P6.2b Revised dependency diagram for Problem 2 Resolving the partial dependency to achieve 2NF yields the dependency diagram shown in Figure P6.2c. Figure P6.2c 2NF dependency diagram for Problem 2 Finally, the transitive dependency is resolved to achieve the 3NF solution shown in the final dependency diagram in Figure P6.2d. Figure P6.2d Final dependency diagram for Problem 2 3. Using the INVOICE table structure shown in Table P6.3, do the following: Table P6.3 Sample INVOICE Records Attribute Name Sample Value Sample Value Sample Value Sample Value Sample Value INV_NUM 211347 211347 211347 211348 211349 PROD_NUM AA-E3422QW QD-300932X RU-995748G AA-E3422QW GH-778345P SALE_DATE 15-Jan-2018 15-Jan-2018 15-Jan-2018 15-Jan-2018 16-Jan-2018 PROD_LABEL Rotary sander 0.25-in. drill bit Band saw Rotary sander Power drill VEND_CODE 211 211 309 211 157 VEND_NAME NeverFail, Inc. NeverFail, Inc. BeGood, Inc. NeverFail, Inc. ToughGo, Inc. QUANT_SOLD 1 8 1 2 1 PROD_PRICE $49.95 $3.45 $39.99 $49.95 $87.75 a. Write the relational schema, draw its dependency diagram and identify all dependencies, including all partial and transitive dependencies. You can assume that the table does not contain repeating groups and that any invoice number may reference more than one product. (Hint: This table uses a composite primary key.) The solutions to both problems (3a and 3b) are shown in Figure P6.3a. NOTE We have combined the solutions to Problems 3a and 3b to let you illustrate the start of the normalization process within a single PowerPoint slide. Students generally seem to have an easier time understanding the normalization process if they can compare the normal forms directly. We will continue to use this technique for several of the initial normalization decompositions … if the available PowerPoint slide space permits it. b. Remove all partial dependencies, write the relational schema, and draw the new dependency diagrams. Identify the normal forms for each table structure you created. NOTE You can assume that any given product is supplied by a single vendor but a vendor can supply many products. Therefore, it is proper to conclude that the following dependency exists: PROD_NUM → PROD_DESCRIPTION, PROD_PRICE, VEND_CODE, VEND_NAME (Hint: Your actions should produce three dependency diagrams.) Figure P6.3a The Dependency Diagrams for Problems 3a and 3b c. Remove all transitive dependencies, write the relational schema, and draw the new dependency diagrams. Also identify the normal forms for each table structure you created. To illustrate the effect of Problem 3's complete decomposition, we have shown Problem 3a's dependency diagram again in Figure P6.3c. Figure P6.3c The Dependency Diagram for Problem 3c d. Draw the Crow’s Foot ERD. NOTE Emphasize that, because the dependency diagrams cannot show the nature (1:1, 1:M, M:N) of the relationships, the ER Diagrams remain crucial to the design effort. Complex design is impossible to produce successfully without some form of modeling, be it ER, Semantic Object Modeling, or some other modeling methodology. Yet, as the preceding decompositions demonstrate, the dependency diagrams are a valuable addition to the designer's toolbox. (Normalization is likely to suggest the existence of entities that may not have been considered during the modeling process.) And, if information or transaction management issues require the existence of attributes that create other than 3NF or BCNF conditions, the proper dependency diagrams will at least force awareness of these conditions. The invoicing ERD, accompanied by its relational diagram, is shown in Figure P6.3d. (The relational diagram only includes the critical PK and FK components, plus a few sample attributes, for space considerations.) Figure P6.3d The Invoicing ERD and Its (Partial) Relational Diagram 4. Using the STUDENT table structure shown in Table P6.4, do the following: Table P6.4 Sample STUDENT Records Attribute Name Sample Value Sample Value Sample Value Sample Value Sample Value STU_NUM 211343 200128 199876 199876 223456 STU_LNAME Stephanos Smith Jones Ortiz McKulski STU_MAJOR Accounting Accounting Marketing Marketing Statistics DEPT_CODE ACCT ACCT MKTG MKTG MATH DEPT_NAME Accounting Accounting Marketing Marketing Mathematics DEPT_PHONE 4356 4356 4378 4378 3420 COLLEGE_NAME Business Admin Business Admin Business Admin Business Admin Arts & Sciences ADVISOR_LNAME Grastrand Grastrand Gentry Tillery Chen ADVISOR_OFFICE T201 T201 T228 T356 J331 ADVISOR_BLDG Torre Building Torre Building Torre Building Torre Building Jones Building ADVISOR_PHONE 2115 2115 2123 2159 3209 STU_GPA 3.87 2.78 2.31 3.45 3.58 STU_HOURS 75 45 117 113 87 STU_CLASS Junior Sophomore Senior Senior Junior a. Write the relational schema, draw its dependency diagram, and identify all dependencies, including all transitive dependencies. The dependency diagram for problem 4a is shown in Figure P6.4a. Figure P6.4a The Dependency Diagram for Problem 4a As you discuss Figure 6.4a, note that the single attribute PK (STU_NUM) automatically places this table in 2NF, because it is not possible to have partial dependencies when the PK consists of a single attribute. The relational schema for the dependency diagram shown in Figure P6.4a is written as: STUDENT(STU_NUM, STU_LNAME, STU_MAJOR, DEPT_CODE, DEPT_NAME, DEPT_PHONE, ADVISOR_LNAME, ADVISOR_OFFICE, ADVISOR_BLDG, ADVISOR_PHONE, STU_GPA, STU_HOURS, STU_CLASS) b. Write the relational schema and draw the dependency diagram to meet the 3NF requirements to the greatest extent possible. If you believe that practical considerations dictate using a 2NF structure, explain why your decision to retain 2NF is appropriate. If necessary, add or modify attributes to create appropriate determinants and to adhere to the naming conventions. NOTE Although the completed student hours (STU_HOURS) do determine the student classification (STU_CLASS), this dependency is not as obvious as you might initially assume it to be. For example, a student is considered a junior if that student has completed between 61 and 90 credit hours. Therefore, a student who is classified as a junior may have completed 66, 72, or 87 hours or any other number of hours within the specified range of 61–90 hours. In short, any hour value within a specified range will define the classification. The normalized structure is shown in Figure P6.4b. The relational schemas are written as: STUDENT(STU_NUM, STU_LNAME, STU_MAJOR, DEPT_CODE, ADVISOR_NUM, STU_GPA, STU_HOURS, STU_CLASS) (Note that we have added the ADVISOR_NUM to serve as a FK to the advisor attributes.) MAJOR(MAJOR_CODE, DEPT_CODE, MAJOR_DESCRIPTION) BUILDING(BLDG_CODE, BLDG_NAME, BLDG_MANAGER) DEPARTMENT(DEPT_CODE, DEPT_NAME, DEPT_PHONE, COLLEGE_CODE) COLLEGE(COLL_CODE, COLL_NAME) (After studying Chapter 4, “Entity Relationship Modeling,” your students should know enough about database design to suggest many improvements in the design before it can be implemented.) Figure P6.4b The Normalized Dependency Diagrams for Problem 4b As you discuss Figure P6.4b, explain that, in this case, the STUDENT table structure indicates a 2NF condition because two transitive dependencies exist. If there is an information requirement to track the components of each major, we can break out a major code, store it in student, create a new entity named MAJOR, and relate it to its department in a 1:M relationship. (Each department offers many majors, but only one department offers each major.) Creating a new entity to eliminate the student classification-induced transitive dependency increases implementation complexity needlessly; student hours are updated each semester by application software and other application software can then use a look-up table to update the classification when necessary. Structure simplicity is a virtue. In any case, the normalization diagram may be modified as shown next. (We have added a few attributes, such as BLDG_MANAGER, to improve the database's ability to provide information.) Note that the assumptions inherent in the business rules also make an impact on normalization practices! If the room is numbered to reflect the building it is in – for example, HE105 indicates room 105 in the Heinz building – one might argue that the ADV_OFFICE value is the determinant of the ADV_BUILDING. (You will learn in Chapter 7 that you can create a query to find a building by looking at room prefixes.) However, if you define dependencies in strictly relational algebra terms, you might argue that partitioning the attribute value to “create” a dependency indicates that the partitioned attribute is not (in that strict sense) a determinant. Although we have indicated a transitive dependency from ADV_OFFICE to ADV_BUILDING, we have used a dotted line to indicate that there is room for argument in this set of transitive dependencies. In any case, the (arguable) dependency ADV_OFFICE  ADV_BUILDING does not create any problems in a practical sense, so it is acceptable to ignore this (arguable) transitive dependency. Keep in mind that the decomposition shown in Figure P6.8 is subject to many modifications, depending on information requirements and business rules. For example, both the department and the college may be tied to the building in which they are located. Additional modifications are discussed in the answer to Problem 9. c. Draw the Crow’s Foot ERD. NOTE This ERD constitutes a small segment of a university’s full-blown design. For example, this segment might be combined with the Tiny College presentation in Chapter 4. The Crow’s Foot ERD is shown in Figure P6.4c. Figure P6.4c The College ERD As you examine the ER diagrams in Figure P6.4c, note that we have made several assumptions that cannot be inferred directly from the dependency diagram in problem 4b. For example: • Apparently, some buildings do not house advisors. Some buildings may be used for storage, others for classrooms, and so on. • When a student is assigned to a department, that department must assign an advisor to that student. That is, a student must have an advisor. Therefore, ADVISOR is mandatory to STUDENT. • Evidently, some advisors do not (yet?) have students assigned to them. From an operational point of view, this optionality is desirable, because it enables us to create a new advisor without having to assign a student advisee to that new advisor. (The new advisor may have to receive some training before having students assigned to him or her.) • Some departments do not offer majors. For example, a department may offer service courses only. • Some colleges do not have departments. This condition is subject to a business rule that is not specified, nor can it be inferred from the dependency diagram. However, this characteristic is not unusual in a college environment. For example, some professional curricula are certified by special boards. Such boards may make certification conditional on the professional curriculum’s independence. (We have created the optionality for discussion purposes. This discussion should stress the importance of the business rules. You generate the business rules by asking detailed questions!) • All departments must be affiliated with a college. • STUDENT is optional to MAJOR. This optionality, too, is desirable from an operational point of view. For example, new majors may not (yet) have attracted students. Business rules may change the nature of the structures shown here. For example, an advisor is likely to be a professor ... who is an employee of the university. Therefore, you might introduce a superset/subset relationship between EMPLOYEE and PROFESSOR, while the need to distinguish between professors and advisors disappears. Similarly, EMPLOYEE may be the source of information concerning the BUILDING manager, thus creating a relationship between BUILDING and EMPLOYEE. Note also that the nature of the relationships (1:1, 1:M, M:N) is not revealed by the dependency diagrams. For example, the 1:M relationship between MAJOR and DEPARTMENT (a department can offer many majors, but each major is offered by only one department) cannot be inferred from the dependency diagram. Normalization and ER modeling are part of the same design process! Finally, note that we have also included several new entities, MAJOR and BUILDING, to reflect the preceding discussion. NOTE Remind your students that the order of the attribute listing in each entity is immaterial. Although it is customary to list the PK attribute first, there is no requirement to do so. Similarly, whether the STU_LNAME is listed before or after the STU_GPA has no effect on the STUDENT entity’s functionality. 5. To keep track of office furniture, computers, printers, and so on, the FOUNDIT company uses the table structure shown in Table P6.5. Table P6.5 Sample ITEM Records Attribute Name Sample Value Sample Value Sample Value ITEM_ID 231134-678 342245-225 254668-449 ITEM_LABEL HP DeskJet 895Cse HP Toner DT Scanner ROOM_NUMBER 325 325 123 BLDG_CODE NTC NTC CSF BLDG_NAME Nottooclear Nottoclear Canseefar BLDG_MANAGER I. B. Rightonit I. B. Rightonit May B. Next a. Given that information, write the relational schema and draw the dependency diagram. Make sure that you label the transitive and/or partial dependencies. The answers to this problem are shown in Figure P6.5a and the relational schema definition below the figure.. Figure P6.5a The FOUNDIT Co. Initial Dependency Diagram The dotted transitive dependency lines indicate that these transitive dependencies are subject to interpretation. We will address these dependencies in the discussion that accompanies Problem 5b’s solution. The relational schema may be written as follows: ITEM(ITEM_ID, ITEM_DESCRIPTION, BLDG_ROOM, BLDG_CODE, BLDG_NAME, BLDG_MANAGER) b. Write the relational schema and create a set of dependency diagrams that meet 3NF requirements. Rename attributes to meet the naming conventions, and create new entities and attributes as necessary. The dependency diagrams are shown in Figure P6.5b. We have added a sample relational diagram to illustrate the relationships at this point. The relational schemas are written below Figure 6.5b. The dependency diagrams in Figure P6.5b reflect the notion that one employee manages each building. Figure P6.5b FOUNDIT Co. 3NF and Its Relational Diagram The relational schemas are written as follows: EMPLOYEE(EMP_CODE, EMP_LNAME, EMP_FNAME, EMP_INITIAL) BUILDING(BLDG_CODE, BLDG_NAME, EMP_CODE) ITEM(ITEM_ID, ITEM_DESCRIPTION, ITEM_ROOM, BLDG_CODE) As you discuss the dependency diagrams in Figure P6.5b, remind the students that BLDG_CODE is not a determinant of BLDG_ROOM. A building can have many rooms, so knowing the building code will not tell you what the room in that building is. If the room is numbered to reflect the building it is in – for example, HE105 indicates room 105 in the Heinz building – one might argue that the BLDG_ROOM value is the determinant of the BLDG_CODE and the BLDG_NAME values. You will learn in Chapter 7, “Introduction to Structured Query Language (SQL),” that you can create a query to find a building by looking at room prefixes. However, if you define dependencies in strictly relational algebra terms, you might argue that partitioning the attribute value to “create” a dependency indicates that the partitioned attribute is not (in that strict sense) a determinant. Although we have indicated a transitive dependency from BLDG_ROOM to BLDG_CODE and BLDG_NAME, we have used a dotted line to indicate that there is room for argument in this set of transitive dependencies. In any case, the (arguable) dependency BLDG_ROOM  BLDG_CODE does not create any problems in a practical sense, so we have not identified it in the Problem 9 solution. Clearly, BLDG_CODE is a determinant of BLDG_NAME. Therefore, the transitive dependency is marked properly in the Problem 5b solution. c. Draw the Crow’s Foot ERD. Use Figure P6.5c to show that, in this case, the ER diagram reflects the business rule that one employee can manage many (or at least more than one) building. Because all employees are not required to manage buildings, BUILDING is optional to EMPLOYEE in the manages relationship. Once again, the nature of this relationship is not and cannot be reflected in the dependency diagram. NOTE We also assume here that each item has a unique item code and that, therefore, an item can be located in only one place at a time. However, we demonstrate in Appendixes B and C that inventory control requirements usually cover both durable and consumable items. Although durables such as tables, desks, lamps, computers, printers, etc. would be uniquely identified by an assigned inventory code, consumables such as individual reams of paper would clearly not be so identified. Therefore, a given inventory description such as "8.5 inch x 11 inch laser printer paper" could describe reams of paper located in many different buildings and in rooms within those buildings. We demonstrate in Appendixes B and C how such a condition may be properly handled. Figure P6.5c The FOUNDIT Co. ERD As you examine Figure P6.5c, note that the BLDG_ROOM is actually an ITEM entity attribute, so it is appropriate to rename it ITEM_ROOM. Also, keep in mind that a room may be related to the building in which it is located. (A BUILDING may contain many ROOMs. Each ROOM is located in a single building.) Therefore, you can expand the design shown in Figure P6.5b to the one shown in Figure P6.5c. This solution assumes that a room is directly traceable to a building. For example, room SC-508 would be located in the Science (SC) Building and room BA-305 would be located in the Business Administration (BA) building. Note that we have made ROOM optional to BUILDING to reflect the likelihood that some buildings – such as storage sheds -- may not contain designated (numbered) rooms. Although optionality’s make excellent default conditions, it is always wise to establish the optionality based on a business rule. In any case, the designer must ask about the nature of the room/building relationship. 6. The table structure shown in Table P6.6 contains many unsatisfactory components and characteristics. For example, there are several multivalued attributes, naming conventions are violated, and some attributes are not atomic. Table P6.6 Sample EMPLOYEE Records Attribute Name Sample Value Sample Value Sample Value Sample Value EMP_NUM 1003 1018 1019 1023 EMP_LNAME Willaker Smith McGuire McGuire EMP_EDUCATION BBA, MBA BBA BS, MS, Ph.D. JOB_CLASS SLS SLS JNT DBA EMP_DEPENDENTS Gerald (spouse), Mary (daughter), John (son) JoAnne (spouse) George (spouse) Jill (daughter) DEPT_CODE MKTG MKTG SVC INFS DEPT_NAME Marketing Marketing General Service Info. Systems DEPT_MANAGER Jill H. Martin Jill H. Martin Hank B. Jones Carlos G. Ortez EMP_TITLE Sales Agent Sales Agent Janitor DB Admin EMP_DOB 23-Dec-1968 28-Mar-1979 18-May-1982 20-Jul-1959 EMP_HIRE_DATE 14-Oct-1997 15-Jan-2006 21-Apr-2003 15-Jul-1999 EMP_TRAINING L1, L2 L1 L1, L3, L8, L15 EMP_BASE_SALARY $38,255.00 $30,500.00 $19.750.00 $127,900.00 EMP_COMMISSION_RATE 0.015 0.010 a. Given the structure shown in Table P6.6, write the relational schema and draw its dependency diagram. Label all transitive and/or partial dependencies. The dependency diagram is shown in Figure P6.6a. Note that the order of the attributes has been changed to make the transitive dependencies easier to mark. (In any case, the order in which the attributes are written into a relational database table is immaterial.) The relational schema is written below Figure P6.6a. Figure P6.6a The Dependency Diagram for Problem 6a The relational schema is written as: EMPLOYEE(EMP_CODE, EMP_LNAME, EMP_EDUCATION, JOB_CLASS, EMP_DEPENDENTS,DEPT_CODE, DEPT_NAME, DEPT_MANAGER, EMP_TITLE, EMP_DOB, EMP_HIRE_DATE, EMP_TRAINING, EMP_BASE_SALARY, EMP_COMMISSION_RATE) b. Draw the dependency diagrams that are in 3NF. (Hint: You might have to create a few new attributes. Also make sure that the new dependency diagrams contain attributes that meet proper design criteria; that is, make sure there are no multivalued attributes, that the naming conventions are met, and so on.) Dependency diagrams have no way to indicate multi-valued attributes, nor do they provide the means through which such attributes can be handled. Therefore, the solution to this problem requires a basic knowledge of modeling concepts, once again indicating that normalization and design are part of the same process. Given the sample data shown in Problem 6, EDUCATION, DEPENDENT and QUALIFICATION are multi-valued attributes whose values are stored as strings. We have created the appropriate entities to avoid the use of multi-valued attributes. (See Figure P6.6b.) Figure P6.6b The Dependency Diagrams for Problem 6b As you discuss Figure P6.6b, note that a real world design would have to include additional entities or additional attributes in the existing entities. For example, while the job description is likely to include a (job) base salary, employee experience – perhaps measured by time in the job classification and performance – is likely to add to the job’s base salary. Therefore, the EMPLOYEE table might include a salary or hourly wage adjustment attribute. Overall employment longevity is likely to be included, too … employers often find it useful to keep (expensive) job turnover rates low. And, of course, you might include year-to-date (YTD) earnings and taxes in each employee’s records, too. This problem is a great source of discussion material! The relational schemas are written as: EMPLOYEE(EMP_CODE, EMP_LNAME, DEPT_CODE, JOB_CLASS, EMP_DOB, EMP_HIREDATE) DEPENDENT(EMP_CODE, DEP_NUM, DEP_FNAME, DEP_TYPE) DEPARTMENT(DEPT_CODE, DEPT_NAME, EMP_CODE) JOB(JOB_CLASS, JOB_TITLE, JOB_BASE_SALARY) EDUCATION(EDUC_CODE, EDUC_DESCRIPTION) QUALIFICATION(EMP_CODE, EDUC_CODE, QUAL_DATE_EARNED) c. Draw the relational diagram. The relational diagram is shown in Figure P6.6c. Figure P6.6c The Relational Diagram for Problem 6c d. Draw the Crow’s Foot ERD. The Crow’s Foot solution is shown in Figure P6.6d. Figure P6.6d The Crow’s Foot ERD for Problem 6d 7. Suppose you are given the following business rules to form the basis for a database design. The database must enable the manager of a company dinner club to mail invitations to the club’s members, to plan the meals, to keep track of who attends the dinners, and so on. • Each dinner serves many members, and each member may attend many dinners. • A member receives many invitations, and each invitation is mailed to many members. • A dinner is based on a single entree, but an entree may be used as the basis for many dinners. For example, a dinner may be composed of a fish entree, rice, and corn. Or the dinner may be composed of a fish entree, a baked potato, and string beans. Because the manager is not a database expert, the first attempt at creating the database uses the structure shown in Table P6.7: Table P6.7 Sample RESERVATION Records Attribute Name Sample Value Sample Value Sample Value MEMBER_NUM 214 235 214 MEMBER_NAME Alice B. VanderVoort Gerald M. Gallega Alice B. VanderVoort MEMBER_ADDRESS 325 Meadow Park 123 Rose Court 325 Meadow Park MEMBER_CITY Murkywater Highlight Murkywater MEMBER_ZIPCODE 12345 12349 12345 INVITE_NUM 8 9 10 INVITE_DATE 23-Feb-2016 12-Mar-2016 23-Feb-2016 ACCEPT_DATE 27-Feb-2016 15-Mar-2016 27-Feb-2016 DINNER_DATE 15-Mar-2016 17-Mar-2016 15-Mar-2016 DINNER_ATTENDED Yes Yes No DINNER_CODE DI5 DI5 DI2 DINNER_DESCRIPTION Glowing sea delight Glowing sea delight Ranch Superb ENTREE_CODE EN3 EN3 EN5 ENTREE_DESCRIPTION Stuffed crab Stuffed crab Marinated steak DESERT_CODE DE8 DE5 DE2 DESERT_DESCRIPTION Chocolate mousse with raspberry sauce Cherries jubilee Apple pie with honey crust a. Given the table structure illustrated in Table P6.7, write its relational schema and draw its dependency diagram. Label all transitive and/or partial dependencies. (Hint: This structure uses a composite primary key.) The relational schema may be written as follows: MEMBER(MEMBER_NUM, MEMBER_NAME, MEMBER_ADDRESS, MEMBER_CITY, MEMBER_ZIP_CODE, INVITE_NUM, INVITE_DATE, ACCEPT_DATE, DINNER_DATE, DINNER_ATTENDED, DINNER_CODE, ENTRÉE_CODE, ENTRÉE_DESCRIPTION, DESSERT_CODE, DESSERT_DESCRIPTION) The dependency diagram is shown in Figure P6.7a. Note that DIN_CODE in Figure P6.7a does not determine DIN_ATTEND; just because a dinner is offered does not mean that it is attended. Note also that we have shortened the prefixes – for example, MEMBER_ADDRESS has been shortened to MEM_ADDRESS -- to provide sufficient space to include all the attributes. Figure P6.7a The Dependency Diagram for Problem 7a b. Break up the dependency diagram you drew in Problem 7a to produce dependency diagrams that are in 3NF and write the relational schema. (Hint: You might have to create a few new attributes. Also, make sure that the new dependency diagrams contain attributes that meet proper design criteria; that is, make sure that there are no multivalued attributes, that the naming conventions are met, and so on.) Actually, there is no way to prevent the existence of multi-valued attributes by merely following normalization rules. Instead, knowledge of E-R modeling concepts will help define the environment in which the multi-valued attributes are dealt with. Although we keep repeating the message, it is worth repeating: normalization and modeling fit within the same design spectrum and they take place concurrently as the definition of entities and their attributes take place. The design process can be described thus: • Define entities, attributes, and relationships and model them. • Normalize. • Redesign based on the normalization outcomes and the evaluation of the design's ability to meet transaction and information requirements. • Normalize the results and evaluate the normal forms until the process has yielded a stable design, implementation, and applications development environment. Such a process will yield the dependency diagrams shown in Figure P6.7b. In this case, it hardly seems practical to eliminate the 2NF condition displayed by MEMBER. After all, zip codes tend to be thought of as part of the address. Worse, the elimination of the MEMBER's 2NF condition would require the creation of a ZIPCODE table, with ZIP_CODE as the foreign key in the MEMBER table. Such a solution would merely add complexity without adding functionality. Figure P6.7b The Dependency Diagram for Problem 7b As you examine Figure P6.7b, note how easy it is to see the functionality of the decomposition. For example, the (composite) INVITATION and DINNER entities make it possible to track who was sent an invitation on what date (INVITE_DATE) to a dinner to be held at some specified date (DIN_DATE), what dinner (DIN_CODE) would be served on that date, who (MEM_NUM) accepted the invitation (INVITE_ACCEPT), and who actually attended (INVITE_ATTEND. The INVITE_ACCEPT attribute would be a simple Y/N, as would be the INVITE_ATTEND. To avoid nulls, the default values for INVITE_ACCEPT and INVITE_ATTEND could be set to N. Getting the number of acceptances for a given dinner by a given date would be simple, thus enabling the catering service to plan the dinner better. The relational schemas follow: MEMBER(MEM_NUM, MEM_NAME, MEM_ADDRESS, MEM_CITY, MEM_STATE, MEM_ZIP) INVITATION(INVITE_NUM, INVITE_DATE, DIN_CODE, MEM_NUM, INVITE_ACCEPT, INVITE_ATTEND) ENTRÉE(ENT_CODE, ENT_DESCRIPTION) DINNER(DIN_CODE, DIN_DATE, DIN_DESCRIPTION, ENT_CODE, DES_CODE) DESSERT(DES_CODE, DES_DESCRIPTION) Naturally, to tracks costs and revenues, the manager would ask you to add appropriate attributes in DESSERT and ENTRÉE. For example, the DESSERT table might include DES_COST and DES_PRICE to enable the manager to track net returns on each dessert served. One would also expect that the manager would want to track YTD expenditures of the members and, of course, there would have to be an invoicing module for billing purposes. And what about keeping track of member balances as the members charge meals and make payments on account? c. Using the results of Problem 7b, draw the Crow’s Foot ERD. The Crow’s Foot ERD is shown in Figure P6.7c. Figure P6.7c The Crow’s Foot ERD for Problem 7c 8. Use the dependency diagram shown in Figure 6.8 to work the following problems. FIGURE P6.8 Initial Dependency Diagram for Problem 8 a. Break up the dependency diagram in Figure 6.8 to create two new dependency diagrams, one in 3NF and one in 2NF. The dependency diagrams are shown in Figure P6.8a. Figure P6.8a The Dependency Diagram for Problem 8a b. Modify the dependency diagrams you created in Problem 8a to produce a collection of dependency diagrams that are all in 3NF. (Hint: One of your dependency diagrams will be in 3NF, but not in BCNF.) The solution is shown in Figure P6.8b. Figure P6.8b The Dependency Diagram for Problem 8b c. Modify the dependency diagrams in Problem 8b to produce a collection of dependency diagrams that are all in 3NF and BCNF. The solution is shown in Figure P6.8c. Note that the A, C, and E attributes in the first three structures can be used as foreign keys in the fourth structure. Figure P6.8c The Dependency Diagrams for Problem 8c 9. Suppose that you have been given the table structure and data shown in Table 6.9, which was imported from an Excel spreadsheet. The data reflect that a professor can have multiple advisees, can serve on multiple committees, and can edit more than one journal. Table P6.9 Sample PROFESSOR Records Attribute Name Sample Value Sample Value Sample Value Sample Value EMP_NUM 123 104 118 PROF_RANK Professor Asst. Professor Assoc. Professor Assoc. Professor EMP_NAME Ghee Rankin Ortega Smith DEPT_CODE CIS CHEM CIS ENG DEPT_NAME Computer Info. Systems Chemistry Computer Info. Systems English PROF_OFFICE KDD-567 BLF-119 KDD-562 PRT-345 ADVISEE 1215, 2312, 3233, 2218, 2098 3102, 2782, 3311, 2008, 2876, 2222, 3745, 1783, 2378 2134, 2789, 3456, 2002, 2046, 2018, 2764 2873, 2765, 2238, 2901, 2308 COMMITTEE_CODE PROMO, TRAF APPL, DEV DEV SPR, TRAF PROMO, SPR DEV JOURNAL_CODE JMIS, QED, JMGT JCIS, JMGT Given the information in Table 6.9: a. Draw the dependency diagram. The dependency diagram is shown in Figure P6.9a. Figure P6.9a The Dependency Diagram for Problem 9a Note that Figure P6.9a reflects several ambiguities. For example, although each PROF_OFFICE value shown in Table P6.9 is unique, does that limited information indicate that each professor has a private office? If so, the office number identifies the professor who uses that office. This condition yields a dependency. However, this dependency is not a transitive one, because a non-key attribute, PROF_OFFICE, determines the value of a key attribute, EMP_NUM. (We have indicated this potential transitive dependency through a dashed dependency line.) NOTE The assumption that PROF_OFFICE  EMP_CODE is a rather restrictive one, because it would mean that professors cannot share an office. One could safely assume that administrators at all levels would not care to be tied by such a restrictive office assignment requirement. Therefore, we will remove this restriction in the remaining problem solutions. Also, note that there is no reliable way to identify the effect of multivalued attributes on the dependencies. For example, EMP_NUM = 123 could identify any one of five advisees. Therefore, knowing the EMP_NUM does not identify a specific ADVISEE value. The same is true for the COMMITTEE_CODE and JOURNAL_CODE attributes. Therefore, these attributes are not marked with a solid arrow line. However, if you know that EMP_NUM = 123, you will also know all five advisees, all four committee codes, and all three journal codes for that employee number value. But you do not have a unique identification for each of those attribute values. Therefore, you cannot conclude that EMP_NUM  ADVISEE, nor can you conclude that EMP_NUM  COMMITTEE_CODE or that EMP_NUM  JOURNAL_CODE. b. Identify the multivalued dependencies. Table P6.9 shows several professor attributes – ADVISEE, COMMITTEE_CODE, and JOURNAL_CODE -- that represent multivalued dependencies. c. Create the dependency diagrams to yield a set of table structures in 3NF. The dependency diagrams are shown in Figure P6.9c. Note that we have assumed that it is possible that professors can share an office. Figure P6.9c The Dependency Diagram for Problem 9c d. Eliminate the multivalued dependencies by converting the affected table structures to 4NF. The structures shown in Figure 6.9d1 conform to the 4NF requirement. Yet this normalization does not yield a viable database design. Here is another opportunity to stress that normalization without data modeling is a poor way to generate useful databases. (Note that we have assumed that an advisee can have only one advisor, but that an advisor can have many advisees.) Figure P6.9d1 The Initial Dependency Diagrams for Problem 9d The dependency diagrams shown in Figure P6.9d1 constitute an attempt to eliminate the shortcomings of the “system” shown in Figure P6.9c. Unfortunately, while this solution meets the normalization requirements, it lacks the ability to properly link the professors to committees and journals. (That’s because the relationships between professors and journals and between professors and committees are M:N.) This solution would yield tables P6.9d1 and P6.9d2. (One would expect a professor to be an employee, so it’s reasonable to assume that – at some point -- we’ll have to create a supertype/subtype relationship between employee and professor. (To save space, we show only the first three EMP_NUM value sets from Table P6.9.) Table 6.9d1 Implementation of the M:N Relationship between EMP_NUM and COMMITTEE_CODE EMP_NUM COMMITTEE_CODE 123 PROMO 123 TRAF 123 APPL 123 JMGT 104 DEV 118 SPR 118 TRAF The PK of the table shown in Table P6.9d1 is EMP_NUM + COMMITTEE_CODE. Table 6.9d2 Implementation of the M:N Relationship between EMP_NUM and JOURNAL_CODE EMP_NUM JOURNAL_CODE 123 JMIS 123 QED 123 JMGT 118 JCIS 118 JMGT The PK of the table shown in Table P6.9d2 is EMP_NUM + JOURNAL_CODE. Because EMP_CODE = 104 does not show any entries in the JOURNAL_CODE, the employee code does not occur in Table P6.9d2. The preceding table structures create multiple redundancies. Therefore, this solution is not acceptable. Here is yet another indication that normalization, while very useful, is not always (usually?) capable of producing implementable solutions. For example, the preceding examples illustrate that multivalued attributes and M:N relationships cannot be effectively modeled without first using the ERD. (After the ERD has done its work, you should, of course, use dependency diagrams to check for data redundancies!) Figure P6.9e shows a more practical solution to the problem and its structures all conform to the normalization requirements. e. Draw the Crow’s Foot ERD to reflect the dependency diagrams you drew in Part c. (Note: You might have to create additional attributes to define the proper PKs and FKs. Make sure that all of your attributes conform to the naming conventions.) Given the discussion in the previous problem segment d, we have incorporated additional features in the Crow’s Foot ERD shown in Figure P6.9e. Note that we have eliminated the M:N relationships in this design by creating composite entities. This design is implementable and it meets design standards. Normalization was part of the process that led to this solution, but it was only a part of that solution. Normalization does not replace design! Figure P6.9e The Crow’s Foot ERD for Problem 9e 10. The manager of a consulting firm has asked you to evaluate a database that contains the table structure shown in Table P6.10. Table P6.10 Sample CLIENT Records Attribute Name Sample Value Sample value Sample Value CLIENT_NUM 298 289 289 CLIENT_NAME Marianne R. Brown James D. Smith James D. Smith CLIENT_REGION Midwest Southeast Southeast CONTRACT_DATE 10-Feb-2018 15-Feb-2018 12-Mar-2018 CONTRACT_NUMBER 5841 5842 5843 CONTRACT_AMOUNT $2,985,00.00 $670,300.00 $1,250,000.00 CONSULT_CLASS_1 Database Administration Internet Services Database Design CONSULT_CLASS_2 Web Applications Database Administration CONSULT_CLASS_3 Network Installation CONSULT_CLASS_4 CONSULTANT_NUM_1 29 34 25 CONSULTANT_NAME_1 Rachel G. Carson Gerald K. Ricardo Angela M. Jamison CONSULTANT_REGION_1 Midwest Southeast Southeast CONSULTANT_NUM_2 56 38 34 CONSULTANT_NAME_2 Karl M. Spenser Anne T. Dimarco Gerald K. Ricardo CONSULTANT_REGION_2 Midwest Southeast Southeast CONSULTANT_NUM_3 22 45 CONSULTANT_NAME_3 Julian H. Donatello Geraldo J. Rivera CONSULTANT_REGION_3 Midwest Southeast CONSULTANT_NUM_4 18 CONSULTANT_NAME_4 Donald Chen CONSULTANT_REGION_4 West Table P6.10 was created to enable the manager to match clients with consultants. The objective is to match a client within a given region with a consultant in that region, and to make sure that the client’s need for specific consulting services is properly matched to the consultant’s expertise. For example, if the client need help with database design and is located in the Southeast, the objective is to make a match with a consultant who is located in the Southeast and whose expertise is in database design. (Although the consulting company manage tries to match consultant and client locations to minimize travel expense, it is not always possible to do so.) The following basic business rules are maintained: • Each client is located in one region • A region can contain many clients. • Each consultant can work on many contracts • Each contract might require the services of many consultants. • A client can sign more than one contract, but each contract is signed by only one client. • Each contract might cover multiple consulting classifications. (For example, a contract may list consulting services in database and networking.) • Each consultant is located in one region. • A region can contain many consultants. • Each consultant has one or more areas of expertise (class). For example, a consultant might be classified as an expert in both database design and networking. • Each area of expertise (class) can have many consultants in it. For example, the consulting company might employ many consultants who are networking experts. a. Given that brief description of the requirements and the business rules, write the relational schema and draw the dependency diagram for the preceding (and very poor) table structure. Label all transitive and/or partial dependencies. Here is a perfect illustration of the value of business rules. If the business rules had not been available, the sample record would produce ambiguities. For example, if you only look at the sample data in the one available record, defining the relationships between client, contract, date, consultant, and expertise would have been difficult, at best. The business rules augment the original data and their use removes the ambiguities. The business rules help establish that a client can sign more than one contract, so you need more than the client number to identify the remaining attributes. Clearly, another client can sign a contract on the same date, so the CLIENT_NUM is not the determinant of the date. Also, the same client can sign multiple contracts on the same date or on different dates, using the same set of consultants for each contract or a different set of consultants for each contract. Remember also that the consultants have more than one area of expertise, so the same consultant may work on different contracts for the same client or for different clients. Given the combination of the business rules and the sample record in the original problem – or given the use of the two records provided in the first part of this discussion -- the dependencies show up in Figure P6.10a. Figure P6.10a The ConsultCo Dependency Diagram The relational schema is written as follows: CONTRACT(CLIENT_NUM, CLIENT_NAME, DATE, CONTRACT, CLASS_1, CLASS_2, CLASS_3, CLASS_4, REGION, CONS_NUM_1, CONS_NAME_1, CONS_NUM_2, CONS_NAME_2, CONS_NUM_3, CONS_NAME_3, CONS_NUM_4, CONS_NAME_4) Or, if you prefer that the PK be the first listed attribute, you can write the relational schema this way: CONTRACT(CONTRACT, CLIENT_NUM, CLIENT_NAME, DATE, CLASS_1, CLASS_2, CLASS_3, CLASS_4, REGION, CONS_NUM_1, CONS_NAME_1, CONS_NUM_2, CONS_NAME_2, CONS_NUM_3, CONS_NAME_3, CONS_NUM_4, CONS_NAME_4) In any case, remind your students that the order in which the attributes are listed is immaterial in a relational database environment. b. Break up the dependency diagram you drew in Problem 10a to produce dependency diagrams that are in 3NF and write the relational schema. (Hint: You might have to create a few new attributes. Also make sure that the new dependency diagrams contain attributes that meet proper design criteria; that is, make sure that there are no multivalued attributes, that the naming conventions are met, and so on.) To complete the structures, we have added the REGION_NAME and we have modified the attribute names to make them conform to our naming conventions. Although the normalization procedure has left us with the 3NF system shown in Figure P6.10b, it is not possible to see that some of the relationships between the entities are of the M:N variety. (It would be appropriate to point out that the multivalued attributes encountered in Problem 10's sample values are probably best handled through the use of composite entities. Similarly, the M:N relationship between contract and consultant would have to be handled through a composite entity, perhaps named ASSIGNMENT, to indicate the assignment of consultants to contracts. We will resolve those issues in the answers to subsequent problems.) Here is yet another indication that normalization, while very useful as a tool to help eliminate data redundancies, is incapable of serving as the sole source of good database design. Figure P6.10b The ConsultCo Dependency Diagrams in 3NF The relational schemas are written as follows: CLIENT(CLIENT_NUM, CLIENT_NAME, REGION_CODE) CLASS(CLASS_CODE, CLASS_DESCRIPTION) CONTRACT(CONTR_NUM, CLIENT_CODE, CONTR_DATE, REGION_CODE) CONSULTANT(CONS_NUM, CONS_NAME, REGION_CODE) REGION(REGION_CODE, REGION_NAME) Keep in mind that the preceding dependency diagrams and relational schemas do not (yet) define a practical design. For example, processing requirements usually dictate that the attributes be made more atomic. (Printing mailing labels, creating mailing lists and phone directories would mandate the decomposition of CLIENT_NAME into CLIENT_FNAME, CLIENT_LNAME, and CLIENT_INITIAL. The CONS_NAME must be similarly decomposed.) Also, remember that this simple system lacks many important entities and attributes. For instance, at this point there's no way to contact the clients, nor can clients contact the consultants. Clearly, we ought to add addresses and phone numbers. However, we have added some crucial relationships to enable us to track billing charges by class and to track billable hours by class, by consultant, and by class. (Note also that the ASSIGN_CHG_HOUR is written into the ASSIGNMENT table by the applications software from the CLASS table to ensure the historical accuracy of the charges. If the CLASS_CHG_HOUR changes, we must preserve the original charge per hour that was in effect when the assignment charge was made.) You can let your students use database software such as Microsoft Access to implement this system. Naturally, you can add tables and attributes to enable the system to handle invoicing and reporting of consulting activities by consultant, by type, by client, and so on. We have added a few of the appropriate entities and attributes in the answer to problem 10c. c. Using the results of Problem 10b, draw the Crow’s Foot ERD. The Crow’s Foot ERD is shown in Figure P6.10c. Figure P6.10c The ConsultCo ERD for Problem 10c The addition of the ASSIGNMENT entity addresses the problem of keeping track of billable hours and charges by consultant and that the addition of the SKILL entity enables the end user to track all consultant qualifications. Whether or not optionality’s are included in the ERD depends on the business rules and on the operational requirements. For example, you can infer from Figure P6.10c that the ASSIGNMENT entity does not necessarily contain a given CLASS code. (Perhaps there is a “customer support” classification that may not have been used – yet.) Similarly, you can infer that a given CONTRACT number has not (yet) been used in the ASSIGN entity. (It is again worth emphasizing that many optionality’s exist for operational reasons. That’s why the optionality is often used as the default condition. In any case, the database designer is obligated to develop precise business rules to make sure that the data environment is properly reflected in the design.) 11. Given the sample records in the CHARTER table shown in Table P6.11, do the following: Table P6.11 Sample CHARTER Records Attribute Name Sample Value Sample Value Sample Value Sample Value CHAR_TRIP 10232 10233 10234 10235 CHAR_DATE 15-Jan-2018 15-Jan-2018 16-Jan-2018 17-Jan-2018 CHAR_CITY STL MIA TYS ATL CHAR_MILES 580 1,290 524 768 CUST_NUM 784 231 544 784 CUST_LNAME Brown Hanson Bryana Brown CHAR_PAX 5 12 2 5 CHAR_CARGO 235 lbs. 18,940 lbs. 348 lbs. 155 lbs. PILOT Melton Chen Henderson Melton COPILOT Henderson Melton FLT_ENGINEER O’Shaski LOAD_MASTER Benkasi AC_NUMBER 1234Q 3456Y 1234Q 2256W MODEL_CODE PA31-350 CV-580 PA31-350 PA31-350 MODEL_SEATS 10 38 10 10 MODEL_CHG_MILE $2.79 $23.36 $2.79 $2.79 a. Write the relational schema and draw the dependency diagram for the table structure. Make sure that you label all dependencies. CHAR_PAX indicates the number of passengers carried. The CHAR_MILES entry is based on round-trip miles, including pickup points. (Hint: Look at the data values to determine the nature of the relationships. For example, note that employee Melton has flown two charter trips as pilot and one trip as copilot.) The dependency diagram is shown in Figure P6.11a. Figure P6.11a The Dependency Diagram for Problem 11a The relational schema is written as follows: CHARTER(CHAR_TRIP, CHAR_DATE, CHAR_CITY, CHAR_MILES, CUST_NUM, CUST_LNAME, CHAR_PAX, CHAR_CARGO, PILOT, COPILOT, FLT_ENGINEER, LOAD_MASTER, AC_NUMBER, MODEL_CODE, MODEL_SEATS, MODEL_CHG_MILE) b. Decompose the dependency diagram in Problem 11a to create table structures that are all in 3NF and write the relational schema. Make sure that you label all dependencies. The normalized dependency diagram is shown in Figure P6.11b. (Note the addition of MOD_CODE in the AIRCRAFT table to serve as the AIRCRAFT table’s FK to MODEL.) Figure P6.11b The Normalized Dependency Diagram for Problem 11b c. Draw the Crow’s Foot ERD to reflect the properly decomposed dependency diagrams you created in Problem 11b. Make sure that the ERD yields a database that can track all of the data shown in Problem 11. Show all entities, relationships, connectivities, optionalities, and cardinalities. The initial Crow’s Foot ERD is shown in Figure P6.11c. Figure P6.11c The Initial Crow’s Foot ERD for Problem 11c While the ERD shown in Figure P6.11c faithfully reflects the results generated by the normalization process, it has a major design flaw. This flaw has the following consequences: • If additional crewmembers such as copilots, loadmasters, and flight engineers are not assigned to the flight, the CHARTER table will include many nulls. (Many of the smaller aircraft that used in charter flying require only that a pilot and a functioning autopilot be used. In fact, the Federal Air Regulations (FARs) that govern charter aviation permit single pilot operations for aircraft that have less than a 12,500-lbs. gross take-off weight and that are not turbine-powered.) • The inclusion of COPILOT, FLT_ENGINEER, and LOAD_MASTER also produce synonyms in the CHARTER table. • As the aircraft used in the charter flights become larger and more complex, crews become larger, thus producing more synonyms and more potential nulls. (Not to mention that the CHARTER table will have to be modified to accept additional crew members such as flight attendants.) The problems associated with the ERD shown in Figure P6.11c are eliminated through the composite entity named CREW in Figure P6.11d. Note that this modification makes it possible to assign any number of crewmembers. To ensure that the crewmembers are properly qualified, a job attribute can be added to the EMPLOYEE entity and the applications software can then assign crewmembers based on job classifications such a pilot, loadmaster, flight attendant, etc. Because only some employees are qualified as crewmembers, CREW is optional to EMPLOYEE. But each crewmember must be an employee, so EMPLOYEE is mandatory to CREW. Figure P6.11d The Final Crow’s Foot ERD for Problem 11c Note that the application shown in Figure P6.11e -- based on the design shown in Figure P6.11d --enables the end user to input only those crew members that are required for the charter flight. (In this case, only two crew members are required, but the design permits the addition of many more crew members without making structural changes in the database tables. Such flexibility is the essence of good design.) Figure P6.11e Sample Charter Record Solution Manual for Database Systems: Design, Implementation, and Management Carlos Coronel, Steven Morris 9781337627900, 9781305627482

Document Details

Related Documents

person
Ethan Williams View profile
Close

Send listing report

highlight_off

You already reported this listing

The report is private and won't be shared with the owner

rotate_right
Close
rotate_right
Close

Send Message

image
Close

My favorites

image
Close

Application Form

image
Notifications visibility rotate_right Clear all Close close
image
image
arrow_left
arrow_right