Designing Geodatabases for Transportation. J. Allison Butler
Чтение книги онлайн.
Читать онлайн книгу Designing Geodatabases for Transportation - J. Allison Butler страница 10
Figure 2.3 Tables An application seeking to use the data stored in a relational table needs to know the name of the table and the name of each attribute it seeks, but not the physical manner of data storage. That job is performed by the RDBMS. The primary key uniquely identifies each row. One or more foreign keys can be established to provide connections to other tables. In this example, the Position Number attribute serves as a foreign key to a table storing position descriptions, where Position Number is the primary key. Foreign keys express association relationships. Cardinality is the ratio of rows for two tables. The number that comprises each half of the ratio is the table’s multiplicity. The cardinality of this one-to-many (1:m) association relationship says that a position number must be entered for each employee and that some position numbers may not be applicable to any employee.
Dr. Edgar Codd invented relational databases in the early 1970s at IBM, although it was several years later before a working product could be devised. Such a database management system is based on relational algebra, a kind of math that controls what can happen to data in such a storage structure. Relational algebra supports seven functions2:
• Retrieve (read) row
• Update (write) row
• Define virtual relations (table views)
• Create a snapshot relation
• Define and implement security rules
• Establish and meet stability requirements
• Operate under integrity rules
Relational tables are not actually stored in the row-and-column form we typically use to visualize them, but everything you need to take from this book can be accommodated with the rows-and-columns metaphor. Oracle, SQL Server, Sybase, and Informix are commonly used RDBMS platforms. Products like Microsoft Access have much of the functionality of an RDBMS but are actually database management systems that employ files.
Relationships in relational databases
The big advance offered by the relational database is its ability to represent and manage relationships between tables. Where files normally use a record’s position in the file to uniquely identify each member, an RDBMS cannot impose any ordering on its member records. Thus, an RDBMS requires that at least one column be an instance identifier, called a primary key.
The relationship that relational databases are most concerned about is the association of one table to another. An association is established by placing the same column or a set of columns in both tables. This connection is called a foreign key. For example, a foreign key may link a central table storing general roadway information with other tables containing information about speed limit, traffic volume, maintenance jurisdiction, and pavement condition.
There is considerable variety in the nature of primary keys. The duty of a primary key is to uniquely identify each row in a table, which means that there can only be one row with a given primary key value. For this reason, many database designers argue against using a primary key that is entered by the user. This guidance also means the primary key cannot have any implicit meaning other than service as the row identifier. A primary key with intrinsic meaning is called an intelligent key.
Users like intelligent keys because they are easier to remember and they can serve double duty as an attribute. Database designers hate intelligent keys because they are prone to error in data entry and duplication within the database. You may want to use route number, such as SR 98, as the primary key. The problem is that you might accidentally type “RS 98,” or SR 98 might be rerouted, resulting in confusion as to the version a record references. Instead, database designers populate primary keys with integer sequencers supplied by the RDBMS and large, globally unique identifiers created through various mathematical processes. These values are guaranteed to be unique within the table.
All those other potential primary keys—the ones that mean something—are candidate primary keys and, thus, potential foreign keys. They could be primary keys, except for the chance that they might be duplicated within the table, which is the one thing that must never happen to a primary key. Coded values that are used as shorthand for a larger meaning, like a functional class code of 11 that means rural interstate highway, are often candidate primary keys that are chosen to serve as foreign keys. Some foreign keys may also be useful outside the database. These are called public keys, and they include such things as driver’s license number, Social Security number, river-reach code, the three-letter airport abbreviation, the two-letter state and province abbreviation, and highway route number. All of these primary and candidate key concepts are used in this book to demonstrate specific database design solutions. Each has a number of useful applications.
While on the topic of table keys, it is important to acknowledge their two varieties. A simple key consists of a single field. A complex key is composed of more than one field. For a complex key, it is the arrangement of key values that must be unique, not each individual field’s value. Complex keys are useful when a combination of things is required to identify a single member. For example, instead of using a single functional class field to indicate rural/urban location and the type of roadway, you could split them into two fields, one for each aspect of highway functional class. A facility identifier in combination with a date field, such as to indicate the version of SR 98 that opened to traffic in July 2007, is another possible example you may find useful.
The two tables involved in an association relationship are called the origin and the destination. Both contain a field with the same data in the same form, although the number of instances with the same value may differ. The foreign key in the origin table is usually the primary key or a candidate primary key in the destination table. Association relationships are typically described as a ratio of the number of rows that can exist at each end of the relationship. Each number is called a multiplicity and the combination of the two multiplicities is called the relationship’s cardinality.
Multiplicity can be classified as one or many. Thus, cardinality can be the various combinations of these two values: one-to-one (1:1), one-to-many (1:m), and many-to-many (m:n). When the presence of rows at one end of the relationship is optional—in other words, the association doesn’t always happen—multiplicity can be zero, but that does not affect the cardinality. For example, if you designed a rail station database that contained a County table and a Station table, you must allow the number of Station table rows required for a given county to be zero, one, or more. It is, nevertheless, a one-to-many relationship because one county may have zero, one, or more rail stations. The upper bound in the multiplicity determines the cardinality.
In a one-to-one association, each row in one table may be related to one and only one row in the other table. This relationship is relatively rare because putting all the attributes in one table can often eliminate it. However, there are times when it is useful to split attributes of an entity into multiple tables. For instance, there may be a set of attributes that exist for only a small subset of entities or you may want to do different things with each subset of attributes.
The