 |
|
ss
Oracle Tips by Burleson |
All about Keys
All relational databases require unique keys to
identify primary table rows.
A key is required whenever we need to join tables
together with SQL. In most cases, the key will be a unique value
(e.g. social security number), but we may sometimes see non-unique key
values.
Sometimes the table will not have a unique value,
and your table will have a sequence to uniquely identify each table
row.
We need these matching keys to use SQL to join
tables, and we do this inside our SQL WHERE
clause. However, Oracle also has the ability to hard link
related rows together. This is Oracle’s object oriented feature,
where Object ID’s are used instead of keys:
Now that we see the basics of SQL, let’s take a
look at the normalization
steps, so that we can understand
table structure.
Un-normalized Form (0NF)
Essentially, an un-normalized relation is a
relation that contains repeating values. An un-normalized relation can
also contain relations nested within other relations, as well as all
kinds of transitive dependencies. Sometimes un-normalized relations
are signified by 0NF, but an un-normalized relation is not to be
confused with a denormalized relation.
The un-normalized relation is any relation in its
raw state, and they commonly contain repeating vales and other
characteristics that are not found in denormalized relations. The
process of denormalization is a very deliberate
attempt to introduce controlled redundant items into an already
normalized form.
First Normal Form (1NF)
In essence, any relation is in
first normal form if it does not contain any repeating values. Here,
we have taken our relations with repeating values and moved them to
separate relations. When the new relations are created, we carry the
primary key of the original relation into the new relation.
We start by creating a primary key to
uniquely identify each row in the table. Any repeating data is moved
to a new table. Finally, create a key for each of the rows of the new
table and a reference to that key in the original table. This is like
what we did with the FRIENDS and ADDRESS tables above. First normal
form removes repeating data to separate tables. Each of the new
tables may have repeating data that should be moved to another new
table. This process continues until there is no longer redundant
data.
Second Normal Form (2NF)
For second normal form, you must
start in first normal form. Then, each column in a table must be
dependent on the key for that table, or it should be move to a new
table.
The purpose of the second normal form (2NF) test
is to check for partial key dependencies. Partial key dependencies are
created when we break off an un-normalized relation into first normal
form by carrying the key thereby creating a concatenated key with
several data items. The formal definition of second normal form is as
follows:
A relation is in second
normal form if and only if the relation is in first normal form and
each non-key attribute is fully functionally dependent on the entire
concatenated key.
However, I prefer the following definition:
A relation is in second
normal form if each attribute depends on the key, the whole key, and
nothing but the key, so help me Codd.
Third Normal Form (3NF)
The third normal form (3NF) test
refers to transitive dependencies. A transitive dependency
is a circumstance where one non-key attribute is functionally
dependent on another non-key attribute. Whereas the 2NF test serves to
check for dependencies between key fields and attribute fields, the
3NF test serves to check for dependencies between non-key attributes.
First, your relation must already be in second
normal form. Any column that is dependent on another column that is
not the key must be placed in a new table. Any column that is derived
from another column (like a total) must be placed in a separate
table.
The process of going to 3NF is all about
eliminating redundant data inside tables. However, almost all
databases are not in 3NF, and your tables will almost always have
deliberately-introduced values in multiple tables! Let’s see why.
Denormalization
Now that you have an idea of how normalization is
used to segregate redundant and non-dependent data, I have to admit
that most databases are not all the way into third normal form. As
database software had become more powerful and hardware less expensive
(such as fast hard drives), many databases are denormalized to improve
performance. The process of denormalization has
to do with taking a normalized database and reintroducing selected
redundancy to improve performance.
When normalization
techniques were created, disks were very expensive, and you did not
want to have any redundant data because it was very expensive. Since
the 1970’s, disk have become thousands of times cheaper, and it is now
common to go to third normal form and then go back and reintroduce
redundant data!
Having data in more than one place can reduce the
amount of SQL table joins, making our SQL run faster. The rules for
denormalization
relate to the size of the data
item, and how frequently the data is updated:
In other words, a tiny data item that is seldom
updated is a great candidate for denormalization.
Below, we see a database model where redundant data items have been
re-added to eliminate SQL table joins:
The greater the amount of duplicated data, the
smaller the number of tables, and the faster the SQL will run.
The above book excerpt is from:
Easy Oracle
SQL
Get Started
Fast writing SQL Reports with SQL*Plus
ISBN 0-9727513-7-8
John Garmany
http://www.rampant-books.com/book_2005_1_easy_sql.htm |