Presentation on theme: "Normalization Building Database Relationships. page 21/4/2014 Presentation Normalization Youve been creating tables without giving much thought to them."— Presentation transcript:
Normalization Building Database Relationships
page 21/4/2014 Presentation Normalization Youve been creating tables without giving much thought to them. And thats fine, they work. You can SELECT,INSERT, DELETE, and UPDATE with them. As you get more data, you start seeing things you wish youd done to make your WHERE clauses simpler. What you need is to make your tables more normal.
page 31/4/2014 Presentation Steps to Normalization Making your data atomic First Normal Form – 1NF Each row of data must contain atomic values. Each row of data must have a unique identifier, known as a Primary Key. (To make our tables completely normal, we need to give each record a Primary Key.) There is a 2NF and 3NF, that we will get to in the near future.
page 41/4/2014 Presentation All about relationships 1. Pick your thing, the one thing you want your table to describe. (Whats the main thing you want your table to be about?) 2. Make a list of the information you need to know about your one thing when youre using the table. (How will you use this table?) 3. Using the list, break down the information about your thing into pieces you can use for organizing your table. (How can you most easily query this table?)
page 51/4/2014 Presentation Atomic Data Atomic data has been broken down into the smallest pieces of data that cant or shouldnt be divided any further. Consider a pizza delivery guy. To get to where hes going, he just needs a street number and address in a single column. For his purposes, thats atomic. He never needs to look for a single street number on its own. In fact, if his data were broken into street number and street name, his queries would have to be longer and more complicated, making it take him longer to get the pizza to your front door.
page 61/4/2014 Presentation Atomic Data Now consider a realtor. He might want to have a separate columns for the street number. He may want to query on a given street to see all the houses for sale by street number. For him, street number and street name are each atomic.
page 71/4/2014 Presentation The Benefits of Normal Tables Normal tables wont have duplicate data, which will reduce the size of your database. With less data to search through, your queries will be faster. If you begin with a normalized table, you wont have to go back and change your table when your queries go too slowly.
page 81/4/2014 Presentation Remember that Clown Table? Clown tracking has become a nationwide craze, and our old table isn't going to cut it because the appearance and activities columns contain so much data. For our purposes, this table is not atomic.
page 91/4/2014 Presentation First Normal Form – 1NF To be 1NF, a table must follow these two rules: Each row of data must contain atomic values. Each row of data must have a unique identifier, known as a Primary Key. (To make our tables completely normal, we need to give each record a Primary Key.) There is a 2NF and 3NF, that we will get to in the near future.
page 101/4/2014 Presentation PRIMARY KEY rules A primary key is a column in your table that makes each record unique. The column in your table that will be your primary key has to be designated as such when you create the table. Which means that the data in the primary key column cant be repeated. Consider a table with the columns shown below. Do you think any of those would make good primary keys?
page 111/4/2014 Presentation PRIMARY KEY rules A primary key cant be NULL. If it's null, it can't be unique because other records can also be NULL. The primary key must be given a value when the record is inserted. When you insert a record without a primary key, you run the risk of ending up with a NULL primary key and duplicate rows in your table, which violates First Normal Form. The primary key must be compact. A primary key should contain only the information it needs to to be unique and nothing extra. The primary key values cant be changed. If you could change the value of your key, youd risk accidentally setting it to a value you already used. Remember, it has to remain unique.
page 131/4/2014 Presentation Primary Key Exercise Table 2 –Student Name Address City State Zip Phone# SS#
page 141/4/2014 Presentation Primary Key Exercise Table 3 –ZooAnimals Type Name Gender B_Day Tag Number
page 151/4/2014 Presentation Take care using SSNs as the Primary Keys for your records. With identity theft only increasing, people dont want to give out SSNs and with good reason. Theyre too important to risk. Can you absolutely guarantee that your database is secure? If its not, all those SSNs can be stolen, along with your customers identities.
page 161/4/2014 Presentation The best primary key may be a new primary key. When it comes to creating primary keys, your best bet may be to create a column that contains a unique number. Think of a table with peoples info, but with an additional column containing a number. Lets think of it as id.
page 171/4/2014 Presentation Geek Bits There's a big debate in the SQL world about using synthetic, or made-up, primary keys (like the ID column above) versus using natural keysdata that is already in the table (like a VIN number on a car or SSN number).
page 181/4/2014 Presentation Getting to Normal Creating a primary key is normally something we do when we write our CREATE TABLE code.
page 191/4/2014 Presentation Lets look at Gregs table my_contacts Its not atomic and it has no primary key. From what youve seen so far, this is how youd have to fix Gregs table: Step 1: SELECT all of your data and save it somehow. Step 2: Create a new normal table. Step 3: INSERT all that old data into the new table, changing each row to match the new table structure.
page 201/4/2014 Presentation Lets look at Gregs table my_contacts So now you can drop your old table. NO! We can add a primary key to Gregs table and make the columns more atomic using just one new command. Lets look at our original commands: CREATE TABLE my_contacts ( last_name VARCHAR(30), first_name VARCHAR(20), VARCHAR(50), gender CHAR(1), birthday DATE, profession VARCHAR(50), location VARCHAR(50), status VARCHAR(20), interests VARCHAR(100), seeking VARCHAR(100) );
page 211/4/2014 Presentation CREATE TABLE with PRIMARY KEY At the top of the column list we added a contact_id column that were setting to NOT NULL, and at the bottom of the list, were add a line PRIMARY KEY, which we set to use our new contact_id column as the primary key. CREATE TABLE my_contacts ( contact_id INT NOT NULL, last_name varchar(30) default NULL, first_name varchar(20) default NULL, varchar(50) default NULL, gender char(1) default NULL, birthday date default NULL, profession varchar(50) default NULL, location varchar(50) default NULL, status varchar(20) default NULL, interests varchar(100) default NULL, seeking varchar(100) default NULL, PRIMARY KEY (contact_id) )
page 221/4/2014 Presentation Auto Increment AUTO_INCREMENT When used in your column declaration, that column will automatically be given a unique integer value each time an INSERT command is performed. Adding the keyword AUTO_INCREMENT to our contact_id column makes our SQL software automatically fill that column with a value that starts on row 1 with a value of 1 and goes up in increments of 1 CREATE TABLE my_contacts ( contact_id INT NOT NULL AUTO_INCREMENT, last_name varchar(30) default NULL, …
page 231/4/2014 Presentation Adding the PK to an existing table Heres the code to add an AUTO_INCREMENT primary key to Gregs my_contacts table. ALTER TABLE my_contacts ADD COLUMN contact_id INT NOT NULL AUTO_INCREMENT FIRST, ADD PRIMARY KEY (contact_id); Our new SQL command, ALTER ADD COLUMN does just that. It says to add a column to the table and name it contact_id. FIRST tells the software to make the new column the first one in the list. This is optional, but its good form to put your primary key first. Lets try this!
page 241/4/2014 Presentation To see what happened to your table: SELECT * from my_contacts; The contact_id column has been added first in the table before all the other columns. Because we used AUTO_INCREMENT, the column was filled in as each record in the table was updated The next time we INSERT a new record, the contact_id column will be given a value one higher than the highest contact_id in the table. If the last record has a contact_id of 23, the next one will be 24. Lets add a new record!