Easy explanation of Normalization Relational Database Design for Beginners - 1NF, 2NF, 3NF

Jun 07, 2021

Hello and welcome, this video will explain

normalization

as it relates to

relational

database

s. Okay, so a full example of how to produce a normalized

database

will be shown, but at first I'm going to cover the terminology so we understand what two are. primary keys foreign keys bridge tables relationship types, etc. they're because if you don't understand that it's very difficult to understand the database itself, then if you know all that, you can go straight to the

normalization

example later in the video, okay? We're going to cover our first normal form, second normal form, third normal form, there are other normal forms that you know, before the fourth, fifth, even a sixth normal form, which is found more in academic discussion articles, but which has not really been seen.

You can stop a third normal form, probably for In more than 99% of database

design

s it's fine, so we'll cover a third normal form. Well, let's get started and first I'll cover some basic things. The first thing I want to talk about is what. is a table, this is a representation of a table, we use tables to store our data, it is just a two dimensional representation of columns and rows, so each column will have a in each table, the columns will be unique and the columns you will simply will create above the column names that describe the type of data that will be in the column, so here the name of this column is the product number, so obviously in this column I will start storing the product numbers and the name of product to be stored. the names of the products, etc., etc., so they make sense.

More Interesting Facts About,

easy explanation of normalization relational database design for beginners 1nf 2nf 3nf...

Each table will have a name. These columns are now known as the job names column. Some people call them fields. Some people even call them attributes. Rows are also known as records or even tuples. but I'll go in, you know, change the word, sometimes I call them rows, sometimes I call them records for the columns, sometimes I call them columns and sometimes I call them fields, okay, these are the values, okay, the intersection of a row and a column is a value. So what is the value of the product name in this row? They're apples. Okay, so these are the values.

Well, each column name in a table must be unique, so you can't have two columns with the same name in different tables. That's perfectly fine. No problem. there, but on the same table you can't, okay, let's move on now. Here I am showing two tables and they are related to each other through this link. That's why they are called

relational

databases, because all the tables are related to each other. Somehow, now what I want to point out here is this column here in this table, the product ID column and in the category table we have a column called category ID.

Warning. I've underlined the column names that indicate a primary key, okay, every table that you create will have a primary key, okay, and a primary key has three characteristics, okay, and these are the values that should be unique , so all these values, no matter how many rows you have, must be unique, in other words, they cannot be repeated, okay, I can never have them. our other row in the products table that has a product ID of two because I've already used it, okay, so it has to be unique. The other features are that it cannot contain a null value.

Null means worthless, in other words, it says "you." must have a value here, you can't leave it blank, any column that is part of the primary key must have a value, that's okay, you can never leave them blank and the third characteristic of a primary key that most people does not speak is that it must be the minimum number of columns to create uniqueness. Well, in this case I have a single column, but you will see later that when we do the normalization, the last one is very important or else you will end up with a bad layout.

In this case, I have a column that I can use as a uniqueness. You'll notice that this column here is a product number. If each product has a unique product number, this column could have been used as a primary key, this is known as a candidate key. Okay, so other columns could be a primary key, but you just choose one or two, you know, one or two columns to form the primary key, you'll see that later, but you choose something so that the primary key is okay if there is another column or combination of columns that creates uniqueness, those are candidate keys.

Okay, this column here. The category ID in the products table is known as a foreign key. A foreign key is the column that links one table to another table and usually comes from a primary key of a table. then the category, the primary key is the category ID, is linked to a foreign key in another table. Notice I almost have the same name, they don't have to be. I'll talk about that later when I do the naming conventions. Okay, so the foreign key is used. to link the tables, okay, you will notice that the values of a foreign key can be repeated, no problem, let's see how it works.

What does it mean. You can see this product here, which is from Apple. It has a category ID of 1. What does that mean? What category does Apple have? falls into good, you look for a 1 here, you go down through the link, this is the columnist's link; so you find the value of one which is here, then you look at the category name here and you see that it's a fruit category, okay? this one here, the chicken, is a three, so go down here and find the three, here is the poultry category and here is one again, which is fruit, and this one here, the sugar is a floor, go down here , which is a baking category, okay, that product drops. in the baking category, okay, what you'll notice when you finish your

design

is that each table contains is basically a noun, so it's a first person place or event, okay, you'll find it, so this is one thing , it is a product, this is a thing. category is fine, you'll see later.

I'll explain this in more detail later, but a foreign key links the tables. Well, these values in the category in the foreign key must exist in this table, the link table which is known as referential integrity. I can't put a product here in this table and I couldn't put a category five idea here because it wouldn't make any sense, there's no category named with a number five, so these values have to exist in this table, okay? What I am showing here is that this table has a single column as its primary key. This table called product has a single column as primary key.

Note that this table has two underlined columns when it has two or more columns that make up a primary key. It is called a composite primary key the table still only contains one primary key but it is composed of two or more columns, in this case two columns, okay, so it is called a composite primary key, now the same rules that we had for the primary key still have to be current. In other words, we have a uniqueness and now what do I mean by uniqueness? When it is a composite primary key, you will notice that the invoice ID is repeated here and you may even see that the product ID is repeated, but the combination of the two is never repeated.

When you look at a composite primary key, you have to look at all the values in all the columns that make up the primary key to determine uniqueness; It is the combination, so in this case a 1 in 1 can never be repeated, a 1 in 4 can never be repeated. repeat again no matter how many rows you put in that combination you can never repeat again let's see how this works this lion basically this is what you will see later called the bridge table. I'll explain it later, but let's look at it for a second, this invoice, which is one that goes up here, just look for one that is the invoice number, which was the date of the invoice and there's a column called Customer ID, it's a five and it is a foreign key to another table called customer, which I am not showing well, but through here I could see the invoice number, the invoice date and the customer who bought these products, what products they bought on this invoice.

Notice that there are two same invoice IDs, which means there are two line items on this invoice. product that they bought which is a 1 which is from Apple and they also bought a four which is sugar so they bought apples and sugar in this invoice this has three items in the same invoice a product one a product two in a product three which is The chicken and the Apple oranges are okay and you can see how many they bought of each product, the cost to the store says and why they sold them, okay, so this is a composite primary key.

Alright, a composite primary key again is a primary key that is composite. of two or more columns is still considered a primary key and you must look at the combination of all values to obtain uniqueness. Okay, so the composite primary key now, when you're designing, you have to figure out the business rules and the business rules are just some. kind of description of policies or what the customer wants, okay within an organization or whatever the client does, okay, so business rules, once you figure out business rules, they will help you define your entities. , which are another name for tables, okay, I forgot to say that. before, but tables, some people will call them entities, some people will call them relationships.

Okay, it also tells you what your columns or fields are, your relationship types, and your constraints. I will explain them later. The main source of your business rules comes from your customer. end users, you talk to people and see what they want, you also generate, look at the reports they have, for example, if you are asked to create an invoicing program and they already print invoices, get a copy of their actual printed invoice because everything that is on that invoice has to be in your database or see you otherwise you can't regenerate the exact type of invoice correctly if they have submitted two addresses on your invoice you know you need a table to accommodate the two shipping addresses. you need to say it's the delivery methods, you would need a table to handle your delivery methods, okay, so whatever reports they have, get a copy of them, talk to the end users you talk to and people will use the app to find out. what you need, well, I mentioned earlier the types of relationships in a relational database, there are three types of relationship types, one to one, one to many and many to many, well, I am going to read a description or sample of each one. and then I'll show you how a retail company works, so it's surprising that a retail company requires each of its stores to be managed by a single employee.

In turn, each store manager who is an employee manages only one store. The relationship between store manager and employees is one to one. I will say this if you end up with a one on one or you think you end up with a one on one between two tables check it out they are very rare ok most are one on ones. -many and many-to-many, so double check if you think you'll end up with one to one, just double check. Okay, don't say it doesn't exist, just double check one to many. An example of this can be done by a painter.

I paint many paintings, but each painting is painted by only one painter, so the relationship between a painter paints the painting is one to many, many to many, a student can take many classes in each class, can have many students , so the relationship between a Student table and a table called Class are many-to-many now in a relationship schema database. If you end up with many-to-many, you have to get rid of the too many. The final design cannot be many-to-many. -lots of relationships in it, okay, split it into a 2 1 2 mini switch. I'm going to show you how to do it, so that the final design can't contain a many-to-many relationship.

Here's an example, we have our invoices table and have a products table now, it's very, very important that you get the right relationship type when you're designing a database. If you don't get a correct table, you will change the layout. Well, let me explain how to do it very simply. not the normalization process, but this is a way to do it without the normalization process, also to double check the final normalized design and all that, okay, so we have an invoice table and a product table, which you have to do is make two. Business rules always always have to make two business rules.

Well, I called them two-way business rules because what you do is you create a rule that goes from invoice to product and a rule that goes from product to invoice, so you go in the opposite direction. Let's see what I mean by that, let's start by going from the invoice to the product. What I want to do is find out what is the type of relationship between these two tables. We know that they are related because when you create an invoice you have products in This and the products appear on the invoices, so they are related in some way, but the question is how are they related, so I will start with the invoices table and go to the product table and it is verysimple: just make a sentence that starts with the word one.

I usually start with the word each, but I'll start with the word one because I talked to you about the relationship types or one to one, one to many or too many, so it says this invoice can have how many products. in it one or more than one more than one means many so it is one or many so each invoice or one invoice can have how many products if it says one invoice can only have one product that means for each invoice I can only sell one thing that is probably not what you want an invoice.

I can sell a lot of products on the same bill, so think about the store and a grocery store, you take a whole basket of food and put up different products, that's your receipt for that year. see if that's the invoice, so if you go to a store like Home Depot or something, you go in and you buy a lot of things, okay, each invoice you have has a lot of products, so it probably makes more sense that an invoice can have many. products now what you do is you go in the opposite direction you start on the product side and you go to the invoice side and you make another sentence and you start with the word one again so that a product can appear on how many invoices if you say that a product can appear on an invoice, that means if I sold apples on this invoice here I will never be able to sell apples again, I can't have them ever appear on any other invoice, so most likely what you want is that a product can appear on many invoices.

You can sell the same product on many different invoices. Now the sentence says that one product can appear on many invoices. Then look here and see. many on the product side and you see many on the invoices side, so the relationship between these two tables is many on products and many on invoices, so it's a many-to-many relationship. As I just told you before, the final design can't have many to many, so what we have to do is get rid of these many to many and it's very, very simple, all we do is put a new table between these two tables that we call a bridge table that some people call It's a join or link table, so we just put a bridge table in the middle and the minimum minimum number of columns that are in this table is the primary key of this table and the key main of this table, so there are the invoice ID and the product ID. that is the minimum number of columns that should be in it, the two together usually become the primary key of the bridge table.

Sometimes you can add a new column like date, this is common and all three would be a composite primary key, but right now this works fine as the invoice ID and product ID combined become the primary key. Well, that's the minimum. You could add others to accommodate what you need to individually make your foreign keys. Well now you see, I wrote the type of relationship between them with a 1. this side and many on this side the infinity sign means many one on this side and many on this side why do I know that I always remember where the foreign key is ?

That is the side of many, so the invoice ID is the main key on this side. on the one hand, this is the foreign key itself, it's a foreign key, so it's the many sides that you combined. There are primary keys, but individually they are foreign keys, so the multiple side always goes on the foreign key side. Well, that's a bridge board. how to get rid of a "many to many" well now I can adapt to this rule each invoice can have many products look at this invoice has two products this invoice has three products each product can appear on many invoices look at this product one that is apples has appeared on two different invoices, this invoice and this invoice, so I have adapted both conditions using a bridge table.

Well, that's what the bridge board does and I give it a name. I just call it invoice detail, so I call it something detailed. Well let's look at naming conventions just like any programming language, you should get used to using some type of naming convention when it comes to designing your databases. Okay, so object names some people burn since an object is a table, some people can prefix the table with a prefix you know as TBL to indicate a table. No, I'll just call it a product. Okay, you should also make the table name singular and not plural.

Don't add SS to table names. Nothing says you can't, but usually not. fact, people won't put an S, they won't pluralize it, okay, so for object names, like your database name, your table and column names, don't use reserved words, what are reserved words? reserved? They are words that denote some type of SQL. command, function or procedure so these are words that are used in SQL like if you don't name a select table or a select column like select as a keyword it is a statement in SQL you wouldn't call it maybe date because date is A reserved function is a function or procedure found in most database systems, so you should be careful not to use reserved words and, depending on the relational database system you use to design your database , you can use Oracle, you can use MySQL, you can use SQL Server.

Postgres, whatever the case may be, they have different procedures and they have different names for the procedure, so are you going to memorize all the highlighted words? I don't know, what I do is avoid this problem. I prefer all my column names with the table name, so if I have a table called invoice and I need a date, I call a column called date in that table, I'll actually call it invoice date, net invoice is the table name and then the date is the column name, so if I have a table of employees and I want their first and last name.

I don't just make two columns, one called first and last name. I'll call it employee first name and employee last name, so I use the table name. as my prefix and you will never have problems with reserved words. Well it's also not a good idea to use spaces in your object name, so let's say you have a column called student space, first space name perfectly valid, you won't get an error, the problem is when you are programming or writing your statement SQL or something like that, you have to remember to put the object name or the column name in quotes like this quote student space first space name quote if you forgot to put it in quotes you will get an error message so to avoid having to remember that and write some more, you can put underscores or do what I do, just shorten it with something like this student name F.

I know that means the student name. The first name would be your last name column, okay, so figure out what you want to use. I just recommend that you don't use spaces, it just gets cumbersome, just use alphanumeric characters, okay, zero, nine or eight is Z or Zed, you can use other characters, but I. I would stick to standard characters. Well, always start a database table or column name, so any of these objects with a letter and not a number, you can't start an object name like a table or column with a number, it has to be a letter, you can have a number in the name, but it can't be the first character, that's okay, so try to remember these naming conventions.

Now let's get into normalizing why we came here. Well, we will divide it into three normal forms only first normal form second normal form and third normal form definition of first normal form each column value must be a single value in other words it has to be a Tom each intersection of column and row only can contain a single value, okay, that's what we mean by atomic, so for example, if I have a student's name, technically you have first and last name, so those are two different values. My first and last name are fine, so normally you would split it into a first name column and a last name column. you don't have a column that says author, you don't store the names of two authors in the same intersection of columns and rows, you would have to split that, okay, you can't have two of the same two values in the same column now a date like, for example the date of birth of a student which consists of a month, day, year, but when it comes to relational databases, we consider that it is perfectly valid and is considered a single value, which is the date of birth of the student.

The dictionary definition of atomic means not divisible and a date is technically divisible. so it's not that type of atomic, it's not the dictionary version of atomic, we don't divide it into month, day, year, we would just keep it as the student's date of birth, and it would contain the month, day and year in the same column of row. intersection is fine, so don't mix them, you have to remove all the repeating groups, that's hard to explain in words, it's easier to show, so I'll show you that after explaining this, all the values in a column should be of the same type, so if they are all dates of birth of students, they all have to be dates of birth of students, they cannot be anything else, if they are all names of companies, they all have to be names of companies that are not They can be used like those of a person. name unless the company name is a person's name then that is what it means that each column name must be unique in a table.

I mentioned above that table and column names can be repeated in different tables, but not in the same table, no two rows in one table can be identical, which means no duplication, you cannot duplicate data in one database. relational data, the second normal form if it meets all the requirements of the first normal form then all this is true and it will remove all the partial dependencies, okay that's it. Its non-key attributes depend on the entire primary key, so in other words, all columns that do not form the primary key depend on the primary key again.

You will see that when we do the normalization, you will only get partial dependencies if you have a composite primary key if you only have one column as primary key you do not have partial dependencies a third normal form a table must meet all the requirements of the second normal form and cannot contain dependency transitive, so in this stage, you will remove your transitive dependencies. Transitive dependencies say all non-key columns, which means that all columns that do not form the primary key should depend only on the primary key and not on any of the other columns again.

You will see that. those are the normalization definitions of the three normal forms. Okay, here I am showing you what people consider a repetitive group. I have a simple table like this that has two columns for the author name and the book title and category. look at this suzanne collins wrote a book called The Hunger Games Dan Brown wrote lost symbols Dan Brown also wrote the book The Da Vinci Code but here I have Dorothy Sayers and Robert Eustace wrote the case documents and here below Malala and Christina wrote I am Malala, so I have two authors writing the same book here, two authors writing this book here, the problem with this design is that, for example, you gave this to a client and you wrote the application and everything related to the front end , like the front end and this is the back end, the actual database that stores all the data, then they came to you and said, oh, you know, now we have a book that has three authors, well, what do you have? to do?

You have to add a new column to this table. to accommodate the third author, which means you have to redesign your entire database, change your front-end application and all that to accommodate the third author in your reports and your screens when you're populating data, a lot of work and then you fix that and then They come and say well, now we have four authors on this book, so you have to redesign everything again, add a new column, the problem is that you are designing a table horizontally, you never design the table horizontally, a table should be designed vertically, which it means you just have to add rows or records in the table to suit whatever situation the client has, that's fine, you shouldn't have to go back and redesign your database in your application every time the client wants something new like that , okay, so this is I called the group repeated here the other way to look at it is like this, well, what if I just do a column for the author like this?

I call them by author and now this book here, the documents in the case, I have two authors, see? there Dorothy Sayers and Robert Eustace I separate them with a comma maybe I'll use a semicolon I don't know I'll use something to separate them well, you broke the rule of the first normal form that says that each column can only contain one value that is, atomic, you have two different values in this column, you have two values, so you can't do that either, so you can look at a repeated group this way with multiple columns or some people look at it this way.

I see this as non-atomic,cleaning? So I like my primary keys to be just integers or or numbers that don't have decimals, that's how I like mine, but the above is perfectly valid. I would just make this slight change at the end and notice that what I did was atomize the column for the guest name at the end here. Put the first and last name in two separate columns. The reason I do it is that if you're looking for people, all the people who have the last name DOE, it's easier to search this column if they were combined and said their name.

It's Jane Doe, write both and search without problem, but if you are looking for an email or something that has to do only with the last name, how do you find out the last name? Do you divide it with the second word or is it the third? word or they have a middle initial or whatever, so it's hard to see, so it's always helpful to break them down, that's why we want them atomic, okay, that's it, that's the final design that was the standardization. I hope you enjoyed it or learned some tips and if you like this video please like and subscribe to the channel and if you want to see any new videos when they are posted click the little bell icon to get notified anyway.

Everyone have a great day and take care of yourselves.

Watch Video & Subscribe

If you have any copyright issue, please Contact