YOLO Basic Introduction. | You only LIVE once. | Object Detection.

Apr 07, 2024

Hello everyone and welcome to Tatum learning on this channel, we talk about data and everything related to it, so today's video will be an

introduction

to the yellow algorithm. In today's video I'm going to give you a very simple example of just an overview of how. Yolo works and what he does in the next few videos we will delve into this particular c

once

pt, but for this video I just want all the beginners to understand what Yolo is and how he performs steering and recognition, so let's get started. started so in this particular lecture, first we will have the simple explanation of eula, second, we will see the architecture very briefly, just one slide for you to understand what is the architecture and what are the layers, just know the layers.

Okay, and the third one will be the explanation of the loss function. You'll see that this particular loss function in this yellow is actually very intuitive and very easy to understand. Well, that's why it's interesting and the document is just for you. look

once

at unified real-time

object

detection

and these are the authors, okay, so what happens in yellow and yellow? You give it an image, it takes an image and it feeds it to a deep neural network, actually a CNN, but we'll look at that in the architecture and then are you supposed to predict or rather is the algorithm supposed to predict eight values?

More Interesting Facts About,

yolo basic introduction you only live once object detection...

What does this value of eight do? It

basic

ally gives you the detected location, the detected

object

, the coordinates of the box and what the box contains, so in this case the box is gold. The box contains this, so this is what Yolo does, it gives an image to the neural network, it predicts eight values and these eight values will have the information of the object, where the object is and which object is okay, so so far You saw that you had a deep neural network and you win, the new network was supposed to predict your values, so what are these eight values?

Actually, you can see here first something called a confidence score. I'll explain what this x and y confidence score is, which are

basic

ally the positions and these are the probabilities that the particular object is in, as you know, falling into some particular class, you can see this X Y W and H, this will have, you You know, this will combine the coordinates of this box and the probability of the face will be high because it is a phase and the probability of all the others will be lower, so this is what we are going to do and this is what these are supposed to do. four eight values tell us, so, as I told you, the POF object in promissory note is trust. score So what is the intersection of IOU over Union?

I think it's one of the most important things in this whole yellow document. So what is the intersection over Union? So they gave you this image, right? You

only

had this picture and this face in the middle, so what? You came up with a black bounding box I saw last night. What I did was split the entire image into nine cells and this gold box is what you wanted because you wanted the face to fit exactly in the box with equal proportions from all sides almost, but this particular black box is the bounding box. which predicts the network and you can see that the bounding box is not as accurate as the gold one because it extends too far into the top left corner, so the black box is what you got from the network and the yellow box is what you wanted now , the intersection of the Union over the Union is yes, the first is the bounded region, which is also the predicted box region, is represented by black and the second is the ground truth region, which is represented by gold now the IOU is equal to the area over the intersection over the union so the intersection will be the common region of both frames so this particular region sees this part this part is not included and this part is also not included divided by Union or Union area means all the parts that are in both bounding boxes, so I use intersection area by Union area.

I hope this is clear, it will be and look what is the right objective, so what is the objective of this? particular confidence score the goal is for the confidence score to be high or in other words the P of the object in iou should be high why when and a little high when will be high this is something I have explained deeply in the next video which It's coming maybe tomorrow the day after tomorrow, so you can easily know in that particular video how we're going to maximize the score that I owe you or rather the trust score, so we're done with the first part.

I'm not really explaining much because I just want you to know what each and every part is, not how they are located because that's a little complicated and we'll have it in the next video, so we had this picture and this was the picture delimiter. and these were the grid cells. Now I just took out the bounding box which is this and this particular grid because this particular grid has the object. Now the object also extends into another grid space, but this grid is supposed to find the object now. What is this? Golden Circle, this is actually the center of this bounding box.

Now I know it's not exactly in the center. The center of this bounding box should be. It should have been the eye, but the problem is that if I put it here, everything will be ruined. so bear with me, this is the center of this bounding box. Okay, now what I've done is removed the bounding box but kept the center. Now this X is the distance from this particular edge. Well, this corner in particular. Sorry, not skirting this one. particular corner if this grid length is assumed to be 1 then okay, this was the grid cell on the right, so we kept the bounding boxes in the center here and just found the length with respect to the grid cells WH, W and H will be explained in the next video.

They are also found if it was found in a similar way. Now we look at what are these probabilities? So the probability of the face is equal to the probability of the face of the given object. Some of you may not be familiar with these types of odds. Well, it basically means that if there is an object in the bounding box, what is the probability that? the object is face means that if this event has happened or occurred what is the probability of this happening something like this again varies laman and simple terms I think it is called conditional probability whatever we will see in the next video is fine and this is the probability that the face of the object is a face well, so you will also have a probability that the football will stick to your bike.

Now this is a head, so obviously the probability of the head will be higher and the probability of the football and the bike will be slightly lower now. I am not very good at architecture but it is very useful to know that the architecture consists of several convolutional layers, two fully connected layers and these two will be responsible for giving this as the output is eight boxes. Okay now we'll look at the loss function the reason I've included the loss function is because the loss function is really interesting and the reason why it's interesting is that it's basically based on the sum of squared errors which which means this is a regression problem, which means we may have other problems.

Also, but mainly, it gives us the intuition that this particular problem is a regression problem and yes, Yolo in the last few output layers actually has a linear output activation function. Okay, linear activation functions, but I'll explain all that in the next video. Over here, the component of the loss function is the number one classification loss and the second localization loss, a localization error loss, basically, so why do we have this kind of not saying why the classification and the location should be lost and not be an error, but do they make sense? is that you are classifying between phase probability, bike probability and football probability so it is a classification while finding X Y W and H is actually a regression problem because you are trying to find real values and there are multiple frames bounding boxes, yeah, so there will be There will be multiple bounding boxes and these were the eight values that we saw basically, this is the main thing.

Well, predicting these values and changing these values is actually the core of Yolo. Well, if you have reached the end and there are some things that Dell is chosen in this series. First, this will be a series, so this particular video is an introductory video. The next video will be detailed on the explanation of yellow and then in the next videos I will show it. You have a way to implement it, you know, in just five lines or six lines of Python code. I don't know the exact lines, but yes, many just less than ten. I think that in just ten lines you should be able to do it, or rather, just five lines.

I will be able to know how to implement this Yolo algorithm using an API. I'll tell you what it is. Secondly, it's really interesting. I also plan to implement Yolo from scratch in Python because it will be something you will learn more about and its concepts. like non-maximum suppression and all of that hasn't been included means I haven't set that week and all of that will also be required to implement Yolo so yeah and the best thing is you just need to sit back and relax. because I plan to make a full Yolo framework okay so you will have a lot of things to learn and I will try to make all those short videos but no promises so yeah we are doing it and we have put it up. how to say a request I didn't say request, but a goal of 10 likes for this video.

I don't think it's too much, my maximum is 4 and this is the fourth we have. I don't ask for that 10 is my goal 10. is my goal and the most I have achieved and basically it doesn't matter if you like my video just then it doesn't matter and yes I am aiming for 10 likes for this video and it is really very very motivating when I see those numbers and lastly, please give me suggestions. Well, this particular video was actually someone asking me to do it, but it wasn't a requested video because requested videos are basically one video in length.

This will be a complete series, however you like. request me anything or rather tell me something to do if that is interesting I will definitely do it ok thanks go to my description. I've written a lot of stuff and like some hints and stuff, you might have some questions and those will be answered. in the figure related to the video and also related to my channel, ok, thank you and yes.

Watch Video & Subscribe

If you have any copyright issue, please Contact