Gaussian Mixture Models

Jun 04, 2021

Hello, my name is Luis Serrano and this video is about Gaussian

mixture

models

. Gaussian

mixture

models

are very useful clustering models. Grouping has many applications, including audio classification, where it is used to differentiate sounds, for example to differentiate different instruments from one part of a song or to count your voice, apart from background noise, when talking to your voice assistant. voice, another application is document classification, so if you have a large collection of documents, you can use it to separate different documents by topic, for example, sports, scientific, political, etc. A useful application is image segmentation, for example, when self-driving cars see an image and want to separate it into pedestrians, traffic signs, other cars, etc., make sure the clustering problem is as follows: we have a set of point data and hopefully it can look like it.

They are grouped in very well defined groups like this and we proceed to group them so in this case we see that we have the green cluster, the blue cluster and the red cluster. There are some very useful clustering algorithms such as k-means clustering and hierarchical clustering and the I explained in detail in this video and on my channel, you can find the link in the comments in that other video, you can learn how to group data sets like this, however sometimes the data sets are more complicated like this in this set of data. It looks like there are two groups, this circle over here and this line or elongated oval over here and these groups seem to intersect each other which makes the problem very difficult for traditional clustering algorithms, therefore we need something different.

More Interesting Facts About,

gaussian mixture models...

Notice one thing in the previous grouping. algorithms each point belongs to exactly one group, whether it is yellow or blue, however, we can think of points belonging to both groups at the same time, so when we go from hard assignments to soft assignments, from a hard grouping to a soft grouping in a hard grouping each The point belongs to a particular group and only that one in the soft group; However, a point can be 10 percent of one group, 25 percent of another, etc., so in this case we have some points that are mostly blue, some that are mostly yellow, and some that belong to both clusters and the algorithm that we will learn today is a very powerful and very popular soft clustering algorithm that will do this and it is called Gaussian mixture model and why it is called Gaussian mixture model because it is based on the Gaussian or normal distribution. this one here is a Gaussian distribution in two variables and this is another one, they both look like bumps under a really big blanket, but since we're drawing everything in the plane, we'll draw them like this from above, so what?

What we're going to do is draw the shadow, these purple curves, here are the levels of the bump, so whenever you draw these concentric ovals underneath, just imagine a little mountain that's based on the screen and whose top is in front of you. Now these Gaussian distributions are defined in any number of dimensions and so is our data, but for clarity in this video I will mainly use two-dimensional data in a plane so that the Gaussians look three-dimensional like this in the plane and as I said , just look at its shadow or its projection, so now we are ready to get into the clustering algorithm of the Gaussian mixture model and this has two main steps: the first is to color the points according to the Gaussians, how do we do it right ?

This works as follows. Let's say we have two Gaussians, yellow and blue, each point in the plane can be colored with two colors yellow and blue according to these two Gaussians before going into details, let's draw some points here. Notice that the leftmost point lives mostly. on the yellow mountain, so we'll color it yellow or mostly yellow as we go to the right, we notice that the dots are more half and half, this dot here is half blue, half yellow, since it lives in the middle and a as we move to the right. We go more and more blue, we get to this point on the right that lives mainly on the blue mountain, so this one is colored almost all blue and we can do this for every point on the plane, the points that are close to yellow. the mountain becomes mostly yellow, the points are close to the blue mountain but mostly blue and the points in between get some combination, some ratio between yellow and blue, but how do we do this more exactly?

We do it in the following way and for illustrative purposes: I'm going to do it in one dimension, but you can imagine this seeing it from the side, so you actually see the mountain from the side, so let's say that now we have lines instead of planes , so Gaussian distributions in one dimension have the color yellow. one called ffx and the blue one called g of x and now let's say we have three points on the line and for each one we are going to look at the distances to the yellow and blue curves and take that proportion for the first one. is much greater than the blue distance, so this point is, say, 75 percent yellow 25 blue, for example, the point in the middle is 50 50 because it is at that point where the blue and yellow heights are equal, so this is exactly half yellow half blue and the last point is mostly blue because there is a very small distance on the yellow curve and a very large distance to the blue curve, so this is let's say 95 blue and five percent yellow, so If you can imagine doing that for each point, we call each point on the line and if we can imagine doing it in two dimensions or in more dimensions, you are still coloring points on the plane according to the heights of each of the mountains on top . from that point and that was step one, which is how to color points according to the Gaussian, now we are going to do step two, which is the last step and is the opposite, given some points, how to find the Gaussian and we do it in the In the following way let's say that these are points then the idea is to find the Gaussian that lifts them higher if our points are like this then the title would be this and in Flatland which is how we are seeing it from above look like this, this is how we find that title , let's say these are points and I explained it in much more detail and with real numbers and formulas in this video here it is called covariance matrix video, you can find the link here or in the description but here I am going to give you the essence, like this that the first thing we have to do is find the center of mass of this point, so imagine that these points were weights, where would you put your finger to balance all these weights? well, you would put it somewhere here in the mean or the average which is called mu.

The way you calculate it is you take the x coordinates and the average of all of those and that's the x coordinate of the mean and then you take the y coordinates and take the average of them and that's the y coordinate of the mean, which gives us good information about the data set, but we need more, we need to know the variance x, which is the horizontal dispersion, how it is distributed. Are these points in the horizontal direction all close together or are they far apart? This is a number similarly, we have the variance and that tells us the dispersion in the vertical direction and that still doesn't tell us everything about the data.

It's set because we don't know if it looks like a rectangle or a line or an oval or a circle or what direction it points, so we have something called covariance, so covariance tells us a little bit more about where the data is. is pointing and given the variance This data is in two dimensions, so the covariance matrix will be a two-by-two matrix, but if your data is 100 dimensional, we will have a 100 by 100 matrix, so this is general, the covariance matrix covariance is like this in the diagonal elements that we are going to have. the variances, so the variance x and the variance y and then the diagonal off the diagonal, we will have the corresponding covariances, so we will have the covariance of Gaussian is just plugging in this formula, so don't worry so much about this formula, just know that mu goes into this formula, sigma goes into this formula and x, which is the point, so at each x you plug it into this formula. and you get a number, that number is the height of the mountain at that point, so if you enter this formula, for each point it will look like a lump like that on an infinite blanket, which is the Gaussian distribution, this formula can look like a Bit strange, but this one may seem more familiar if you've seen Gaussian or normal distributions in one dimension.

This is a one-dimensional version. However, we will still be missing one thing here. Our points are complete, but we may have a data set like this where some points appear as a fraction, so you can see that some points at the bottom are half points and this point on the left is a quarter point , so you can have 10 percent of a point or 97 per point in any proportion and for those data sets you can still find a Gaussian that fits them well. You can still find the center of mass. This time the center of mass will not be in the same place as before because there are some heavy points on the top right pulling on it, so it will be here we still have a variance x maybe it's a difference, we still have a variance and and we still have a covariance, so given this, we can construct a covariance matrix and we can plug things into the formula and get a Gaussian, so the moral of the story is that if we have a bunch of points or a bunch of proportions of points, percentages of points, we can always fit the perfect Gaussian, so the perfect mountain that lifts these points higher and places all the other points at the bottom, and so on.

We're going to use this step in our algorithm, so Now that we have done the two steps, we are ready to build the Gaussian mixture model and here it goes, we will do this to separate this data set into two groups as we saw in the beginning which are the circle here and the line where some points belong mainly to the circle some mainly belong to the line and some to both and the algorithm is the following I wrote it here in the corner it simply consists of repeating step one and too many times, so let me show you, eh, first we start with a random Gaussian, so let's start with one here and one here so we have a random mean, a random covariance, a random variance, this and draw the yellow Gaussian and the blue one.

The next step is we are going to enter a loop, how long will this loop last? We'll figure it out later, for now, run it many times and see what happens first. Let's go to step one, which is the color points according to the Gaussian, that's how we learn. how to call each of our points in the plane based on how close they are to these two Gaussians. Remember we did it based on heights. When we do that, we notice that these two points here are much closer to the yellow titles, so they belong. to that yellow it looks like they are completely yellow, but think of them as a high percentage of yellow and a very small percentage of blue.

These points here are the same, they are mostly blue and for all the other points there is some combination of yellow. and blue now, these are a little rough because I color them by eye, but if you want to be exact, I recommend that you code this or take a numerical example and calculate it by hand, but this is mainly for illustrative purposes, so we have colored. each point in our data set according to these two Gaussians, so now what we do is forget about the Gaussians and we go to step two and from the colors we create new Gaussians, so let's create two new Gaussians that fit the yellow portions and the blue portions. so let's just look at the yellow portions and fit a Gaussian again this is something I observed so we have this Gaussian for the yellow points we'll keep that in mind for a second and now let's look at the blue points or the blue portions and fit a Gaussian those too, so now we forget about the colors, we have two Gaussians and now with these two Gaussians we are going to go back to step one and recolor our points so you see what is happening, first we color the points and then we move to Gaussians build Gaussians, then we forget about the Gaussians and color the points again, then we forget about the colors and draw Gaussians again and go back and forth, this is just a back and forth process, so using these two Gaussians we color the points again so these two are mostly yellow these are mostly blue and these are some kind of ratio between blue and yellow again we go back to step two, we forget about the Gaussians and using the colors we are going to create two newGaussians, so here are the yellow ones and here is the blue one and now with the new Gaussians we are going to color the points, so we go back to step one and again we color these points are mostly blue, these are mostly yellow and these are kind of a mix and now we go back to step two, but notice that now the algorithm stays the same, it terminates or at least converges, it changes very little because it reaches some kind of equilibrium, if we were to run it again. we had the same

gaussian

s or very similar

gaussian

s and if we get this we run it again we get the same or very similar colors so we have reached an end point so we can say that this cycle continues until the algorithm starts to converge, so our algorithm says as long as you're not converging, keep doing this and that's it, that's the complete algorithm for building a Gaussian mixture model.

Note that we got the group we expected. Now this algorithm like many others is not exact, there was a bit of luck involved. With our initial conditions and our data set so we may not always reach the desired answer, there are many heuristics that will help us, one is run it several times and choose the best solutions we obtain, another is to put good conditions on the original Gaussians. restrictions on the mean variance and covariances so that they are random enough but also good enough candidates, so there are many ways to improve this algorithm, so here is a small summary of what we did: We started with our data set and we choose two random ones.

Gaussians We now use these two Gaussians to color each point in our data set as a combination of yellow and blue based on its position relative to each of the two Gaussians. Now we forget about those two Gaussians because we will find two new ones. According to the colors that we have obtained, based on this color we get two new best Gaussians and now we use these two Gaussians to change the color of the entire data set, so forget about the colors and recolor them according to these two new Gaussians. Now we forget. over those two Gaussians again and using the colors we fit two new better Gaussians and notice how these are much better than the previous ones and finally we use these two Gaussians to change the color of our data set, so notice that in each step we are always doing steps one and step two, which is the color points, recalculate the Gaussians, the color points, recalculate the Gaussians and each time we improve a little more until we reach a point where the Gaussians do not move much and that is where We finish and that's all friends, thank you very much for your attention.

I would like to remind you that I have a book called Rocking Machine Learning that I invite you to read. In this book I talk about Supervised Learning Algorithms in detail in a very conceptual way using examples etc. and there is also a lot of code in python if you want to find it go to this website which is also in the description and if you use the discount code serrano yt you can get 40 off the book and if you enjoyed this video , subscribe for more content, like, share with your friends and feel free to comment.

I love reading your comments in particular, if you have ideas for future videos you would like to see please let me know. In the comments, I often make a video that comes out as someone's suggestion, so feel free if you want to tweet me, this is my Twitter name. Louis likes math and if you want to see all this information put together. I have this website called serrano.academy where you can find about the book, blog posts, videos, podcasts, interviews, etc., so I invite you to check it out. Thank you very much for your attention and see you in the next video.

Watch Video & Subscribe

If you have any copyright issue, please Contact