YTread Logo
YTread Logo

Economics 421/521 - Econometrics - Winter 2011 - Lecture 1 (HD)

May 01, 2020
this is

economics

421 condom metrics unless one day I'm in the wrong place was actually a little embarrassing, so let's go through the syllabus first and then we'll move on to the first

lecture

, you can see it. I'm the boy you want. spend ten weeks with it or not find out so again the syllabus I'm not going to hand out a paper copy can be found through the whiteboard and when you enter through the whiteboard I don't know if your interface looks like the mine, but the class web page is great, it doesn't work, it should appear when you go to the whiteboard, tell it no, when it asks this question, I only asked it in IE, but this is the way you want to be able to access the program of studies, not the visuals of today.
economics 421 521   econometrics   winter 2011   lecture 1 hd
I'll move on to a different version of this, but you can access the syllabus through your whiteboard. I'll put grades up there and most things like that, but most of my communication with you will be through this web page and so you can get the address there, so go to the board to get the address, it's a disadvantage for you document type addcom forward slash

economics

421, but if you go to Google and just economics 421, this class will appear in the first list and you will be able to find It's also like that. I filmed this class.
economics 421 521   econometrics   winter 2011   lecture 1 hd

More Interesting Facts About,

economics 421 521 econometrics winter 2011 lecture 1 hd...

I put the videos on YouTube. I'll talk about the minute and the last time I did it at the beginning. They have over 100,000 views and were therefore ranking high on Google, making it fairly easy to find. So my name is Mark Toma, my office hours are there. I'm at PLC 471, get off the elevator and turn right and you'll find me pretty easily. My office hours are there. They will be after this. class says 1:30 usually people have questions and things take a little longer than that so I'm hoping it's 135-140 but I'll stay an hour no matter what my next class is at three o'clock so which is right next to my office, so I'll be there all the time between classes.
economics 421 521   econometrics   winter 2011   lecture 1 hd
Any other time, it's fine to stop by. Honestly, you won't find me there at 8:00 a.m. m. I'm a later morning person. but in the evenings, late afternoon, late morning, it should be there most of the time. I don't get any benefit if you go through that kind of thing. If I'm there, the best thing to do is just send an email and set up. make an appointment if you want to talk to me and I will do my best to meet with all of you we are going to have projects in this class there are a hundred of you or there was less time, I checked that is going to put a lot of pressure on us and that is why we will do our best so at some points, if it gets overwhelming, try to push some of this into the GTF, so I'll select the GTFS.
economics 421 521   econometrics   winter 2011   lecture 1 hd
I gave this class some good ones or at least I tried and I hope we can put some dip into them if I can't make it, do the best I can, but don't be afraid to check with the GTS because they should be very helpful. Also, what else should I say? I talked about the website, there is the course description from the class website. I'll say a lot about that in a few minutes, so I'll skip over the text is this one from Daugherty. Honestly, I'm not the biggest fan of this book in the world. I think it's a little simplistic in places.
Okay, we'll fill in the holes, but we'll make you buy another book for $150 or whatever they cost these days. For the marginal benefit you would get by switching to bub, what I could do anyway seemed very small, so I decided to stick with this book. I'll keep a close eye on it, but sometimes it can improve what's here because sometimes I don't. I think it's detailed enough to explain aspects of beauty intuition, give you the things you need to do this right, so we'll use the same book that we shouldn't have to buy anymore if you have this last trimester. and we will solve any deficiencies in class.
What more recon 420 prerequisites do you wish you had? I think this would be pretty difficult without them. The class has laboratories. Those are instructors Holden and Tang. Those two also did the latter. quarter the person who did this last quarter borrowed my notes and my problem sets and all that, so he already went through all of this once, including all the problem sets and everything, so you are well prepared, we will do a lot in labs we'll hand out assignments as exam passing assignments we'll go over procedures we'll do all sorts of things so it's something you'll probably want to attend.
I'm not going to take formal attendance, but I will take the things we do. in the lab, of course, part of the class, so if you want to love, you'll want to go, so there's a midterm and a final. The midterm exam will be on February 3, that is, Thursday of week 5. I would like to do it in week 6. that Monday after the weekend, but that day I will go to the New York Fed for a conference, so we will have it that day in the final. I don't make the final schedule. I am NOT a morning person. and Friday is the worst day for finals, but it's Friday at 8 a.m. and I just don't give finals early, so if you have plane tickets or someone is going to force you to return home, you need to leave town early, this is not going to be the class for you, you're going to need to find other arrangements people are sick good things if I ever get makeup I give it to them later than ever so the reason is that one person can find out what is in the final up to 50 people it really contaminates the exam if that person discovers what is in the exam.
I have 1/100 of a problem that is not that big of a problem, so having people take it after the file is much safer from a security standpoint and that is My reason is that Michael is in the late Friday at 8 a.m., we will deal with there will be an empirical project in this class. Oh, there will also be some assignments along the way and the midterm is 30. The final sports love assignments that count for part of the grade I got in there somewhere. I'm looking at empirical projects making up 15 percent of the grade and we'll have a lot more to say about that along the way.
What I've found doing this course over the years is that you tend to put it off until the last week, so what I'm going to do is have benchmarks along the way, you're going to have to do certain things at certain times and that will keep the project going for everyone. of these so that in the last week you will not be in a hurry to try to do it, however, the last week will be a crush for all of us, but we will do our best to make you ready beforehand by having some reference points and things to do at along the way, but again I will have brochures about the project what we expect what questions to answer how long it should be what the design will be all those kinds of things will be delivered later so it should no longer start, there should not be any uncertainty about what will be supposed to do once it's time to start writing it, we computer labs are going to use eviews again, this is fine, everyone should have it.
Since last quarter it has been installed in Sissel's laboratory and elsewhere, so I don't think I need to say much about it. You should know how to execute it pretty well. There are the tasks. 15 percent of your grade. There is a tentative outline of the course. It's brief. You'll want to familiarize yourself with this web page because there are some useful things on this right sidebar. My office is there. My email. The GTF offices. Your email. Your opening hours. I don't have them all yet, but we'll include them soon, so these categories are helpful, for example, if you want the old midterms, you can just click on them and it should show all the old midterms that I've never given in this class if you want see what I've done in the past, you can get the passport, so here's the winner.
Oh, nine, for example, oh, shoot. I'm on a 64-bit system and YouTube won't play, sorry, okay, we're fine, you're wrong, yes. Every day I upload the materials for each class what we are going to cover what we will cover next time that type of thing and then I will publish the videos of each class through YouTube here the assignments will be on the website you want you know what assignments suit you to give you can go to the previous class you want to get ahead of the curve I'm going to give you the same assignments that I gave in the past there's the project things I was talking about this It's one of the brochures, okay, so all this is will be posted on the class website and you'll want to get used to looking at it, so again there are review materials there if you want to know what we're going to cover in the class. you can press review when the final is great.
I'll post all the topics for the final so it's a review and all kinds of stuff so take a look at that sidebar and get used to the information there because it will be useful for you if you want a really detailed syllabus check out these topics for the final version and those removed last time. Now I'm going to tell you exactly everything that we're going to cover in this course in the order that we covered it, okay, anything else. There is a link to my blog. I have an economics blog. I just posted it a moment ago.
If you want to see it, that's cool, but it's not part of the class. These are many of my personal opinions. It receives around 20,000 visitors. one day it's pretty active, it takes up a lot of my time, then there's a link there and that's all I'm going to say about it. Anything else here, but there was something else they wanted to say. Okay, any questions about all that. I forgot to say that they were very well behaved today. There is one thing I must warn you about. I am a little demanding with the class. Oh well, we love them.
Notes for the exam. No, sorry, where can I spoil those Tigers? They spoiled us. Oh yes, probably. So we can explain that I can then justify my reasoning for that, but I think it's better that a lot of intuition can be drawn from the formulas and a lot can be learned in that memorization process, so I'm not at all. I'm a fan of doing that, I think you learn a lot more, maintain a lot more without those questions themselves which won't depend heavily on the formula so it will be doable without your notes, a lot of people take this course and a lot of people get Every quarter it's the same number people seem to get A's, so you'll be fine, there's a curve, there's definitely a curve, in fact the way I grade is I just put it in, I take things, I rank the scores, I take a certain percentage a B C there are some guidelines in our A and B.
I draw the a. We have a guideline in the department. Between these percentages I find a good gap. Draw that line. Draw another one, so it's purely in order that way. GTFS hard or easy because they do the grading, the average can vary in GTF five ten points as long as the distributions are correct, this is isolated to your location, sorry to disappoint you, ok I'm going to assume at Highbury number two , white, right? big last time I taught it bigger I taught this at UC San Diego one time 300 people had to use big railroad chalk to write things.
I went there last night I looked at the board from the back row, it's pretty hard to see It was pretty hard to see from these corners where you are so I'll try to write big but if I don't write big enough for all of you to see you'll have to let me know I will do my best, but I can forget that I don't have 30 people in my

econometrics

class, so we will do our best. The video must be clear. If you don't bring it to class, I'll assume I have basic familiarity with you. You're familiar with the basic linear regression block beta 1 plus beta 2 X 2 I plus beta 3 X 3 I plus beta al K and you learned about the assumptions that were necessary for the estimators of that model, the optimal one, this property called blue, which are unbiased linear estimators, we'll get to that in a moment, basically, for now you learned about this model and then you learn how to estimate this bar, so if it were a two variable case where you just had y equals beta 1 plus beta 2 X 2 I plus UI, there are some scattered points and what you want to do is find the line that best fits these points to get a set of data about x and y and given that set of data, you want to find the best possible fit line and you want to know about its properties, what are the properties of this model.
Now there are two uses of regression models so we estimate these models for this there is actually one more but I'm going to talk about two there are two main reasons why we look at these models one is to test hypotheses to test theories about how it works We in the world want to know how a change in the money supply affects money GDP affects interest rates, it affects unemployment, we have theories about how that works, if you've been paying attention we have competing theories about how that works, presumably we can use tests to discriminate between competing theoretical models and find the ones thatthey offer better fit the best possible explanation for the world and presumably that then helps us understand how the world works this says it works this way this says it works that way we estimate a model we do some kind of test we favor this model ah, it's more likely that the world works this way than that way, scholars are more interested in using regression models to learn about the world and how it works.
You can also use these models to forecast. People in the business community are more interested in using regression models to forecast what interest rates will be. in the future what will be the unemployment rate in the future what will be the GDP in the future they also need a model and that's why this group of people use these models for different purposes and in fact we will find out later that it is more or less probable Let these people worry about forecaster bias. it is possible to have an estimator to have a forecast it is a little bit wrong it is biased but it has a really low variance it is quite good compared to an estimator that is on the button that has a huge variance so one can be a little bit off of place but very close to the truth, the other may be centered on the truth, but very high variance forecasters are often very high. will choose the biased estimator because it has a narrower forecast error, but academics wouldn't be interested in that at all because they want to know what the multiplier is and a biased estimate or a multiplier is not useful, it doesn't add their knowledge to the world, so if you want to say well what the exact truth is, your mother said, using unbiased estimators to learn what the real number is instead of these biases, but what will happen to those things as we go through the course, the main thing now.
It's just to realize that there are two separate uses of models, we'll focus mainly on the first one, which is using models to test how the world works, but keep in mind that there is also this forecasting step that Tim teaches economics . 422, which has to do with forecasting, that is the next step: developing these models. Okay, to get the best answers to these questions, we need the best study model. Well, there are three steps involved, the first step is to specify the model. I call it model specification. So far, I'm not sure how much Jeremy made of this with the solar pipe twist, but until now you've probably assumed you had the right one. model, you did a little or omitted variables a little in the inclusion of extraneous variables.
I believe in Jeremy's class where our model has no specification issues, but I don't think you've really gotten too far into specifying model errors if you use lags of a variable how many X's should it include? Should the law apply to the variable? Should I square it? Should I use a x2 by x3 interaction term during the local

winter

? Should I use 1 over X as my variable or is X itself any difference? Why don't we stop? Do we use the logarithm of X that much? What is the advantage of specifying the logarithm model? If we use variable squares, we will run salary regressions on this class.
We will find that if you leave out the squared salary, you will actually get a lot. bad estimates you're not going to learn about the world because you start with a poorly specified model how do you know age squared is supposed to be there? How do you find out? How do people discover the fact that the wage squared is in the part? of a regression model and this is something that you will face head on when you try to do your projects. You'll try to figure out what variables are there and what data I have so you can have the perfect model in mind. you have to look for data there, so there is data, but there is something similar, if I use something close to the true data, okay, I use something that is measured with air, okay, I guess I need you to know some variable, suppose I need the ex. -The ante real interest rate, which is what matters in economics, but all I can get from the data is the ex-post real rate.
That's a problem: how does that affect the estimate? If I specify the model incorrectly, what kind of problems do I have? we'll want to talk about that, how do you go about writing the best fitting model? How do you write the right model? How do you prove that it is a suitable model? What do you do, for example, with the Akai ek information criteria? one way to look at the specification we will get a way to test whether we should include this variable in the model or not to test whether m2 should have three or four legs if I need to use money lags because there is persistence how do I do it?
I know if I should use three four five six or seven lakhs, where do I leave? All those kinds of questions I'll have to face head on as we look at this specification step, we'll also have to worry about the error term I missed. What is the correct way to specify this error term? What if it is not normally distributed? What if some other distribution follows? You've assumed that you have a normally distributed bug that gives you T and FS and all sorts of nice distributions. What if your error is distributed? Another way is something that we can overcome with the modeled specification, so there are a ton of issues that we will need to solve here as we talk about the specification.
Once we have the model specified, the next step is. estimate, so let's assume for now that we have the correct model. We have managed to write a model that may not be good enough. We have a decent model here, probably estimated. Well, we're going to have to learn all kinds of different estimators here. let's briefly talk about estimation here while using a guard so remember our goal is to fit a line to a set of data so we have this scatterplot of data and we fit a line now if I just choose a data pointer and Measure this distance this is the true line the true relation this distance this is the observation in my data this is X I Y I so this is between the observation and the true line is the error in the relationship, so KY i is equal to beta 1 its constant plus beta 2 x 2 i plus UI, so this true line is beta 1 plus beta 2 x 2 i and then we have this The error is gone now, what you learned to do was minimize the sum of all your observations.
Let's probably end up with Jeremy, a time series. God, they probably use capital letters, they got a lot of fun for these purposes. and you would choose theta 1 theta 2 or bay all betas if it's a larger model all data you choose Thetas to minimize the sum of squares what is this when you had like x squared plus y and algebra during summer squid oh , that is? a distance measure, that's how distance is measured, so what are we really minimizing or forgetting the math? What do we minimize? The distance between the line and the points we are finding.
The line is the minimum squared distance between the points between the Y. To the point, so we're just minimizing the distance, add all these distances together and square them. You'll get the least squares estimator you talked about, that's not the only estimator, why don't you look at the absolute value of the distance? error instead of a square why not minimize the absolute deviation instead of the square why not minimize this to the quarter which will further penalize outliers or to the sixth eighth Weisinger this is called this is the least squares estimator this is called estimation crazy if you, if you minimize the sum of the deviations, we won't do this, but that's called the estimated minimum absolute deviation.
The point is that the way you do these things there is more than one estimator you learned about the OLS estimator, the OLS estimator minimizes these sums of squares, but that's just one, a lot of estimators out there, there's also an estimator minimum absolute deviation, there's another one, any, you can use any distance measure here and you just want to minimize that distance, it's just another distance measure, so all kinds. s also fortunately it turns out that this one we already know a lot about this this one is blue it is the best unbiased linear estimator there is no estimator that is linear it has a lower variance than this one and that's why we use this one because because it has some very, very good properties, but These other estimators might be useful in other situations, but anyway the point is that there are so many different estimators, we'll still focus on the least squares estimators.
Aware of this, yeah, I just want you to be aware of the fact that there are other estimators and we're going to look at other things like, instead of going crazy, we'll do things called Els, we'll look at different types of estimators that exist and so that they're in this class. , we'll look at a variety of different ways to estimate this line in addition to what you've already learned, what the OLS estimator is. So why do we have to do it? We make sure. The reason we're going to need all these different estimators is that sometimes we have these assumptions that make them correct.
There are ten. I'll analyze them in a few minutes. If those assumptions are met, we know we have the best. unbiased linear estimator, as long as these assumptions are true, the OLS estimator is the best estimator. You can get a linear estimator if your errors are normal, it's the best nonlinear estimator and it's also the best estimator out there, but if one of those. the assumptions are violated and we will talk about this assumption, then that estimator may no longer be the best, there may be other estimators that are better, then, if it has a correlation between the variable on the right and the error, if it has a serial correlation, if you have heteroskedasticity if you have all kinds of problems that we will talk about during the course, this estimator is no longer valid, not that without that it is not the best, so we will have to find ways to find other estimators. will involve the same procedural operator this kind of thing we will have to come up with different estimators for different violations and we will say more about that Oh, so win the Gauss-Markov assumptions your hope is that your products are satisfied OLS estimators are blue, let's go over this letter by letter, so we'll get to those assumptions in a second.
Let's assume they are satisfied. What does blue mean? What is the BB? What is the best B and blue view? So what do we mean by the best estimator and that is actually the lowest variance Louis barium jim is a minimum estimated variance so what we can do is take estimators and write their distribution so that an estimator can have and the errors are normal the estimator would be normal. Focus it on the truth, so maybe we're flipping coins and you don't know if it's fair and you're trying to estimate the probability of it coming up heads and tails, which you don't want, you know the other one, so you flip the coin twenty times. this number is going to be ten, right, we should give ten heads and ten tails, so the average should be ten, call it heads one and tail zero, then the average should be ten, but in twenty given tosses we could get six or nine o Sorry 12:19 play Yahtzee this morning got over 300 stupid iPhone, so the estimators have a distribution that we can measure only theoretically or we do it empirically, flipping the coin, mapping the frequencies, you might have two different estimators, both centered on truth. both are unbiased, although this one has a smaller variation in this one, which one is better, this one, both will be true on average, both will give you the correct answer on average, but this one will be wrong. time after this symmetrical need, you have to deal with my shitty art, so I'll do my best.
This one is better because it is closer on average to smaller bearings. Best means that there is no stricter distribution around the data and this is the OLS estimator. is as strict as possible as it gets bigger and bigger and there's more data, these things get tighter and tighter, but there's a limit for any given N on how strict it would be if an amount of data collapsed down to the truth. For a given n, you want to choose the estimator that has the smallest tears, that's better. L is linear. What do we mean by that?
It's a little hard to explain, so stay with me for a second when you minimize the sum of squares. I'm going to use two variables up Y in the x case just to make, but simple just to show what we mean by linear, but you can do the same thing with a multivariate model using linear algebra. I'm going to use regular algebra, so I need to use a small easy example so you learn that the estimator for beta if I have the log of Y I in Spain a 1 beta 2 X 2 I plus UI you learned that beta 2 half is the sum of minus X bar 1 minus bar and over the sum of bar OLS is, we're going to use different estimators, which means GLS or something else later, so we'll start with OLS is in most things at some point for now, it's not that important now you probably also wrote this and this will do it easier.
NoI have to do this to show the linearity property, but it will simplify it. I can write this. something of X I Y I over the sum of let alpha I be X I over the sum of TRUE? a sum of 1 and X I squared we flip a coin this is just the beta or OLS as a sum. of the altai while that is what we mean by linearity this estimator is linear and why is there no and squares here there is no, it is just the sum of a coefficient in the coefficient And poetry or that is a linear operator that says that there is a mathematical proof of linearity we will go but when it is in that form it is linear sometimes there are non-linear estimators that are better when we do OLS this estimation this is the best linear estimator the best estimator of this form but sometimes there are estimators in another form that are better let's go to see something later called autoregressive conditional heteroskedasticity, it's an angle and the ranger in the use of san diego got the Nobel Prize or everything it is, it simply means that the variance changes over time and fluctuates persistently, so variance moves time OLS is still the best estimator for that linear estimator model, but there is a nonlinear estimator that can be infinitely better than the linear estimator, so that is a case where the linear estimator is still the best. better, but there is a nonlinear estimator that is even better, so it is not always the case that this is the best estimator, it is just better linear, there could be one of this I squared shape that would be a model in your estimator , is not linear in use if the errors I have erased are normally distributed then you know this is better than normal in your tube, but they are not normal since they are an arc model, then because the variance moves differently it has a built-in normal distribution, so they're not normal, there's a different estimated expansion, okay?
I'm biased, okay, what do you mean bye? Yes, you are very new, we already talked a little about this. You may have an unbiased estimator. Looks like there's your unbiased estimate. Truth centered. Which is impartial. It mathematically means the expected value. equals beta now many of you have this math thing and you don't want to be here. I would prefer to take it in that class. I'm going to do my best not to create that obstacle here, so I love you. If you're really afraid of these things when you see this ease, forget about expectations man, they say that on average, on average, habitable beta zeta, that's it, it means the rest just shows it, it's just that, on average, I do this enough times on average.
I know the truth when you measure a variance is the sum of the bar Y I minus y squared that's just the mean squared deviation just the mean squared that's what the variances are not very students fit many of you are not going to be using these things in the future and someday you'll get a real job and you'll tell yourself during this or something. I don't have to learn all these stupid formulas and get a cheat sheet that I'm not going to use. this garbage in my life what's the point why make me learn it what we hope is even if the technical apparatus abandons you in the future the ideas will remain when someone at some point in the future presents a graph in a sales meeting and saying look here there's our advertising versus sales here's the data here's a line through it look how wonderful this is and you tell yourself you know the errors in this model or you probably won't say see record of it that's probably the case I have persistence here is smaller probably has some persistence when that happens.
Remember that you can't trust these models, in fact if there is persistence in the error terms in these models then it doesn't need to make any sense to say this. It's an important relationship and you'll be able to ask questions you might not otherwise be able to ask about the data presentations, even if you forgot all the technical parts and many of you will be in those types of meetings, so this should give a focus to the data and the ability to ask the kinds of questions that you need to ask when you're trying to do analytical things or understand analytical presentations or make decisions in a business that you know you see, you know you're running a business or you're doing some work and you see that two variables are related, you want to know how much you can trust the strength of that relationship, well, you'll get an idea of ​​how much you can trust that from the things we're going to do.
Hopefully, even if you forget about all the technical stuff, the rest will still be useful, but anyway back to unbiased, so unbiased, here's an unbiased estimator, here's a biased estimator, this is the forecaster versus the academic you could. As a forecaster, not me, you are finding that number instead of that limit because you are so close to the truth. If someone used this yesterday, you know it's going to stray, maybe further away, and you're on it most of the time. When placing bets on a financial market, you may want to be as close as possible, even if there is a small error, so a tipster might accept this estimator.
We think that if we find unbiased estimators, this is the best unbiased estimator, but it doesn't say that there isn't a better biased estimator, sometimes there is nervousness, but OLS will still be within this estimator and then in what is an estimator, that's just a rule to process the data, there is a rule that tells you what to do with a day. Okay, so what assumptions are required for a bottle to work? And then what the course is about is going through them and saying, "Look, that's probably not the case; you can expect these assumptions to be true, and what if they're not?
How do you get the best estimator when Gauss -Markov no longer holds, so are these assumptions always true? There would be no need for this class, they are rarely larger, so we need to know how well there is a substance there. It takes ten of these to guarantee a. quick move I guess it's a question, not a statement. It's the first one, okay, the regression, the regression model is linear in the parameters. Your Alex, what does that mean, so this is an example model. , it doesn't stop at the parameters, that model is not a good model, what about the toes I did this beta 1 plus beta 2 times 1 over X 2 plus that model, okay, everything is fine because the parameters are linear, they are just coefficients x 1 coefficient multiplied by a value, so I think the linearity is about the beta, not the Xs, if you had y equals beta 1 plus beta 2 X 3 plus you, that is not linear in parameters, it is not. parameter parameter value x but then this is not okay, no, it's okay, you know, if I did it this way, why am I equal to beta 1 plus beta 2 X 2 plus beta 3 2x3 plus UI?
Add an interaction term between X 2 minutes, that's fine. linearity has nothing to do with the what I said yes and Sam repeated, what this says is that exes are not random, exes are chosen, they are not random variables, someone chooses the accents classically, what you do in a classical regression setup, you would choose the treatments from exes in advance, so in advance Choose, say, ten treatments for exes, you would apply those treatments repeatedly and observe what happens with the Y so you can do this. Actually, I would choose, for example, the values, so this could be x equals four, so it will access the system if you give them the experiment. in X of four and then you look at the operands missile monkey 20 times then you take the next State Chico and even the whole class on how to choose the outputs of the experimental design, so you can spend 14 weeks practically on how to choose these X optimally, but the point is that they chose and they are not random variables, that is a very bad assumption.
Often in economics our variables are actually random on my end, they are not fixed, what do we do then? OLS is still fine, pin and we'll see what you can do. In this case, you may have to sometimes use what are called instrumental variable estimators, so when the that these taxes should be set, that is the assumption that is sometimes not true. OLS is okay and sometimes it's not and we'll see when it's about music as we go through the course. A good example of when Xs are not random is when they are measured with air.
Suppose X I is X I star plus something. the error must be in maybe X I is the length of this board, so I get out my thing and measure and round to the nearest inch. Well, I don't have an exact measurement, this is the trill of the table that I have a little. There is a small error there. I measured the table and that error is pretty random when I round like that, I actually think it's a uniform distribution, in that case I think rounding is a uniform distribution and that's a case where you have the X that would have a random error associated with it, so when you round Neer inch sometimes it's bigger, sometimes it's smaller, what you wrote on a piece of paper and no matter the degree of measurement of my good half inch, quarter inch, I'm going to have an error sometime overwhelming always if this error is small really very small relative to the length of the table, that won't matter much for most applications, so when we have a small randomness it probably won't affect this much, still we will be fine. but as that bug gets bigger and bigger you can have serious problems using all of us and that's just a simple example of how to get a red max.
There are many other ways your actions can end up close and when that happens. you're meant to be a derp isn't necessarily yes, okay, the error term is zero, which means this isn't a major assumption for the most part, it's easy to overcome, but we generally assume that these around the true line, so those are supposed to be normal distributions of these X, so I choose this x equals two. I took 20 times, I get different results each time, that should be a normal distribution and it should be centered on the truth and I will be centered on the true line if it has zero mean so the errors have zero meaning on average, this will be centered on the true line now, sometimes that's not what happens, you get non-zero errors that are not Rama non-zero.
Set my thoughts on errors to non-zero. means for errors, so here is your true line, but your distribution looks like this: They are sending here instead of the true value, so this is a true line, but your errors are not random, it's like they are random , there is a non-zero mean, why can't I say that the non-zero mean errors have a non-zero mean and therefore their distributions that I just wrote when you sent them here with me, if you ask them online , it will estimate that line instead of this line because this is where those errors are now centered, as long as the error says that the expected value of BI is for this distance before all the air is constant, what's going to mess this up ?
The slope will be fine, but the constant will be biased, so we have a non-zero error bias, the cause now the error varies, so you won't have the slope up most of the time it's just a cause of error, This happens, so if you have a non-zero error, it means that either the errors simply shift the line up or down by the amount of the mean. Most of the time in economics we are interested in the slope and not in the constant if we are only interested in the slope this is not a big deal in some However sometimes the mission is a constant then we will have to worry about this so if you have any idea why not change the estimate constant in this regression that you would have if you knew, you would, so if I know it's because I just asked you what a crack board is.
I'm ready, but if I only know that there is probably a non-zero mountain B or its value, then I can't. You may know this design, but not in magnitude, so you can even put things together. it's at least this day or at least this small, but you can't really do much more than that, but if there's any way to know this, you can't come up with the explanation becauseis the estimate, so there's a line here and it doesn't go. to tell you if they have a zero mean or not so you have to get that to get operational information and it's just an air so this is only a problem if you care about the concept okay homoscedasticity what does basket city mean parents , this means the same variation.
It is the assumption that the variance of your errors is constant here we have the meaning zero, so we are saying that the UI is distributed with zero mean at a constant variance, there is no variance. Here the variance is the same for each error. when I wrote them I tried to make these distributions the same sprint, the variance is the same everywhere, that's not always true if I'm measuring sales and put the Walmart plates in the same regression, what do you think was the largest variance? of its sales is probably greater than the total, Walmart's daily variation is probably greater than the total sales of the stores in the city, so if you were doing a regression on those things you would probably see that this is the true line because I start with a small extension. and then the extension as the story gets bigger, the variation probably gets bigger too, but whatever the reason, if that happens, this variation tends to grow or start begging, it gets smaller or starts little by little against a big event, it becomes small, whatever, as long as it is not constant.
I have a problem, it won't be the best estimator. Here's why Wells thinks each of these observations is equally informative, but if someone wanted him to do it, if someone said that he can only use half the data to estimate this line, what happened. You choose, I choose that. Since the variation is very low, these are much more informative about the true art of needs, so when I estimate what I want to do, I want to give them more weight than I do because they tell me much more about the truth. . that means OLS gives everyone the same weight, so essentially what you do is take the X I and divide it by the variance, so if I have X I here, I divided it by the variance, if the variance is small, this makes this is more important. the variance being large makes that less important, so we're going to have to take the EPI estimator and somehow get it to outperform Sigma.
I involved with the correct weights so that everything but weights all observations equally when it shouldn't and that is the fundamental problem. from the OLS estimator, when you have arrows, you could ask us today the mean is fine, it's not skewed, so it will give you the correct betas, the center points are still fine, instead the variance gets much larger as It moves to the right much of its Look, you don't have a zero non-zero mean problem, which is that the variance changes, so the central tendency is still correct. You're still going to hit the right line on average, but you're going to have a lot more value for this in us.
If you have this, you really want to focus on this at this lower end or wherever there's the other way around, this handles it very well, so our solution for heteroskedasticity will essentially need to divide by the variance that you divide. lives of the x times the variance, then you can show that the variance becomes a constant, which will show that you end up waiting your eyes for Sigma and that then kicks the variance of this disease, it doesn't matter, that will fix them well. next assumption is work without autocorrelation let's spend a lot of time on this this is really quiet this heteroskedasticity is probably mainly a problem with time with cross-sectional data this is mainly a problem with time series dating data over time essentially in this case fixes everything from a platform, so it could be that UI, so our model is again why I am beta 1 plus beta 2 X 2 I plus UI, the same model as before in correlation, so let's say, well, I'm Rho UI minus 1 plus B. 9 this is a random variable, so it says that today's air can be 1/2 is half of yesterday's air plus some new innovation, so your airs are persistent over time if you have a high air today you are likely to have a high error tomorrow, so when Rho is positive, generally your errors tend to follow this way, it's not that different from the data.
I've exaggerated the shape, so the trend is high errors followed by high errors, low errors followed by low errors If GDP is below its long-term trend today, what do you expect it to be next month? It will probably be below the trend again, so there is persistence in Jin GDP, if this is the long-term trend we are above and the trend today or let's do it. true, that below the trend today, since it is more likely that we will do it below the trend amaro, there is some persistence, we are not jumping all over the place, it is not long term, today tomorrow we are above, so, that is not It's the way the economy works. they tend to move smoothly towards them, they can move smoothly like this, so your bugs tend to follow some kind of distribution here when that happens, if you ask to make this model with those bugs, your first task will give you an example of this.
I'll get T-stats that are huge, absolutely huge, it seemed like you have the model that fits you best. You'll write home, Mom, God, I got that. I have a T of 300. I'm going to be famous, damn it, right for this. insignificant, you have no relationship, this is the sales example I was talking about, you often get this kind of persistence in this sales data and if anyone was really impressed that they had a good Club line between advertising and sales, put it in In a sales meeting, you should read your hand and say that you know I can give you all the reasons later, how you can save it.
It's not the right conclusion to draw from this data if you're betting the company on the fact that you have an important relationship there, boy. I think you might be making a mistake and I will ask that stupid economics and then the verb does not smile, but anyway you understand at least how to recognize the problems you know, you know that its technical side will sometimes be there. can be more persistent than that, that's called first order autocorrelation, you can also have second order time series, usually you use time series, you see it goes to row 1 UT minus 1 is row 2 UT minus 2, so which this would be a second order autocorrelation.
Today's mistake depends on yesterday's mistake and the day before yesterday, normally you and your roommate get along but not today, you read this is positive, you are measuring how angry you are, you are very angry and that's how our stupid things are, your roommate room. did yesterday these are all the stupid things they did the day before and when you go back three days you're fine you're over it so you deleted it today has nothing to do with all the stupid things your roommate got three days ago but this I might know point one, so if they did it two days, no, you're still 0.1.
I have said this about you. Point eight happened yesterday. You are still very angry and this is all your stupid things that I do today, that your roommates are so erratic. They are quite random and therefore the length of memory persistence determines how you model autocorrelation. You can see if it has been raining or cold for a couple of weeks. It is above the average temperature at which it tends to stay there. Your average temperament tends to stay there, not forever. Eventually you will move on to another point. If there is a lot of persistence in errors in the models, and if you don't take that persistence into account, you will make a big mistake.
The problem here is that everything except each observation is a completely new observation, if you give it new information, it treats each observation as completely new information, completely informative, but if it is today and then it is 79 tomorrow, the fact that tomorrow be seven high doesn't. Add a lot to what you already knew about what happens, it's 80 and you know this persistence, so your observation tomorrow is not as informative as it would be in a model that doesn't have this persistence. OLS gives full weight and you should count everything like no, no, no, those are not correct, they should not be given the same weight due to persistence, this is not new information, only some of the information is important, You already knew everything, those streets are new, etc. makes mistakes, so we're going to have to fix this model somehow, what you do is you subtract the rows multiplied by a model yesterday, you just transform the data and you can fix that pretty easily, but again, if you have that problem, that's It's just not a network, it will work, it won't give you the right answers, the next one is just harder to explain.
It says this is true for everything. I use that symbol on page 911 100 times, so this says the exes and use. they are not correlated no matter what happens and it is going to happen which is a very common problem no matter what OLS is no longer the best estimator OLS is no longer balloon all of this will be biased here in that case weight loss works this is what which does all this takes a model why I chose beta 1 plus beta 2 X 2 I plus beta 3 An estimate is by looking at the data are: find x when X 2 moves, X 3 remains constant as long as X 2 rooms 11 is a little more than 2 this increases by 3 this increases for fun a little less than 2 so review the data define all the independent movement of your Nexus and just average the notes says ok, what is the average response to X 2? otherwise it is the same but suppose this is true suppose it is not true that says that when positively every time because of that correlation you get an absolutely wrong answer, so when that assumption is violated, when , when you see a wide loop, you assume it's true. as soon as this did not move and failed measure the coefficients when x is random, that is likely to happen its most basic law, the quantity demanded is plus that this P I this to I are going to be related they are going to be correlated I tried to estimate this model one equation at a time how do you make it happen on the same day that you see that there is a problem that we have to deal with later? they are simultaneous equations, they both have a PE elbow, these Q are the same, tomorrow you will only see one Q, but anyway in this model this VI will be a random variable and if it is, then you will have problems, we will come back to that I should present them later, well it's five minutes or less, the book doesn't support this one.
I want to put the number of observations there and it is greater than the number of variables. If that's not true, you can't even. ask to make the model, if you want to estimate four things, you need at least four data, so you will leave it, put one more than the last, next time, the number eight is that there must be variability in the essence and in the most variability the benefit of estimating two lines in this we have a group of X here and a group of X here right next to it in this model we have a group of X here and a group of estimator?
Think about the limit. Think about all your observations. We believe that there is no variability in the output either. So you have 100 observations, but they are all the same. X. Now tries to adjust a line to this, what constant should have, it looks like a balancing, there is no way to know what the slope should be with that line that is built, it has the same opinion and the closer their observations are, the more uncertainty will have about what the true line should be, the further apart they are, the better it looks, so not only do you not have any variability, Rexes, you can't estimate the line, but we can say more than that, there is more variability.
Latin is better than less variability and in fact you saw them for the model Y I and beta 1 plus beta 2 X 2 I plus they used a factor for the variance 21 of am also in Sigma squared over the son of square what is this a measure in this model? What is the average of the XS X is 4? They are all 4. What is the average? So it's X minus bar X and then 0. What's the sum? 0 0 What is the variance in that case? infinite, you have infinite variations, you have no idea what is true in that model, the larger the extension like here, the burner apart from these is smaller, this variation will be fine.
I will finish this class on time. I've had some problems today with people coming and going. I get emails and that happens especially in a class like this, you stand in the middle andyou go out, annoying everyone behind you, in front of you and everything else. We have videos, we can't stay here for an hour and 20 minutes. I'm wondering why you're in college to begin with, but stay home and watch the videos and let the people of Auburn

If you have any copyright issue, please Contact