YTread Logo
YTread Logo

This is How Easy It Is to Lie With Statistics

Apr 09, 2020
Around 2002, Target went to a statistician with a question: what was the answer that could potentially make the company millions of dollars. They asked: "Using computers alone, can you determine which clients are pregnant even if they don't want us to know?" And from then on, statistician Andrew Pole was looking for an algorithm to do just that. What he did was analyze the shopping patterns of pregnant women and noticed some common behaviors, like an increase in lotion purchases, loading up on vitamins, and more things I know nothing about. and used

this

information to not only determine which clients were likely pregnant, but also what their expected due date was, and after developing his mathematical model, the statistician had a list of hundreds of thousands of women who were likely pregnant along with their due date and what trimester they were in, and from then on the target could send coupons at just the right time during the following months and even after the baby was born.
this is how easy it is to lie with statistics
Now, although the target was cautious about following secrecy laws. It could still turn women away if they suddenly started receiving coupons like cribs, diapers, and other related items when they didn't actually tell the company they were pregnant. So what the goal did was to simply add these items along with some other unrelated products when the coupons arrived, so it would seem more natural. But about a year after creating

this

algorithm, something happened, and this is where it gets interesting. One day, a man walked into a Minnesota Target demanding to see a manager. He was very angry and apparently what had been happening was that the target was sending coupons for things like diapers, cribs and other related items to this guy's high school daughter and he was very upset about this.
this is how easy it is to lie with statistics

More Interesting Facts About,

this is how easy it is to lie with statistics...

He was saying things like, are you guys trying to encourage her to get pregnant? And the manager didn't really know what was going on. Of course, he apologized and a few days later the manager called the dad back to apologize again. But this time the dad was not so angry but a little more embarrassed, I think you are. know where this is going. on the phone The father said: In fact, I owe you an apology. There have been some things happening around here that I haven't been fully aware of and in fact, My daughter is pregnant and due in August.
this is how easy it is to lie with statistics
So, yes, this statistical algorithm discovered that this girl was pregnant even before her father knew it. That's the power of

statistics

and we're just getting started. In 1964, an elderly woman was walking home from shopping when she was suddenly pushed to the ground and her purse was stolen. He now caught a glimpse of the thief and saw a blonde woman with a ponytail who then fled the scene. Then there was also a man nearby who heard the screams and saw the woman crash into a yellow car driven by a black man who had a beard and moustache.
this is how easy it is to lie with statistics
And yes, by the way, all of this is necessary for the story. A few days after the incident, the police ended up catching Janet Collins and her husband Malcolm, who matched all the descriptions given by witnesses. They were then charged with the crime and put in front of a jury now as most of the evidence could be provided. this was just from the victim and the man who saw the event and what they both witnessed also brought in a mathematician to help prove the guilt of this couple. This mathematician calculated the probability of randomly selecting a couple who was innocent but who also shared all of these characteristics that were observed by the witnesses.
Based on data The mathematician came up with these numbers and assuming independent events We can multiply them all together to find the joint probability that they all apply to an innocent couple. It turns out that there was less than a 1 in 12 million chance that this was random. The couple who happened to fit all of those descriptions were innocent, so the jury returned a guilty verdict. This is actually a very famous case in terms of the use of

statistics

in the courtroom. Another quick example is that of Sally Clark, who was convicted of murdering her two young children in the 1990s.
Her first child died suddenly in 1996 of unknown causes, so it was assumed to be a case of SIDS. sudden infant death, but about a year later she gave birth to her second child, who was later found dead again 8 weeks after his birth. for unknown reasons So after this happened and was reported, the police ended up arresting her and her husband on suspicion of murder. During the trial, a pediatrician professor testified that the probability of two babies dying from SIDS at approximately the same time relative to their birth was approximately 1 in 73 million and again one in 73 million is beyond a reasonable doubt, for which was most likely a shaking or suffocation event or whatever and Sally Clark was found guilty and sentenced to life in prison.
So you can see. Statistics have a lot of power in our world, be it advertising, criminal cases, etc., but what is also really powerful and much easier to do is lie, deceive and misinform. Use statistics and you don't even have to use bad data to do this. I mean, I've already done it several times in this video. I'm going to talk about that soon, so yeah, next part for all the people who comment on the videos before watching them in full because I'm going to say more, but let's start with light. In 2007 in the United Kingdom there was a Colgate advertisement that said the classic "80% of dentists recommend Colgate." It wasn't long before the UK advertising standards authority ordered them to abandon this claim because, although it was true, they knew people would make it.
I don't really understand what it meant. The study that was conducted allowed dentists to respond with more than one toothpaste. So as a dentist, one could say that I recommend Colgate, crest, oral-b. Dentist two might say Colgate, crest or Sensodyne and the like, for dentist three. , four and five In this scenario, 80% of dentists recommend Colgate. That's true. But 100% of dentists recommend crest in this hypothetical case and 80% also recommend oral B. All of these numbers are real and you can make an ad with any of these claims. But again, we know that people wouldn't understand what they really meant now.
In the next part I will ask you a question. If we hypothetically say that high school dropout rates in a certain country go from 5% annually to 10%, is that a 5% increase or a 100% increase? Because if you're in the 5% and you add five, you obviously get 10%. But if you're making, say, $5 an hour and you get a 100% raise, you'll get $10 an hour. So which one of these is it? and I'm sure many of you are saying that seems like a pointless question. Yes. Add five to get to 10, but the physical number of people dropping out of school would increase by 100%.
Well, in the spirit of this video. Let's ask something else. Which of these paints a more accurate picture? Like one of these was published in the New York Times or Forbes or whatever. Which tells people more about what's going on? And I'm really curious to see what they have to say about it because I think we'll hear different answers from different people. But for the next part, I think we'll all agree. What if hypothetically the dropout rates are one? in a million people and then the following year they reach two in a million people. So that's .0001% to .0002%, a difference again of .0001.
But that's also a 100% increase in the physical number of people dropping out of school. Then which one? Do you think one of these two headlines offers a better image? Well, again, we may hear different answers, but I think we can agree that 100% makes it seem like a worse problem than it is. Like five people in the entire nation drop out of school and the next year ten people drop out. Don't necessarily call that an epidemic yet. Now, using numbers like this in a misleading way is actually not hypothetical because it happened a few decades ago in the UK.
But not with university dropout rates, but rather with a birth control pill. In 1995, the UK's Medicines Safety Committee issued a warning that a certain type of birth control pill increased the risk of life-threatening blood clots by 100%. What that really meant is that with the old second generation pill about 1 in 7,000 women developed a blood clot while with the new pill about 2 in 7,000 women developed a blood clot. So yes, the physical number of women who received a blood clot increased by one hundred percent. That's true, but if we dig a little deeper we see that with the old pill it's about 0.014%, while with the new pill it was about 0.028%, which doesn't seem worthy of a breaking news alert.
But articles were published about this misleading. statistic and as a result, naturally, tens to hundreds of thousands of women stopped taking this birth control pill a year later and that scare was attributed to 13,000 unwanted pregnancies, many of which were teenage pregnancies... many pregnancy stories of teenagers in this video... moving on Do you really know that lice are good for your health? It seems pretty stupid, right, but people actually thought this at one point and that brings us to the part of this video titled correlation or causation or both. Remember, it is usually very

easy

to determine that two things are correlated from a statistical test, but causality is a completely different thing that is not as

easy

to detect.
However, people are very quick to assume that A causes B just because A is correlated with B. Sometimes the logic can be stupidly obvious, as if fast-moving wind turbines are positively correlated with fast wind. As one goes up, the other goes up, but does that mean that fast-moving wind turbines cause fast winds? Well, obviously not, it's the other way around. But in many cases it is not so obvious, as if it were saying that children who watch more violent television programs are more violent. Does this mean that these programs make children more violent? I mean that could be possible and would definitely be an immediate thought for a lot of people.
But what if children who are more violent simply watch more violent television shows? That also seems perfectly reasonable. So we can't jump to conclusions too soon. That's what a lot of people would probably do. Or in the Middle Ages, Europeans saw that people who had lice were usually healthy. While people who were sick rarely had lice. As a result, they assumed that lice would make people healthier when in reality lice are very sensitive to body temperature. So, people had a fever or something, the lice would find another host. Then, on the subject, we have the third cause.
Fallacy in which two correlated events do not actually cause each other at all, but rather a third thing causes both. For example, the sale of ice cream does not cause an increase in heat stroke or vice versa, although they are correlated. Instead, warm weather is the cause of both. Or during the last decades atmospheric CO2 has increased along with obesity levels. So does one cause the other? Well, no richer population tends to eat more and also produce more CO2. And sometimes it may not be very clear what is causing what. A while ago they discovered that students who smoke cigarettes get lower grades and that could mean that smoking causes lower grades OR maybe it means that getting bad grades causes smoking...
Maybe the extra stress that comes along with lower grades increases the possibility that a student will pick up that first cigarette. That also seems like a reasonable explanation. Or it could be that a variety of third factors are actually responsible for both. So even when you look at a statistical test with precise numbers, it's pretty crazy how far from the truth you can be. Next we have a story that will probably make people make assumptions in the early 1970s. Someone noticed that Berkeley accepted 44% of male applicants to its graduate school, but only 35% of female applicants. Now, right there, half of the internet is fine, let alone...
But only 35% of female applicants. That was a real unedited clip from everyone on the internet. Now these numbers we saw are true, but very misleading when you look at how male and female applicants applied to each program within the Graduate School, the assumed bias not only disappears but in some ways changes. Look closely here at thisrow. You will see that there was a high acceptance rate for the program. In fact, women had a much higher acceptance rate, but it was still overall high for everyone; however, many more men applied for this program, while more women applied for these programs.
Down here with much lower acceptance rates. So since a higher percentage of women were applying to these programs with higher rejection rates, overall acceptance for women would be guaranteed even though they are, in fact, slightly favored in a couple of departments, so any of these headlines. It could be published with the necessary statistics to back it up and all you have to do is choose which one you want to use and throw away the other. Include it in an article, put it in bold at the top, put cleverly selected statistics below to back it up and you have a story.
This was an example of the Simpsons Paradox, where looking at the data as a whole tells a totally different story than grouping the data appropriately, which I'm sure many of you are aware of, but I had to include it here. You remember the story. of the blonde woman who robbed the old woman Well, as I said, this is a famous case but not because of the use of statistics but because of the misuse of statistics in the courtroom, this was a classic. Example of the fallacy of prosecutors. This fallacy arises when people assume that the probability of A given B is the same as the probability of B given A.
Which, I'm sure many of you know, is not usually true from this equation. As if to say that behind this curtain there is an animal with four legs, that's what happens. What is the probability that it is a dog? Well, you probably think something like this, it could be a dog, it could be a cat, it could be a cheetah. It could be many other things and if you had to come up with a number you could say one in a hundred, one in a thousand or whatever, but instead if it said that behind this curtain there is a dog.
That's the given, what is the probability that it has four legs? Well, that's almost a guarantee because most dogs have four legs. So you see, change the given and the question at hand and the probability can change a lot. Now let's see what I said before. Turns out there was less than 1 and 12. Millions of chances that this random couple who happened to fit all of those descriptions was innocent, so the jury returned a guilty verdict. This here was wrong. Statistics actually showed us that, given an innocent couple, the odds of them fitting the descriptions were one in 12 million.
But then I said what the jury had also assumed: if you change the given and the question in question, the probability remains the same. The same thing we just saw could be very wrong. This left side should make sense, like pulling a random couple out of a mall. That's because there was a very slim chance that all of this would apply to them. But this is a false assumption. That is the fallacy of prosecutors. They tell us or give us. Hey, here's a couple that fits all of those descriptions. If maybe ten people in the entire city fit everyone who is given one at random, there is a 1 in 10 chance of them being guilty or a 9 in 10 chance of being innocent.
Not one in 12 million. And remember Sally Clark, who was convicted of murdering her two children. This is also a famous case of misuse of statistics. It turns out that bacterial tests had actually been hidden that would reveal more specific information than a simple multiplication of two probabilities. tell the full story at all. As if the two events were supposed to be independent of each other when genetic or environmental factors definitely could have been at play. As I said, Sally was convicted and sentenced to life in prison. But she only served three years when the convictions were finally overturned in early 2003.
Until then, Sally Clark was widely criticized in the press as a child murderer and she was never able to mentally recover from the false accusation. A few years after her release, she developed psychiatric problems and died in her home from alcohol poisoning in 2007. I'll say it again for everyone who didn't follow along. A woman lost two of her children due to natural causes, she was accused of murdering them, she was tried and found guilty for misuse of Statistics, she spent 3 years in prison, and even after her release she was unable to recover mentally and died. Almost four years later, if you don't find that story crazy, then I don't know what to tell you.
And in fact, the outcome of this case led the Attorney General to order a review of hundreds of other similar cases. Now why not? I don't want us to end this video with that. Let's look at another classic misuse of statistics. This has to do with how data is represented and often involves bar graphs that do not have zero as a reference point. For example, one showed on FoxNews. A chart detailing the numbers What would happen if the Bush tax cuts expired? Do you see any problems? Yes, it starts at 34 percent at the bottom, which makes an increase of not even 15 percent seem like a few hundred percent.
Actually the graph should look like this. Or take the Terri Schiavo case that occurred about two decades ago, which involved a debate over whether a feeding tube should be removed from a woman in an irreversible vegetative state. During that time, CNN published this graphic detailing which political parties agreed with the courts. It appears that Democrats were significantly more supportive of the decision, but because the baseline is not zero, it looks much different than it should. , This is what the real graph would look like Or in 2015, the White House posted a tweet about the increase in students receiving high school diplomas with an extremely misleading graph.
They made an increase of about ten percent look like almost 200%. Or in music. A chart was released showing views among top artists, making Drake look like he was ahead by a wide margin, when in reality he was about a five percent lead. Now I suppose the comments on this video will be quite interesting, but remember that these were selected events and it is not like everything. I said it shows the whole picture. I just find it interesting that these numbers can change the way we think about a person. They can take a look at some of the most intimate moments of our lives based on our shopping list.
They can make very trivial events seem very trivial. seriously and vice versa. You don't even need to use bad numbers for this. But I hope this shows how easy and simple math and statistics can be in the real world outside of a school environment. Especially and with that I'll end that video. there, if you liked it, make sure to LIKE and subscribe. Don't forget to follow me on Twitter and try them on the Facebook group for updates on everything. Press the bell if you don't receive notifications. Share comments and all those other YouTube forums and I.
We'll see them in the next video.

If you have any copyright issue, please Contact