YTread Logo
YTread Logo

Statistics made easy ! ! ! Learn about the t-test, the chi square test, the p value and more

Jun 08, 2021
Learning

statistics

doesn't have to be difficult now, instead of bombarding you with a complicated formula and statistical theory, I will guide you through a way of thinking that will allow you to address the most common statistical questions when we analyze sample data for the most part. part we see two things, we see differences between groups, so men are taller than women and we see relationships between variables such as taller people much

more

than shorter people committed and the big question is whether those differences and those associations or relationships are real and I'm going to explain to you what we mean by the term real in the next few minutes, we're going to look at a very simple data set and we're going to see how by looking at various combinations of variables. and variable traps we can identify very specific differences between groups and very specific relationships between variables and I will explain when and how to use statistical

test

s and how to interpret their results.
statistics made easy learn about the t test the chi square test the p value and more
Now let's imagine that we have a research question and it is about the height and weight of people living in Ireland, of course we cannot measure the height and weight of the entire population, so we take a random sample of the population and we measure the weight and height of that sample and click some additional information like gender and age group of each of the people in our sample and organize this data in a spreadsheet or data set with the various attributes in columns and these are called variables and these variables will be the object of our investigation now

more

The data sets you work with will contain two types of variables, categorical and numerical variables, categorical variables like gender content categories, as the name suggests, Consider them as groups or repositories into which data can be organized, in this case, men and women, numerical variables such as height. in numbers as the name suggests and can now be organized on a number line to better understand our data and make sense of it, we summarize and visualize it in the case of categorical data, we can count the number of observations at any given time. category and we can represent them in a table and in a bar graph and to summarize the numerical data, we are first interested in the distribution of the data, so we could describe the range of the data, the interquartile range, we could also include the standard deviation .
statistics made easy learn about the t test the chi square test the p value and more

More Interesting Facts About,

statistics made easy learn about the t test the chi square test the p value and more...

To get an idea of ​​the center of the data, we use the median which divides the doctor into two equal halves and we use the mean, which is the average. The mean is probably the most commonly used summary

value

to represent this type of data. We can visualize it. data using a boxplot which is a visual representation of the range, the interquartile range and the median and of course we can create a histogram and this gives us the shape of the data so I hope you can see that this process of summarizing and visualizing the data takes it from just being numbers and words on a spreadsheet and turns it into something that has meaning for us, something we can understand, something we can think about now, in this very data set.
statistics made easy learn about the t test the chi square test the p value and more
Simple, we have two categorical and two numerical variables and things start to get interesting when we start looking at combinations of variables, so for example we can take a look at a categorical and numerical variable like gender and height and then we can group the data by gender, which is the categorical variable. and create a summary of the numerical variable in this case height that is separated into those two groups and looking at the summary we can see that in our sample data men are on average taller than women what I want you to see here is that we I have analyzed a combination of the categorical and numerical variable, but as you can imagine, there are other possible combinations of variables that we could have analyzed.
statistics made easy learn about the t test the chi square test the p value and more
We could have looked at height and weight, which are both numerical. We could have analyzed the gender and age group is categorical and in each case we can see differences between groups or relationships between variables and in each of these cases there are specific statistical

test

s that we can apply to see if what we are seeing in the data of the sample has implications for what If we think about the general population, can we infer that something is statistically significant? So let's take a quick look at the five most important data combinations we have and, first, see what we could observe in our sample data given what type of data type combination and, second, what statistical test we could apply to determine whether or not we can infer something about the general population, so we could look at a single categorical variable like gender and we could do a one-sample proportion test for two categories. variables we would do a chi-

square

test for a single number with the t test if we have a categorical and a numerical variable we do a t test or analysis of variance or ANOVA if there are more than two categories in a categorical variable and for two numerical variables we do a correlation test.
Now I'm going to go back to each of these scenarios in each of these tests, so don't panic at this point, what I want you to see is how the data can be split. and in just a few minutes we're going to take each of these scenarios and look at exactly what questions you can ask and how you can apply statistical testing and, most importantly, how to interpret your results now before we continue. I just want to say a big thank you to biomed central or BMC for sponsoring this video. BMC is a publisher that publishes open access journals and that means that the full text of all published articles is available free of charge to anyone in the world.
I'm editor-in-chief of one of the magazines they publish called Globalization and Health and I'm really impressed with them as a company. I think they have integrity and I honestly believe they are making the world a better place. They have a portfolio. from over 300 journals they publish, so check them out at biomed central com. I'll put a link in the description below at this point. I mean it's not good science to take a data set and just randomly explore blindly hoping you find something that is statistically significant before you interrogate the data, start by defining your question, your hypothesis, define your null hypothesis, identify the alpha

value

that you are going to use and then analyze the data, so let's see what we can do.
With just a categorical variable like gender, we might ask if there is a difference in the number of men and women in the population. Now we could affirm that as a hypothesis, that is, there is a difference between the number of men and women in the population. population and we could check whether we think that's the case or not and when we take a good look at our sample data, we actually see that there is a difference in the proportion of men and women, so we should get excited, well, no, no. However, remember that this is just sample data.
We could have selected by chance a sample that simply showed a difference, so let's consider the possibility that there is actually no difference in the number of men and women in the population and we call that our null hypothesis and if that were true, what would happen? probability would it be? What is the probability that we will see the difference we have observed or a greater difference? And if we can show that this probability is low, then we can have a degree of confidence that the null hypothesis is incorrect and we can reject it, but before calculating this probability that we are going to call our p-value, we must be clear about how small the probability is. small enough below which P value we would reject the hypothesis. null and we need to decide on that limit before we calculate the p value and we call that limit alpha value and for the rest of the examples in this video we are going to use an alpha value of point zero five or five percent, so we really have two scenarios, We have the null hypothesis which is that there is no difference and the alternative hypothesis which is that there is a difference and the next step is to apply a statistical test and in this case we are doing a sample proportion test and we generate a p value if P is less than the alpha then we can reject the null hypothesis and affirm that the difference we observe is statistically significant if we add another categorical variable in this case age group we can have a research question such as whether the proportion of men and women differs between these groups, for What we hypothesize is that the number of men and women we observe depends on the age of calories we observe, in other words the proportions change or depend on or are depending on the age category, now we can collect our sample data, We look at them and we can see that yes, in fact, the proportions change between the age groups, in other words, in our sample data, the proportions depend on the age category now it is that JooJoo opportunity, well, let's try the idea of that the proportions are all the same and that they are independent of the age category, that is our null hypothesis.
Now here we can perform a chi-

square

test and that gives us a p-value and if the p-value is less than Alpha, we can reject the null hypothesis and claim that our observation is statistically significant. If we want to look at just one numerical variable on its own, like height, then we don't have any group to look for differences between and we don't do it. We have no other numerical variable with which to look for some kind of association relationship, so what questions can we ask? We might have some theoretical value that we want to compare our data to, for example in the case of average height we might have some historical data.
We might ask if the current population is significantly different from that historical daughter, then our question might be: is the average height different from a previously established height? Let's imagine that the previously established height was one point four meters. We want to know if the average height in our current population is significantly different from that historical daughter. the current population is different than that our hypothesis is that there is a difference again we collected some sample data and found that the average height is in fact different from the historical height it is statistically significant well if there was no difference what would they be the chances that we observe the difference we make or a larger difference we perform a t test comparing the means and if the p value is less than the alpha then we can reject the null hypothesis and state that the observed difference is statistically significant now let us consider a variable categorical and numerical and the question remains: is there a difference between the average height of men and women?
In this case, our hypothesis is that there is a difference in our sample. We observe a difference. Let's assume there is no difference. We carry out a test. -test that gives us a p value if P is less than Alpha it will reject the null and we affirm that the observation is statistically significant if we had a categorical variable with more than two categories, such as an age group that has three categories, then we do a t test we would do an analysis of variance or ANOVA now let's look at the latest combination of variable types in Stata, it said two numerical variables height and weight here we could start with the question: is there a relationship between height and weight? our hypothesis is that there is a relationship, we collect sample data, we look at it and one lakh we see that some kind of relationship is simulated or suppose it is not, it is assumed that there is no correlation between the two variables and if it were not real, so what are the chances that we see the relationship that we see and here we perform a correlation test?
Now a correlation test will give us two things, firstly, it will give you a correlation coefficient that tells us something about nature. of the association between the two variables and I'm going to talk about that in just a minute but of course it also gives us a p value and again if the p value is less than the Alpha we can reject the null hypothesis and we claim that the correlation that we see is statistically significant and the correlation that we see can be represented by a number that we call the correlation coefficient, so let's talk about that for a second correlation coefficient which is a number between negative 1 and 1 and look at the relationship between two numerical variables If as the variable perfectly positive correlation as On the x and y axes, the correlation coefficient will be the same, of course, we have barely been able to scratch the surface in terms of what there is to

learn

about statistical analysis.
If you want to

learn

more, go to Learn More 365 Days. com and I have some courses there that you will loveif you want to learn about our programming, which is a programming language that is used for statistical analysis and it's free, it's very powerful, it's

easy

to use, it's absolutely fantastic. I have a YouTube Channel that focuses specifically on that, so that's our programming 101. I'll put a link in the description below, go check it out; Otherwise, subscribe to this channel, hit the notification bell if you want to receive notifications of future videos, leave your comments below and share them. this video with whoever you think might find it useful until next time take care

If you have any copyright issue, please Contact