ANOVA – Analysis of Variance

What is ANOVA?

It is a statistical test used to check the significant difference between 2 or more groups. There are 2 types of variables used in this test.

1. Independent – will always be a categorical variable with groups. Example – gender is a categorical variable with 2 groups Male & Female

2. Dependent – will always be a numeric variable. Example – Salary of an employee

Why do we use ANOVA?

We use it to check whether there is a significant difference between groups by comparing the means of different groups.
Example – We can check whether salary for male and female are significantly different or same.

If there is a significant difference, we say that we reject null hypothesis.
If there is no significant difference, we do not reject null hypothesis.

There are typically two types of hypothesis:

1. Null Hypothesis – meaning, there is no change/ everything is same.

2. Alternate Hypothesis – meaning, there is a change / there is a difference.

Determining significant value is also in the hands of clients. In this tutorial we will be using significant value as 0.05, meaning, if the results are below 0.05, we will reject null hypothesis and say there is a significant difference between the groups. When we use 1 independent and 1 dependent variable then we say we are doing One-Way ANOVA.
When we use multiple independent variables and 1 dependent variable, we use n-Way ANOVA.
n is the number depending upon the number of independent variables being used.
If, the number of independent variables are 2 and 1 dependent variable is there, we will be doing Two-Way ANOVA.

Example of One-Way ANOVA:

Problem Statement: Is there a significant difference between salaries of people gender-wise in an organization named ‘Y’?

Here, the dependent variable will be salary. Salary is our numeric variable. Independent variable will gender having 2 groups as male and female.

Let’s say, after running the test, our significant value comes as 0.45, we will say there is no significant difference between salaries of males and females. The significant value is above 0.05, so we do not reject null hypothesis.
In other words, we can say the employees are not paid higher salaries in organization ‘Y’ on the basis of their gender.

Example of Two-Way ANOVA:

Problem Statement: Is there a significant difference between marks of students on the basis of subjects and books they refer to?

Here, the dependent variable will be marks. Marks scored are numeric variable. Independent variables will subjects and books. Notice that there are 2 independent variables.

Let’s say, after running the test, our significant value comes as 0.003, we will say there is a significant difference between marks scored on the basis of subjects and books. The significant value is below 0.05, so we reject null hypothesis.
In other words, we can say the marks scored will play a significant role on subjects and books you refer to. 