A. Overview of chi-squared tests
B. Step-by-step instructions for creating pivot tables in Excel

C. Step-by-step instructions for doing a chi-squared test in Excel
D. Interpreting your Excel output


We often need to know whether multiple groups are equally likely to do something, buy something, etc...Moreover, if two or more groups are not equally likely to do something, which group or population is more likely to do it?

So, like in all statistics, we take a sample from each group and try to see if their outcomes are similar. Of course, simply because the results from each sample aren't exactly equal doesn't mean that this observed difference is "statistically significant."  The differences we observe in our sampling could simply be the result of random sampling error. For example, we might just randomly get a lot of men in our sample who really like to shop, even if, in reality, women are more likely to shop. So, regardless of the sample results we obtain, we have to test to see if the differences we see could simply be due to random sampling error or if they are really indicative of a true difference between our two groups. 


Cross tabulations are an easy, convenient way to summarize data, especially when you have a lot of categorical responses (Yes/No, Coke/Pepsi, etc..). Excel can generate these kind of tables by creating what is called a Pivot Table.

Take an example from Dr. Paul's Marketing Analysis course several years ago. One group of students surveyed 57 women, between the ages of 18 to 25. The respondents were divided into 2 age groups, 18-21(=1) and 22-25(=2). The following question was asked: "On a scale from 1 to 5 (5="very important", 1="not important at all"), How important is it to you that you buy name brand clothes?" A portion of their Excel spreadsheet appears below:

Suppose they want to construct a table that shows the relationship between the age group and the relative importance of name brands.

Step 1: A cross tabulation can be easily constructed in Excel by making a Pivot Table. Go to Data, Pivot Table and hit next. Excel then asks you to input the range. Make sure that the entire range of the data is selected. In this example, the data runs from row 2 to 58, and is found in columns A, B and C. Hit next.


Step 2: Now, you are ready to set up the table. The labels of your series are in boxes to the right. You must always put at least one label as a heading (row or column). In addition, you must always put in the data field "what you want Excel to measure."


Step 3: You need to put the "how important" box as a heading field also. This is because you are going to be asking Excel to count the number of women in each age group who answered "1", "2", "3", "4" and "5". Be sure to double-click on your data box to make certain that you have asked Excel to "count" the number of women answering each response:

Follow the prompts and you table is placed on another page:

At first glance, you can see that in your sample, 12 out of 38 (31.5%) of women aged 18-21 do not consider brand name to be important at all, whereas only 2 of 19 (10.5%) of the older women made that claim. Clearly, within this sample of 57 women, the younger ones are less concerned with the brand name.

But (again, a HUGE but) can you infer that in the entire population women aged 22-25 on average are more concerned with brand names than women aged 18-21? Again, to make such an inference requires a test. In this case, the proper test is called a Chi-Squared test.


Step 1: State your null hypothesis. The hypothesis is that "the two age groups are equally likely to respond "1", "2", "3", "4" and "5" to the question". In other words, the probability that  women in the 18-21 age group rates the importance of name brands as a "1", "2", equal to the probability that women in the 22-25 age group rate name brands equally important: 

H0: pages 18-21 = pages 22-25

Notice that this data is categorical (not numeric).  As a result, it would not be appropriate to do a "t-test" to test this hypothesis. In this case, the proper test is what we call a Chi-squared test. 

Step 2:  Choose a critical level for the test and find the critical value.  The sampling distribution of a statistic which compares the "expected" frequency of a sample with the actual, or "observed" frequency is called Chi-Squared. For a sample this statistic is distributed like a Chi-square with (rows-1)*(columns-1) df (here, df = (2-1)*(5-1)=4).  See the Chi-Squared distribution for critical values. At the 10% level (90% confidence level) the critical Chi-Squared is 7.77.

Step 3: Calculate the test statistic.  The key to calculating the chi-squared statistic and testing this hypothesis is to compare the actual, or "observed" values in each or the table's cells with the "expected" frequency of response that would have occurred if the hypothesis were true.

The test statistic is computed by the formula:

where O = observed frequency in the sample in this class and E = expected frequency in the sample in this class. Remember, you can do a Chi-squared test with any dimension of table. You can have more than two groups and you can have any number of survey categories that you are comparing. It does change the df, though.

The expected frequency, E, is found by multiplying the relative frequency of this class in the hypothesized population (57) by the sample size. For example, the expected relative frequency of women who say that name brands are not important at all (1) is 14/57 (# who answered "1" divided by the total # surveyed). This gives you the number in that class in the sample if the relative frequency distribution across the classes in the sample exactly matches the distribution in the population. Notice that Chi-square is always greater than 0 and equals 0 only if the observed is equal to the expected in each class. Look at the equation and make sure that you see that a larger value of goes with samples with large differences between the observed and expected frequencies.

For example, the expected frequencies for each of the 5 categories, 2 age groups are given as follows (the first number in the subscript represents the age group (1=18-21, 2=22-25) and the second subscript is how they rated name brands on the 1-5 scale):

E11 = 14/57*38 = 9.33

E12 = 17/57*38 = 11.33

E13 = 9/57*38 = 6

E14 = 15/57*38 = 10

E15 = 2/57*38 = 1.33

E21 = 14/57*19 = 4.67

E22 = 17/57*19 = 5.67

E23 = 9/57*19 = 3

E24 = 15/57*19 = 5

E25 = 2/57*19 = 0.67

Doing this in the Excel spreadsheet give the following:

Notice that the table of "expected values" has the exact same dimensions as the original (observed) table. Now that we have calculated the expected values, we are ready to compare them to the observed values. Notice that, if the two age groups were equally likely to rate the importance of name brand as "1" (highly important), we should have seen 9.33 women in the younger age group mark "1" and 4.67 of women in the older age group mark "1". but, in our sample 12 (not 9.33) younger women chose "1". Only 2 (not 4.67) of the older group chose "1". In our sample, the younger women were more likely to rate name brands as very important.  However, is this result due to a true underlying difference in the two populations, or was it simply a result of random sampling error? To answer that question, we have to test it. We need to calculate the Chi-squared test statistic using the formula below and compare it to the critical chi-squared number we get from the chi-squared table.  Again, the formula is:   

The Chi-Squared Statistic =

(12-9.33)2/9.33 + (11-11.33)2/11.33 + (6-6)2/6 + (10-9)2/10 + (0-1.33)2/1.33
+ (2-4.67)2/4.67 + (6-5.67)2/5.67 + (3-3)2/3 + (6-5)2/5 + (2-0.67)2/0.67  

= 6.59


As we noted above, at the 10% level (90% confidence level) the critical Chi-Squared is 7.77 (see the Chi-Squared distribution). So, the hypothesis that the two groups are equally likely to respond that name brands are important, very important, etc... is not rejected at the 10% level.

In this case, we cannot claim with much confidence that the differences we see in this sample are generalizable to the population at large (at least on this one question).

So, why did we observe these differences in our samples there really aren't any? All this test result tells us is that we are not at least 90% confident that there is a difference between these two age groups in their preferences towards purchasing name brand clothes.  In other words, we don't have strong enough evidence that the sample differences are due to anything but random sampling error.  We should not, in this case, make the claim that the younger women care more about name brands. In the future, a larger ample size might allow us to make a stronger claim.

Elon University 
Campus Box 2700
Elon, NC 27244 (800) 334-8448

Last Modified: 08/07/03
Copyright Elon University