Are the averages of two groups reliably different?
An Independent Samples T-Test is used to compare the average scores of two groups to determine if there is a statistically reliable difference between them. This test is appropriate when dealing with two independent samples—that is, two groups that do not overlap or influence each other. The test evaluates whether an observed difference in sample averages are likely to exist in the populations from which the samples were drawn, under the assumption that the populations have equal variances.
Example of a Statistically Reliable Difference Between Two Groups
In our B2B Buyer Experience Research, we compared how useful buyers who work in IT found seller interactions compared to buyers who do not work in IT. If the independent samples t-test is statistically reliable (p < .05), this indicates that we can expect to find that difference replicated in other samples of B2B buyers, and in the population of B2B buyers generally.
For this study, our T-Test showed that there was indeed a reliable difference (p = .02). IT buyers rated seller interactions a 4.40 out of 5 in usefulness, while non-IT buyers rated them as 4.24 in usefulness. This suggests that IT buyers reliably found these interactions slightly more useful than their non-IT counterparts.
Example of No Statistically Reliable Difference Between Two Groups
A company that has two different BDR teams, one in Europe and one in North America, wants to measure and compare the productivity of the two teams. The measure they are interested in is opportunities produced per BDR per month. Each team has 10 BDRs. To get a good sample, the organization measures opportunity production for both teams over three months. At the end of three months, an average number of opportunities per BDR per month is calculated.
At the end of three months, these were the results observed:
- Europe: The average number of opportunities produced per BDR per month is approximately 12.13.
- North America: The average number of opportunities produced per BDR per month is approximately 14.10.
To test this, we conducted a T-test, which returned a p-value of p = .133. This p-value is well outside of the standard cut-off for statistical reliability of .05 (see here for an explanation of p-values and their meaning). As a result, we cannot conclude that the small difference we saw over those three months is statistically reliable. For now, we have to conclude that productivity is statistically equal.