Learn how to test if there are statistically significant differences in your data.
While anyone can see the difference between two numbers, finding out whether that difference is statistically significant can take more work.
Let’s say you’ve run a customer satisfaction survey at work. Your boss wants to analyze if men give your company a lower Net Promoter Score℠ (NPS) than women.
In the data, you see that the average rating from male respondents was 9, compared to an average score of 12 from female respondents. How can you determine if nine is significantly different from 12? This is where t-tests come in.
In this article, we’ll define t-tests and their use cases, share examples of t-tests, and explain how to interpret your results.
A t-test compares two numbers to determine if their difference is statistically significant. This test helps determine if observed differences are statistically significant or due to chance. It is especially useful with small sample sizes.
The test calculates a t-value, which is compared to a critical value from the t-distribution to determine statistical significance. Often used in hypothesis testing, t-tests can be highly valuable for determining if two groups differ in a measured characteristic.
Testing for statistical significance is common in concept testing and product testing. In concept testing, AB tests are commonly used to determine if one ad concept performs better than another. Similarly, product testing can determine if a product will hold its own when launched into the market.
Various types of t-tests use specific formulas to calculate statistically significant differences, with the two-sample t-test being the most common. The formula for this t-test is:
Here are the formulas for the one-sample t-test and paired t-test:
In both the one-sample and paired t-tests, the calculated t-value is compared to a critical value from the t-distribution to assess significance.
How do you know when to use a t-test? There are several times when you might choose to utilize a t-test after gathering data from a survey or questionnaire, including:
There are three types of t-tests commonly used by researchers. These t-tests serve different purposes that we’ll explain below.
The one-sample test looks at whether the mean (aka average) of data from one group (in this case, the overall NPS) is different from a value you specify.
Example: Your company’s goal is to have an NPS significantly higher than the industry standard 5. Your company’s latest survey puts its NPS at 10. Is an NPS of 10 significantly higher than the industry standard of 5?
Two-sample t-tests examine whether the means of two independent groups are significantly different from one another.
Example: Your hypothesis is that men give your company a lower NPS than women. The average NPS from male respondents is 9, while the average score from women is 12. Is 9 significantly different from 12?
This test is for when you give one group of people the same survey twice. A paired t-test lets you know if the mean changed between the first and second surveys.
Example: You surveyed the same group of customers twice: once in April and a second time in May, after they had seen an ad for your company. Did your company’s NPS change after customers saw the ad?
There are four steps to performing a t-test.
This section walks through the four steps using the NPS ratings example from the beginning:
Your hypothesis is that men give a lower NPS to your company than women. The average NPS from men is 9, while the average score for women is 12. Is 9 significantly different from 12? This is an example for performing a two-sample t-test.
Let’s dive into the steps and t-test example.
Each type of t-test has a different formula for calculating the t-statistic. For this example, we’ll use the two-sample t-test formula where:
You’ll probably be conducting the t-tests in a spreadsheet or statistical program (like Excel or SPSS). However, if you’d like to do the math by hand, the formulas for the other two types of t-tests are included below.
Degrees of freedom are the number of ways the mean could vary. In this case, the degrees of freedom are the number of NPS ratings you could have in a given group of respondents. Similar to the t-statistic, the formula for degrees of freedom will vary depending on the type of t-test you perform.
This formula must be used to determine degrees of freedom in two-sample t-tests.
The critical value is the threshold at which the difference between two numbers is considered statistically significant.
According to this table, for a two-tailed test with an alpha level of 0.05 at 41 degrees of freedom, the critical value is 2.02. Note that most analysts use a two-tailed test instead of a one-tailed test because it’s more conservative.
For more information on the differences between one-tailed and two-tailed tests, check out this video from Khan Academy.
If your t-statistic is larger than your critical value, your difference is significant. If your t-statistic is smaller, then your two numbers are, statistically speaking, indistinguishable.
In our example, the absolute value of the t-statistic is 0.86, which is not larger than the critical value of 2.02, so you can conclude that men do not give significantly lower NPS ratings than women.
When running a t-test, you will need to interpret the results using three components: the t-value, the degrees of freedom (df), and the p-value.
Let’s explore these values and how to interpret the results of your t-test.
T-value: The t-value is the size of the difference relative to any variation in the data. A high t-value indicates a greater difference between the group means relative to the variability. A low t-value indicates a lesser difference between the group means which implies that the difference is likely by chance.
Degrees of freedom (df): The degrees of freedom refer to the values that are free to vary when calculating a statistic. This number of values is used to reference the right critical values.
P-value: P-value is the value that represents if an observed difference is statistically significant or not. The p-value is the probability of observing the test statistic under the null hypothesis (i.e., no difference between groups). The p-value threshold is typically 0.05 (5%).
Additionally, t-test results may sometimes also present a confidence interval (CI) for the difference between the means. The confidence interval shares a range of values that the true value is likely to fall within. A significant difference between groups is likely when the CI does not include zero.
Done correctly, your data analysis could lead marketing efforts or product development. However, you must get buy-in from stakeholders first. For this reason, you must take extra care in presenting your data to ensure stakeholders understand the significance of your findings and next steps.
You should include these five components in your t-test results presentation:
T-tests are used to determine if the difference in the means of two sample groups is statistically significant. You can use t-tests during survey data analysis to help share the reliability of your data.
SurveyMonkey allows you to streamline the process of creating and sending surveys to sample groups for your organization’s research needs. With SurveyMonkey, you can build market research surveys and questionnaires from scratch or tap into our broad selection of over 400 survey templates.
Get started collecting survey data for analysis today to help your organization make better decisions for growth. Create a free account today.
Net Promoter Score is a trademark of Bain & Company, Inc., Satmetrix Systems, Inc. and F. Reichheld.
Insights managers can use this toolkit to help you deliver compelling, actionable insights to support stakeholders and reach the right audiences.
How to use customer and employee feedback to drive innovation with insights from LinkedIn, FranklinCovey, and Hornblower.
New research on the role of data on the employee experience; how it impacts decision making, worker confidence, and trust in teammates and leaders
New research on workplace trends and how employees are balancing personal time, working from home, and the gap in remote and in-office work