R
T-test
We do this by using the function
t.test() on a formula telling the t-test to compare the values of reaction time depending on the gender:t.test(data$reaction_time ~ data$gender)However, as t-test is parametric, you will first need to verify that the data matches its conditions. These conditions are:
- The sample groups should be normally distributed
- The two populations from which the samples are taken should have the same variance
- The two sets of data were sampled independently from the two populations, not in clusters
qqnorm().Calling
qqnorm(data[data$gender == “male”,]$reading_time) will create the Q-Q plot for the reading times of men. Calling qqline(data[data$gender == “male”,]$reading_time) will add a line to the plot to help you with the evaluation: if the data is normally distributed, the data points should form an approximately straight line, not deviating from the line added by qqline(). If the data points deviate from the line too much, the data is not normally distributed.For a more precise evaluation, you can use the Shapiro-Wilk test.

In our case, the data for men is approximately normally distributed. However, the test requires both of the groups to be normally distributed.
The equality of variance can be roughly assessed by adding the line created by qqline(data[data$gender == “male”,]$reaction_time) to the plot you just created for women. If the two samples have approximately equal variance, the slopes of the two lines should not differ too much.

In our case, the variance is also approximately equal. Since the data for our example was also sampled independently, we can proceed with the t-test. However, if any of these parameters were not met, a non-parametric alternative would have to be used. In this case the Wilcoxon-Mann-Whitney test.
Now you can call the t.test() function as presented earlier. The result should look like this:

Here, R reminds you what test did you use and what data did you perform it on. It also reminds you about the alternative hypothesis: in this case it is that there is a difference in the means of distributions from which these two samples came. The number most users of R are interested in here is the p-value. The p-value is the probability that results as extremely different as these would occur by chance. Here, it is 0.3615. If you were to report the result in your paper, you would also note the t-value and the number of degrees of freedom (df).
Even though there is no universally binding threshold for the p-value to consider the difference significant, a number of values are usually accepted, depending on the field and importance of the study. For regular studies in linguistics, p < 0.05 is viewed as a threshold of reasonable confidence.
NOTE: The size of the p-value is not connected to the size of the effect. Two means (100 and 150) may have a p-value of 0.55 if each of them comes from two values only, no matter how large the difference between them. On the other hand, another two means (99.8 and 100.8) may have a p-value of less than 0.001 if they are calculated from samples of 1000 values.
As the p-value returned by the t-test is larger than 0.05, you can conclude that your data does not suggest any significant difference in the reaction time of men and women.
The optional parameters of the t.test() function allow you to specify the type of t-test you want to perform. Most important decision is deciding whether you should perform a paired or an unpaired test and whether you have a one-tailed hypothesis or not.
An unpaired test is performed when there is no relation between the two samples compared: e.g. one are scores of men, while the other of women. A paired test is used if the values in one sample have their counterparts in the other one. For example in a study where participants take a language test before and after trying a new language teaching approach, each participant should have two scores – one from the entry test and one from the end test.
One-tailed t-test assumes that your hypothesis has a direction: you are testing whether the score of women is higher than the score of men. Two-tailed test is performed if you do not have any expectation about the direction of the difference, you are only testing whether there is a difference between the scores of the individual genders.
NOTE: The expectation about the direction of the trend has to be formed before inspecting the data. If you only expect to see a difference before you see the data, you must use the two-tailed hypothesis.
These two optional parameters are expressed with the arguments paired and alternative. Paired can equal to T or F, while alternative may equal to one of "two-sided", "greater", "less". E.g.:t.test(data$reading_time ~ data$gender, paired=F, alternative="two-sided")