Question #105996
Independent-samples t-test practice problems
1
Expert's answer
2020-03-20T12:20:48-0400

Independent Samples t-test

The t-test is used to compare the values of the means from two samples and test whether it is likely that the samples are from populations having different mean values.

Independent samples consist of two groups of individuals who are randomly selected from two different populations.

The term “independent” is used because the individuals in one sample must be completely unrelated to the individuals in the other sample.

When two samples are taken from the same population it is very unlikely that the means of the two samples will be identical. When two samples are taken from two populations with very different means values, it is likely that the means of the two samples will differ. Our problem is how to differentiate between these two situations using only the data from the two samples.

The parameters of interest are the population means, which are denoted μ1\mu_1 and μ2.\mu_2.

The sample means are denoted as xˉ1\bar{x}_1 and xˉ2.\bar{x}_2.

The sample sizes are denoted as n1n_1 and n2.n_2.


Population meanSample meanSample sizeμ1xˉ1n1μ2xˉ2n2\def\arraystretch{1.5} \begin{array}{c:c:c} \text{Population mean} & \text{Sample mean} & \text{Sample size} \\ \hline \mu_1 & \bar{x}_1 & n_1 \\ \hdashline \mu_2 & \bar{x}_2 & n_2 \end{array}

Conditions for Inference Comparing Two Means

Before conducting any statistical analyses, two assumptions must be met:

1) The two samples are random and they come from two distinct populations. The samples are independent. That is, one sample has no influence on the other.

Additionally, the same response variable must be measured for both samples. 

2) Both populations are Normally distributed. The means and standard deviations of the populations are unknown. In practice, it is enough that the distributions have similar shapes and that the data have no strong outliers.

The Two-Sample t Statistic

When data come from two random samples or two groups in a randomized experiment, the difference between the sample means (xˉ1xˉ2)(\bar{x}_1-\bar{x}_2) is the best estimate of the difference between the population means (μ1μ2).(\mu_1-\mu_2).

In other words, since the population means (μ1\mu_1 and μ2\mu_2) are unknown, the sample means ( xˉ1\bar{x}_1 and xˉ2\bar{x}_2) must be used to make inferences. 

The inferences that are being made are based on the differences between the sample means: xˉ1xˉ2.\bar{x}_1-\bar{x}_2.

When the Independent condition is met, the standard deviation of the difference xˉ1xˉ2\bar{x}_1-\bar{x}_2 is


σxˉ1xˉ2=σ12n1+σ22n2\sigma_{\bar{x}_1-\bar{x}_2}=\sqrt{\dfrac{\sigma_1^2 }{n_1}+\dfrac{\sigma_2^2}{n_2}}

However, this formula requires the population standard deviations to be known. If these are unknown, tt procedures must be used to make inferences.

If the values of the parameters σ1\sigma_1 and σ2\sigma_2 (the population standard deviations) are unknown, they can be replaced with the sample standard deviations. The result is the standard error of the difference xˉ1xˉ2\bar{x}_1-\bar{x}_2


s12n1+s22n2\sqrt{\dfrac{s_1^2 }{n_1}+\dfrac{s_2^2}{n_2}}

Degrees of Freedom

The shape of the tt distribution is different for different sample sizes. 

Therefore, when making inferences about the difference between two population means, the size of the two samples must be taken into account. 

This is because the tt distribution is used to make these inferences.

Computing degrees of freedom:


df=n1+n22df=n_1+n_2-2

Confidence Interval for μ1μ2\mu_1-\mu_2

Two-sample tt interval for a difference between means

When the random, normal and independent conditions are met, a level C confidence interval for μ1μ2\mu_1-\mu_2 is


CI=(xˉ1xˉ2)±ts12n1+s22n2CI=(\bar{x}_1-\bar{x}_2)\pm t^* \sqrt{\dfrac{s_1^2 }{n_1}+\dfrac{s_2^2}{n_2}}

where tt^* is the critical value for confidence C level for tt distribution.


standard error=s12n1+s22n2standard\ error=\sqrt{\dfrac{s_1^2 }{n_1}+\dfrac{s_2^2}{n_2}}

margin of error=ts12n1+s22n2margin\ of\ error=t^*\sqrt{\dfrac{s_1^2 }{n_1}+\dfrac{s_2^2}{n_2}}

Use the tt-table or a calculator  to look up a two-tailed test with dfdf degrees of freedom and for a significance of α\alpha . We find a critical value t.t^*.

Two-Sample tt Test 

 it is necessary to test whether the difference between two independent groups of individuals is statistically significant.

The null hypothesis for this test is that the groups have equal means or that there is no significant difference between the average scores of the two groups in the population

H0:μ1μ2=hypothesized valueH_0:\mu_1-\mu_2=hypothesized\ value

The alternative hypothesis can be one-sided, stating that the mean of one of the groups is higher or lower than the mean of the other group.

If there is no information to justify a one-sided alternative hypothesis, a two-sided alternative hypothesis, which states that the two means are significantly different, could be formulated.

H1:μ1μ2hypothesized value (twosided)H_1:\mu_1-\mu_2\not=hypothesized\ value\ (two-sided)

Suppose the random, normal and independent conditions are met. To test the hypothesis H0:μ1μ2=H_0:\mu_1-\mu_2= hypothesized value compute the tt statistic

Use the tt- table or a calculator  to look up a two-tailed test with dfdfdegrees of freedom and for a

Two independent samples from two normal distributions with unequal variances


t=xˉ1xˉ2hypothesized values12n1+s22n2t={\bar{x}_1-\bar{x}_2-hypothesized\ value \over \sqrt{\dfrac{s_1^2 }{n_1}+\dfrac{s_2^2}{n_2}}}

df=(s12n1+s22n2)2(s12n1)2n11+(s22n2)2n21df=\dfrac{(\dfrac{s_1^2 }{n_1}+\dfrac{s_2^2}{n_2})^2}{\dfrac{(\dfrac{s_1^2 }{n_1})^2}{n_1-1}+\dfrac{(\dfrac{s_2^2 }{n_2})^2}{n_2-1}}

If dfdf  doesn't equal an integer, then we take the integer portion of df.df.  


Two independent samples from two normal distributions with equal variances


t=xˉ1xˉ2hypothesized valuesp1n1+1n2t={\bar{x}_1-\bar{x}_2-hypothesized\ value \over s_p \sqrt{\dfrac{1 }{n_1}+\dfrac{1}{n_2}}}

where


sp2=(n11)s12+(n21)s22n1+n22s_p^2=\dfrac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2}

df=n1+n22df=n_1+n_2-2



Two-sample tt procedures are more robust than one-sample tt procedures, particularly when the distributions are not symmetric.



Need a fast expert's response?

Submit order

and get a quick answer at the best price

for any assignment or question with DETAILED EXPLANATIONS!

Comments

No comments. Be the first!
LATEST TUTORIALS
APPROVED BY CLIENTS