1. In an attempt to find the mean number of hours his tutorial classmates spent per day preparing for tutorials, John collected data from 10 of his friends in the tutorial group and found that the mean is 2.4 hours with a standard deviation of 0.8 hours. However, a day later he felt that the sample size is too small. John collected data from another 5 of his friends and found that the mean of this additional dataset is 2.0 hours with a standard deviation of 1.2 hours. Find the mean and standard deviation when these 2 sets of data are combined.
Let m is a mean value of n data xi. Then be definition of the mean value:
"m = \\frac{1}{n}\\sum_{i=1}^{n}x_i" then "\\sum_{i=1}^{n}x_i =n*m"
So, for two samples of n1 and n2 data with mean values m1 and m2:
"\\sum_{i=1}^{n_1+n_2}x_i = \\sum_{i=1}^{n_1}x_i + \\sum_{i=n_1+1}^{n_1+n_2}x_i = n_1*m_1 + n_2*m_2"
And finaly
"m =\\frac{1}{n_1+n_2} \\sum_{i=1}^{n_1+n_2}x_i = \\frac{n_1*m_1+n_2*m_2}{n_1+n_2} = \\frac{10*2.4+5*2}{10+5} = 2.27"
Similarly for a standard deviation:
"\\sigma = \\frac{1}{n} \\sum_{i=1}^{n_1+n_2}(x_i - m)^2 = \\frac{1}{n}(\\sum_{i=1}^{n_1+n_2}x_i^2 - m^2) = \\frac{\\sum_{i=1}^{n_1}x_i^2 + \\sum_{i=n_1+1}^{n_1+n_2}x_i^2}{n}-m^2"
But
"\\sum_{i=1}^{n_1}x_i^2 = n_1*(\\sigma_1 +m_1^2)" and "\\sum_{i=n_1+1}^{n_1+n_2}x_i^2 = n_2*(\\sigma_2 +m_2^2)" therefore
"\\sigma = \\frac{n_1*(\\sigma_1 + m_1^2) +n_1*(\\sigma_2 + m_2^2)}{n_1+n_2} = \\frac{10*(0.8 + 2.4^2)+5*(1.2+2^2)}{10+5}- 2.27^2=\\frac{65.6+26}{15}-5.14=0.97"
Comments
Leave a comment