_____________________________________________________________________________ The human resource manager at a car dealership wants to know if the ages of its employees are related to the department that they work in. Data was compiled and tabulated in a 2-way contingency table. The employees were classified according to their age and department. Expected counts are printed below observed counts Sales Accounts Marketing Repairs Total 20-29 8 10 27 43 88 17.74 *** 26.20 23.62 30-39 29 26 38 22 * 23.18 26.72 34.24 30.86 40-49 33 32 72 82 219 44.15 50.88 65.20 58.77 50-59 81 106 86 54 327 65.92 75.97 97.36 87.75 Total 151 ** 223 201 749 Chi-Sq = 5.348 + **** + 0.024 + 15.912 + 1.459 + 0.019 + 0.413 + 2.544 + 2.816 + 7.003 + 0.709 + 9.182 + 3.448 + 11.875 + 1.325 + 12.983 = 80.395 DF = *****, P-Value = ****** No cells with expected counts less than 5.
c) Fill in the gaps marked by ‘*’, ‘**’, ‘***’, ‘****’ , ‘*****’ and ‘******’ [8] d) What is the assumption on which these calculations are based? [1]
is the total for row 2 and it is given as,
is the total for column 2 and it is given as,
is the expected count for row 1, column 2. To find its value, we use the formula
, where is the total for row 1, is the total for column 2 and is sample size. Therefore,
.
Thus,
accounts for the test statistic above and it is found using the . For this case, this value is in row and column . Therefore ,
For this test, the degrees of freedom where is the number of rows and is the number of columns. Therefore
To determine the p-value for this test, we find . To find this probability, we consider the degrees of freedom and the test statistic of 80.395 and use the following the command in .
> pv=pchisq(80.395,9,lower.tail=FALSE) and the output is,
> pv
[1] 1.348953e-13
The output is the desired P-Value.
In summary,
The assumption on which these calculations are based is that the expected frequency count for each cell of the table is at least 5.
Comments