The following table refers to two properties: age and traffic violations of residents in Gauteng over 12 months.
(i) If a Gauteng resident is selected at random, what is the probability that
(a) he is under 21 and has one or more violations during the past 12 months?
(b) he is under 21?
(c) he has one or more violations during the past 12 months?
(ii) Are events “under 21” and “one or more violations during the past 12 months” independent?
(iii) Test at a 5% level of significance if age and traffic violations are independent of each other:
a.
The sample size is "n=15+25+8+12+30+10=100"
There are 3 rows and 2 columns.
The row totals are given as,
"r_1=40,\\space r_2=20,\\space r_3=40" and the column totals are given as, "c_1=53,\\space c_2=47"
i)
a.
The probability that he is under 21 and has one or more violations during the past 12 months is given as, 25/100=1/4
Thus the probability that a resident is under 21 and has one or more violations during the past 12 months is 1/4.
b.
The probability that he is under 21 years is, (15+25)/100=40/100=2/5
c.
The probability that he has one or more violations during the past 12 months is (25+12+10)/100=47/100=0.47
Therefore, the probability that a resident has one or more violations is 0.47
ii)
Let A denote the event that the resident is under 21 and B denote the event that he has one or more violations.
If A and B are independent then we expect the condition "p(A\\cap B)=p(A)*p(B)"
We have,
"p(A)=2\/5=0.4", "p(B)=0.47" and "p(A\\cap B)=0.25"
Now,
"0.25\\not=0.4*0.47"
Therefore the events “under 21” and “one or more violations during the past 12 months” are not independent.
iii)
The hypotheses tested are,
"H_0:" age and traffic violations are independent.
"Against"
"H_1:" age and traffic violations are not independent.
To conduct this test, we use the Chi-square test for independence as follows,
We first determine the Expected counts for each cell using the formula,
"E_{ij}=(r_i*c_j)\/n" where row "i=1,2,3" and column "j=1,2"
They are given as,
"E_{11}=(53*40)\/100=21.2"
"E_{12}=(47*40)\/100=18.8"
"E_{21}=(53*20)\/100=10.6"
"E_{22}=(47*20)\/100=9.4"
"E_{31}=(53*40)\/100=21.2"
"E_{32}=(47*40)\/100=18.8"
The test statistic is given as,
"\\chi^2_c=\\displaystyle\\sum_{j=1}^2\\displaystyle\\sum_{1=1}^3(O_{ij}-E_{ij})^2\/E_{ij}"
"\\chi^2_c=(15-21.2)^2\/21.2+(25-18.8)^2\/18.8+(8-10.6)^2\/10.6+(12-9.4)^2\/9.4+(30-21.2)^2\/21.2+(10-18.8)^2\/18.8=12.987(3\\space dp)"
"\\chi_c^2" is compared with the table value at "\\alpha=0.05" with "(r-1)(c-1)=(3-1)(2-1)=2" degrees of freedom.
The table value is,
"\\chi^2_{\\alpha,2}=\\chi^2_{0.05,2}=5.99147"
The null hypothesis is rejected if "\\chi^2_c\\gt \\chi^2_{0.05,2}"
Since "\\chi^2_c=12.987\\gt \\chi^2_{0.05,2}=5.99147", we reject the null hypothesis and conclude that there is no enough evidence to show that age and traffic violations are independent at 5% level of significance.
Comments
Leave a comment