Goodness of Fit Examples
Calculations by
Professor S. Gramlich
(updated 3/22/09)
Example 1
Testing for Uniformly Distributed Industrial Accidents
(from Triola, “Elementary Statistics,” 8th ed,
C2001, p. 586)
10-2 #6 (10-2, #9 from 9th ed)
A study was made of
147 industrial accidents that required medical attention.
Among those
accidents, 31 occured on Monday, 42 on Tuesday, 18 on Wednesday,
25 on Thursday, and
31 on Friday (based on results from "Counted Data CUSUM'S,"
by Lucas, Technometrics, Vol. 27, No. 2). Test the claim that accidents occur
with
equal proportions on the five workdays. If the
proportions are not the same, what
factors might
explain the differences?
H0: fits uniform distribution (probalities equal) (original claim)
H1: does not fit uniform (at least 1 of probalities not equal)
degrees of freedom= k-1:
df = 5-1 =4
critical value: cc2= 9.448 (assuming .05 significance level)
expected value:
E=n/k= 147/5=29.4
test statistic:
Pearson C2 approx= S((O-E)2/E)
= (31-29.4)2/29.4 + (42-29.4)2/29.4 + (18-29.4)2/29.4 + (25-29.4)2/29.4 + (31-29.4)2/29.4
= 10.6531
Decision:
Traditional: Since 10.653 > 9.448, ts inside critical
region, so Reject Null.
P-value: .05 > p-value >.025 for df = 4 (from Table A-4)
Sinc p-val < .05,
Reject Null.
Conclusion:
There is not sufficient evidence to support the claim.
Tues and Wed have biggest differences.
Example 2
Do Industrial Accidents Fit the Claimed Distribution?
(from Triola, “Elementary Statistics,” 8th ed,
C2001, p. 586)
10-2 #7
Use a 0.05
significance level and the industrial accient data from Exercise 6 (above)
to test the claim
of a safety expert that accidents are distributed on workdays as
follows:
30% on Monday, 15%
on Tuesday, 15% on Wednesday, 20% on Thursday, and
20% on Friday.
Ho: the data fits the claimed distribution.
H1: the data does not conform to the given distribution
degrees of freedom=
k-1
df= 5-1 =4
critical value= cc2= 9.488
expected values:
Ei= n*pi= 147*30% for mon, 147*15% for tues, etc...
test statistic:
Pearson X2 approx = S((O-E)2/E)
=((31-22.05)^2)/22.05 +...+ ((31-29.4)^2)/29.4 = 23.431
p-value < .005 (from
Table A-4 for df=4)
Decision: Reject null.
Conclusion: There not sufficient evidence to support the claim that accidents
are distributed as given.
Example 3
Car Crashes and Age
Brackets
(from Triola, “Elementary Statistics,” 8th ed,
C2001, p. 586)
10-2 #11 (10-2 #16
from 9th ed)
Among drivers who have had a car crash in the last year, 88 are randomly selected
and categorized by age, with the results listed in the accompanying table. If all ages
have the same crash rate, we would expect (because of the age distribution of licensed)
the given categories to have 16%, 44% 27%, and 13% of the subjects, respectively.
At the 0.05 significance level, test the claim that the distribution of crashes conforms
to the distribution of ages. Does any age group appear to have a disproportionate
number of crashes?
Age |
Under 25 |
25-44 |
45-65 |
Over 64 |
Drivers |
36 |
21 |
12 |
19 |
Ho: data fits (conforms to) the given distribution. (original claim)
H1: data does not conform to the given distribution
degrees of freedom= k-1:
df = 4-1 =3
critical value: cc2= 7.815 (from Table A-4, alpha = .05)
expected values:
Ei= n*pi
88*.16= 14.08, 88*.14 = 38.72, 88*.27=23.76, 88*.13=11.44
test statistic:
Pearson X2 approx = S((O-E)2/E)
=((36-14.08)^2)/14.08 + ((21-38.72)^2)/38.72... = 53.05
Decision: Since ts > cv, Reject Null.
P-val < .005 (from Table A-4) and since < alpha, Reject Null.
Conclusion: There
is not sufficient evidence to support the claim.