Goodness of Fit Examples

Calculations by Professor S. Gramlich

(updated 3/22/09)

 

Example 1

Testing for Uniformly Distributed Industrial Accidents

(from Triola, “Elementary Statistics,” 8th ed, C2001, p. 586)

10-2 #6 (10-2, #9 from 9th ed)

 

A study was made of 147 industrial accidents that required medical attention.

Among those accidents, 31 occured on Monday, 42 on Tuesday, 18 on Wednesday,

25 on Thursday, and 31 on Friday (based on results from "Counted Data CUSUM'S,"

by Lucas, Technometrics, Vol. 27, No. 2).  Test the claim that accidents occur with

equal proportions on the five workdays.  If the proportions are not the same, what

factors might explain the differences?

 

H0: fits uniform distribution (probalities equal) (original claim)

H1: does not fit uniform  (at least 1 of probalities not equal)

 

degrees of freedom= k-1:

df = 5-1 =4

critical value:  cc2= 9.448 (assuming .05 significance level)

 

expected value:

E=n/k= 147/5=29.4

 

test statistic:

Pearson C2 approx= S((O-E)2/E)

= (31-29.4)2/29.4 + (42-29.4)2/29.4 + (18-29.4)2/29.4 + (25-29.4)2/29.4 + (31-29.4)2/29.4

= 10.6531

 

Decision:

Traditional:  Since 10.653 > 9.448, ts inside critical region, so Reject Null.

P-value:  .05 > p-value >.025  for df = 4 (from Table A-4)

Sinc p-val < .05, Reject Null.

 

Conclusion:

There is not sufficient evidence to support the claim.

Tues and Wed have biggest differences.

 

 

Example 2

Do Industrial Accidents Fit the Claimed Distribution?

(from Triola, “Elementary Statistics,” 8th ed, C2001, p. 586)

10-2 #7

 

Use a 0.05 significance level and the industrial accient data from Exercise 6 (above)

to test the claim of a safety expert that accidents are distributed on workdays as follows:

30% on Monday, 15% on Tuesday, 15% on Wednesday, 20% on Thursday, and

20% on Friday.

 

Ho:  the data fits the claimed distribution.

H1:   the data does not conform to the given distribution

 

degrees of freedom= k-1

df= 5-1 =4

critical value= cc2= 9.488

 

expected values:

Ei= n*pi= 147*30% for mon, 147*15% for tues, etc...

 

test statistic:

Pearson X2 approx =  S((O-E)2/E)

=((31-22.05)^2)/22.05 +...+ ((31-29.4)^2)/29.4  = 23.431

 

p-value < .005 (from Table A-4 for df=4)

 

Decision:  Reject null.

 

Conclusion:  There not sufficient evidence to support the claim that accidents

are distributed as given.

 

 

Example 3

Car Crashes and Age Brackets

(from Triola, “Elementary Statistics,” 8th ed, C2001, p. 586)

10-2 #11 (10-2 #16 from 9th ed)

 

Among drivers who have had a car crash in the last year, 88 are randomly selected

and categorized by age, with the results listed in the accompanying table.  If all ages

have the same crash rate, we would expect (because of the age distribution of licensed)

the given categories to have 16%, 44% 27%, and 13% of the subjects, respectively.

At the 0.05 significance level, test the claim that the distribution of crashes conforms

to the distribution of ages. Does any age group appear to have a disproportionate

number of crashes?

 

Age

Under 25

25-44

45-65

Over 64

Drivers

36

21

12

19

 

 

Ho:  data fits (conforms to) the given distribution. (original claim)

H1:  data does not conform to the given distribution

 

degrees of freedom= k-1:

df = 4-1 =3

critical value:  cc2= 7.815 (from Table A-4, alpha = .05)

 

expected values:

Ei= n*pi

88*.16= 14.08, 88*.14 = 38.72, 88*.27=23.76, 88*.13=11.44

 

test statistic:

Pearson X2 approx = S((O-E)2/E)

=((36-14.08)^2)/14.08 + ((21-38.72)^2)/38.72... = 53.05

 

Decision:  Since ts > cv, Reject Null.

P-val < .005 (from Table A-4) and since < alpha, Reject Null.

 

Conclusion: There is not sufficient evidence to support the claim.