Replaces the contents of the result variable with the Chi-square statistic determined by the first two vectors. The Chi-square statistic is the sum of N terms, where N is the size of each of the two input vectors. Each term contributing to the sum is the square of the difference between one element from the observed values vector and the same element of the expected values vector, divided by the expected value. If the input vectors are of different lengths, the shorter vector will be "extended" to the length of the longer by repeating its last element as many times as necessary. This extension is only done internally and does not change the actual size or content of the shorter vector. If any element is a "missing" value (represented by a "." or "NaN") the result for CHISQUARE will be NaN. To avoid this, you can CLEAN the input vector(s) before applying the CHISQUARE command. |
COPY (30 29 16 12 33 5) observedData COPY (25.2 37.2 12.6 16.8 24.8 8.4) expectedValues CHISQUARE observedData expectedValues chiSquare PRINT chiSquare The above program produces the following output: chiSquare: 9.098182283666157
The next program is a realistic example that shows the computation of the expected values using a subroutine, generates the appropriate chi-square distribution by simulation, and then determines the acceptance or rejection of the null hypothesis. 'From CliffsQuickReview Statistics, Page 110: 'Suppose 125 children are shown three TV commercials 'A, B, and C, for breakfast cereal and are asked to 'pick which they liked best. The results are: ' ' A B C Totals 'Boys 30 29 16 75 'Girls 12 33 5 50 'Totals 42 62 21 125 ' 'Is the choice of favorite commercial related to 'whether the child is a boy or a girl? 'Null hypothesis: the commercial choice is not 'related to the sex of the child. This can be 'restated as: How often (or what is the 'probability that) the contents of the six inner 'cells would be as far or farther than they 'currently are from their expected values? 'Compare results for alpha = 0.05 vs. alpha =0.01. ' COPY (30 29 16) boysData COPY (12 33 5) girlsData COPY boysData girlsData observedData 'Subroutine to compute the expected values given 'the rows of a two-row data table. NEWCMD EXPECTED_VALUES row1 row2 expectedVals SUM row1 row1Sum SUM row2 row2Sum ADD row1 row2 columnSums SUM columnSums grandTotal MULTIPLY row1Sum columnSums row1Products MULTIPLY row2Sum columnSums row2Products DIVIDE row1Products grandTotal row1ExpectedVals DIVIDE row2Products grandTotal row2ExpectedVals COPY row1ExpectedVals row2ExpectedVals expectedVals END EXPECTED_VALUES boysData girlsData expectedValues PRINT expectedValues 'Now, compute and print the Chi-square value 'for the given table data. CHISQUARE observedData expectedValues chiSquare PRINT chiSquare 'Construct a "universe" of ads in the same proportions 'as in the original data table: NAME (1 2 3) adA adB adC COPY 42#adA 62#adB 21#adC ads 'Compute and record (SCORE) chi-square values for 'many simulated table cell entries derived from 'the ad universe. COPY 5000 rptcount REPEAT rptcount SHUFFLE ads ads TAKE ads 1,75 boys TAKE ads 76,125 girls COUNT boys = adA boysAdA COUNT boys = adB boysAdB COUNT boys = adC boysAdC COUNT girls = adA girlsAdA COUNT girls = adB girlsAdB COUNT girls = adC girlsAdC 'Rebuild a new table with the simulated data COPY boysAdA boysAdB boysAdC girlsAdA girlsAdB girlsAdC observedData$ CHISQUARE observedData$ expectedValues chiSquare$ SCORE chiSquare$ chiSquareScores END 'Just for information purposes, print out the 'chi-square distribution we have generated: HISTOGRAM chiSquareScores COUNT chiSquarescores >= chiSquare chiCount DIVIDE chiCount rptCount significanceLevel PRINT significanceLevel OUTPUT "Conclusions:\n" OUTPUT "The null hypothesis is " IF significanceLevel >= 0.05 OUTPUT "NOT " END OUTPUT "accepted at the 0.05 significance level.\n" OUTPUT "The null hypothesis is " IF significanceLevel >= 0.01 OUTPUT "NOT " END OUTPUT "accepted at the 0.01 significance level.\n" |