The Educational Achievement of Indian Children

CHAPTER III

Statistical Treatment of the Test Data

INTRODUCTION

In order to make the report more meaningful to the reader, it was decided to describe the statistical treatment of the test data in some detail. The discussion and interpretation of the results depend directly upon this treatment. Then too, because of the great amount of data gathered, certain assumptions had to be made in order to facilitate the treatment of the data and the discussion of the results. An understanding of these assumptions is necessary to interpret and qualify the results properly.

GENERAL PROCEDURES

The test scores for each test for each student were entered on a code sheet, transferred to Hollerith cards, and sorted by IBM equipment. Since there were nine geographic areas and six types of schools and twenty-four tests, the sorting yielded 360 distributions. The following distribution for grade eight for day schools for the Pressey Vocabulary Test was typical:

Pressey Vocabulary Test

The following values were calculated for each of the 360 distributions: (1) mean, (2) standard deviation, (3) plus one standard deviation, and (4) minus one standard deviation. For the distribution shown above, the calculations were as follows:

calculation

Plus one standard deviation = 50.77 + 14.09 = 64.86
Minus one standard deviation = 50.77 - 14.09 = 36.68

These values were used to draw the vertical lines that appear on the twenty-four figures in Chapter IV. There are nine of these lines for the nine geographic areas and six of these lines for the six types of schools.

These values were also used in computing the percentage of overlap between grade eight and grade twelve for the various types of schools where the tests given were common to both grade levels. For example, what percentage of the students in the mission schools in grade eight exceeded the mean of the students in the mission schools in grade twelve on the Use of Resources Test? The mean of the pupils in grade twelve was 47.45 and the mean for the pupils in the eighth grade was 37.91. The standard deviation for the eighth grade was 8.40. In terms of the eighth grade distribution, how many standard deviations above the mean would a score of 47.45 fall? The procedure is as follows:

Assuming a normal distribution of test scores for grade eight, 1.14 standard deviations above the mean would have 12.7 per cent of the area above this point. Therefore, it can be said, assuming a normal distribution of scores, that 12.7 per cent of the eighth grade exceeded the mean score of the twelfth grade on the Use of Resources Test.

FURTHER ANALYSIS

Did the Indian children located in a particular geographic area achieve significantly more on a particular test than the Indian children in the other eight areas? In order to answer this question for the twenty-four tests, it would have been necessary to run twenty-four tests of significance by means of the technique of analysis of variance. Since this technique is based on the assumption of nomogeneity of variances, this would have necessitated twenty-four tests for homogeneity of variances prior to the application of the F test or test of significance. If on a particular test, the geographic areas had been homogeneous with respect to variances and if the F test had proved to be significant, thirty-six t tests would have had to be run in order to locate the significant differences between particular geographic areas. The required probability for the selected difference to be significant would be 1.39 in 1000 at the 5 per cent level or the critical ratio would have to be equal to or greater than about 3.22. Even a spot check here and there would, at best, be sketchy because the number of cases in each area varies considerably. Thus, in two different comparisons the one critical ratio might be significant because of the large N's involved and the other critical ratio might not be significant because of the small N's involved, even though the differences in means were the some or nearly so. Needless to say, these calculations would have involved an immense amount of labor. Therefore, any conclusions regarding differences in achievement between geographic areas on the various tests, will have to be drawn from the line graphs appearing in the various figures.

A word of explanation regarding the line graphs is in order. The distribution for a particular test for a particular area or type of school is graphically portrayed by a straight vertical line, the lowest point of the line being one standard deviation below the mean (a horizontal mark at the center of the vertical line), and the highest point of the line being one standard deviation above the mean. These limits, assuming a normal distribution of scores, mark the range of the middle two-thirds of the students on a particular test. The achievement of Indian children in the various geographic areas or in the various types of schools can be compared by locating the mean point. In addition, the amount of overlap in achievement on a particular test can be noted by comparing the two vertical lines for two geographic areas or for two types of schools.

These vertical lines were also drawn for the five types of schools which Indian children attend. These five types of schools are: reservation boarding schools, day schools, mission schools, non-reservation boarding schools, and public schools.*Since the Indian Service is more interested in the achievement of Indian children in the various schools Indian children attend than in the comparison between these types of schools and the schools which white children attend, it was decided to go beyond the graphical treatment for these comparisons even though the amount of labor involved was great.

Since the results of the IBM work yielded distributions of scores rather than the sums of the scores and the sums of the scores squared, it would have been awkward to use the technique of analysis of variance mentioned previously. An examination of the line graphs made it almost certain that significant differences based on comparisons between types of schools would have been obtained for all twenty-four of the tests. Later calculations proved this examination to be correct. For the moment, let us assume that F values had been calculated and that all of the values were significant. Ordinarily one would not calculate the F values unless the test of homogeneity of variances had established that the variabilities of the groups under comparison were essentially the same. Were the variabilities for the six types of schools on the twenty-four tests essentially the same or did they differ significantly from each other? An examination of the data indicated that on some of the tests, the six types of schools were homogeneous with respect to variances and on other tests they were not. What effect would a significant difference in variances have on the value obtained by running a t test on the scores obtained from the two types of schools? It may result in a somewhat larger value of t, but it is unlikely that a significant value of t would be produced only by a difference in variances¹⁶ Since the data yielded by the IBM work lent themselves better to the calculation of critical ratios rather than t ratios, fifteen critical ratios were calculated for each of the twenty-four tests. A comparison was made of the values yielded by the critical ratio formula with those yielded by the t formula or by the use of the BehrensFisher formula when the variances were not homogeneous.

In the case of the Pressey Vocabulary Test for grade eight, did the Indian children in the public schools achieve significantly more than the Indian children in the day schools? A critical ratio of 5.81 was obtained in favor of the Indian children in the public schools. On the face of it, this would seem to indicate that Indian children in the public schools achieved significantly more than did the Indian children in the day schools. The application of the t test yielded a value of 5.86, slightly higher than that obtained by the critical ratio formula. However, the variances of the two groups were not homogeneous and the t test was not the proper tool to use. The application of the Behrens-Fisher d test yielded a value of 5.86 which exceeded the table value at the 1 per cent level. This definitely established the superiority of the Indian children in the public schools.

Although the procedure used with the comparison of achievement of Indian children in public and day schools is correct where the variances differ significantly, this procedure involves a great amount of unnecessary work, since the values obtained by all three methods will not differ greatly when the number of cases involved is large. Edwards has this to say: "With still larger samples, in the neighborhood of 30 cases each, a variance which is 2.0 times as large as the other will be sufficient to reject the hypothesis of a common variance."¹⁷ Since the number of cases in grade eight ranged from about 70 to 450 and since the number of cases in grade twelve ranged from about 30 to 350, the consideration of homogeneity of variances for this study did not seem to be important providing the reader is mindful that significant differences in variances did exist in a minority of the comparisons. Therefore, to facilitate the drawing of conclusions, it was assumed that each of the twenty-four sets of comparisons yielded significant F values and that the variances for each set were homogeneous. Thus, the calculation of fifteen critical ratios for each of the twenty-four sets of comparisons would be necessary to ascertain which group or groups achieved significantly more than the others. Since for each comparison there were six types of schools necessitating the calculation of fifteen critical ratios, the required probability for the selected difference to be significant is not 1 in 100 but as 1 in (15) (100), or .6 in 1000 at the 1 per cent level and 3.3 in 1000 at the 5 per cent level. Thus, any critical ratio above 3.40 would have a probability value less than .6 in 1000 and would be, considered significant at the 1 per cent level. Any critical ratio from 2.94 to 3.40 would have a probability value less than 3.3 in 1000 and would be considered significant at the 5 per cent level. The twenty-four tables in Chapter IV, each containing fifteen critical ratio values, are to be read with the above discussion and reservations in mind.

SUMMARY

This chapter has described the general procedures used in the treatment of the test data. Two statistics, the mean and the standard deviation, were calculated for each category for each of the twenty-four tests. These statistics were used in the construction of line graphs showing the range of scores for the middle two-thirds of the students in a particular category on a particular test. It was explained that these line graphs could be used for rough comparisons of achievement of children located in the different geographic areas or in the different types of schools. In addition, the means and standard deviations were used to determine the percentage of Indian children in the eighth grade that exceeded the mean of the Indian children in the twelfth grade where the tests given were common to both grades. Since the comparisons involving Indian children in the various types of schools they attend as well as comparisons with white children in public schools were considered of greater importance than the comparisons between geographic areas, considerable discussion was devoted to the calculation of critical ratios and their interpretation as to significance. It was decided that a critical ratio of 3.40 or greater would be significant at the 1 per cent level and that a critical ratio of 2.94 to 3.40 would be significant at the 5 per cent level. It was pointed out that the above statements are based upon certain assumptions and reservations, and that the reader must be mindful of these assumptions and reservations when considering the data at hand.

* Indian children in the public schools will in this report be called Public Indian. White children in the public schools will be called Public White.

16 A. Fisher. Statistical Methods for Research Workers. (6tg ed.) Edinburgh: Oliver and Boyd, 1936, p. 129.

17 A[Ien L. Edwards. Experimental Design in Psychological Research. New York: Rinehart and Company, Inc., 1950. p. 163.

The University of Alaska Fairbanks is an Affirmative Action/Equal Opportunity employer, educational institution, and provider is a part of the University of Alaska system. Learn more about UA's notice of nondiscrimination.

Alaska Native Knowledge Network
University of Alaska Fairbanks
PO Box 756730
Fairbanks AK 99775-6730
Phone (907) 474.1902
Fax (907) 474.1957

Questions or comments?
Contact ANKN

Last modified August 17, 2006

— HOME —