INTERNATIONAL JOURNAL OF POPULATION GEOGRAPHY Int. J. Popul. Geogr. 7, 129-148 (2001) DOI: 10.1002/ijpg.213 Space-time Evolution of the Fertility Transition in India, 1961-1991 F. Balabdaoui, J.-P. Bocquet-Appel, * C. Lajaunie and S. Irudaya Rajan ' 1 Centre de Geostatistique, Ecole des Mines de Paris, 35 rue Saint Honore, 77305 Fontainebleau, France 2CNRS, UPR 2147, 44 rue de l'Amiral Mouchez, 75014 Paris, France 3Centre for Development Studies, Trivandrum 695 Oil, India ABSTRACT A fertility index has been estimated which takes into account the space-time variations of mortality rates of children and women and the different fertility rates of women across age groups. The basic data are the recorded frequencies at district level for four decennial census dates (1961 to 1991) of children aged 0-4 years and women aged 15-49 years. On that basis, we have 326, 349, 400 and 454 data points on the map for each decennial census, respectively. To construct fertility surfaces on the map, the kriging technique of spatial interpolation was applied. To highlight on these surfaces the areas of regional homogeneity and zones of major change, on a subcontinental scale, a technique of image analysis and the wombling procedure of detection of the zones of abrupt change were used. The fertility maps and the variogram surfaces show that the transition took place by an expansion of heterogeneity starting from the southwestern zones. This expansion reduced the extent of the pre-transition area of homogeneity in such a way that its original transverse SW-NE direction turned into a longitudinal direction by 1991 and its size became appreciably smaller. In 1961 a long zone of abrupt change detected by wombling was observed crossing the Indian subcontinent transversely. From west to east, it extended longitudinally for a length of * Correspondence to: Jean-Pierre Bocquet-Appel, CNRS, UPR 2147, 44 rue de l'Amiral Mouchez, 75014 Paris, France. E-mail: bocquet-appel@ivry.cnrs.fr nearly 2500 km, including Bangladesh. This zone of abrupt fertility change divided the subcontinent into two regions, one of high fertility in the north and the other of lower fertility in the south. This latter zone is separated partially by State boundaries and geographical landmarks. A procedure of regression trees (classic AID tree) applied to the co-variables shows that this zone also corresponds to a major regional change in the female literacy rate. From 1961 to 1991, with a general fall in fertility rates all over the country, this zone of abrupt change has progressively disappeared. Copyright © 2001 John Wiley & Sons, Ltd. Received 26 August 2000; revised 16 November 2000; accepted 25 January 2001 Keywords: fertility transition; surface variable; generalised wombling; India INTRODUCTION The study intends to highlight the geographical patterns of fertility and their changes in India every ten years from 1961 to 1991. Fertility transition is taking place in most parts of the country (Bhat et at, 1984; Visaria and Visaria, 1997; Bhat and Zavier, 1999; Visaria and Irudaya Rajan, 1999). What are the geographical patterns of fertility since the pre-transition period? Which are the zones in transition and those which remain unchanged? Which are the demographic and sociocultural co-variables available in the censuses at the district level which are asso- Copyright © 2001 John Wiley & Sons, Ltd. 130 F. Balabdaoui et al. ciated with the transition? The present paper relates mainly to the data and the demographic techniques of estimates and spatial analysis which were used to reconstitute and analyse, from the regional to the subcontinental scale, the fertility map on the different census dates since 1961. In order to reconstruct (interpolate) surfaces of fertility, as continuous variables on a map, the kriging technique (Wackernagel, 1998) was applied. To highlight on these surfaces the areas of regional homogeneity and zones of major change at the subcontinental scale, a technique of image analysis and the wombling procedure (Barbujani et at, 1989 Bocquet-Appel and Bacro, 1994) of detecting the zones of abrupt change were used. The ecological correlation between fertility and the demographic, socioeconomic and cultural co-variables of the censuses (1981 and 1991) are already known. This article is mainly of a descriptive nature. Its goal is to provide material for further analysis of demographic changes and their correlates. DATA Census Figures of Children and Women The basic information for estimating fertility rates at the district level are the census figures of children aged 0-4 years and women aged 15-49 years for the years 1961, 1971, 1981 and 1991. The districts are located by the geographical coordinates of their capitals. From 1961 to 1991, they represent for the four decennial censuses 326, 349, 400 and 454 data points on the map, respectively. The list of the administrative subdivisions - states and districts -varies chronologically. When one refers to a district (represented by the coordinates of its capital, historically a stable geographical unit), it will be in the current administrative state division (1991) and not in the administrative subdivision at the time of the earlier censuses. In the censuses, there are two known sources of error: under-enumeration of persons in different age groups, and mis-reporting of age. The corrections proposed by post-enumeration checks (Census of India, 1982, 1994; Bhat et at, 1984; Bhat, 1998), undertaken in order to take account of the missing individuals supposedly notcounted in large regions (North, West, South, East, South-West and North-East), have only a negligible influence on the distributions of the enumerated population (about 0.2%). Therefore, these corrections were not made. More serious seems to be the problem of the mis-reporting of ages of children, an error which would produce an underestimate for the age group 0-4 years and an overestimate for the age group 5-9 years (Bhat, 1995). Percentages of under- and over-estimation estimated from smoothing of distributions have been proposed at the all-India level; such estimates do not exist, however, for district and state levels (Dyson, 1976; Census of India, 1977; Chandra, 1980). These percentages are reported to be almost invariant between censuses (Bhat, 1998). From these percentages, application of a constant correction factor to the number of children aged 0-4 years enumerated in the districts does not modify the relative distribution of the number among the districts. The geographical pattern of fertility, with or without correction, remains unchanged. For lack of local information, we have therefore not attempted to correct this under-enumeration. Probability of Deaths and Age-Specific Fertility Rates The calculation of fertility indices also rests on the probability of deaths and on the age-specific fertility rates. For the probabilities of death, we used the two following categories of information: • for the censuses 1981 and 1991, the death probability for children 0-4 years at the district level (Registrar General of India, 1988, 1997); • for the periods 1970-75, 1976-80, 1981-85 and 1986-90, life tables at the level of 15 States for the first two periods and 17 States for the last two, as well as at the all-India level for all four periods, but with no life tables at district level (Registrar General, India, 1989, 1994). Three usual problems of missing data thus arise: to estimate (i) mortality at district level when information is available only at State level; (ii) mortality of particular States and Union Territories for which no data are Copyright © 2001 John Wiley & Sons, Ltd. Int. ]. Popul. Geogr. 1, 129-148 (2001) Fertility Transition in India 131 available; (iii) mortality over the period 1961-70 for all States and Union territories. The standard approach was used: (i) when the mortality of a lower level of geographical aggregation was not available (the district), one applied the mortality of the higher aggregation level; thus we assigned to the district the life table (considered as averaged) of the State to which it belongs, and to particular States the life table (considered as averaged) of India; (ii) one applied the mortality of the period 1970-75 to the previous period 1961-71. This method of filling in the missing data tends to homogenise mortality as one moves back in time, and consequently blurs the local differences in fertility in the earlier censuses, but one can hardly find an alternative. With regard to the co-variables, they are those available in the censuses of 1981 and 1991. They may be classified under three categories: • demographic: population density (by km2), urban population, infant mortality; • cultural: average household size, which represents the number of members in a nuclear family, literacy rate and religious distribution (Hindu, Muslim and Christian) of the population; • socioeconomic (work participation). The frequencies in the various categories involving large numbers (hundreds of thousands of individuals) were converted into percentages, for which the standard error of estimation is negligible. A Fertility Estimator at the Localities To take account of the space-time variation of mortality of children and women and the difference between the relative fertility of women across age groups, an index of fertility was estimated (see Appendix A). The parameters are: f(x,T) age-specific fertility rate, from age class x to age class x + 4 (x = 15, • ■ -,45); P(0, X, T) enumerated children aged 0-4 years, at locality X; P(x, X, T) enumerated women belonging to age class x, to age class x + 4; q (x, T) the death probability for women; q(0, X, T) the death probability for children belonging to the age class 0-4 years. As starting point of a period, the fifth year preceding a census, t is set to 0. The end of the period corresponding to the census is set to T = 5. This estimator at the locality X, at time t, is: 2(X, t) P(0,X,T) 27 f(x,T)P(x,X,T) q(0,X,T)-q(x,X,T) (1-e T[q(x,X,T)-q(0,X,T)h (1) One can show that the expectancy of 1(X, T) = E(1(X, T)) = 1 (see Appendix B). It is thus an estimator without dimension. Within the same census t, the point-values account rightly for the hierarchy of fertility between localities. But it is hardly possible to compare these values between the censuses because the expectancy of the differences of 2(X, T) between two censuses, at T and T, is zero: E(2(X, T)) - E(2(X, T')) = 1 - 1 = 0. To compare the values between the censuses, we may either take the estimated fertility for a single age class considered as reference, or calculate 2(X, T) by taking a standard set of age-specific fertility rates/(x,T) common to all the censuses. In this last case we find, in the absence of mortality (q(x,X,T) — q(0,X,T) —> 0), an estimator identical to that of Coale and Treadway (1986) of Ig, if the children aged 0-4 years at the numerator are replaced by those born during the year before the census: 2(X, t) P(0,X,T) ZTf(x, T)P(x, X, T) (2) In the remaining part of the paper, we follow the second approach and use the fertility estimator X(X, T) which we call simply 2(X). The age-specific fertility rates introduced into 2(X) were those for the all-India level for the year 1971, taken as standard reference. At the time of the census, on average, the women of the age class x (to the age x + a), were in the preceding age class, x — a — 1, at the birth of the enumerated children aged a years. To take account of the effective fertility of these Copyright © 2001 John Wiley & Sons, Ltd. Int. ]. Popul Geogr. 1, 129-148 (2001) 132 F. Balabdaoui et al. women during the period preceding the census, the age varying linearly, one considered that such was also the case for the fertility between two successive age classes and that the births were distributed uniformly over the period. Thus, the effective fertility of the women of age class x is roughly, during the period, the average fertility of the age class x and the age class x — a — 1: f(x) = 1/2 (fix) — fix — a — 1)). Estimated Fertility at the Localities and the Censuses The source data suffer from many uncertainties, which have undergone thorough review (Bhat et at, 1984; Rele, 1987; for a summary, see Bhat 1996, 1998). The correlation coefficients between the fertility index 2(X) for 1981 and 1991 and the total fertility rate (TFR) provided by Bhat (1996) for 1974-80 and 1984-90, estimated with the districts common to the study of Bhat (1996) and with ours, are respectively 0.924 (n = 269) and 0.957 (n = 295). Thus, beyond measuring units which are different, TFR and 2(X), the two fertility indices, provide information which are closely correlated. It is not easy to attribute their difference to any single factor, given the anomalies present in the source data in terms of the quantity of information and its integration by the estimators. The relationship of 1(X) to TFR, as estimated by Bhat (1996), may be expressed by a simple linear regression of the form TFR = a + b 1(X) + e, with a and b being coefficients and e a residual (for 1981: « = -0.9533, b = 7.2720, e = ± 0.4053, r2 = 0.882; for 1991: « = -0.9084, b = 7.1628, e = ± 0.2595, r2 = 0.953). It is found, roughly, by taking the average of the two regressions, that TFR = 2 children gives 2(X) = 0.39, and that TFR = 6 children gives 1(X) = 0.94. TECHNIQUES OF SPATIAL ANALYSIS The main techniques of spatial analysis used are summarised below. Quasi-continuous Estimate of Fertility on the Map: Kriging The map of India is contained in spatial Copyright © 2001 John Wiley & Sons, Ltd. fields subdivided into 130 x 121 pixels, with a Lambers projection, to take into account the relative size of the areas when the map is projected on to a sphere. This is important when one is interested in the analysis of the distributions of demographic variables. The simple geographical projection which corresponds to a Cartesian coordinate system does not provide for such an understanding. For each period, from the observed k(X) values at district level, the reconstitution of the surface of the fertility index was obtained by using the procedure of spatial interpolation called kriging. Briefly let us recall that the kriging procedure is the transposition in a spatial context of the multiple linear regression, by considering that the data are outcomes of identically distributed correlated random variables. Essentially, the estimates rest on the variation of the information redundancy with the geographical distance. The measurement of this space redundancy is expressed in the form of a variogram model, fitted to the experimental variogram which is calculated from the data. The variogram model is injected into the regression, from the data points, on any point of the studied geographical space. As an interpolator based on a statistical model, the kriging technique is the best linear unbiased estimator (BLUE) (for a summary, see Wack-ernagel, 1998). This procedure has already been used in demography to detect and represent the spatial diffusion of contraception in Victorian Britain from the data of the European fertility project (Bocquet-Appel and Jakobi, 1998). The observed and modelled variogram of k(X) used in the kriging procedure of spatial interpolation is represented in Fig. 1. The values of the parameters of the models are given in Table 1 and the maps are given in Plate 1. To incorporate the local variation to the extent possible, the estimates provided by the kriging procedure were obtained from the localities at a distance of <500km from the point of estimation. At this distance all the variograms show that they are located in a zone of homogeneity of fertility, characterised by a reasonably important spatial redundancy. In fact, kriging in single neighbourhoods gives very satisfactory maps. Int. ]. Popul. Geogr. 7, 129-148 (2001) 1361 i-- »4 * \5 V 1971 v-°í „í0"*«** ^ 1981 ^OOJS 0 5 O »»o «O? r 1991 »° o s * i*0»«»» ° ?- n 1 290 1.115 0.3*0 0.7S4 0.569 0414 Plate 1. Spatial distribution of the fertility index A(X) in the four censuses. Copyright © 2001 John Wiley & Sons, Ltd. Int. ]. Popul Geogr. 7 (2001) 1961 UUUUUlf I J 1991 _ . Plate 2. Observed variogram surface of the fertility index 1(X) for the 1961 and 1991 censuses. Copyright © 2001 John Wiley & Sons, Ltd. Int. ]. Popul Geogr. 7 (2001) Fertility Transition in India 133 1961 0.09 . • 0.08 , 0.07 . 0.06 . ^0.05 j I004 - 0.03 : . * * . 0 02 _ . ' • . • 0 01 J t • 0 00 . 1971 40 00 _ * . 35 00 _ • .30 00 _ - ■ -* ........* - ---------------------------------------- 25 00 , * 20 00 . • 15 00 . ♦ 10.00 _ : 5.00 . 2 00 Distance?. [x10e6] 1 00 2.00 Distances. [x10e6] 1981 1991 50.00 . • 45.00 . 40.00 . • 35 00 . . »30 00 . ^•25.00 . . ' §20 00 . . E / $1500 . yí 10.00 _ f 5.00 . 0 00 _ . 2 00 3 00 Distances i* 10*6] ^0.05 Ě0.04 0 03 0 02 001 0.00 2.00 Distances [x10e6] Figure 1. Observed and modelled variogram of the fertility index 1(X) in the four censuses, 1961, 1971,1981 and 1991. Detecting the Homogeneity Areas and Zones of Abrupt Change: Edge Enhancement Filter and Wombling From the maps represented in continuous form (Plate 1) we sought, on the one hand, to reduce the data while emphasizing the regional areas of homogeneity, and on the other hand, to detect the zones of major change at the scale of the Indian subcontinent. To emphasize the regional areas of homogeneity on fertility surfaces, we have (i) applied the edge en- Table 1. Parameter values of the modelled variograms to estimate the fertility index A(X), from 1961 to 1991, by kriging. Periods Parameters 1961 1971 1981 1991 Model Spherical Nugget* 0.005 Sill** 0.030 Fitting distance*** 2100 Interpolation distance 500 Spherical Spherical Spherical 0.008 0.008 0.013 0.031 0.028 0.058 900 1100 1900 500 500 500 ' Variance at nil distance. * Maximum variance at the fitting distance. '** Distance in km. Copyright © 2001 John Wiley & Sons, Ltd. Int. ]. Popul. Geogr. 1, 129-148 (2001) 134 F. Balabdaoui et al. hancement filter (Marr and Hildreth, 1980), and (ii) regrouped the filtered values in class groups at steps of 0.1. Briefly, this filter consists of withdrawing the pixel value of the Lapla-cian from the eight surrounding pixels in fields 3x3. Its effect on the map is to increase the intensity of contrasts between contiguous areas of homogeneity. The map appears then in the form of homogeneity areas, of which the arbitrary number depends on the amplitude of the state variable and the increment of the value classes. To highlight the zones of major change at the subcontinental scale, the wombling procedure of detection of the zones of abrupt change (also called 'barriers'; Barbujani et at, 1989; Bocquet-Appel and Bacro, 1994) was used. Briefly, from the gradient evaluated at all points of the surface variable, which represents the rate of change between contiguous pixels, the procedure selects the pairs of contiguous pixels whose values of the gradient (the rate of change) are at a maximum relative to the gradient distribution on the surface. This technique has already been used in demography to detect the wave of the fertility transition in Europe (Bocquet-Appel and Jakobi, 1996). DATA ANALYSIS Univariate Distribution of the Fertility Index A(X) and the Fertility Transition Theoretically, on a horizontal axis representing the range of the possible values of a fertility index, one must expect that the pre-transition distribution (with the vertical axis as the frequency and the horizontal axis the index) of the local values of this index is located in the upper part of the values, and that the post-transition distribution lies in the lower part. The transition must be expressed by a displacement from the upper towards the lower part of the distribution (Bhat, 1996; Bocquet-Appel and Jakobi, 1997). If the geographical area of the generating variables of fertility (primarily socioeconomic and anthropological) is homogeneous, the pre-transition distribution should be unimodal and regular. If this area is not homogeneous, the distribution could be bimodal or even multi-modal, each mode corresponding to a regrouping of local- ities presenting some regional differences. The post-transition distribution, whose values of the index fluctuate around or below the replacement level of the generations, should be unimodal after a flattening of the regional heterogeneity during the transition. The histograms of the 2(X) values of the four censuses are represented in Fig. 2 and the main parameters of the distributions in Table 2. The year 1961 exhibits indications of bimodality. As indicated above, the years 1961 and 1971 have approximations of missing values which could have influenced the distributions. But the bimodal aspect of a distribution, such as that of 1961, in a heterogeneous space geographically and culturally like the Indian subcontinent, is not necessarily the mark of an anomaly in the data. It could be an indication of the presence of two great zones of homogeneity. Such a bimodality, if it is not an artefact, must disappear with the transition which plays a homogenising role. The distribution in 1981 appears regular, as already noted by Bhat (1996), and without any noticeable dissymmetry. It moves towards the low fertility values. The distribution for 1991 is dissymmetrical on the left, towards the low values of the fertility index. It clearly represents a distribution in the course of transition. From 1961 to 1991, at each census, the proportion of the districts having a low fertility index (2(X) < 0.39 = TFR < 2) and those having a high fertility index (2(X) ^ 0.94 = TFR ^ 6) indicated on the histograms, account for 0-72%, 0-71.8%, 0.3-21.1% and 3.1-9.1% of the distributions, respectively. Since 1981 (in fact, from 1976-80), one can observe a substantial fertility reduction in the districts. The whole of the transition kinetics in progress is well represented by the kernel distributions of the 2(X) values in the four censuses (Fig. 3). In 1971 (in fact, from 1966-70), the distribution of X(X) appears unimodal and regular. Perhaps this distribution was bimodal in 1961 (1956-60). But, if the distribution in 1971 is correct, then the absorption speed of the bimodality of 1961, which corresponds to an important demographic change, could have been quite rapid. Then, after 1971, one can observe the predicted shift of the distribution of 2(X) towards lower values. Another possible approach to measuring the Copyright © 2001 John Wiley & Sons, Ltd. Int. ]. Popul Geogr. 1, 129-148 (2001) Fertility Transition in India 135 80 60 % 40- 20 tí 1961 I ±, I IH 1971 r-1 0.6 1.0 -0.2 -0.1 |0.5|. Variables: LR_T, LR_F, LR_M, LR_RT, LR_UT are the literacy rates for the total, for females, males, rural and urban, respectively; GROUP_FEC represents the homogeneity area of the fertility index l(X); INFANT_MORT is Q(5); LAT, LONG and LOLA, the latitude, longitude and their product; WPR_T, WPR_F, WPR_M, are the worker participation rate in total, and for the females and males, respectively; AHS_TOTAL represents the average household size; ST_PCT and SC_PCT are the percentages of Scheduled Tribes and Castes respectively; MUSLIM_PCT, HINDU_PCT, CHRIST_PCT, are the percentages of Muslim, Hindu and Christian religions, respectively; POP_DENSITY is the density of the population; URBAN_PCT is the percentage of urban population in the district. Copyright © 2001 John Wiley & Sons, Ltd. Int. ]. Popul Geogr. 7, 129-148 (2001) 142 F. Balabdaoui et al. Table 5. PCA of the fertility variables and their correlations (rotated loading matrix) with the co-variables after varimax rotation (1991). 1 2 3 4 LR_T -0.959 0.011 0.056 -0.082 LR_F -0.939 0.059 0.167 -0.138 LR_M -0.939 -0.028 -0.083 0.010 LR_RT -0.886 -0.084 0.045 0.299 LAMBDA_91 0.837 -0.272 0.019 0.222 GROUP_FEC 0.828 -0.256 0.021 0.233 INFANT_MORT 0.743 0.147 -0.090 0.215 LRJJT -0.729 0.071 0.055 0.091 LAT 0.568 -0.478 0.083 0.249 LOLA 0.566 -0.509 0.284 0.212 WPR_T 0.050 0.854 0.142 0.423 WPR_M 0.109 0.825 -0.117 -0.142 WPR_F -0.018 0.718 0.203 0.547 AHS_TOTAL 0.443 -0.693 -0.116 -0.013 HINDU_PCT 0.074 0.101 -0.881 0.115 CHRIST_PCT -0.235 -0.021 0.847 0.124 ST_PCT 0.169 0.316 0.807 0.275 LONG 0.214 -0.282 0.687 -0.038 SC_PCT 0.148 -0.274 -0.634 0.027 POP_DENS -0.105 0.058 0.076 -0.736 URBAN_PCT -0.383 0.076 -0.096 -0.646 MUSLIM_PCT 0.097 -0.303 -0.057 -0.566 Variance 7.118 3.437 3.266 2.264 explained by rotated components % total variance 32.355 15.621 14.844 10.292 explained Note: Numbers in boldface are correlations >|0.5|. Variables as in Table 4. homogeneity areas of the maps, labelled GROUP_FEC), we find that they are positively correlated with infant mortality and negatively with all the variables expressing the literacy rate (labelled LR_T, LR_F, LR_M, LR_RT, LR_UT: see Tables 4 and 5 for the variable labelling definitions). Fertility is also positively correlated with latitude. The three other factorial axes, which represent mixtures of socioeconomic and cultural information, are without apparent link to fertility. The correlations of these axes will therefore not be commented upon. Barriers and Co-variables The oldest co-variables that we have are those Copyright © 2001 John Wiley & Sons, Ltd. for the 1981 census. Thus, the relationship between what appears as the vestiges, observed in 1981, of the barrier of the pre-transition fertility 1961-71 and the co-variables of the 1981 census was analysed. In order to capture the effect of the pre-transition barrier only - and not of the new zones of change which appear subsequently on the map - the four transverse fragments located roughly on the barrier zone of 1961 were retained, excluding the easternmost fragment, located in East India, where the co-variables are missing. Then, the districts located up to a maximum distance of 100 km from the fragment limits of the pre-transition barrier were categorised into two groups. The north-south limit of the fragments was determined relative to the longest axis (longitudinal) of each fragment (see Fig. 5, 1981). The first group includes the districts located north of the barrier, and the second those located to the south. The two central fragments present a difficulty: they are almost parallel, so that the districts located between these two fragments are simultaneously south of one fragment and north of the other. They can be classified in both groups. Thus, the districts located between these two fragments were eliminated, to retain only those located north of the northern fragment (north of the barrier), and those located south of the southern fragment (south of the barrier). The districts for which the co- Table 6. List of districts located north and South of the fertility barrier in 1981. North South West fragment Barmer, Jalor, Pali, Udaipar, Dungarpur Centre fragment Sagar, Damoh, Satna, Rewa East fragment Hazaribagh, Giridih, Murshidabad Gandhinagar, Kheda, Ahmadabad, Surendranagar, Rajkot Akola, Amravati, Yavatmal, Wardha, Nagpur, Bhandara Ranchi, Purulia, Bankura, Medinipur, Haora, Calcutta Note: see text for explanation. Int. ]. Popul. Geogr. 7, 129-148 (2001) Fertility Transition in India 143 BARRIER * LR F<24.200 ~ I_____ Figure 6. Regression trees obtained by 'automatic interaction detection' (classic AID) of the 17 socioeconomic and cultural co-variables, sampled at 28 districts, located north and south of the zone of abrupt change of the fertility index A(X) in 1961. Note: to the left are the districts located south of the zone of abrupt change, to the right those located to the north. The threshold criterion <24.2 of the LR_F variable (female literacy rate) correctly classifies the districts between the north and south of the zone, with one exception. The other variables do not carry additional information. variables were missing were also eliminated. The list of the districts retained (n = 28) is given in Table 6. To observe the behaviour of the co-variables on both sides of the barrier, by taking account of the information redundancy between variables, the procedure of regression trees called 'automatic interaction detection' (classic AID tree: see Breiman et at, 1984) was used. Briefly, at each split, districts are classified using a loss function which is least squares, in which the within-group sum of squares about the group mean is as small as possible. For the actual split of the districts between those above (north) and below (south) the barrier, we chose the predictor and cut point which yielded the smallest overall within-cluster sum of squares. Only one co-variable makes it possible to subdivide the district sample correctly, LR_F (female literacy rate), with the value <24.2% giving districts north of the barrier, in the high fertility region, and values ^24.2% giving districts to the South, in the low fertility region. Using this criterion, only one district is wrongly located (in the North; see Fig. 6). This classification criterion of the districts on the fertility barrier, although close to the result of the usual ecological correlation analyses of the co-variables, is noteworthy. It makes it possible to underline, from the subcontinental to the regional geographical scale, the remarkable coincidence of results obtained by the two independent approaches. One is the correlation between the co-variables, at the national scale, which exhibits the well-known negative relationship between literacy rate and fertility. The other is the selection of a districts sample, on the criterion of their locations on each side of the fertility barrier as detected by wombling; this barrier also corresponds to a major regional change for the female literacy rate. Copyright © 2001 John Wiley & Sons, Ltd. Int. ]. Popul. Geogr. 1, 129-148 (2001) 144 F. Balabdaoui et al. CONCLUSION Compared with other fertility indices, the 2(X) we have used takes into account mortality. If this information is missing, 2(X) is then identical to the Ig index. The comparison of the pairs of simple isotropic variograms shows the expected temporal increase in spatial variance of fertility with geographical distance, but not between 1971 and 1981. There is an anomaly of variance between the censuses in 1971 and 1981, towards an excess in 1971 or a deficit in 1981, but perhaps not an anomaly in the expected shape of these variograms. The directional variograms, as well as the comparison of the homogeneity areas of the fertility surfaces, show that the transition takes place by an increase in deviation from the southwestern regions, which correspond to regions of relatively low fertility. The expansion of these regions reduces the homogeneity areas, historically at high fertility levels, localised obliquely in the centre of India, in such way that the original transverse direction of this area, SW-NE, became longitudinal by 1991, and its size was cut by a half. The expansion of the areas of moderate or low fertility (2 < 0.6), resulting in a reduction of the high fertility areas, is observed mainly in the southern and coastal regions of the subcontinent. A study of the geographical variation of fertility and its change, using the total fertility rate (TFR) variable, recently carried out by Bhat (1996) for the years 1981 and 1991, indicated that the fertility pattern strongly differed between the north and the south of the country, and is 'reminiscent of the Indian monsoon'. Its stability was maintained over the period. Could the barriers detected by the wombling technique be due to an artefact produced by the filling procedure of the missing data - in Diu' Daman// Maharashtra Dadar & Nagar Havelí Mähe A^v—' f ^PwKiiclMrry KeUla) / (pDndicheny} Figure 7. Map of Indian States. Copyright © 2001 John Wiley & Sons, Ltd. Int. J. Popul. Geogr. 7, 129-148 (2001) Fertility Transition in India 145 particular, the use of the life tables of the states? The use of this information might contribute to the generation of patches of homogeneity corresponding to the limits of the states, and thus also to zones of abrupt changes among the states. If this is in fact the case, the barriers detected at the state frontiers could be the outcome of this statistical artefact. On the other hand, the inter-state frontiers generally denote the limits of several cultural and socioeconomic variables which could exert a demographic influence. To check the assumption of possible statistical artefact, the long barrier of 1961, which is the best candidate, was examined. A statistical test of the states concerned, which takes account of the cross-distributions of the districts in and out of the barriers, and along and out of the frontier zones, is not easy to construct, in particular because of the geographical configuration of certain states (for example, Assam, where 19 districts out of 23 are frontier). By visual inspection, is the long barrier of 1961 systematically more localised along the frontier zones of states than within states? Its western border is on the frontier zone separating, longitudinally, the state of Maharashtra to the south, and the states of Gujarat and Madhya Pradesh to the north. For each of these states, life tables exist. Nevertheless, this barrier continues transversely within the state of Madhya Pradesh, until the district of Surgaja, reaching across the eastern border of Madhya Pradesh to Bihar. Moreover, the eastern pillar of this barrier (excluding East India) is entirely within the state of West Bengal, not touching any frontier district in the rest of India. One does not perceive an obvious influence of a data artefact on the identified barriers. From 1961 to 1991, one also observes the progressive disappearance of the long transverse zone of abrupt change of fertility, dividing the subcontinent into two vast regions, one in the north with high fertility, and the other in the south with lower fertility. In 1961, the western part of this zone of abrupt change corresponded to a geographical obstacle: the Southern bank of the Narmada river following the Satpura Ranges to eastern Madhya Pradesh. The available co-variables have shown their usual ecological correlation with fertility, in particular the literacy rate, globally on an all-India scale, and regionally at the district level, located directly on both sides of this zone of abrupt change. If this barrier of fertility is pre-transitional - i.e. had been in existence historically for a long period - one may wonder, beyond the nominal co-variable 'literacy rate', which other anthropological variables, captured today under the rubric 'literacy rate' contributing to an increase in fertility in the north and a decrease in the south, have been in operation. In-depth investigations in selected districts concerning the anthropological factors that influence fertility, independently of the recent outcomes of official policies and westernisation (market economy, mass media), could provide meaningful answers (for example, in the district of Barner, Rajasthan, X = 1.1, and in that of Ahmadabad, Gujarat, X = 0.7, when they are separated from each other by about 300 km). These districts are among those for which the variance of fertility is at a maximum and the geographical distance at a minimum. It should be the same for the underlying anthropological factors. ACKNOWLEDGEMENTS Christophe Guilmoto, French Institute of Pondicherry, has drawn attention to the Indian transition and has provided the geographical coordinates of the districts. This project has received financial support from the University Grants Commission (New Delhi), CNRS (PSIG 98, Scientific Department of SHS), and the Indo-French programme of Cooperation in Social Sciences (responsible: Jean-Luc Racine, Maison des Sciences de l'Homme, Paris). The constructive comments of an anonymous referee are also acknowledged. APPENDIX A A Fertility Estimator at the Localities In order to take account of the space-time variation of mortality of children and women and the differences between the relative fertility of women across age groups, an index of fertility was estimated, under the following assumptions. The migrant population does not live separated from its children (the children Copyright © 2001 John Wiley & Sons, Ltd. Int. ]. Popul. Geogr. 1, 129-148 (2001) 146 F. Balabdaoui et al. are enumerated at the same time that parents are enumerated). The only consequence of migration is the assignment of the place of birth having taken place in the locality of origin to the place of migration, inducing a (very weak) loss of contrast. Because of the Indian demographic regime, there is little distinction between marital and general fertility. The fertility rate of the age class x at the age x + a, at the locality X, written as x(x,X,r), may be represented as the product of a term which takes into account the profile of the age effect f(x,t), and a geographical factor X(X,t). Thus, under this assumption, we have: x(x,X,r) = f(x,t)X(X,t) In this model, the profile of the age effect is thus, at a given time, the same in all the localities; only the fertility intensity varies spatially. In this form, it is clear that the fertility allocation between the terms /and X is arbitrary. A natural choice consists in taking for the value of/the average fertility over the country - the age-specific fertility rate - and thus Ix x(x,X,ŕ) =/(x,r). With this choice, the spatial average X(X,t) = 1 at every time t. Let us stress that only the profile of/ and not its level, exerts an influence on the estimated fertility x. That means that a multiplicative error, relating to the level of / would be automatically compensated for by an opposite modification of the geographical coefficient, producing correct values of fertility. The geostatistical approach consists of building a stochastic model in which the observed variables (i.e. the frequencies) are interpreted as a particular outcome of this model (Matheron, 1989). In this model, the geographical factor X is regarded as a random function. We begin from a given locality. Here the spatial dimension does not intervene. Let us assign each individual to a cohort, according to his/her age at the time of the census. An individual dead before this date is assigned according to the age which he/she would have had if he/she had survived. Let P(0, X, t), P(x, X, t), q(x, t) and q(0, X, ŕ), be the random variables representing the enumerated children of the local population aged 0-4 years, the enumerated women belonging to the age class x, to the age x + a, the in- stantaneous probabilities of death for women, and the probability of death of the children belonging to the age class 0-4 years at time ŕ, respectively. As the starting point of a period, the fifth year preceding a census, ŕ is set to 0. The end of the period corresponding to the census is set to ŕ = 5. The evolution of the female population and that of the children is given by: E[dP(x,X,t)\F(t)} = -q(x,X,t)P(x,X,t)dt (x = 15,20, ...,45) E[dP(0, X, t)\F(t)] = J2 T(*> x> t)P(x, X, t)dt -q(0,X,t)P(0,X,t)dt where F(t) is the history of the process until the time t. To introduce mortality, we write P(x,X,t) = E[P(x,X,t)]. From the preceding equation we deduce, after having lifted the conditioning, that: dP(x, X, ŕ) = -q(x, X, r)P(x, X, ŕ) The coefficients X(x,X,t) and q(x,X,t) may vary over time under the effect, for example, of population ageing and the improvement of living conditions. To simplify the solving of the equation, we will admit that the use of average coefficients (at the mid-date of the period considered, and for the age of the population concerned at this date) gives a good approximation of the solution. Under this approximation, X(X,t), q(x,X,t) and f(x,t) are constant per piece (over each period of integration). This approximation is perhaps somewhat less suitable for the children of less than 5 years. Therefore the evolution of the parental population of the age class x, on average, follows the law: P(x,X,r) = P(x,X,0)e-^'x'r) Let us now introduce fertility, by calculating the average of P(0, X,t) which we will denote P(0,X,r). According to the second differential equation given above, one obtains the following equality: dP(0, X,t) = Y^ Kxi xi r)P0> x> ŕ)dŕ -q(0,X,t)P(0,X,t)dt By taking account of the initial condition P(0,X,r) = 0, the differential equation is easily Copyright © 2001 John Wiley & Sons, Ltd. Int. ]. Popul. Geogr. 1, 129-148 (2001) Fertility Transition in India 147 solved and the solution is then: P(0,X,t) = x(x,X,ŕ)P(x,X,ŕ) E q(0,X,t)-q(x,X,ť) (e-T[<,(x,X,t) _ e-l(0,X,t)]\ Of course, this expression must be replaced by its limit if two coefficients of mortality q(x,X,t) and q(0,X,t) are equal, which is not the case in general, for the mortality of the children is higher than that of the women aged 15-49 years. By replacing x(x,X,r) by l(X,t)f(x,t) as suggested by the model, and by taking P(0, X,T) as estimator of its average, where T corresponds to the end of the period, one obtains: 2(X, t) P(0,X,T) E f(x,T)P(x,X,T) q(0,X,T)-q(x,X,T) (l_eTll(x,XX)-q(0,X,T)h This estimator takes into account the missing frequencies due to mortality during tá, by weighting the enumerated frequencies of the age classes by the corresponding probabilities of survival. When the probabilities of death are negligible (q(x,X,T) — q(0,X,T) —> 0), the natural estimator is found to be: 1(X, t) P(0,X,T) TJ2f(xJ)P(x,X,T) APPENDIX B The Expectancy of the Estimator for the Space Factor Let us notice that by definition of the average fertility/at the moment ŕ: -i Wfa. -. N[oc f{x,t) = —YJ15] ET/(x,T)P(x,X,T) but E[P(0,X,T)|P(x,X,T)] = ]Tn(X,T)/(x,T)P(x,X,T) where, conditionally to the function X: E[1(X,T)|2] =Jl(X,T) It follows that the expectancy of the spatial average for the X(X, T) estimator is also unity. REFERENCES Barbujani G, Oden N, Sokal RR. 1989. Detecting regions of abrupt change in maps of biological variables. Systematic Zoology 38: 376-389. Bhat M. 1995. Age misreporting and its impact on adult mortality estimation in South Asia. Demography India 24: 59-80. Bhat M. 1996. Contours of fertility decline in India: a district level study based on the 1991 census. In Population Policy and Reproductive Health, Sriniva-san K (ed.). Hindustan Publ: New Delhi; 96-117. Bhat M. 1998. Demographic estimates for post-independence India: A new integration. Demography India 27: 23-57. Bhat M, Preston S, Dyson T. 1984. Vital Rates in India, 1961-1981. Report no. 24. Committee on Population and Demography: Washington DC National Academy Press. Bhat M, Zavier F. 1999. Findings of the National Family Health Survey: Regional analysis. Economic and Political Weekly XXXIV: 3008-3033. Bocquet-Appel JP, Bacro JN. 1994. Generalized wombling. Systematic Biology, 3: 316-329. Bocquet-Appel JP, Jakobi L. 1996. Barriers to the spatial diffusion for the demographic transition in Western Europe. In Spatial Analysis of Biodemographic Data, Bocquet-Appel JP, Courgeau D, Pumain D (eds). Congresses and Colloquium 16. Eurotext, John Libbey/Ined: Montrouge; 117-129. Bocquet-Appel JP, Jakobi L. 1997. Diffusion spatiale de la contraception en Grande-Bretagne, ä ľori-gine de la transition. Population 4: 977-1004. Bocquet-Appel JP, Jakobi L. 1998. Evidence for a spatial diffusion of contraception at the onset of the fertility transition in Victorian Britain. Copyright © 2001 John Wiley & Sons, Ltd. Int. J. Popul. Geogr. 7, 129-148 (2001) 148 F. Balabdaoui et al. Population, Special Issue, New Methodological Approaches in the Social Sciences: 181-204. Breiman L, Friedman J, Olshen R, Stone C. 1984. Classification and Regression Trees. CRC Press. Census of India. 1977. Age Tables. Census of India 1971, Office of the Registrar General and Census Commission 3: New Delhi. Census of India. 1982. Report on Post Enumeration Check. Office of the Registrar General and Census Commission 4: New Delhi. Census of India. 1994. Report on Post Enumeration Check. Office of the Registrar General and Census Commission 1: New Delhi. Chandra NK. 1980. Adjustment of age data for India's census population. Demography India IX: 274-285. Coale AJ, Treadway R. 1986. A summary of the changing distribution of overall fertility, marital fertility, and the proportion married in the provinces of Europe. In The Decline of Fertility in Europe: The Revised Proceedings of a Conference on the Princeton European Fertility Project. Coale Ansley J, Watkins Susan C (eds). Princeton University Press: Princeton; 31-181. Dyson T. 1976. Analysis and adjustment of the 1971 Indian age distribution, and a reappraisal of mortality and fertility estimates. Demography India 1: 71-92. Dyson T, Murphy M. 1985. The onset of mortality transition. Population and Development Review 11: 399-440. Marr D, Hildreth E. 1980. Theory of edge detection. Proceedings, Royal Society of London B207: 187-217. Matheron G. 1989. Estimating and Choosing. Springer Verlag: Berlin. Pannatier Y. 1996. Variowin, Software for Spatial Data Analysis in 2D. Springer Verlag: Berlin. Registrar General, India. 1989. SRS Based Abridged Life Tables 1981-85. Occasional Paper No. 1 of 1989. Office of the Registrar General: New Delhi. Registrar General, India. 1994. SRS Based Abridged Life Tables, 1986-90. Occasional Paper No. 1 of 1994. Office of the Registrar General: New Delhi. Registrar General of India. 1988. Child Mortality Estimates of India. Occasional Paper No. 5 of 1988. Office of the Registrar General: New Delhi. Registrar General of India. 1997. District Level Estimates of Fertility and Child Mortality for 1991 and their Interrelations with Other Variables. Occasional Paper No. 1 of 1997. Office of the Registrar General of India: New Delhi. Rele JR. 1987. Fertility levels and trends in India, 1951-81. Population and Development Review 13: 513-530. Visaria P. 1971. The provisional population totals of the 1971 census: some questions and research issues. Economic and Political Weekly 6: 1459-1465. Visaria P, Visaria L. 1997. Demographic transition: Accelerating fertility decline in the 1980s. In India's Demographic Transition. A Reassessment, Irudaya Rajan S (ed.). MD Publications: New Delhi; 245-279. Visaria P, Irudaya Rajan S. 1999. National Family Health Survey: A landmark in Indian surveys. Economic and Political Weekly XXXIV: 3002-3008. Wackernagel H. 1998. Multivariate Geostatistics. Springer Verlag: Berlin. Copyright © 2001 John Wiley & Sons, Ltd. Int. ]. Popul Geogr. 7, 129-148 (2001)