in I. Balderjahn, R. Mathar and M. Schader eds.: Data Highways and Information Flooding, a Challenge for Classi cation and Data Analysis, Springer 1997. 21st Annual Meeting of the Gesellschaft fur Klassi kation, GfKl'97, Potsdam, March 12 14, 1997. Unsupervised Fuzzy Classi cation of Multispectral Imagery Using Spatial-Spectral Features Rafael Wiemker II. Institut fur Experimentalphysik, Universitat Hamburg, KOGS Informatik, Vogt-Kolln-Str. 30, D-22527 Hamburg, Germany, http: kogs-www.informatik.uni-hamburg.de wiemker Abstract: Pixel-wise spectral classi cation is a widely used technique to produce thematic maps from remotely sensed multispectral imagery. It is commonly based on purely spectral features. In our approach we additionally consider additional spatial features in the form of local context information. After all, spatial context is the de ning property of an image. Markov random eld modeling provides the assumption that the probability of a certain pixel to belong to a certain class depends on the pixel's local neighborhood. We enhance the ICM algorithm of Besag 1986 to account for the fuzzy class membership in the fuzzy clustering algorithm of Bezdek 1973. The algorithm presented here was tested on simulated and real remotely sensed multispectral imagery. We demonstrate the improvement of the clustering as achieved by the additional spatial fuzzy neighborhood features. 1 Introduction Spectral classi cation is a widely used technique to produce thematic maps from remotely sensed multispectral imagery. The classi cation or labeling of each pixel relies on its spectrum or spectral signature, which consists of n spectral values of the spectral bands i = 1:::n of wavelength i. The similarity between an observed spectral vector x = :::; xi; ::: t 2 lRn and a given reference spectrum m is commonly evaluated by the spectral distance d = kx,mk in the feature space. The n-dimensional spectral feature space is spanned by the radiance or re ectance signals xi as received in the n spectral bands of the sensor or camera. Given the set of all observed spectral vectors x corresponding to the pixels of a particular multispectral image, unsupervised clustering algorithms are employed to nd k cluster centers in the spectral feature space lRn around which the observed spectra are scattered. Each pixel of the image can then be classi ed labeled to the class to which its spectral distance is minimal. Most classi cation techniques applied in multispectral remote sensing Richards 1993 rely on purely spectral features and consider only one pixel at a time. More recently, a method for utilizing additional contextual information from neighboring pixels has been derived from Markov random eld modeling Besag 1986. This `ICM-algorithm' has been shown to improve classi cation results on multispectral imagery Solberg et al. 1996. So far, this spectral-spatial labeling approach has been used in conjunction with supervised classi cation only, i.e., the reference classes were established from training data by the analyst. In this paper we describe the e ects of incorporating spatial context information into unsupervised clustering techniques such as the hard and fuzzy k-means algorithms. 2 Hard and Fuzzy Clustering with Spectral and Spatial Features All k-means algorithms or `migrating means'-algorithms work iteratively. Also, all k-means algorithms, as used in multispectral image classi cation, determine for each pixel the Euclidean distance dxjmc = qPixi ,mc;i2 between the spectral vector x and the respective mean spectrum mc of each class !c c = 1;:::;k in the spectral feature space which is spanned by the n spectral bands i of the imaging sensor i = 1;:::;n, where xi = xi. The hard k-means algorithm Ball and Hall 1967 assigns each pixel to the class !c to which the spectral distance dxjmc is minimal. Then the k cluster centers mc are recomputed as the means of all pixels which currently belong to the respective class. The process is repeated until convergence. The fuzzy k-means algorithm originally `c-means', Bezdek 1973 relies on a fuzzy membership pspecxj!c which is inversely proportional to the spectral distance dxjmc : pspecxj!c = d,1 xjmc Pc0 d,1 xjmc0  ; kX c=1 pspecxj!c = 1 : 1 The algorithm iterates two alternating steps: a updating the membership weights pspecxj!c, and b re-estimating the cluster centers mc =Px pspecxj!cx=Px pspecxj!c. In contrast to the hard k-means, all pixels are used for the computation of each cluster center, weighted with their respective fuzzy membership pspecxj!c. Convergence to a local minimum of a global objective function has been shown Bezdek 1981. Considering contextual image information,we use Markov random eld modelling, and assume that the conditional spatial probability pspatxj!c of pixel x depends only on the pixels x0 in its spatial neighborhood Nx Li 1995. As the neighborhood Nx we here use a ll window around pixel x except the pixel itself. For the interaction between neighboring pixels, Besag 1986 has suggested a neighborhood potential Ux = Px0 2Nx 1 , x;x0 , with x;x0 = 1 for equal classes !x = !x0, and 0 otherwise. In this paper we use a re ned `fuzzy' neighborhood potential Uxj!c, based on the current memberships P!cjx0 de ned in Eq. 4 of the neighboring pixels x0. The potential Uxj!c and then the spatial membership pspatxj!c are de ned as Uxj!c = X x02Nx 1 ,P!cjx0 2 pspatxj!c = 1 Ze, Uxj!c ; kX c=1 pspatxj!c = 1 3 where 0 is a factor to weight the in uence of the spatial context. The spatial membership pspatxj!c for class !c is large if the neighboring pixels x0 have large memberships P!cjx0 for the same class !c, and small if they tend to belong to other classes !c0 . Computation of the normalization constant Z is unnecessary here, as it cancels out in Eq. 4. The joint spectral-spatial membership P!cjx of pixel x to belong to class !c then is de ned to be : P!cjx = pspecxj!c  pspatxj!c Pc0 pspecxj!c0   pspatxj!c0  ; kX c=1 P!cjx = 1 : 4 Using the additionalspatial features, we again can perform hard and fuzzy kmeans clustering, depending on whether the cluster centers are re-estimated from only those pixels currently assigned to each cluster, or from all pixels using their current memberships as weights. 3 Results on Simulated and Remotely Sensed Multispectral Imagery In order to evaluate the e ect of the additional contextual memberships Eq. 3, we have simulated a test image with n = 2 spectral bands and k = 2 spectral classes !1 and !2 Fig. 1. The spectral vectors x = x1;x2 t of each class are scattered randomly around the two cluster centers m!1 and m!2, forming uncorrelated Gaussian distributions. In the original spectral feature space the two clusters Fig. 1, right, true cluster centers marked by crosses are almost indistinguishable by eye appraisal due to the extreme scatter. The root mean square scatter of both clusters is equal to the Euclidean distance between the cluster centers. A number of runs was performed with four di erent algorithms. The number of clusters k = 2 and random initial centers seed coordinates for the cluster centers were supplied. The resulting classi cation accuracies i.e., the relative number of correctly labeled pixels and the cluster center estimation accuracy mean relative deviation of coordinates between true and estimated class centers are given in Table 1. Typical classi cation results of each method are shown in Fig. 2. We observe that the fuzzy k-means performs slightly better in the estimation of the class center coordinates, but not in the classi cation. Also, the hard k-means is improved by the additional spatial features in classi cation, but not in cluster center accuracy. For fuzzy k-means with additional contextual memberships, however, the results indicate clearly that not only the classi cation results are improved, but that also the coordinates of the cluster centers are estimated with signi cantly improved accuracy. Good convergence of the iterative algorithm can be achieved by starting the iteration with = 0, i.e., without spatial in uence, and then increasing gradually towards = 1. For each intermediate -value, convergence is waited for before the spatial in uence is increased Fig. 4. Another interesting observation is that the classi cation and cluster center accuracy does not deteriorate when the number of classes k is over-estimated. We performed another series of runs with k = 3 and k = 4 and random cluster seeds provided. With a fuzzy k-means on purely spectral features this decreases the classi cation accuracy as well as the cluster center estimation Fig. 3. In contrast, with contextual fuzzy k-means the super uous classes are basically not populated at all, and the centers of the actually existing Gaussian distributions are found correctly. Classi cation and cluster center accuracy with over-estimated k = 4 clusters is indeed the same as for the correct k = 2 case Table 1, bottom line. The fuzzy clustering with purely spectral features on the one hand, and with spectral-spatial features on the other hand, was applied to remotely sensed multispectral scanner imagery  ight altitude 300 m with n = 10 spectral bands. The airborne line scanner of DAEDALUS, Inc., is operated on board of a Do 228 aircraft by the German Aerospace Research Establishment DLR. Some examplary classi cation results can be inspected in Fig. 5. The classi cation results which utilize both spectral and spatial features appear smoother and without grainy, pixel-size noise. Note that not only the pixelwise classi cation but also the cluster centers di er when utilizing additional spatial features. The such established clusters are clearly better suitable for thematic image segmentation see e.g. the forest areas in Fig. 5, top. test image 1. band true class image original spectral space 0 50 100 150 200 250 0 50 100 150 200 250 Figure 1: Simulated test image left with n = 2 spectral bands and k = 2 spectral classes center with rms scatter equal to the distance between the cluster centers in the spectral feature space right. hard k-means fuzzy k-means fuzzy k-means hard w context fuzzy w context fuzzy w context Figure 2: Typical classi cation results of the various algorithms left and center. On the right-hand side, estimated cluster centers are depicted in the spectral feature space. The path of the migrating means from random seeds is indicated by solid lines. For improved visualization, the original feature space was magni ed and the scatter reduced. classi cation accuracy cluster center accuracy algorithm correctly classi ed pixels deviation from true centers hard k-means 75 1 5 1.5 with context 91 1 5 1.5 fuzzy k-means 75 1 3.5 1 with context 99 1 0.3 0.1 Table 1: Results on simulated data. Accuracy of cluster center estimation and classi cation for various algorithms. Error margins are estimated from several runs with random cluster seeds. spectral spectral-spatial Figure 3: Typical classication results top, and cluster centers found bottom, with over-estimated k = 4 instead of the correct k = 2 for fuzzy kmeans clustering, on purely spectral left and spatial-spectral features right. On the right-hand side, estimated cluster centers are depicted in the spectral feature space. The path of the migrating means from random seeds is indicated by solid lines. For improved visualization, the original feature space was magni ed and the scatter reduced. 0 20 40 60 0.00 0.05 0.10 0.15 0.20 iterations 0.2 β = 0.0 0.4 0.6 0.8 1.0 deviationfromtrueclustercenters Accuracy vs. Rising Spatial Influence Figure 4: Typical iteration process. Convergence is achieved for each level of spatial in uence . The spatial inuence is raised gradually and yields increasing accuracy of the cluster center estimation. image =980 nm spectral spectral-spatial Figure 5: Multispectral aerial imagery of Nurnberg with n = 10 spectral bands 1 = 435nm, 10 = 2215nm, unsupervisedlyclassi ed by fuzzy k-means clustering into k = 4 classes recorded in 1995, 300m altitude, atmospherically corrected, various scales. 4 Conclusions We have tested the e ects of using spatial context features in addition to spectral features for unsupervised clustering and classi cation of multispectral imagery. Our observations with simulated and remotely sensed multispectral imagery can be summarized as follows:  The additional use of spatial features can signi cantly improve the classi cation labeling results of unsupervised clustering. The full bene ts of additional spatial features are experienced when used in conjunction with fuzzy clustering in contrast to hard clustering.  Moreover, also the accuracy of the cluster center coordinates as estimated by unsupervised clustering is signi cantly improved when using additional spatial features in conjunction with fuzzy k-means.  The additional spatial features avoid e ectively the deteriorating effect of over-estimating the number of clusters k. The cluster centers are found accurately by fuzzy k-means clustering even if the spectral feature space in fact contains fewer than k clusters of Gaussian distri- bution. References BALL, G. and HALL, D. 1967: A Clustering Technique for Summarizing Multivariate Data. Behavioral Sciences, 12, 153 155. BESAG, J. 1986: On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society B, 48, 3, 259 302. BEZDEK, J.C. 1973: Fuzzy Mathematics in Pattern Classi cation. PhD-thesis, Applied Math Center, Cornell University, Ithaca. BEZDEK, J.C. 1981: Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York. LI, S.Z. 1995: Markov Random Field Modeling in Computer Vision, Springer, Tokyo. RICHARDS, J.A. 1993: Remote Sensing Digital Image Analysis. Springer, Heidelberg, New York. SOLBERG, A.H.S., TAXT, T., and JAIN, A.K. 1996: A Markov Random Field Model for Classi cation of Multisource Satellite Imagery. IEEE Transactions on Geoscience and Remote Sensing, 34, 1, 100 134. spectral spectral-spatial Figure 5b: Multispectral aerial imagery of Nurnberg with n = 10 spectral bands 1 = 435nm, 10 = 2215nm, classi ed by fuzzy k-means clustering into k = 4 classes recorded in 1995, 300 m altitude, atmospherically corrected.