A comparison of pixel-based and object-based image analysis with selected machine
learning algorithms for the classiﬁcation of agricultural landscapes using SPOT-5
HRG imagery
Dennis C. Duro a,
⁎, Steven E. Franklin a,b
, Monique G. Dubé c
a
School of Environment and Sustainability, University of Saskatchewan, Saskatoon, Saskatchewan, Canada S7N 5C8
b
Environmental and Resource Studies/Geography Department, Trent University, 1600 West Bank Drive, Peterborough, Ontario, Canada K9J 7B8
c
Total E&P Canada Limited, Sustainability Division, #2900, 240-4th Ave SW, Calgary, Alberta, Canada T2P 4H4
a b s t r a c ta r t i c l e i n f o
Article history:
Received 11 February 2011
Received in revised form 17 November 2011
Accepted 24 November 2011
Available online 28 December 2011
Keywords:
Comparison
Object-based
Decision tree
Random forest
Support vector machine
Pixel-based and object-based image analysis approaches for classifying broad land cover classes over agricultural
landscapes are compared using three supervised machine learning algorithms: decision tree (DT), random
forest (RF), and the support vector machine (SVM). Overall classiﬁcation accuracies between pixelbased
and object-based classiﬁcations were not statistically signiﬁcant (p>0.05) when the same machine
learning algorithms were applied. Using object-based image analysis, there was a statistically signiﬁcant difference
in classiﬁcation accuracy between maps produced using the DT algorithm compared to maps produced
using either RF (p=0.0116) or SVM algorithms (p=0.0067). Using pixel-based image analysis,
there was no statistically signiﬁcant difference (p>0.05) between results produced using different classiﬁcation
algorithms. Classiﬁcations based on RF and SVM algorithms provided a more visually adequate depiction
of wetland, riparian, and crop land cover types when compared to DT based classiﬁcations, using either
object-based or pixel-based image analysis. In this study, pixel-based classiﬁcations utilized fewer variables
(15 vs. 300), achieved similar classiﬁcation accuracies, and required less time to produce than object-based
classiﬁcations. Object-based classiﬁcations produced a visually appealing generalized appearance of land
cover classes. Based exclusively on overall accuracy reports, there was no advantage to preferring one
image analysis approach over another for the purposes of mapping broad land cover types in agricultural environments
using medium spatial resolution earth observation imagery.
© 2011 Elsevier Inc. All rights reserved.
1. Introduction
The classiﬁcation of land use and land cover (LULC) from remotely
sensed imagery can be divided into two general image analysis approaches:
i) classiﬁcations based on pixels, and ii) classiﬁcations
based on objects. While pixel-based analysis has long been the mainstay
approach for classifying remotely sensed imagery, object-based
image analysis has become increasingly commonplace over the last
decade (Blaschke, 2010). Whether pixels or objects are used as underlying
units for the purposes of classifying remotely derived imagery,
the information contained within - and among - these units
(e.g., spectral, textural, etc.) can be subjected to a variety of classiﬁcation
algorithms. Previous comparative studies have been conducted
that examine the relative performance of different
classiﬁcation algorithms using pixel-based, and/or object-based
image analysis. A brief summary of selected comparisons is provided
below.
1.1. Algorithm comparisons using pixel-based or object-based
classiﬁcations
Using pixel-based based image analysis on Landsat Thematic Mapper
(TM) data, Huang et al. (2002) compared thematic mapping accuracies
produced using four different classiﬁcation algorithms: support
vector machines (SVMs), decision trees (DTs), a neural network classiﬁer,
and the maximum likelihood classiﬁer (MLC). Their results suggested
that the accuracy of SVM-based classiﬁcations generally
outperformed the other three classiﬁcation algorithms. Pal (2005) compared
the accuracies of two supervised classiﬁcation algorithms using
Landsat Enhanced Thematic Mapper (ETM+) data: SVMs and Random
Forests (RFs) (Breiman, 2001), and found that they performed equally
well. Gislason et al. (2006) compared a RF approach to a variety of
decision tree-like algorithms using pixel-based image analysis of
Landsat MSS data. They found that the selected tree-like algorithms
tested performed similarly, but that the RF algorithm outperformed
the standard implementation of Breiman et al.'s (1984) DTs;
Remote Sensing of Environment 118 (2012) 259–272
⁎ Corresponding author at: School of Environment and Sustainability, University of
Saskatchewan, Room 323, Kirk Hall, 117 Science Place, Saskatoon, Canada SK S7N
5C8. Tel.: +1 705 748 1011x6111.
E-mail address: dennis.duro@usask.ca (D.C. Duro).
0034-4257/$ – see front matter © 2011 Elsevier Inc. All rights reserved.
doi:10.1016/j.rse.2011.11.020
Contents lists available at SciVerse ScienceDirect
Remote Sensing of Environment
journal homepage: www.elsevier.com/locate/rse
however, their ﬁndings also showed that the RF algorithm performed
slightly less well than a modiﬁed DT algorithm (boosted
1R). Carreiras et al. (2006) examined several classiﬁcation algorithms,
which included standard DTs, quadratic discriminant analysis,
probability-bagging classiﬁcation trees (PBCT), and k-nearest
neighbors (K-NN) using pixel-based analysis of spatially coarse
(1 km pixels) SPOT-4 VEGETATION imagery. Their results, veriﬁed
by 10-fold cross-validation, showed that the PBCT algorithm produced
the best overall classiﬁcation accuracy. Brenning (2009) compared
eleven classiﬁcation algorithms using a pixel-based image
analysis, and Landsat ETM+imagery, for the detection of rock glaciers.
This extensive study found that penalized linear discriminant
analysis (PLDA) yielded signiﬁcantly better mapping results as compared
to all other classiﬁers, including both SVMs and RFs. Using
Landsat TM and ETM+data, Otukei and Blaschke (2010) compared
the MLC, SVM, and DT algorithms in a pixel-based approach, and
found DTs performed better than MLC and SVM. In an earlier study,
Laliberte et al. (2006) used an object-based approach on Quickbird
imagery to compare K-NN with DT algorithms. Their study found
that DTs produced better overall classiﬁcation accuracies than the
K-NN algorithm, but that the former was more difﬁcult to implement
as compared to the latter.
1.2. Algorithm comparisons between pixel-based and object-based
classiﬁcations
Relatively recent comparisons between the results of pixel-based and
object-based image analysis have also been conducted. For example, Yan
et al. (2006) compared pixel-based image analysis using MLC and objectbased
image analysis using K-NN on Terra Advanced Spaceborne Thermal
Emission and Reﬂection Radiometer (ASTER) imagery. In their study, the
authors claimed that the overall accuracy of the object-based K-NN classiﬁcation
drastically outperformed the pixel-based MLC classiﬁcation
(83.25% and 46.48%, respectively). Yu et al. (2006) used high spatial resolution
digital airborne imagery and compared a pixel-based classiﬁcation
based on MLC with an object-based classiﬁcation using K-NN,
using a DT as a mechanism for feature selection in both cases. Their
study showed that the 1-NN object-based classiﬁcation outperformed
the pixel-based MLC classiﬁcation by 17%, although calculation of the average
classiﬁcation accuracy of each of the 48 vegetation classes listed
was only 51% for the object-based K-NN classiﬁcation, and 61.8% for the
pixel-based classiﬁcation using MLC. Platt and Rapoza (2008) compared
K-NN and MLC for both pixel-based and object-based classiﬁcations,
with and without the addition of expert-based knowledge, using multispectral
IKONOS imagery. Their results revealed that the object-based
NN classiﬁcation using expert knowledge had the best overall classiﬁcation
(78%), while the best pixel-based classiﬁcation using MLC (without
expert knowledge) achieved an overall accuracy of 64%. CastillejoGonzález
et al. (2009) compared pixel-based and object-based classiﬁcations
in agricultural environments using multispectral Quickbird imagery
and a variety of classiﬁcation algorithms. The best pixel-based classiﬁcation
used non-pan-sharpened imagery and the MLC algorithm, while
the best purely object-based classiﬁcation used pan-sharpened imagery
and MLC, with both approaches achieving high overall accuracies of
89.6% and 93.69%, respectively. Their study also revealed that the two
best results, using non-pan-sharpened imagery and MLC, showed a
small difference in classiﬁcation accuracy between pixel-based and
object-based image analysis (89.60% and 90.66%, respectively); however,
the difference between these same approaches grew considerably when
using pan-sharpened imagery (82.55% and 93.69%, respectively). Myint
et al. (2011) used Quickbird imagery to classify urban land cover. They
compared results from a MLC pixel-based classiﬁcation with an objectbased
classiﬁer using K-NN and a series of fuzzy membership functions.
The object-based classiﬁcation (90.4%) outperformed the pixel-based
classiﬁcation (67.6%) in overall accuracy for their original image; however,
in their test image, the differences between the object-based and
pixel-based approaches was reduced to less than 10% (95.2 and 87.8%, respectively).
Finally, in a recent study, Dingle Robertson and King (2011)
compared pixel-based and object-based image analysis for classifying
broad agricultural land cover types for two time periods (1995 and
2005) using Landsat-5 TM imagery. They compared land cover maps produced
using MLC (pixel-based) and K-NN (object-based) algorithms and
found that the difference in overall accuracy between these classiﬁcation
approaches was not statistically signiﬁcant. Despite these ﬁndings, an intensive
visual analysis of their post-classiﬁcation analysis revealed that
the object-based classiﬁcation using K-NN depicted areas of change
more accurately than the pixel-based classiﬁcation using MLC.
In general, the above comparisons between pixel-based and
object-based classiﬁcations reveal that the latter typically outperform
the former when comparing overall classiﬁcation accuracy using a variety
of remotely sensed imagery in settings ranging from agricultural
to urban land cover classes. However, unlike the studies examining
either pixel-based or object-based classiﬁcations in isolation, many
comparison studies often rely on relatively simple classiﬁcation algorithms
(e.g., K-NN) for the object-based classiﬁcation, and probabilistic
based algorithms (e.g., MLC) for the pixel-based classiﬁcation, the
latter of which is less suited to datasets that are non-normally distributed,
or that contain categorical data (Franklin & Wulder, 2002). The
present study aims to bridge the gap between these previous comparisons
by examining both pixel-based and object-based classiﬁcation
approaches, with a selection of relatively modern and robust supervised
machine learning algorithms: decision trees (DTs), random forests
(RFs), and support vector machines (SVMs). We conduct a visual
and statistical assessment of the classiﬁcation outputs using medium
spatial resolution (10 m) multi-spectral imagery from the SPOT-5
HRG sensor. For the purposes of this study, six broad land cover classes
were mapped in a riparian area undergoing intensive agricultural
development in western Canada. We assessed each image analysis
approach, and each of the selected machine learning algorithms, for
their ability to accurately portray these selected land cover types.
Recommendations are made in the context of operational mapping
of agricultural landscapes for the purposes of general land cover mapping
and monitoring in agricultural environments using medium spatial
resolution earth observation imagery.
2. Study area
The study area is located along the South Saskatchewan River approximately
90 km east of the provincial borders of Alberta and Saskatchewan
(Fig. 1). Approximately 80 sq. km, the study area is a
subset of a much larger drainage basin selected for a long-term study
of land cover change and land use practices typical of the southern
half of the western Prairie Provinces of Canada. Similar large drainage
areas have been previously selected by others to assess potential impacts
caused by development on aquatic ecosystems over time (e.g.,
Squires et al., 2009), and represent an appropriate scale and unit of
measurement for conducting cumulative environmental effects assessments
on aquatic ecosystems (Dubé, 2003; Duinker & Greig, 2006;
Noble, 2008; Seitz et al., 2011). Indeed, over the past century, environmental
impacts in the region due to agricultural development has
replaced much of the native vegetation and has ﬁlled an estimated
40% of small wetland areas (Huel, 2000), facilitating the gradual introduction
of crops and improved pasture lands that dominate much of
the prairies today. The selected study area is typical of agricultural activity
conducted near riparian and wetland environments in the region.
Such environments have been linked to a range of species and environmental
processes, the ﬂow of nutrients between terrestrial and aquatic
ecosystems, and are the focus of best management practices for protecting
water quality in agricultural environments (Cooper et al., 1995;
Gordon et al., 2008; Gregory et al., 1991; Naiman & Décamps, 1997;
Thompson & Hansen, 2001; US EPA, 2005). Climate in the Prairie Ecozone
is characterized by long and cold winters, with summers being
260 D.C. Duro et al. / Remote Sensing of Environment 118 (2012) 259–272
relatively short, but often very warm. The region receives little precipitation
and is relatively dry as a result, with semi-arid regions existing in
the southern portions of the province (e.g., The Great Sand Hills).
3. Methods
3.1. Data sets and processing
3.1.1. Ancillary datasets
Several tiles of the Canadian Digital Elevation Data (CDED) digital
elevation model (DEM) were downloaded from the GeoBase online
spatial data portal (www.geobase.ca). At latitudes of less than 68°
N, the CDED DEM has a horizontal post spacing of approximately
23 m (North–south)×16–11 m (East–west). After projection into Albers
Equal Area Conic and nearest-neighbor resampling, the CDED
DEM was converted to square 16×16 m pixels. An Albers-Equal
Area Conic was selected as the ﬁnal projection for all data used in
this study due to known area and shape preserving characteristics
of this projection, and because using a standard Universal Transverse
Mercator projection would have spanned multiple zones, introducing
potential projection-related errors in the ﬁnal map output. Together
with elevation above sea level, slope and aspect, topographic features
(e.g., ridge, channel, plane) (Pike, 2000) were calculated from the
CDED DEM and included as variables in the classiﬁcation process.
Fig. 1. Study area situated over the South Saskatchewan River (Saskatchewan, Canada). Inset shows SPOT-5 10 m HRG false color image of study area (R = NIR, G = Red, B =
Green).
261D.C. Duro et al. / Remote Sensing of Environment 118 (2012) 259–272
Other ancillary datasets (e.g., road networks, geodetic monuments,
administrative boundaries, etc.) were downloaded from the
GeoSask online spatial data portal (www.geosask.ca), and used as
reference layers for geometric and orthographic corrections of satellite
imagery.
3.1.2. Remotely sensed imagery
Panchromatic (2.5 m) and multispectral (10 m) imagery from the
Système Pour l'Observation de la Terre (SPOT-5) satellite was
obtained from the Alberta Terrestrial Imaging Corporation (www.
imagingcenter.ca). The SPOT-5 imagery was collected on August 28,
2005. High resolution digital color aerial orthoimagery (60 cm pixels)
obtained in the same year as the SPOT-5 imagery was downloaded
from the Saskatchewan Geospatial Imagery Collaborative (www.
ﬂysask.ca) online data portal. The panchromatic imagery was orthorectiﬁed
using a rational polynomial coefﬁcient model of the SPOT-5
sensor and the CDED DEM mosaic, in conjunction with ground control
points obtained from ancillary layers (road network and geodetic
monuments). Image-to-map registration yielded a root-meansquare
error (RMSE) of 0.3 pixels using a 1st order polynomial transformation.
The multispectral imagery was then registered to the panchromatic
imagery, achieving an RMSE of less than 0.5 pixels using a
1st order polynomial transformation. A visual assessment conﬁrmed
that all image sources were aligned with ancillary data layers of
higher spatially accuracy (e.g., road network, quarter section plots,
etc.). The multispectral SPOT-5 scene was examined for a suitably
representative study area, and a 630×553 pixel subset (348,390
pixels) of the full SPOT-5 scene was then selected for analysis (Fig. 1).
Radiometric processing was applied to the SPOT-5 multispectral
imagery, and the Normalized Difference Vegetation Index (NDVI)
layer was computed and included in the analysis (Song et al., 2001).
Calibrated digital numbers (DNs) were ﬁrst converted to top-ofatmosphere
reﬂectance following procedures outlined by Chander
et al. (2009) with updated sensor calibration coefﬁcients for both
SPOT-5 HRG sensors provided by the Centre National d'Études Spatiales
(CNES, 2009), and updated exoatmospheric solar irradiance coefﬁcients
using the Thuiller spectrum (Thuillier et al., 2003) provided
by G. Chander (personal communication, Sept. 2010). Absolute atmospheric
correction of the imagery was not performed due to the lack
of simultaneously acquired ground based spectral data or appropriate
meteorological data available in the study area. Instead, a relative correction
using the Dark-Object Subtraction (DOS) method was used to
alleviate atmospheric scattering effects (Chavez, 1988). The second
angular moment texture measure, from computed co-occurrence matrices,
was calculated for each of the SPOT-5 multispectral bands and
NDVI layer. Texture measures have been found to increase overall
classiﬁcation accuracies using SPOT imagery (Franklin & Peddle
1990), and have been shown to improve the quality of the image segmentation
process (Ryherd & Woodcock, 1996).
The four bands of SPOT-5 multispectral imagery were placed in a
single data set along with the calculated NDVI layer, texture measures,
DEM, and related landscape variables. This combined data set,
or “image stack”, consisted of 15 individual layers, or predictor variables
(Table 1). Pixel-based variables were selected from this stack
based on previous experience in classifying land cover types in our
study area. The object-based classiﬁcation used several layers from
the pixel-based image stack as input to the image segmentation process,
and as input layers for the calculation of “object features” (see
Section 3.1.3 for details).
3.1.3. Image segmentation and object feature selection
Image segmentation represents a fundamental ﬁrst step in objectbased
image analysis, as the image objects (sensu stricto “image segments”)
resulting from this process form the basis of an object-based
image classiﬁcation (Castilla & Hay, 2008). In this study, image segmentation
was performed using the multi-resolution segmentation
(MRS) algorithm found in the 64-bit version of eCognition Developer
8 (Trimble, 2010a). The MRS algorithm uses a “bottom-up” image
segmentation approach that begins with pixel sized objects which
are iteratively grown through pair-wise merging of neighboring objects
based on several user-deﬁned parameters (scale, color/shape,
smoothness/compactness) that are weighted together to deﬁne a homogeneity
criterion; together, these parameters deﬁne a “stopping
threshold” of within-object homogeneity based on underlying input
layers, and thus deﬁne the size and shape of resulting image objects
(Baatz & Schäpe, 2000; Benz et al., 2004; Trimble, 2010b).
Of the parameters used by the MRS algorithm, the selection of an
appropriate value for the “scale” parameter is considered the most
important, as this value controls the relative size of the image objects,
which has a direct effect on the classiﬁcation accuracy of the ﬁnal
map (Kim et al., 2008; Myint et al., 2011; Smith, 2010). In general,
smaller values for the scale parameter produce relatively smaller
image objects, while larger values produce correspondingly larger objects.
An examination of the available literature reveals that a quantitative,
semi-automated approach for the selection of optimum values
for image segmentation parameters using genetic algorithms exists
(e.g., Bhanu et al., 1995), but that such semi-automated methods
are not yet fully implemented in mainstream image segmentation
software (e.g., Deﬁniens' eCognition; but, see Costa et al., 2008;
Drăgut et al., 2010). In this study, the selection of appropriate input
layers and values for individual parameters used by the MRS algorithm
was guided by previous experience and by using an iterative
“trial-and-error” approach often employed by others conducting
object-based image analysis (Dingle Robertson & King, 2011; Yan
et al., 2006; Mathieu et al., 2007; Myint et al., 2011; Yu et al., 2006).
The values for image segmentation parameters used in this study
are found in Table 2.
The image segmentation process was considered complete once
image objects were produced that visually corresponded to meaningful
real-world objects of interest. Image objects produced using the
smallest scale parameter (Fig. 2B) were sufﬁciently small enough to
delineate ﬁne scale features of interest within the study area such
as narrow channels of riparian vegetation, or fringes of wetland vegetation
located around pools of water. The two additional, coarser
image segmentation scales (Fig. 2C and D) were included in the
object-based classiﬁcation to depict larger objects of interest (e.g.,
crop ﬁelds). The use of image object information derived from multiple
image segmentation scales has been shown elsewhere to produce
Table 1
Image layers used in pixel-based classiﬁcations.
Spectral bands Vegetation index Landscape variables Texture measurea
Green NDVI Elevation Green
Red Slope (degrees) Red
NIR Aspect (degrees) NIR
SWIR Topgraphic classb
SWIR
NDVI
DEM
a
– “Angular second moment” texture calculated for the listed image layers.
b
– Topographic classes: Plain, Ridge, Channel (Pike, 2000).
Table 2
Parameter values used in multi-resolution segmentation (MRS) algorithm.
Image segmentation parametersa
Scale Color/shape Smoothness/
compactness
# of objects Median area of
objects (sq. m)
5 0.9/0.1 0.5/0.5 6583 9401
15 0.9/0.1 0.5/0.5 937 69243
30 0.9/0.1 0.5/0.5 273 241434
a
Image layers used: NDVI, DEM, and slope (weighted equally).
262 D.C. Duro et al. / Remote Sensing of Environment 118 (2012) 259–272
better overall classiﬁcation accuracies (Smith, 2010), and better classiﬁcation
accuracies for individual land cover classes (Myint et al.,
2011). Image objects produced at the ﬁnest image segmentation
scale served as the underlying building blocks, or “image segments”
(Castilla & Hay, 2008), used for the object-based classiﬁcation, although
information obtained from image objects produced at all
three image segmentation scales (Figs. 2B–D) was utilized in the
object-based classiﬁcations.
Following the image segmentation process, variables were selected
for use in the object-based classiﬁcation. The object-based image
analysis software used in this study refers to such variables as “object
features” (Trimble, 2010a), which is a term adopted throughout the
rest of the text when referring to variables used by object-based classiﬁcations.
Object features allow for contextual relationships between
image objects to be incorporated into the object-based image analysis.
For example, relationships between several smaller sub-objects
(e.g., groups of individual crops) contained within a single image object
(e.g., crop ﬁeld) produced using a larger image segmentation
scale, can be used for discriminating between land cover types
(Myint et al., 2011). In such cases, the information being considered
represents an “object texture feature” (see Table 3). Several types of
object features are available within the Deﬁniens eCognition software
and are described elsewhere (Trimble, 2010a).
Fig. 2. Comparison of image segmentation levels used in object-based classiﬁcation: A) SPOT-5 10 m HRG false color image of study area (R—NIR, G—Red, B—Green); B) Image segmentation
(MRS scale 5); C) Image segmentation (MRS scale 15); D) Image segmentation (MRS scale 30).
Table 3
Object features used in object-based classiﬁcations (adapted from Trimble, 2010a).
Object features a
Object layer features Description
Mean Mean value of an image object
Standard deviation Standard deviation of image object
Mean difference
to neighbors
The difference between mean values of an image
object and neighboring image objects.
Mean difference to scene The difference between the mean input layer value of
an image object and the mean input layer value of the
scene
Mean difference to
super-objects
The difference between the mean input layer value of
an image object and the mean input layer value of its
superobject. Distance of 1.
Std. dev. difference
to super-object
The difference of the std. dev. input layer value of an
image object and the std. dev. input layer value of its
super-object. Distance of 1.
Object texture features Description
Mean of sub-objects Standard deviation of the different input layer mean
values of the sub-objects. Distance of 1.
Avrg. mean diff to
neighbors of sub-objects
The contrast inside an image object expressed by the
average mean difference of all its sub-objects for a
speciﬁc input layer. Distance of 1.
a
Object features listed were calculated for each of the 15 image layers listed in
Table 1.
263D.C. Duro et al. / Remote Sensing of Environment 118 (2012) 259–272
Selecting object features for use in object-based image analysis
can be a subjective process based on past experience and user knowledge
(e.g., Laliberte et al., 2007), or one driven by a feature selection
algorithm prior to ﬁnal classiﬁcation (e.g., Yu et al., 2006; Van Coillie
et al., 2007). In this study, we relied on our past experience with conducting
object-based classiﬁcations in the study area to guide our selection
of object features (Table 3).
The total number of object-features considered in a multi-scale
object-based classiﬁcation can be considerable since information is calculated
per image object, and can be calculated at each segmentation
scale for each of the input layers. In this study, information based on
all 15 input layers (Table 1), 3 image segmentation scales (Table 2),
and 8 object features (Table 3) were used in the object-based classiﬁcation.
The total number of object features considered (360) in the
object-based image analyses was reduced to 300 as the calculation of
values for certain object features required that certain conditions are
met. For example, in this study, “object texture features” (Table 3)
were selected that calculate values for an individual image object
based on their underlying sub-objects, which are created at lower
image segmentation scales. However, image objects produced at the
ﬁnest image segmentation scale represent the ﬁnest level of detail,
and therefore cannot be used to calculate sub-object information.
The total number of object features available to the object-based
classiﬁcations greatly outnumbers the number of variables used in
pixel-based classiﬁcations (300 versus 15, respectively). The ability
to utilize and link information from image objects delineated at multiple
scales inherent in the underlying imagery is often presented as
one of the advantages of object-based image analysis (Blaschke,
2010). As such, multiple image segmentation scales were used for
object-based classiﬁcations, which is an approach that has been
adopted in several other recent studies comparing pixel-based and
object-based classiﬁcations (e.g., Yan et al., 2006; Myint et al., 2011;
Whiteside et al., 2011). While utilizing disparate numbers of potential
predictor variables may hamper a strict comparison between image
analysis approaches, it nonetheless represents a more typical comparison,
as object-based classiﬁcations often utilize multiple image
segmentation scales even if a single object-feature type is utilized
(e.g., mean layer value; see Table 3).
3.1.4. Sampling data, accuracy assessment, and map comparison
In this study, high spatial resolution aerial orthophotos and panchromatic
satellite imagery were used to collect ground reference
data, as contemporaneous ﬁeld-based samples were not available
within the selected study area. A stratiﬁed random sampling approach
was utilized in order to adequately sample land cover classes
of interest (e.g., narrow channels of riparian vegetation) that were
relatively underrepresented within the study area. An initial land
cover map produced using an unsupervised ISODATA clustering algorithm
was created to provide an initial stratiﬁcation of the study area.
Four multispectral bands from the SPOT-5 imagery were used to produce
the initial stratiﬁed classiﬁcation using 20 spectral classes. Six
broad land cover classes were selected for the purposes of this comparison
study: crop land, mixed grasslands, exposed rock/soil, riparian
and wetland vegetation, and water (cloud and shadow were not
present in the study area). The 20 spectral classes produced by the
ISODATA algorithm were grouped into the six selected land cover
types. Spectral classes remaining from the ISODATA classiﬁcation
that did not clearly ﬁt into the selected six land cover types were classiﬁed
as “no data” and excluded from further analysis. The generalized
ISODATA classiﬁcation was then converted into a polygon
based map and imported into a GIS for further analysis.
Using image objects produced at the ﬁnest segmentation scale
(Table 2), and the polygon-based ISODATA classiﬁcation, a stratiﬁed
random sample of image objects within the six land cover types
was performed. A total of 690 image objects were selected (115 per
land cover type). Image objects produced using the MRS algorithm
– even using small image segmentation scale values – can vary in
size considerably (see Table 2), and may contain more than a single
land cover type. As such, image objects were visually examined
using a combination of SPOT-5 panchromatic and multispectral
data, along with color aerial orthoimagery, to assess the homogeneity
of the land cover types present within individual image objects. Those
image objects that contained more than one of the six broad land
cover types were rejected, leaving 679 samples in total. These samples
were then split into training and testing set using proportional
stratiﬁed random sampling, which allowed for both sets of data to retain
the overall class distributions of the six selected land cover types
present in the original data set. Approximately two-thirds of the samples
(437) were used to train the machine learning algorithms, reserving
approximately one-third (242) as a “hold-out” test set used
exclusively for accuracy assessment and statistical comparisons between
classiﬁcations. To clarify further, the test set was not used to
train or tune parameters associated with the machine learning algorithms
examined in this study. Model building and tuning of individual
parameters used by the machine learning algorithms was
accomplished through repeated k-fold cross-validation based on the
training data set only (see Section 3.2).
In order to obtain training and testing samples for the pixelbased
classiﬁcation that were commensurate with training and
testing image objects, a single point within each of the selected
image objects was randomly selected. As each of the image objects
used for training and testing were visually screened for land cover
homogeneity, any point within the image object would correspond
to the underlying land cover type already identiﬁed for the image
object. This procedure ensured that both the object-based and
pixel-based classiﬁcations used training and testing data gathered
from the same locations.
Two measures for assessing the accuracy of thematic maps classiﬁed
from remotely sensed imagery are commonly reported: i)
overall accuracy and ii) the Kappa coefﬁcient of inter-rater agreement
(Congalton, 1991; Congalton & Green, 1998). Overall accuracy
has the advantage of being directly interpretable as the
proportion of pixels classiﬁed corresponds to probabilities related
to a given thematic map's reported commission and omission accuracy
(Stehman, 1997), while the Kappa coefﬁcient has been used to
assess statistical difference between classiﬁcations (Congalton,
1991). Studies often assess the performance of multiple classiﬁcation
algorithms utilizing the same testing and training samples
(Foody, 2004). In such cases, the assumption that each classiﬁcation
was independently assessed is violated (Cohen, 1960) – i.e.,
that the number of the samples being compared are independent
– and therefore, a statistical comparison using Kappa coefﬁcient
values is unwarranted (Foody, 2004). In such circumstances, it
has been recommended that either a Monte Carlo permutation
test of related κ coefﬁcient values (McKenzie et al., 1996), or McNemar's
test for paired-sample nominal scale data (Agresti, 2002; Zar,
2009), be used to assess whether statistically signiﬁcant differences
between classiﬁcations exists (Foody, 2004). The latter approach
has been used by others to statistically compare object-based and
pixel-based classiﬁcations (e.g., Dingle Robertson & King, 2011;
Yan et al., 2006; Whiteside et al., 2011), and is therefore adopted
here for comparability.
For each classiﬁcation, a confusion matrix is presented, along with
its overall accuracy (i.e., the percentage of correctly classiﬁed land
cover types), and user's and producer's accuracy (Congalton &
Green, 1998). As recommended by others, overall accuracy measures
are reported using exact 95% conﬁdence intervals (Morissette &
Khorram, 1998; Foody, 2009). The McNemar test was used to assess
the following goals of comparison: 1) whether a statistically signiﬁcant
difference exists between pixel-based and object-based classiﬁcations
that utilize the same machine learning algorithm; and, 2)
whether a statistically signiﬁcant difference exists between different
264 D.C. Duro et al. / Remote Sensing of Environment 118 (2012) 259–272
machine learning algorithms when using either pixel-based or
object-based image analysis. The McNemar test was run without
Yates' correction for continuity for small sample sizes, as this correction
is generally not recommended (Foody, 2004; Zar, 2009). Both the
individual accuracy assessments and statistical comparisons are
based on the “hold out” test set.
3.2. Tuning of machine learning algorithm parameters
Model building, tuning, and accuracy assessments of were performed
using version 2.12 of the 64-bit version of R, a multiplatform,
open-source language and software for statistical computing
(R Development Core Team, 2010). Several add-on packages
were used within R for creating each of the machine learning algorithms
used in this study: decision tree (DT), random forest (RF),
and the support vector machine (SVM). Classiﬁcations based on DT
models used the Recursive PARTitioning or “rpart” package
(Therneau & Ripley, 2010), which is largely based on the classiﬁcation
and regression tree (CART) algorithm originally developed by
Breiman et al. (1984). The classiﬁcations built with the RF algorithm
used the “randomForest” package (Liaw & Wiener, 2002), which is
based on the original RF algorithm and software code developed by
Breiman and Cutler (Breiman, 2001; Breiman & Cutler, 2007). Classiﬁcations
using models based on the SVM algorithm (Cortes & Vapnik,
1995; Vapnik, 1998) used the “kernlab” package (Karatzoglou et al.,
2004).
All classiﬁcation models were developed using the “caret” package
within R (Kuhn, 2008), which allowed for a single consistent environment
for training each of the machine learning algorithms and tuning
their associated parameters. A repeated k-fold cross-validation
resampling technique was used to create and optimize classiﬁcation
models for both pixel-based and object-based classiﬁcations using
all three machine learning algorithms. Resampling by k-fold crossvalidation
begins by partitioning a sample into k subsamples of
roughly equal size, with k-1 subsamples used as a training set, and a
single subsample left out as a test set. Using this approach, a classiﬁcation
model using each of the three machine learning algorithms is
built using the training set and assessed against the single leftover
test set. This process is repeated k times (“folds”), whereby each of
the k subsamples serves a turn as a test set, ensuring that all subsamples
are used as part of the training and testing sets. Results for each
fold are then combined to select the model with the highest average
accuracy. Similar cross-validation techniques have been used by
others to compare the performance of multiple classiﬁers using
earth observation imagery (e.g., Friedl & Brodley, 1997; Huang et al.,
2002; Brenning, 2009, 2010).
Several adjustable “tuning parameters” used by each of the machine
learning algorithms to optimize classiﬁcation performance
were examined using 10-fold cross validation, which is the number
of folds recommended when comparing the performance of machine
learning algorithms (Kohavi, 1995). “Optimal” values for tuning parameters
were selected using three repetitions of a 10-fold crossvalidation
based on the original training data set, with the original
test removed completely from the cross-validation process (i.e., the
original test set was not used for training or tuning any of the classiﬁcation
models). Tuning parameters were considered optimized
based on classiﬁcation models that achieved the highest overall classiﬁcation
during the cross validation process. Speciﬁc details on tuning
parameters used by the three machine learning algorithms
examined in this study are listed in the following sections.
3.2.1. Decision Tree based models
For DT based classiﬁcations, several values were examined for the
“maximum depth” tuning parameter, which controls the maximum
depth of any single node in the tree. When using the “caret” package,
an initial DT model is ﬁt to all of the training data to obtain the
maximum depth of any node; this value is then used to obtain an
upper bound on values considered during subsequent model building
using cross validation (Kuhn, 2011). In general, using a larger maximum
depth value will allow for a relatively complex tree to be built,
with a potential increase in overall classiﬁcation accuracy, whereas
lower maximum depth values tend to build less complex trees, with
potentially lower overall classiﬁcation accuracies. By increasing the
number of branching nodes (i.e., decision rules), the DT algorithm is
capable of grouping a larger number of distinct observations present
within a dataset. By default, the “rpart” package uses 10-fold cross
validation of the training data to internally obtain classiﬁcation
error rates (Therneau & Ripley, 2010). When using “rpart” the appropriate
sized tree is obtained using the “1 SE rule” established by
Breiman et al. (1984), whereby the smallest-sized tree whose cross
validation error is within 1 standard error of the minimum cross validation
error is selected. The tree is then pruned using the “cost complexity”
(cp) value that corresponds to the size of tree found using the
“1 SE rule”. The cp parameter controls the condition at which noninformative
splits are pruned from the tree (Therneau & Ripley,
2010). Using the “caret” package, the default cp value (0.01) used
by the “rpart” package was maintained, and only the maximum
depth parameter was tuned for DT based classiﬁcations.
3.2.2. Random Forest based models
For random forest (RF) based classiﬁcations, the default number of
trees (500) was selected since values larger than the default are
known to have little inﬂuence on the overall classiﬁcation accuracy
(Breiman & Cutler, 2007). The other adjustable RF tuning parameter,
the mtry parameter, controls the number of variables randomly considered
at each split in the tree building process, and is believed to
have a “somewhat sensitive” inﬂuence on the performance of the RF
algorithm (Breiman & Cutler, 2007). For categorical classiﬁcations
based on the RF algorithm, the default value for the mtry parameter
is
ﬃﬃﬃ
p
p
, where p equals the number of predictor variables within a dataset
(Liaw & Wiener, 2002).
3.2.3. Support Vector Machine based models
Classiﬁcations based on the support vector machine (SVM) algorithm
used the radial basis function (RBF) kernel. Other kernels
were not considered in this study. The parameters used by the SVM
algorithm have been shown to inﬂuence overall classiﬁcation accuracy
(Burges, 1998). The two model tuning parameters for SVM models
using the RBF kernel in the “kernlab” package are “cost” (C) and
“sigma” (σ). Increasing the former leads to larger penalties for prediction
errors, which may produce an over-ﬁtted model (Alpaydin,
2004); whereas increasing the latter parameter affects the shape of
the separating hyperplane (Huang et al., 2002), which may also inﬂuence
overall classiﬁcation accuracy. An analytical method for directly
estimating σ from the training data has been implemented in the kernlab
package using the “sigest” function (Karatzoglou et al., 2004).
The “caret” package estimates an appropriate value for the σ parameter
using the sigest function by default; therefore, only the C parameter
was tuned when running the SVM algorithm with the RBF kernel
(Kuhn, 2011).
4. Results
4.1. Tuning of machine learning algorithm parameters
For DT-based classiﬁcations, values ranging from 1 to 8 were examined
for the “maximum depth” tuning parameter. Based on the
highest overall classiﬁcation accuracy (i.e., the percentage of correctly
classiﬁed samples) achieved by pixel-based and object-based models
(85.4% and 83.3%, respectively) a maximum depth value of 8 was selected
for both pixel-based and object-based classiﬁcations models.
Several values for the mtry tuning parameter (2–4, 6–8, 10–12, 14)
265D.C. Duro et al. / Remote Sensing of Environment 118 (2012) 259–272
were examined for the pixel-based RF classiﬁcation. For the pixelbased
RF classiﬁcation, the highest classiﬁcation accuracy value
(91.1%) was obtained with an mtry value of 7. A total of 10 mtry parameter
values (2, 35, 68, 101, 134, 167, 200, 233, 266, and 300)
were examined for the object-based RF classiﬁcations. Based on the
highest classiﬁcation accuracy obtained (93.1%), an mtry value of 68
was selected for the object-based RF classiﬁcation. For the pixelbased
and object-based classiﬁcations using the SVM algorithm, a
total of 10 values for the C parameter (0.25, 0.5, 1, 2, 4, 8, 16, 32, 64,
and 128) were examined. The value for the σ parameter was held
constant at 0.0928 for pixel-based classiﬁcations, and at 0.00361 for
object-based classiﬁcations. Pixel-based and object-based classiﬁcations
using the SVM algorithm (overall accuracy of 89.8% and 91.4%,
respectively) were obtained using C parameter values of 8 and 1, respectively.
Models with optimized tuning parameter values were
used to produce the subsequent image classiﬁcations, associated accuracy
assessments, and map comparisons.
4.2. Visual examination of thematic maps
Pixel-based and object-based image classiﬁcations using the three
examined machine learning algorithms are depicted in Figs. 3 and 4,
respectively. Post classiﬁcation clean up (e.g., pixel-based ﬁltering,
GIS-based adjustment of classes, etc.) of the ﬁnal thematic maps
was not performed. A visual overview of the pixel-based classiﬁcations
is presented ﬁrst, followed by object-based classiﬁcations, and
a comparison of outputs produced using both image analysis approaches
and all three machine learning algorithms.
4.2.1. Pixel-based classiﬁcations
For the pixel-based classiﬁcations (Fig. 3), the major visual difference
interpreted between thematic maps produced by the three different
algorithms was the amount of wetland or riparian land cover
depicted in the southern quarter of the study area. For tree-based
classiﬁcations (Fig. 3B and C), the south-western corner of the study
Fig. 3. Comparison of pixel-based classiﬁcations: A) SPOT-5 10 m HRG false color image of study area (R—NIR, G—Red, B—Green); B) Decision tree based classiﬁcation; C) Random forest
based classiﬁcation; D) Support vector machine based classiﬁcation.
266 D.C. Duro et al. / Remote Sensing of Environment 118 (2012) 259–272
area depicts riparian vegetation, whereas the map produced by the
SVM algorithm (Fig. 3D) depicts this area as dominated by mixed
grasslands dotted primarily with wetlands. A visual inspection of
this area using available high spatial resolution imagery and color
orthoimagery revealed that this area is predominantly covered in
vegetation typical of a mixed grasslands land cover type, although
small stream channels can be seen ﬁlled with vegetation, indicating
the presence of a riparian land cover class. Small areas of wetland
vegetation are also present in the high resolution imagery. Two predominant
patches of exposed rock/soil, shown as blue-white patches
on the left portion of Fig. 3A, are best classiﬁed by the SVM algorithm,
while both the RF and DT algorithms depict these areas with patches
of crop land. In general, while all three pixel-based classiﬁcations produced
a similarly speckled “salt-and-pepper” appearance, the DT and
RF based classiﬁcations showed noticeably less of this speckle in the
depiction of large crop land areas (e.g., see north-eastern corner of
Fig. 3C). Overall, the pixel-based classiﬁcation using the SVM algorithm
(Fig. 3D) appears to contain less speckle compared to the DT
and RF classiﬁcations. The classiﬁcation based on the SVM algorithm
appears to show fewer errors of commission in the classiﬁcation of
mixed grassland vegetation along the north-western area, especially
along channels containing riparian vegetation on the north side of
the river.
4.2.2. Object-based classiﬁcations
As with the pixel-based classiﬁcation, the major visual difference
interpreted between thematic maps produced using object-based
image analysis (Fig. 4), is in the relative amount of wetland, riparian
and mixed grassland land cover depicted in the southern half of the
study area. For tree-based classiﬁcations (Fig. 4B and C), the southern
half of the study area depicts larger patches of riparian vegetation,
whereas the SVM algorithm (Fig. 4D) depicts this area as predominantly
mixed grassland. The thematic maps based on DT and SVM algorithms
(Fig. 4B and C) show several noticeable errors of commission,
namely the misclassiﬁcation of riparian land cover as wetland within
the main river channel. All three object-based classiﬁcations
Fig. 4. Comparison of object-based classiﬁcations: A) SPOT-5 10 m HRG false color image of study area (R—NIR, G—Red, B—Green);); B) Decision tree based classiﬁcation; C) Random
forest based classiﬁcation; D) Support vector machine based classiﬁcation.
267D.C. Duro et al. / Remote Sensing of Environment 118 (2012) 259–272
misclassify small areas of riparian and exposed/rock soil land cover
located along the riverbank as mixed grasslands. The two objectbased
classiﬁcations using the RF and SVM algorithm show little indication
of commission error when classifying crop land alongside
riparian channels on the northern slope of the river channel,
whereas several patches of misclassiﬁed crop land are present in
this area of the object-based DT classiﬁcation map. Wetland vegetation
present in the northern part of the study area appears well
deﬁned by all three object-based classiﬁcation algorithms, although
several errors of commission are noticeable in large inundated
ﬁelds.
4.2.3. Visual comparison of pixel-based and object-based classiﬁcations
In general, all land cover maps show a reasonably accurate visual
depiction of the broad land cover types of interest in this area.
When the same machine learning algorithm is compared, both
pixel-based and object-based classiﬁcations showed similar patterns.
For example, the predominance of mixed grassland areas in the
southern portion of the study area was noticeably higher in pixelbased
and object-based classiﬁcations that utilized the SVM algorithm
when compared to classiﬁcations based on tree-based algorithms.
Wetland and riparian areas were generally well deﬁned,
although different algorithms and image analysis approaches differed
slightly in their speciﬁc depictions of these land cover types. Wetland
areas appeared to be best represented by the SVM based classiﬁcations,
particularly when using the object-based approach, which accurately
portrayed vegetation encircling areas of open water,
although this quality is present when using tree-based classiﬁcations
to varying degrees. Likewise, the depiction of riparian vegetation was
relatively consistent across approaches and algorithms, with pixelbased
classiﬁcations producing the most visually accurate depictions
along steep ridges and narrow channels. Crop land was best depicted
by object-based classiﬁcations due to the generalized appearance,
however the less speckled appearance of croplands using pixelbased
RF and DT based classiﬁcations were also considered adequate.
Pixel-based classiﬁcations based on RF and SVM algorithms produced
more visually accurate depictions of sand bars (exposed rock/soil
land cover type) in riparian areas than any of the object-based
classiﬁcations.
4.3. Accuracy assessment and statistical comparisons
An accuracy assessment was performed for each classiﬁcation produced
in this study to evaluate how well predictions based on the optimized
models, generated using repeated k-fold cross validation,
compared against the “hold-out” test data. Table 4 contains detailed
confusion matrices of classiﬁcation accuracies based on the test data.
Overall, both pixel-based and object-based classiﬁcations performed
similarly with respect to overall classiﬁcation accuracy. In
general, all land cover types achieved over 80% user's accuracy, with
the exception of wetland land cover types, which scored below 80%
when using pixel-based image analysis, or object-based image analysis
using the DT algorithm. Producer's accuracy for several land cover
types was relatively consistent for both pixel-based and object-based
classiﬁcations, but speciﬁc differences between machine learning algorithms
were apparent. For example, producer's accuracy for the
crop land cover type was consistently over 80% for both pixel-based
and object-based classiﬁcations, except when using the SVM classiﬁer,
where it decreased to 75% for both image analysis approaches. All
pixel-based classiﬁcations achieved a producer's accuracy of 77.27%
for wetland land cover types, while object-based classiﬁcations
using the RF and SVM algorithm achieved over 95% for this class.
Pixel-based classiﬁcations that utilized the DT algorithm had the lowest
overall classiﬁcation accuracy (87.6%), followed by SVM (89.26%),
and RF (89.67%) classiﬁcations (Fig. 5). The same general trend was
observed for object-based classiﬁcations, with the DT algorithm
obtaining the lowest overall classiﬁcation accuracy (88.84%), followed
by RF (93.39%) and SVM (94.21%) algorithms. Exact 95% conﬁdence
limits, calculated on the results obtained with the “hold-out”
test data set, reveal a wide variability and overlap in overall accuracy
reported between pixel-based and object-based classiﬁcations. Based
on these results, the lowest performing classiﬁcation model (pixelbased
DT) potentially scored within the range of the best performing
RF and SVM classiﬁcations (Fig. 5).
Based on a comparison between predictions made with optimized
classiﬁcation models built using repeated k-fold cross-validation (see
Section 4.1) and the “hold-out” test data, the McNemar test indicated
that the observed difference between pixel-based and object-based
classiﬁcations was not statistically signiﬁcant (p>0.05) when the
same machine learning algorithm was used (e.g., DT classiﬁcation
model using pixel-based or object-based image analysis). With pixelbased
image analysis, the observed difference in classiﬁcation accuracy
between all three machine learning algorithms was not statistically signiﬁcant
(p>0.05). For object-based classiﬁcations, a statistically significant
difference (p=0.05) in classiﬁcation accuracy between models
using DT and RF algorithms (p=0.01162), and DT and SVM algorithms
(p=0.006714) was observed. The difference in overall classiﬁcation accuracy
between object-based classiﬁcations utilizing the RF and SVM algorithms
was not statistically signiﬁcant (p>0.05).
5. Discussion
In general, classiﬁcations produced using either pixel-based or
object-based image analysis created similar and visually acceptable
depictions of the broad land cover classes present within the study
area. As expected, compared to the pixel-based classiﬁcations, the
object-based classiﬁcations offered a more generalized visual appearance
and more contiguous depiction of land cover, which perhaps
better represents how land cover interpreters and analysts actually
perceive the landscape (Stuckens et al., 2000). In some cases, the generalized
depiction of land cover classes produced by object-based
image analysis may account for an apparent preference for objectbased
classiﬁcations over slightly better performing pixel-based classiﬁcations
(e.g., Dorren et al., 2003). Nevertheless, additional processing
of pixel-based imagery, either prior to or after classiﬁcation, can
also produce similar generalized representations of land cover, so
such differences may in fact be largely trivial, at least when considering
the use of medium spatial resolution imagery (10–30 m pixels).
When comparing overall classiﬁcation accuracy (percentage of
classes correctly predicted), there is an apparently consistent, but
small (1–4%), improvement when using object-based image analysis
over pixel-based image analysis (see Table 4 and Fig. 5). However,
the large variability depicted by the exact 95% conﬁdence intervals
suggests that the sample size of the “hold-out” test data set (242)
was too small for assessing such differences; therefore, any apparent
trend reported here should be considered tentative. Deciding on a
sampling effort that is economically feasible and logistically possible,
with one that allows for statistically rigorous comparisons is a major
consideration in operational settings where resources are often limited
(Congalton, 1991). A sample size that is too large can waste valuable
resources that provide unnecessary precision, whereas a
sampling effort that is too small may not be capable of resolving
any statistically meaningful differences when comparing classiﬁcation
accuracies (Foody, 2009).
Despite the low sample size of the test set and associated wider
conﬁdence limits, the McNemar test revealed that, when utilizing
the same machine learning algorithm, the observed difference between
pixel-based and object-based classiﬁcation accuracy was not
signiﬁcant at the 5% level. The ﬁndings in this study suggest that, on
the basis of achieving better overall classiﬁcation accuracy for the application
described in this study, there is no statistical basis for preferring
pixel-based to object-based image analysis, when utilizing the
268 D.C. Duro et al. / Remote Sensing of Environment 118 (2012) 259–272
Table 4
Confusion matrices and associated classiﬁer accuracies based on test data. A = crop land, B = mixed grasslands, C = exposed rock/soil, D = riparian, E = water, F = wetland; Oa = overall classiﬁcation accuracy, Pa = producer's accuracy,
Ua = user's accuracy, CI = conﬁdence interval.
Pixel-based, decision tree Object-based, decision tree
A B C D E F Total Ua A B C D E F Total Ua
A 27 3 0 0 0 2 32 84.38% A 26 0 1 0 0 0 27 96.30%
B 1 60 1 5 0 3 70 85.71% B 1 63 1 1 1 3 70 90.00%
C 1 0 13 0 0 0 14 92.86% C 1 0 12 0 1 1 15 80.00%
D 3 4 0 72 0 0 79 91.14% D 3 4 0 80 1 2 90 88.89%
E 0 1 0 1 23 0 25 92.00% E 0 0 0 0 18 0 18 100.00%
F 0 1 0 4 0 17 22 77.27% F 1 2 0 1 2 16 22 72.73%
Total 32 69 14 82 23 22 242 Total 32 69 14 82 23 22 242
Pa 84.38% 86.96% 92.86% 87.80% 100.00% 77.27% Pa 81.25% 91.30% 85.71% 97.56% 78.26% 72.73%
Oa: 87.60% Oa: 88.84%
Lower 95% CI: 82.78% Lower 95% CI: 84.18%
Upper 95% CI: 91.48% Upper 95% CI: 92.52%
Pixel-based, random forest Object-based, random forest
A B C D E F Total Ua A B C D E F Total Ua
A 27 2 0 0 0 0 29 93.10% A 27 1 0 0 1 0 29 93.10%
B 1 61 1 0 0 3 66 92.42% B 0 65 1 0 0 1 67 97.01%
C 1 1 13 0 0 0 15 86.67% C 1 0 13 0 0 0 14 92.86%
D 3 3 0 80 0 2 88 90.91% D 3 3 0 82 0 0 88 93.18%
E 0 0 0 0 19 0 19 100.00% E 0 0 0 0 18 0 18 100.00%
F 0 2 0 2 4 17 25 68.00% F 1 0 0 0 4 21 26 80.77%
Total 32 69 14 82 23 22 242 Total 32 69 14 82 23 22 242
Pa 84.38% 88.41% 92.86% 97.56% 82.61% 77.27% Pa 84.38% 94.20% 92.86% 100.00% 78.26% 95.45%
Oa: 89.67% Oa: 93.39%
Lower 95% CI: 85.13% Lower 95% CI: 89.49%
Upper 95% CI: 93.20% Upper 95% CI: 96.17%
Pixel-based, support vector machine Object-based, support vector machine
A B C D E F Total Ua A B C D E F Total Ua
A 24 2 1 1 0 1 29 82.76% A 24 0 1 0 0 0 25 96.00%
B 4 63 2 0 1 1 71 88.73% B 3 68 1 0 0 0 72 94.44%
C 1 1 11 0 0 0 13 84.62% C 1 0 11 0 0 0 12 91.67%
D 2 1 0 81 0 3 87 93.10% D 3 1 0 82 0 0 86 95.35%
E 0 0 0 0 20 0 20 100.00% E 0 0 0 0 21 0 21 100.00%
F 1 2 0 0 2 17 22 77.27% F 1 0 1 0 2 22 26 84.62%
Total 32 69 14 82 23 22 242 Total 32 69 14 82 23 22 242
Pa 75.00% 91.30% 78.57% 98.78% 86.96% 77.27% Pa 75.00% 98.55% 78.57% 100.00% 91.30% 100.00%
Oa: 89.26% Oa: 94.21%
Lower 95% CI: 84.66% Lower 95% CI: 90.40%
Upper 95% CI: 92.86% Upper 95% CI: 96.80%
269D.C.Duroetal./RemoteSensingofEnvironment118(2012)259–272
same machine learning algorithm. In addition, when using pixelbased
image analysis, there was no statistically signiﬁcant difference
observed at the 5% level of signiﬁcance between classiﬁcation accuracies
achieved by any of the machine learning algorithms. These ﬁndings
are largely corroborated by the large overlap in conﬁdence
intervals depicted in Fig. 5. Nonetheless, when using object-based
image analysis, statistically signiﬁcant differences (pb0.05) were observed
for classiﬁcation accuracies achieved by SVM and RF algorithms
when compared to DT-based classiﬁcations. Unfortunately, the
McNemar test as implemented here cannot be used for one-sided hypothesis
testing (Foody, 2004), and the wide degree of overlap in the
95% conﬁdence intervals for overall accuracy (Fig. 5) suggests that deﬁnitively
asserting which classiﬁcation algorithm or image analysis approach
is capable of producing higher classiﬁcation accuracies would be
problematic based on the "hold-out" test set used in this study.
Other studies have indicated that both RF and SVM algorithms can
achieve similar overall classiﬁcation accuracies, which are typically
greater than those obtained using DT based algorithms. For example,
Pal (2005) found that both SVM and RF algorithms produced similar
classiﬁcation accuracies. Gislason et al. (2006) reported that RF
based models achieved higher classiﬁcation accuracies than those
produced by standard DT (i.e., DTs that did not utilize bagged or
boosting algorithms). These results differed from those reported by
Otukei and Blaschke (2010) who found that DTs generally performed
better than classiﬁcations produced using SVM. As with this study,
the previous examples were based on medium- and relatively
coarse-spatial resolution imagery (Landsat MSS, TM, ETM+) and
used similar broad land cover classes; however, these comparisons
relied on comparing overall classiﬁcation accuracy values (i.e., the
percentage of correctly classiﬁed samples) rather than using statistical
comparison as employed here and elsewhere (e.g., Foody 2009).
When comparing overall accuracies between object-based and
pixel-based classiﬁcations of Landsat-5 TM imagery, Dingle
Robertson and King (2011) found no statistical difference between
approaches. However, two studies (Yan et al., 2006; Whiteside
et al., 2011) found that differences in overall classiﬁcation accuracies
produced using object-based image analysis were statistically signiﬁcant
(p=0.001, and p=0.01, respectively) than pixel-based image
analysis, with both studies using medium spatial resolution EO imagery
(ASTER and SPOT-5 HRG, respectively). Contrary to the side-byside
comparison conducted in this study, these previous studies compared
different classiﬁers (e.g., MLC and K-NN) and image analysis
methods, making direct comparisons difﬁcult. Furthermore, as illustrated
in this study, examination of conﬁdence intervals around the
overall classiﬁcation accuracy assessments can reveal signiﬁcant
overlap in overall accuracies between image analysis approaches,
confounding the interpretation of two-sided tests of signiﬁcance
such as McNemar's test (Foody, 2009), which have also been used
in previous comparisons (e.g., Dingle Robertson & King, 2011; Yan
et al., 2006; Whiteside et al., 2011). Potential remedies include collecting
a larger “hold-out” test sample to assess whether the large
overlap in conﬁdence intervals would remain, along with an appropriate
means of testing a one-sided hypothesis for such a comparison.
Unfortunately, the collection and use of an adequately sized “holdout”
test set might be prohibitive to assemble for logistical or ﬁnancial
reasons, and would represent an “inefﬁcient use of data”, as these
data are, by deﬁnition, not utilized by the classiﬁer (Kohavi, 1995).
Implementing a repeated k-fold cross-validation, as illustrated in
this study, with a larger dataset may provide statistically rigorous results
without “wasting” data, while at the same time allowing for
one-sided hypothesis testing to be performed (e.g., Kuhn, 2008).
From a practical production standpoint, the setup and execution of
object-based classiﬁcations were more labor intensive as compared to
their pixel-based counterparts. Much of the difference in execution
time encountered was due to a lack of commercially available software
for image analysis that implemented the machine learning algorithms
examined in this study. This lack of a streamlined production environment
multiplied the number of software packages needed and the
amount of data transfers required. In addition, many of the present
comparisons between pixel-based and object-based classiﬁcations of
EO imagery in the available literature to date appear to rely on commercially
available software solutions that provide relatively outdated
and/or less advanced classiﬁcation methods. The present study, along
with others (e.g., Brenning, 2009, 2010), ﬁll this void by providing a
methodological basis for conducting statistically rigorous comparisons
between classiﬁcation outputs generated from EO imagery using freely
available open-source software (e.g., R Development Core Team, 2010).
Regardless of which software packages are used, differences in execution
time between pixel-based and object-based image analysis
still remain. For example, the time spent selecting object-based variables
(i.e., “object features”) is roughly similar to that involved in
selecting variables for a pixel-based classiﬁcation; however, the additional
time needed to select appropriate parameters for the underlying
image segmentation is not trivial, especially if the tasks include
mapping large overlapping scenes of imagery in an operational setting.
Future development and adoption of more quantitative approaches
for selecting optimal image segmentation parameters (e.g.,
Costa et al., 2008; Drăgut et al., 2010) will hopefully reduce the
time required for this important step, while at the same time producing
superior results to the qualitative trial-and-error methods that are
typically practiced now. In addition, faced with potentially hundreds
of object features from which to select, the use of more advanced feature
selection algorithms in object-based image analysis is gaining increasing
attention (e.g., Yu et al., 2006; Chan & Paelinckx, 2008).
Considered together, object-based image analysis will likely remain
more labor intensive compared to pixel-based image analysis, which
80%
84%
88%
92%
96%
100%
DT RF SVM
Overallaccuracy(percentcorrect)
Pixel-based Object-based
Fig. 5. Comparison of overall classiﬁcation accuracy (percent correct) of pixel-based and object-based classiﬁcations using three supervised machine learning algorithms: Decision
Tree (DT), Random Forest (RF), and Support Vector Machine (SVM). Results based on “hold-out” test set. Exact 95% conﬁdence intervals plotted.
270 D.C. Duro et al. / Remote Sensing of Environment 118 (2012) 259–272
is a factor that should be evaluated carefully when conducting image
analysis of EO imagery in an operational environment.
While classiﬁcation accuracy is an important attribute to consider,
in circumstances where there are few overall statistical differences
between image analysis approaches, other preferences may take precedence.
For example, the node-based decision logic diagrams of DT
based models may prove to be more preferable to users than achieving
potentially higher overall classiﬁcation accuracies using the RF algorithm.
While statistically signiﬁcant differences in overall
classiﬁcation accuracy were not observed in this study between
pixel-based and object-based image analysis when utilizing the
same machine learning algorithm, there may be other compelling
reasons for selecting one image analysis approach over another. For
example, object-based image analysis may prove to be more appropriate
in situations that rely on the logic of updating and backdating
image objects within a versatile GIS environment (e.g., Linke et al.,
2009; Linke & McDermid, 2011). As previously mentioned, end
users may prefer the generalized appearance of object-based classiﬁcation
maps as compared to pixel-based classiﬁcation maps, even
when pixel-based accuracy assessments are shown to be superior
(Dorren et al., 2003). Such examples illustrate that the selection of
an image analysis approach, or selection of an individual classiﬁcation
algorithm, may not always be driven by overall classiﬁcation
accuracy.
6. Conclusions
Classiﬁcation of EO imagery using pixel-based and object-based
image analysis was performed using three machine learning algorithms.
No statistical difference between object-based and pixel-based
classiﬁcations was found when the same machine learning algorithms
were compared. When conducting object-based image analysis, RF or
SVM algorithms produced classiﬁcation accuracies that were statistically
different compared to DT based algorithms. No statistical signiﬁcant
between pixel-based classiﬁcations were found. Based on visual assessments
and interpretation of land cover distribution, all classiﬁcations
were capable of depicting the broad land cover types selected for this
study with similar, and acceptable, classiﬁcation accuracies. More visually
adequate overall depictions of riparian, wetland, and crop land
cover types were attributed to RF and SVM based classiﬁcations, whereas
DT based classiﬁcations contained noticeably more omission and
commission errors in these classes. Object-based classiﬁcations were
comparatively more time consuming to produce than their pixelbased
counterparts. Based solely on overall classiﬁcation accuracy,
there appeared to be no advantage in selecting a particular image analysis
approach.
Funding
This research was supported by the Government of Saskatchewan's
Go Green Fund awarded to Dr. Monique Dubé, and by Dr. Steven
Franklin's Natural Science and Engineering Research Council of
Canada Discovery Grant.
Acknowledgments
The authors gratefully acknowledge the assistance of Gyanesh
Chander (SGT Inc.) for calculating the Thuillier solar spectrum calculations
for SPOT-5 HRG-1 and HRG-2 sensors; Claire Tinel (CNES,
France) for advice concerning the derivation of radiometric calibration
coefﬁcients for the SPOT-5 HRG sensors; researchers at Agriculture
and Agri-Food Canada (AAFC), the Saskatchewan Ministry of
the Environment (MoE), and the Saskatchewan Research Council
(Flysask.ca) for providing various data sets used in this study; and,
the constructive comments and recommendations by anonymous
peer reviewers that contributed greatly to improving the ﬁnal
manuscript.
References
Agresti, A. (2002). Categorical data analysis. : John Wiley and Sons.
Alpaydin, E. (2004). Introduction to machine learning. : MIT Press.
Baatz, M., & Schäpe, A. (2000). Multiresolution segmentation—an optimization approach
for high quality multi-scale image segmentation. In J. Strobl, T. Blaschke,
& G. Griesebner (Eds.), Angewandte Geographische Informationsverarbeitung XII
(pp. 12–23). Wichmann-Verlag, Heidelberg.
Benz, U. C., Hofmann, P., Willhauck, G., Lingenfelder, I., & Heynen, M. (2004). Multiresolution,
object-oriented fuzzy analysis of remote sensing data for GIS-ready information.
ISPRS Journal of Photogrammetry and Remote Sensing, 58(3–4), 239–258.
Bhanu, B., Lee, S., & Ming, J. (1995). Adaptive image segmentation using a genetic algorithm.
IEEE Transactions on Systems, Man, and Cybernetics, 25(12), 1543–1567.
Blaschke, T. (2010). Object based image analysis for remote sensing. ISPRS Journal of
Photogrammetry and Remote Sensing, 65(1), 2–16.
Breiman, L. (2001). Random Forests. Machine Learning, 45, 5–32.
Breiman, L., & Cutler, A. (2007). Random forests — Classiﬁcation description. : Random forests
Available at:. http://stat-www.berkeley.edu/users/breiman/RandomForests/cc_home.
htm [Accessed January 12, 2011]
Breiman, L., Friedman, J., Stone, C., & Olshen, R. (1984). Classiﬁcation and regression trees, Belmont,
California, U.S.A. : Chapman & Hall/CRC Available at:. http://www.amazon.ca/
exec/obidos/redirect?tag=citeulike09-20&path=ASIN/0412048418 [Accessed January
12, 2011]
Brenning, A. (2009). Benchmarking classiﬁers to optimally integrate terrain analysis
and multispectral remote sensing in automatic rock glacier detection. Remote Sensing
of Environment, 113(1), 239–247.
Brenning, A. (2010). Land cover classiﬁcation by multisource remote sensing: Comparing
classiﬁers for spatial data. In H. Locarek-Junge, & C. Weihs (Eds.), Classiﬁcation as a tool
for research (pp. 435–443). Berlin, Heidelberg: Springer Berlin Heidelberg Available at:
http://www.springerlink.com/content/x55n3g1314766146/ [Accessed October 4, 2011]
Burges, C. (1998). A tutorial on Support Vector Machines for Pattern Recognition. Data
Mining and Knowledge Discovery, 2(2), 121–167.
Carreiras, J. M. B., Pereira, J. M. C., Campagnolo, M. L., & Shimabukuro, Y. E. (2006).
Assessing the extent of agriculture/pasture and secondary succession forest in
the Brazilian Legal Amazon using SPOT VEGETATION data. Remote Sensing of Environment,
101(3), 283–298.
Castilla, G., & Hay, G. J. (2008). Image objects and geographic objects. In Blaschke
Thomas, S. Lang, & Geoffrey J. Hay (Eds.), Object-based image analysis
(pp. 91–110). Berlin, Heidelberg: Springer Berlin Heidelberg Available at:. http://
www.springerlink.com/content/g403k30318784w36/ [Accessed October 1, 2011]
Castillejo-González, I. L., López-Granados, F., García-Ferrer, A., Peña-Barragán, J. M.,
Jurado-Expósito, M., de la Orden, M. S, et al. (2009). Object- and pixel-based analysis
for mapping crops and their agro-environmental associated measures using
QuickBird imagery. Computers and Electronics in Agriculture, 68(2), 207–215.
Chan, J., & Paelinckx, D. (2008). Evaluation of Random Forest and Adaboost treebased
ensemble classiﬁcation and spectral band selection for ecotope mapping
using airborne hyperspectral imagery. Remote Sensing of Environment, 112(6),
2999–3011.
Chander, G., Markham, B. L., & Helder, D. L. (2009). Summary of current radiometric
calibration coefﬁcients for Landsat MSS, TM, ETM+, and EO-1 ALI sensors. Remote
Sensing of Environment, 113(5), 893–903.
Chavez, P. S., Jr. (1988). An improved dark-object subtraction technique for atmospheric
scattering correction of multispectral data. Remote Sensing of Environment, 24(3),
459–479.
CNES (2009). SPOT imagequality performances. Available at:. http://www.spotimage.com/
automne_modules_ﬁles/standard/public/p551_29f05cbeaf21f085aab8a439d6fb4e14
Performance_QI_Spot2009.pdf [Accessed January 11, 2011]
Cohen, J. (1960). A coefﬁcient of agreement for nominal scales. Educational and psychological
measurement, 20(1), 37–46.
Congalton, R. G. (1991). A review of assessing the accuracy of classiﬁcations of remotely
sensed data. Remote Sensing of Environment, 37(1), 35–46.
Congalton, R. G., & Green, K. (1998). Assessing the accuracy of remotely sensed data: Principles
and practices (1st ed.). : CRC Press.
Cooper, A. B., Smith, C. M., & Smith, M. J. (1995). Effects of riparian set-aside on soil
characteristics in an agricultural landscape: Implications for nutrient transport
and retention. Agriculture, Ecosystems & Environment, 55(1), 61–67.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3),
273–297.
Costa, G. A. O. P., Feitosa, R. Q., Cazes, T. B., & Feijó, B. (2008). Genetic adaptation of segmentation
parameters. In Thomas Blaschke, S. Lang, & Geoffrey J. Hay (Eds.), Object-based
image analysis (pp. 679–695). Berlin, Heidelberg: Springer Berlin Heidelberg Available
at:. http://www.springerlink.com/content/l7367p41j61715w5/ [Accessed July 27,
2011]
Dingle Robertson, L., & King, D. J. (2011). Comparison of pixel- and object-based classiﬁcation
in land cover change mapping. International Journal of Remote Sensing,
32(6), 1505–1529.
Dorren, L. K. A., Maier, B., & Seijmonsbergen, A. C. (2003). Improved Landsat-based forest
mapping in steep mountainous terrain using object-based classiﬁcation. Forest
Ecology and Management, 183(1), 31–46.
Drăgut, L., Tiede, D., & Levick, S. R. (2010). ESP: a tool to estimate scale parameter for
multiresolution image segmentation of remotely sensed data. International Journal
of Geographical Information Science, 24(6), 859.
271D.C. Duro et al. / Remote Sensing of Environment 118 (2012) 259–272
Dubé, M. G. (2003). Cumulative effect assessment in Canada: a regional framework for
aquatic ecosystems. Environmental Impact Assessment Review, 23(6), 723–745.
Duinker, P., & Greig, L. (2006). The impotence of cumulative effects assessment in Canada:
Ailments and ideas for redeployment. Environmental Management, 37(2), 153–161.
Foody, G. M. (2004). Thematic map comparison: Evaluating the Statistical signiﬁcance
of differences in classiﬁcation accuracy. Photogrammetric Engineering and Remote
Sensing, 70(5), 627–634.
Foody, G. M. (2009). Sample size determination for image classiﬁcation accuracy assessment
and comparison. International Journal of Remote Sensing, 30(20), 5273.
Franklin, S. E., & Peddle, D. R. (1990). Classiﬁcation of SPOT HRV imagery and texture
features. International Journal of Remote Sensing, 11(3), 551–556.
Franklin, S. E., & Wulder, M. A. (2002). Remote sensing methods in medium spatial resolution
satellite data land cover classiﬁcation of large areas. Progress in Physical Geography,
26(2), 173–205.
Friedl, M. A., & Brodley, C. E. (1997). Decision tree classiﬁcation of land cover from remotely
sensed data. Remote Sensing of Environment, 61(3), 399–409.
Gislason, P. O., Benediktsson, J. A., & Sveinsson, J. R. (2006). Random Forests for land
cover classiﬁcation. Pattern Recognition Letters, 27(4), 294–300.
Gordon, L. J., Peterson, G. D., & Bennett, E. M. (2008). Agricultural modiﬁcations of hydrological
ﬂows create ecological surprises. Trends in Ecology & Evolution, 23(4), 211–219.
Gregory, S., Swanson, F.., Mckee, W., & Cummins, K. (1991). An ecosystem perspective
of riparian zones. BioScience, 41(8).
Huang, C., Davis, L. S., & Townshend, J. R. G. (2002). An assessment of support vector
machines for land cover classiﬁcation. International Journal of Remote Sensing,
23(4), 725–749.
Huel, D. (2000). Managing Saskatchewan Wetlands: A landowner's guide. Available at:.
http://www.swa.ca/Publications/Documents/ManagingSaskatchewanWetlands.pdf
[Accessed November 11, 2011]
Karatzoglou, A., Smola, A., Hornik, K., & Zeileis, A. (2004). kernlab – An S4 Package for
Kernel Methods in R. Journal of Statistical Software, 11(9), 1–20.
Kim, M., Madden, M., & Warner, T. (2008). Estimation of optimal image object size for the
segmentation of forest stands with multispectral IKONOS imagery. Object-based image
analysis (pp. 293–307). Berlin, Heidelberg: Springer Berlin Heidelberg Available at:.
http://www.springerlink.com/content/u7741201m568u327/ [Accessed January 11,
2011]
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and
model selection. International joint conference on artiﬁcial intelligence (pp. 1137–1143).
Available at:. http://citeseerx.ist.psu.edu/viewdoc/summary? doi=10.1.1.48.529
Kuhn, M. (2008). Building predictive models in R using the caret package. Journal Of
Statistical Software, 28(5), 1–26.
Kuhn, M. (2011). The caret package. Available at:. http://cran.r-project.org/web/
packages/caret/vignettes/caretTrain.pdf [Accessed October 3, 2011]
Laliberte, A. S., Fredrickson, E. L., & Rango, A. (2007). Combining decision trees with hierarchical
object-oriented image analysis for mapping arid rangelands. Photogrammetric
Engineering and Remote Sensing, 73(2), 197–207.
Laliberte, A.S., Koppa, J.S., Fredrickson, E.L., Rango, A. (2006). Comparison of nearest
neighbor and rule-based decision tree classiﬁcation in an object-oriented environment.
In: IEEE international Geoscience and Remote Sensing Symposium Proceedings,
July 31-August 4, 2006, Denver, Colorado.
Liaw, A., & Wiener, M. (2002). Classiﬁcation and regression by randomForest. R News,
2(3), 18–22.
Linke, J., & McDermid, G. J. (2011). A conceptual model for multi-temporal landscape
monitoring in an object-based environment. Selected Topics in Applied Earth Observations
and Remote Sensing. IEEE Journal of, 4(2), 265–271.
Linke, J., McDermid, G. J., Laskin, D. N., McLane, A. J., Pape, A., Cranston, J., et al. (2009).
A disturbance-inventory framework for ﬂexible and reliable landscape monitoring.
Photogrammetric Engineering and Remote Sensing, 75(8), 981–995.
Mathieu, R., Aryal, J., & Chong, A. K. (2007). Object-based classiﬁcation of Ikonos imagery
for mapping large-scale vegetation communities in urban areas. Sensors, 7(11),
2860–2880.
McKenzie, D. P., Mackinnon, A. J., Péladeau, N., Onghena, P., Bruce, P. C., Clarke, D. M.,
et al. (1996). Comparing correlated kappas by resampling: Is one level of agreement
signiﬁcantly different from another? Journal of Psychiatric Research, 30(6),
483–492.
Morissette, J. T., & Khorram, T. A. (1998). Exact binomial conﬁdence interval for proportions.
Photogrammetric Engineering and Remote Sensing, 64.
Myint, S. W., Gober, P., Brazel, A., Grossman-Clarke, S., & Weng, Q. (2011). Per-pixel vs.
object-based classiﬁcation of urban land cover extraction using high spatial resolution
imagery. Remote Sensing of Environment, 115(5), 1145–1161.
Naiman, R., & Décamps, H. (1997). The ecology of interfaces: Riparian Zones. Annual
Review of Ecology and Systematics, 28(1), 621–658.
Noble, B. F. (2008). Strategic approaches to regional cumulative effects assessment: a
case study of the Great Sand Hills, Canada. Impact Assessment and Project Appraisal,
26, 78–90.
Otukei, J. R., & Blaschke, T. (2010). Land cover change assessment using decision trees,
support vector machines and maximum likelihood classiﬁcation algorithms. International
Journal of Applied Earth Observation and Geoinformation, 12(Supplement
1), S27–S31.
Pal, M. (2005). Random forest classiﬁer for remote sensing classiﬁcation. International
Journal of Remote Sensing, 26(1), 217.
Pike, R. J. (2000). Geomorphometry —Diversity in quantitative surface analysis. Progress
in Physical Geography, 24(1), 1–20.
Platt, R. V., & Rapoza, L. (2008). An evaluation of an object-oriented paradigm for land
use/land cover classiﬁcation. The Professional Geographer, 60(1), 87.
R Development Core Team (2010). R: A language and environment for statistical computing,
Vienna, Austria. Available at:. http://www.R-project.org/
Ryherd, S., & Woodcock, C. (1996). Combining spectral and texture data in the segmentation
of remotely sensed images. Photogrammetric Engineering and Remote Sensing,
62(2), 181–194.
Seitz, N. E., Westbrook, C. J., & Noble, B. F. (2011). Bringing science into river systems
cumulative effects assessment practice. Environmental Impact Assessment Review,
31(3), 172–179.
Smith, A. (2010). Image segmentation scale parameter optimization and land cover
classiﬁcation using the Random Forest algorithm. Journal of Spatial Science, 55(1),
69.
Song, C., Woodcock, C. E., Seto, K. C., Lenney, M. P., & Macomber, S. A. (2001). Classiﬁcation
and Change detection using Landsat TM data: When and how to correct atmospheric
effects? Remote Sensing of Environment, 75(2), 230–244.
Squires, A. J., Westbrook, C. J., & Dubé, M. G. (2009). An approach for assessing cumulative
effects in a model river, the Athabasca River Basin. Integrated Environmental
Assessment and Management (pp. 1)..
Stehman, S. V. (1997). Selecting and interpreting measures of thematic classiﬁcation
accuracy. Remote Sensing of Environment, 62(1), 77–89.
Stuckens, J., Coppin, P. R., & Bauer, M. E. (2000). Integrating contextual information
with per-pixel classiﬁcation for improved land cover classiﬁcation. Remote Sensing
of Environment, 71(3), 282–296.
Therneau, T. M., & Ripley, B. A. (2010). rpart: Recursive Partitioning. Available at:
http://CRAN.R-project.org/package=rpart
Thompson, W. H., & Hansen, P. L. (2001). Classiﬁcation and management of riparian
and wetland sites of the Saskatchewan prairie ecozone and parts of adjacent subregions.
Available at:. http://www.swa.ca/Publications/Documents/Classiﬁcation
ManagementRiparianWetlandSites.pdf [Accessed July 1, 2011]
Thuillier, G., Hersé, M., Labs, D., Foujols, T., Peetermans, W., Gillotay, D., et al. (2003).
The solar spectral irradiance from 200 to 2400 nm as measured by the SOLSPEC
spectrometer from the ATLAS and EURECA missions. Solar Physics, 214(1), 1–22.
Trimble (2010). eCognition® Developer 8.64.0 reference book. Available at:. http://
www.deﬁniens.com/ [Accessed January 11, 2011]
Trimble (2010). eCognition® Developer 8.64.0 user guide. Available at:. http://www.
deﬁniens.com/ [Accessed January 11, 2011]
US EPA (2005). National management measures to protect and restore wetlands and
riparian areas for the abatement of nonpoint source pollution. Available at:.
http://water.epa.gov/polwaste/nps/wetmeasures/index.cfm [Accessed July 26,
2011]
Van Coillie, F. M. B., Verbeke, L. P. C., & De Wulf, R. R. (2007). Feature selection by genetic
algorithms in object-based classiﬁcation of IKONOS imagery for forest mapping
in Flanders, Belgium. Remote Sensing of Environment, 110(4), 476–487.
Vapnik, V. (1998). Statistical learning theory. : Wiley-Interscience Available at:. http://
www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20&path=ASIN/0471030031
[Accessed July 28, 2011].
Whiteside, T. G., Boggs, G. S., & Maier, S. W. (2011). Comparing object-based and pixelbased
classiﬁcations for mapping savannas. International Journal of Applied Earth
Observation and Geoinformation, 13(6), 884–893.
Yan, G., Mas, J. F., Maathuis, B. H. P., Xiangmin, Z., & Van Dijk, P. M. (2006). Comparison
of pixel-based and object-oriented image classiﬁcation approaches-A case study in
a coal ﬁre area, Wuda, Inner Mongolia, China. International Journal of Remote Sensing,
27, 4039–4055.
Yu, Q., Gong, P., Clinton, N., Biging, G., Kelly, M., & Schirokauer, D. (2006). Object-based
detailed vegetation classiﬁcation with airborne high spatial resolution remote
sensing imagery. Photogrammetric Engineering and Remote Sensing, 72(7), 799–811.
Zar, J. H. (2009). Biostatistical analysis (5th ed.). : Prentice Hall.
272 D.C. Duro et al. / Remote Sensing of Environment 118 (2012) 259–272