Journal of Statistics Education, Volume 18, Number 3, (2010) Fun with the R Grid Package Lutong Zhou University of Western Ontario W. John Braun University of Western Ontario Journal of Statistics Education Volume 18, Number 3 (2010) www.amstat.org/publications/jse/v18n3/zhou.pdf Copyright c 2010 by Lutong Zhou and W. John Braun all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the authors and advance notification of the editor. Key Words: Cave plot; Fire data; grob; gTree; gList; Viewport Abstract The increasing popularity of R is leading to an increase in its use in undergraduate courses at universities (R Development Core Team 2008). One of the strengths of R is the flexible graphics provided in its base package. However, students often run up against its limitations, or they find the amount of effort to create an interesting plot may be excessive. The grid package (Murrell 2005) has a wealth of graphical tools which are more accessible to such R users than many people may realize. The purpose of this paper is to highlight the main features of this package and to provide some examples to illustrate how students can have fun with this different form of plotting and to see that it can be used directly in the visualization of data. 1. Introduction There is increasing interest in teaching R at the undergraduate, and even early undergraduate level. Graphics is (or at least should be) featured prominently in elementary statistical 1 Journal of Statistics Education, Volume 18, Number 3, (2010) computing courses. Standard plotting using base graphics is relatively straightforward to learn, but students can find themselves running up against difficulties fairly quickly. For example, placement of titles and labels in standard formats is easy, but placing a label at an oblique angle might require some ingenuity. Displaying several plots on one page is also easy, using par(mfrow), and with effort, the margins around the panels can be controlled using par() settings such as oma, etc. Controlling the size, shape and location of the panels requires even more effort, if it is possible at all. These are among the many situations that students might face. An experienced R programmer might be able to handle them, but not beginners. We believe that grid provides a convenient way of producing certain kinds of plots and pictures. Perhaps more importantly, it provides the statistics student with a new perspective on graphics, and we hope to demonstrate in this article that producing plots in this new way can be an enjoyable experience as well. We have found that the grid package is an interesting and surprisingly simple way to do a lot of things that are either difficult or impossible using the more traditional base graphics. Initially, grid poses more of a challenge than base graphics, because there are a few key concepts which must be absorbed first. However, once those key concepts (or even just one of those key concepts, the viewport) are understood, students can construct graphs in new and surprising ways. The grid package is an R package, developed by Paul Murrell in the 1990s. Among other things, it gives us an easier way to produce plots at specified locations of the plotting region. Customized multiple plots can be produced more easily using grid. The first purpose of this paper is to provide a brief but sufficiently complete introduction to grid that could be used in an introductory statistical computing course. We will describe the very basic ideas of the grid package, introducing viewports, editing graphic objects (grobs), and demonstrating how grid can be an alternative to traditional graphics. The second purpose of the paper is to demonstrate that, in addition to what is already available in the lattice package, the grid package offers flexibility which can be exploited in a variety of ways to visualize data. 1.1 Using Viewports The grid package is loaded into R as follows: library(grid) The viewport is the central feature of the grid package. It gives us a rectangular region which is used to orient a plot. Figure 1 exhibits an example of a viewport inside the plotting 2 Journal of Statistics Education, Volume 18, Number 3, (2010) region: vp <- viewport(x=0.5,y=0.5,width=0.9, height=0.9) The vp object contains rules for how a viewport can be created. This one is centered at (.5, .5), with width .9 and height .9 relative to the graphics window. Nothing would actually appear on the computer screen, but we display the outline of the viewport in the left panel of Figure 1. Figure 1. Left panel: An empty viewport outline by dashed lines which do not actually appear if only the given code is used. Right panel: a circle plotted in the viewport vp using grid.circle. After we construct a viewport, we need to tell R to use it. This is done with the pushViewport() function. After a viewport has been “pushed”, a graphics window is created on the graphics device, if it wasn’t already there. When we type pushViewport(vp) the viewport, vp, becomes the focus of our current plotting. Note that the dashed lines which outline the viewport would normally not appear. By default, the coordinates of the lower left corner of the viewport are (0,0), and the upper right corner has coordinates (1,1). As a first example, we will draw a circle of radius 0.3 centered at (0.6, 0.4), after outlining the viewport with a rectangle whose sides are drawn with dashed line segments: pushViewport(vp) # a rectangle (with dashed lines) on the border of the viewport: grid.rect(gp=gpar(lty="dashed")) # a circle centered at (.6,.4) with radius .3: grid.circle(x=0.6, y=0.4, r=0.3) The result is shown in the right panel of Figure 1. 3 Journal of Statistics Education, Volume 18, Number 3, (2010) Another example is displayed in Figure 2. This can be drawn by calling the function stickperson() whose code is displayed here: stickperson <- function() { grid.circle(x=.5, y=.8, r=.1, gp=gpar(fill="yellow")) grid.lines(c(.5,.5), c(.7,.2)) # vertical line for body grid.lines(c(.5,.7), c(.6,.7)) # right arm grid.lines(c(.5,.3), c(.6,.7)) # left arm grid.lines(c(.5,.65), c(.2,0)) # right leg grid.lines(c(.5,.35), c(.2,0)) # left leg } The code uses the gp argument when drawing the circle; this argument takes a large number of graphical parameters which are specified by gpar(). This is the grid version of the par() function used in base graphics. Many, but not all, of the parameters are the same as in base graphics. Here, we have used the fill parameter in order to colour the interior of the circular head yellow. The built-in grid.lines() function is also used; this constructs line segments in the usual 1×1 viewport. The first segment is drawn from (.5,.7) to (.5,.2), the second segment is drawn from (.5,.6) to (.7,.7), and so on. Figure 2. A simple stick-person. Besides plotting within a viewport, we can construct and push additional viewports within a previously pushed viewport. For example, we can create a second (smaller) viewport within the existing viewport as shown in Figure 3: vp1 <- viewport(x=0.5, y=0.75, width=0.6, height=0.3) pushViewport(vp1) 4 Journal of Statistics Education, Volume 18, Number 3, (2010) The specifications on the new viewport (vp1) are relative to the viewport that it has been pushed into (i.e. vp) . For example, the width is .6 units relative to vp, but relative to the original plotting region (the 1 ×1 box), the width is .6×.9 = .54. We will refer to vp1 as the child of vp, and vp as the parent of vp1. Figure 3. A viewport within a viewport. We can change our focus from one viewport to another and back, using the pushViewport() and upViewport() functions. Figure 4. Relationships among parent and child viewports. Figure 4 shows a few of the possibilities. A parent viewport can have several child viewports. By using pushViewport(), our focus moves from the parent viewport to the child viewport. upViewport() moves the focus back to the parent viewport. 5 Journal of Statistics Education, Volume 18, Number 3, (2010) The commands downViewport() and popViewport() also change our focus. For example, after returning to vp, the command downViewport(vp1) moves us back from vp to vp1. It is also possible to push viewports repeatedly, allowing a given viewport to have not only children viewports, but grandchildren and great-grandchildren and so on. More details can be found in Murrell (1999). Having pushed vp1 (after pushing vp), we can construct a plot in vp1. In the left panel of Figure 5, we display a blue circle which is centered in vp1 (by default) and which has radius 0.5 (by default). Code for this is: grid.circle(gp=gpar(col="blue")) # plot the outline of vp1: grid.rect() Figure 5. Left Panel: a circle within the child viewport vp1, which was pushed inside the parent viewport vp; Right Panel: adding a circle in the parent viewport vp, after moving up from the child viewport vp1. To return our focus to vp, we type upViewport() As before, we draw another circle, this time in purple, as shown in the right panel of Figure 5. grid.circle(gp=gpar(col="purple")) # centered at (0.5, 0.5) with radius of 1 as default. This confirms that our focus has been moved back to the parent viewport. Note that if we had pushed a viewport twice in a row, we could use the upViewport() function twice to return to the original viewport, or we could use 6 Journal of Statistics Education, Volume 18, Number 3, (2010) upViewport(2) The argument in brackets determines the number of generations to move up the viewport tree. The following example shows that nesting viewports can be done repeatedly, and seemingly, indefinitely. Again, gridlines() is used to draw the diagonal line segments: the first segment is drawn from (.05,.95) to (.95,.05) and the second segment is drawn from (.05,.05) to (.95,.95). Next, a for loop is used to create a sequence of nested viewports, all of which have heights and widths which are 90% of the lengths of the heights and widths of their parents. A simple rectangle is drawn at each stage, resulting in the “tunnel” appearance displayed in the left panel of Figure 6. pushViewport(viewport()) grid.lines(c(.05, .95), c(.95, .05)) grid.lines(c(.05, .95), c(.05, .95)) for (i in 1:100) { vp <- viewport(h=.9, w=.9) pushViewport(vp) grid.rect() } Figure 6. Left Panel: The result of nesting 100 viewports within each other. Right Panel: Three stick-people walking through the tunnel, each scaled automatically corresponding to their location in the tunnel. 7 Journal of Statistics Education, Volume 18, Number 3, (2010) The right panel of Figure 6 shows 3 stick-people walking through the tunnel at various distances. Note that in order to draw the figure on the left, we have to push 100 viewports. Therefore, we can get back to the original plotting region by applying: upViewport(100) From here, we push 5 viewports before drawing the “nearest” stick-person. By pushing a viewport at x=.8, the person is drawn on the right side of the tunnel. The second stickperson is drawn at the 20th viewport, and on the left side of the tunnel. The last person is drawn at the 30th viewport in the center. for (i in 1:30) { vp <- viewport(h=.9, w=.9) pushViewport(vp) # person 1: if(i == 5) { pushViewport(viewport(x=.8)) stickperson() upViewport() } # person 2: if(i == 20) { pushViewport(viewport(x=.2)) stickperson() upViewport() } # person 3: if(i == 30) stickperson() } 1.2 Using Viewports to Display Data Figure 7 displays information about escape fires, based on a realistic, but hypothetical, data set. Every year, a certain proportion of wildfires cannot be controlled immediately and continue to grow, often rapidly. The following vector gives fairly realistic values for certain parts of North America. escape_prop cutoff], main=maintitle2, type="count", xlab=xlab) print(latticePlot2,newpage=FALSE) } A.3 Cave Plots The following function can be used to draw a single cave plot: ‘caveplot‘ <- function(a,b,atime,btime){ xrange <- range(c(atime, btime)) vp1 <- viewport(x=0.5, y=0.5, width=.9, height=.9, xscale=xrange, yscale=c(0, max(a)+max(b))) pushViewport(vp1) grid.rect() n <- length( a) m <- length( b) grid.segments( unit(atime,"native"), rep(0,n), unit(atime,"native"), unit(a, "native"),gp=gpar(col="blue")) grid.segments( unit(btime,"native"), unit(rep(max(a)+max(b),m),"native"), unit(btime,"native"), unit(max(a)+max(b)-b,"native"),gp=gpar(col="orange")) } The first two arguments specify the nonnegative time series to be compared in the plot. These are drawn as vertical segments protruding upward from the bottom and downward 30 Journal of Statistics Education, Volume 18, Number 3, (2010) from the top, respectively. The last two arguments specify the horizontal locations of the respective segments (i.e. the time indices for the two series). In Figure 18, we have used the above function to plot lightning and rain, but we have added in additional information on duff moisture code (DMC) and number of fires as well. The following function allows us to construct such a plot for a single district from the data set we are using. The width and height arguments in the following function control the size of the viewport that the cave plot will be drawn in. smallcaveplot <- function(district, width, height){ DIS <- subset(swf, District==district) DIS <- DIS[complete.cases(DIS),] vp1 <- viewport(x=unit(DIS$LONGITUDE[1], "native"), y=unit(DIS$LATITUDE[1], "native"), width=unit(width, "native"), height=unit(height, "native")) pushViewport(vp1) caveplot(DIS$RAIN,sqrt(DIS$NumStrikes), DIS$julian, DIS$julian) DIS1 <- subset(DIS, NumFiresRep!=0) # only draw a circle if there is # at least one fire reported if (length(DIS$julian)!=0){ if (dim(DIS1)[1] > 0){ grid.points(unit(DIS1$julian, "native"), DIS1$NumFiresRep, gp=gpar(col=2))}} grid.lines(unit(DIS$julian, "native"), unit(DIS$DMC,"native"), gp=gpar(col=3, lty=1)) grid.text(district, x=0.5, y=0.5) upViewport(2) } The multiple cave plots which are pictured in Figure 19 can be constructed using repeated calls to the above function. # call the function to produce the plots smallcaveplot("FOR", 4.5, 2.7) smallcaveplot("SAU", 4.5, 2.7) smallcaveplot("NOR", 4.5, 2.7) smallcaveplot("RED", 4.5, 2.7) smallcaveplot("HEA", 4.5, 2.7) A.4 The Ellipse Plots The code for Figure 21 is given in this subsection. First, we need the map: 31 Journal of Statistics Education, Volume 18, Number 3, (2010) data(ONTbound) xrange <- c(-98, -70) yrange <- c(40, 58) vp <- viewport(x=.5, y=.5, width=0.8, height=0.8, xscale=xrange, yscale=yrange) pushViewport(vp) grid.rect(gp=gpar(lty="dashed")) grid.xaxis() grid.yaxis() upViewport() pushViewport(viewport(x=.5, y=.5, width=0.8, height=0.8, xscale=xrange, yscale=yrange, clip="on")) grid.lines(unit(ONTbound$V1, "native"), unit(ONTbound$V2, "native"), gp=gpar(col="purple")) The following function draws an ellipse at a weather station’s longitude and latitude. Ellipseloc <- function(Dis){ swf070185 <- subset(swf, MONTH==7&DAY==1) # pick one day from the data set swf71<- subset(swf070185, District==Dis) # choose the weather station c1<- swf71$LONGITUDE c2<- swf71$LATITUDE vp1<- viewport(x=unit(c1, "native"), y=unit(c2, "native"), width=.1, height=.1) pushViewport(vp1) ellipse(h=swf71$WIND_SPEED/10, angle=swf71$WIND_DIR, gp1=gpar(col="red")) grid.text(Dis, x=0.5, y=0.5) #label the ellipse with #the weather station name upViewport() } Multiple calls to above function allow us to draw the ellipses at the weather station loca- tions: Ellipseloc("RED") Ellipseloc("FOR") Ellipseloc("THU") Ellipseloc("HEA") Ellipseloc("NOR") Ellipseloc("SAU") grid.text("Wind Speed", x=0.2, y=0.2) The following function allows us to draw a curve and its mirror image. This function is the basis of the ellipse() function used to construct the wind speed and direction plot described above. Syntax is similar to the gCurve() function (and the curve() function). 32 Journal of Statistics Education, Volume 18, Number 3, (2010) "grid.mirror" 0 with log=\"x\"") exp(seq(log(from), log(to), length = n)) } else seq(from, to, length = n) y <- eval(expr, envir = list(x = x), enclos = parent.frame()) 33 Journal of Statistics Education, Volume 18, Number 3, (2010) require(grid) l1 <- linesGrob(x = x, y = y, default.units = default.units, gp = gp1, vp = vp) l2 <- linesGrob(x = x, y = 2 * min(y) - y, default.units = default.units, gp = gp2, vp = vp) if (draw) { grid.draw(l1) grid.draw(l2) } if (default.units == "native" & return == TRUE) upViewport(1) invisible(gTree(children=gList(l1, l2), name=name)) } A.5 grid.identify() The grid.identify() function depends on the built-in function grid.locator() and is as follows: grid.identify <- function (x, y, labels, n=length(x),color=TRUE,col=seq(1,n)) { labelresult <- NULL if ( length(col) < n){col <- rep(col,length=n)} for ( i in 1:n){ nx <- length(x) locatedpoint <- grid.locator() distance2 <- (as.numeric(locatedpoint[[1]]) -x)ˆ2 + (as.numeric(locatedpoint[[2]]) -y)ˆ2 obsno <- seq (1,nx) [min(distance2)==distance2] if (is.factor(labels)){labels <- as.character (labels)} if(color){grid.points (x[obsno],y[obsno],gp=gpar(col=col[i]),pch=3)} labelresult <- c(labelresult,unique(labels[obsno])) } print(labelresult) } References Becker, R.A., Clark, L.A., and Lambert, D. (1994) “Cave plots: a graphical technique for comparing time series,” Journal of Computational and Graphical Statistics. 3 277–283. R Development Core Team (2008). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org. 34 Journal of Statistics Education, Volume 18, Number 3, (2010) Murrell, P. (2005) R Graphics. Chapman and Hall/CRC, Boca Raton, Florida. Murrell, P. (1999) “A mechanism for arranging plots on a page,” Journal of Computational and Graphical Statistics. 8 121-134. Lutong Zhou University of Western Ontario carly zhou@hotmail.com W. John Braun University of Western Ontario Volume 18 (2010) | Archive | Index | Data Archive | Resources | Editorial Board | Guidelines for Authors | Guidelines for Data Contributors | Guidelines for Readers/Data Users | Home Page | Contact JSE | ASA Publications| 35