R绘图基础

2021-10-30 23:01:17

一，布局

Ｒ绘图所占的区域，被分成两大部分，一是外围边距，一是绘图区域。

外围边距可使用par()函数中的oma来进行设置。比如oma=c(4,3,2,1)，就是指外围边距分别为下边距：4行，左边距3行，上边距2行，右边距1行。很明显这个设置顺序是从x轴开始顺时针方向。这里的行是指可以显示1行普通字体。所以当我们使用mtext中的line参数时，设置的大小就应该是[0,行数)的开区间。当我们使用mtext在外围边距上书写内容时，设置mtext中的outer=TRUE即可。

绘图区域可使用par()函数中的mfrow, mfcol来进行布局。mfrow和mfcol可以使用绘图区域被区分为多个区域。默认值为mfrow(1,1)。

比如mfrow(2,3)就是指将绘图区域分成2行3列，并按行的顺序依次绘图填充；
比如mfcol(3,2)就是指将绘图区域分成3行2列，并按列的顺序依次绘图填充；

我们将每一个细分的绘图区域分为两个部分，一是绘图边距，一是主绘图。

绘图边距需要容纳的内容有坐标轴，坐标轴标签，标题。通常来讲，我们都只需要一个x轴，一个y轴，所以在设置时，一般的下边距和左边距都会大一些。如果多个x轴或者y轴，才考虑将上边距或者右边距放大一些。绘图边距可以使用par()函数中mar来设置。比如mar=c(4,3,2,1)，与外围边距的设置类似，是指绘图边距分别为下边距：4行，左边距3行，上边距2行，右边距1行。很明显这个设置顺序是从x轴开始顺时针方向。行的概念与之前的相同。也可以使用mai来设置。mai与mar唯一不同之处在于mai不是以行为单位，而是以inch为单位。

SOUTH<-1; WEST<-2; NORTH<-3; EAST<-4;

 

GenericFigure <- function(ID, size1, size2)

{

  plot(0:10, 0:10, type="n", xlab="X", ylab="Y")

  text(5,5, ID, col="red", cex=size1)

  box("plot", col="red")

  mtext(paste("cex",size2,sep=""), SOUTH, line=3, adj=1.0, cex=size2, col="blue")

  title(paste("title",ID,sep=""))

}

 

MultipleFigures <- function()

{

  GenericFigure("1", 3, 0.5)

  box("figure", lty="dotted", col="blue")

 

  GenericFigure("2", 3, 1)

  box("figure", lty="dotted", col="blue")

 

  GenericFigure("3", 3, 1.5)

  box("figure", lty="dotted", col="blue")

 

  GenericFigure("4", 3, 2)

  box("figure", lty="dotted", col="blue")

}

 

par(mfrow=c(2,2),mar=c(6,4,2,1),oma=c(4,3,2,1))

 

MultipleFigures()

 

box("inner", lty="dotted", col="green")

box("outer", lty="solid", col="green")

 

mtext("Outer Margin Area (oma) of South: 6", SOUTH, line=1, cex=1, outer=TRUE)

 

plotline<-function(n,direc){

  for(i in 0:n){

    mtext(paste("line",i,sep=""), direc, line=i, cex=1, col="black", adj=1, outer=TRUE)

  }

}

plotline(4,SOUTH)

Ｒ绘图布局

但是，使用mfrow,mfcol只能是矩阵似的布局，如果我们需要简单地实际不规则的布局，那该怎么办呢？还有强大的layout()专门用于布局。

layout(mat, widths = rep(1, ncol(mat)), heights = rep(1,nrow(mat)),respect = FALSE)

其中，mat就是一个距阵，假设我们要画N个图，那么，mat就要描述每个图所在的位置，其中1…N-1都必须至少出现过一次。比如有三个图，我们希望的布局是第一排有一个图，第二排有两个图，那么mat<-matrix(c(1,1,2,3), nrow=2, ncol=2, byrow = TRUE)；如果希望第一排有两个图，第二排有一个图，那么就是mat<-matrix(c(1,2,3,3),nrow=2,ncol=2,byrow=TRUE)。很明显，还是将画布分成许多小格矩阵，这里就是2Ｘ2的，如果希望第1个图放入第一排的1，2格，那就在矩阵的第1，2的位置写入1，如果是希望第2个图放在第一排的1，2格，那就在矩阵的第1，2的位置写入2。

> attach(mtcars)

> layout(matrix(c(1,1,2,3), 2, 2, byrow = TRUE))

> hist(wt)

> hist(mpg)

> hist(disp)

layout不规则布局0

那么参数中的widths和heights又是做什么用的呢？它们就是用来指定每行或者每列的宽度和高度的。我们可以做以下的比较：

> attach(mtcars)

> layout(matrix(c(1,1,1,1,2,2,2,3,2,2,2,3),3,4,byrow=TRUE))

> hist(wt)

> hist(mpg)

> hist(disp)

> layout(matrix(c(1,1,2,3), 2, 2, byrow = TRUE),

+   	 widths=c(3,1), heights=c(1,2))

> hist(wt)

> hist(mpg)

> hist(disp)

layout不规则布局1

我们发现，这两种方法得到的是相同的结果，在代码的书写上，第2种方法要更清晰一些。

想象一下，似乎每个图的位置必须是连在一起的小矩阵，如果，不构成连在一起小矩阵，会成为怎么样的一个结果呢？

> attach(mtcars)

> layout(matrix(c(1,1,2,1,1,1),nrow=2,ncol=3,byrow=T))

> hist(wt)

> hist(mpg)

layout内嵌图布局

如果需要大图，小图，内嵌图这layout()虽然很容易实现，但多少有点麻烦，不是那么随心所欲。还是回过头来使用par()吧。使用par()的中fig=在画布任意位置上画图。在使用fig参数时，需要把画布理解成左下角为坐标(0,0)，右上角为(1,1)的一个坐标系。fig=c(x1,x2,y1,y2)来设置该参，x1<x2,y1<y2，x1,y1定位绘图区的左下角，x2,y2定位绘图区的右上角。使用new=TRUE参数来确认是否在原画布上继续画，还重新在一张新画布上开始画。

> par(fig=c(0,0.85,0,0.85), new=F)

> plot(mtcars$wt, mtcars$mpg, xlab="Miles Per Gallon",

+   ylab="Car Weight")

> par(fig=c(0,0.8,0.55,1), new=TRUE)

> boxplot(mtcars$wt, horizontal=TRUE, axes=FALSE)

> par(fig=c(0.65,1,0,0.8),new=TRUE)

> boxplot(mtcars$mpg, axes=FALSE)

> mtext("Enhanced Scatterplot", side=3, outer=TRUE, line=-3)

> par(fig=c(0.4,0.75,0.4,0.7),new=TRUE,mar=c(2,2,0,0),mgp=c(1,.4,0),cex=1,cex.lab=0.7,cex.axis=0.7)

> hist(mtcars$mpg, main="")

par实现内嵌图

二，颜色

计算机保存及还原颜色时有多种方案，较为常用的是两个，RGB和HSV。R预设了657种颜色，可以通过colors()函数调用（或者英式拼写colours()）。比如我们常用的红，绿，蓝，

> colors()[c(552,254,26)]

[1] "red"   "green" "blue"

我们可以使用grep来调取我们感兴趣的颜色，

> grep("red",colors())

[1] 100 372 373 374 375 376 476 503 504 505 506 507 524 525 526 527 528 552 553

[20] 554 555 556 641 642 643 644 645

 

> colors()[grep("red",colors())]

[1] "darkred" "indianred" "indianred1" "indianred2"

[5] "indianred3" "indianred4" "mediumvioletred" "orangered"

[9] "orangered1" "orangered2" "orangered3" "orangered4"

[13] "palevioletred" "palevioletred1" "palevioletred2" "palevioletred3"

[17] "palevioletred4" "red" "red1" "red2"

[21] "red3" "red4" "violetred" "violetred1"

[25] "violetred2" "violetred3" "violetred4"

 

> colors()[grep("sky",colors())]

[1] "deepskyblue" "deepskyblue1" "deepskyblue2" "deepskyblue3"

[5] "deepskyblue4" "lightskyblue" "lightskyblue1" "lightskyblue2"

[9] "lightskyblue3" "lightskyblue4" "skyblue" "skyblue1"

[13] "skyblue2" "skyblue3" "skyblue4"

 

> SetTextContrastColor <- function(color)

+ {

+   ifelse( mean(col2rgb(color)) > 127, "black", "white")

+ }

> # Define this array of text contrast colors that correponds to each

> # member of the colors() array.

> TextContrastColor <- unlist( lapply(colors(), SetTextContrastColor) )

 

> # 1a. Plot matrix of R colors, in index order, 25 per row.

> # This example plots each row of rectangles one at a time.

> colCount <- 25 # number per row

> rowCount <- 27

> plot( c(1,colCount), c(0,rowCount), type="n", ylab="", xlab="",

+   axes=FALSE, ylim=c(rowCount,0))

> title("R colors")

>

> for (j in 0:(rowCount-1))

+ {

+   base <- j*colCount

+   remaining <- length(colors()) - base

+   RowSize <- ifelse(remaining < colCount, remaining, colCount)

+   rect((1:RowSize)-0.5,j-0.5, (1:RowSize)+0.5,j+0.5,

+     border="black",

+     col=colors()[base + (1:RowSize)])

+   text((1:RowSize), j, paste(base + (1:RowSize)), cex=0.7,

+     col=TextContrastColor[base + (1:RowSize)])

+ }

Ｒ颜色表

对于大多数理工出身的人来讲，理解颜色并不难，难的是如何选择一种或者多种理想的颜色，让绘图很漂亮。Ｒ当中有一个包RColorBrewer就可以为我们解决这个难题，其中预设了很多种颜色组合以供我们使用。

> library(RColorBrewer)

> display.brewer.all()

RColorBrewer中预设的颜色表

我们看到其中Set3和Paired设定了12种颜色，其余的有多有少，以9种居多。在使用其颜色时，使用brewer.pal(n, name)调用即可，其中n最小值为3最大值为每组预设值数组的长度。比如brewer.pal(12,Paired)

在绘图时，有如下颜色参数

参数	描述
col	绘图使用的颜色，许多函数接受一组颜色，并对不同的数据依次使用颜色。
col.axis	坐标轴字符颜色
col.lab	x,y坐标标记颜色
col.main	标题颜色
col.sub	副标题颜色
fg	绘图前景色，包括坐标轴，各类boxes
bg	绘图背景色

> require(graphics)

> par(col.axis="green",col.lab="blue",col.main="darkred",fg="white",bg="black")

> plot(cars,main="speed vs dist")

颜色设置示例

三，字体

字体参数如下：

参数	描述
font	字体描述，1正常，2加粗，3斜体，4加粗，斜体，5符号
font.axis	坐标轴字符描述
font.lab	坐标轴标记字体描述
font.main	标题字体描述
font.sub	副标题字体描述
ps	字体点阵大小，大约为1/72英寸。在使用时text size=ps*cex
cex	字体放大或者缩小多少倍
cex.axis	坐标轴字符大小
cex.lab	坐标轴标记字体大小
cex.main	标题字体大小
cex.sub	副标题字体大小
family	绘图字体。标准字体是”serif”,”sans”,”mono”,”symbol”。当然可以指定任何自己已有的字体库。但它是设备依赖的。

> fonts<-names(pdfFonts())

> cnt<-length(fonts)

> x<-cnt+10

> x<-1:x

> y<-x

> pdf("font.pdf")

> plot(x,y,type="n")

> for(i in 1:cnt){text(i,i*1.3,family=fonts[i],paste("font",fonts[i]),adj=0)}

> dev.off()

字体

四，符号与线形

画图中的符号由pch参数来控制。其描边色由col控制，填充色由bg控制。

> pchShow <-

+   function(extras = c("*",".", "o","O","0","+","-","|","%","#"),

+            cex = 3, ## good for both .Device=="postscript" and "x11"

+            col = "red3", bg = "gold", coltext = "brown", cextext = 1.2,

+            main = paste("plot symbols :  points (...  pch = *, cex =",

+                         cex,")"))

+   {

+     nex <- length(extras)

+     np  <- 26 + nex

+     ipch <- 0:(np-1)

+     k <- floor(sqrt(np))

+     dd <- c(-1,1)/2

+     rx <- dd + range(ix <- ipch %/% k)

+     ry <- dd + range(iy <- 3 + (k-1)- ipch %% k)

+     pch <- as.list(ipch) # list with integers & strings

+     if(nex > 0) pch[26+ 1:nex] <- as.list(extras)

+     plot(rx, ry, type="n", axes = FALSE, xlab = "", ylab = "",

+          main = main)

+     abline(v = ix, h = iy, col = "lightgray", lty = "dotted")

+     for(i in 1:np) {

+       pc <- pch[[i]]

+       ## 'col' symbols with a 'bg'-colored interior (where available) :

+       points(ix[i], iy[i], pch = pc, col = col, bg = bg, cex = cex)

+       if(cextext > 0)

+           text(ix[i] - 0.3, iy[i], pc, col = coltext, cex = cextext)

+     }

+   }

>

> pchShow()

符号

线形主要由lty和lwd来控制。lty: line type. lwd: line width.顾名思意，lty控制线的形状，而lwd控制线的粗细，默认值为1。设计成2表示两倍线宽。

线形

五，坐标轴，图例，标记与标题

六，各类图型

点，线，面

点：points，线：abline,lines,segments，面：box,polygon,polypath,rect，特殊的：arrows,symbols

points不仅仅可以画前文中pch所设定的任意一个符号，还可以以字符为符号。

> ## ------------ test code for various pch specifications -------------

> # Try this in various font families (including Hershey)

> # and locales.  Use sign=-1 asserts we want Latin-1.

> # Standard cases in a MBCS locale will not plot the top half.

> TestChars <- function(sign=1, font=1, ...)

+ {

+    if(font == 5) { sign <- 1; r <- c(32:126, 160:254)

+    } else if (l10n_info()$MBCS) r <- 32:126 else r <- 32:255

+    if (sign == -1) r <- c(32:126, 160:255)

+    par(pty="s")

+    plot(c(-1,16), c(-1,16), type="n", xlab="", ylab="",

+         xaxs="i", yaxs="i")

+    grid(17, 17, lty=1)

+    for(i in r) try(points(i%%16, i%/%16, pch=sign*i, font=font,...))

+ }

> TestChars()

> try(TestChars(sign=-1))

画点

abline可以由斜率和截距来确定一条直线，lines可以连接两个或者多个点，segments可以按起止位置画线。

> require(stats)

> sale5 <- c(6, 4, 9, 7, 6, 12, 8, 10, 9, 13)

> plot(sale5,new=T)

> abline(lsfit(1:10,sale5))

> abline(lsfit(1:10,sale5, intercept = FALSE), col= 4)

> abline(h=6, v=8, col = "gray60")

> text(8,6, "abline( h = 6 )", col = "gray60", adj = c(0, -.1))

> abline(h = 4:8, v = 6:12, col = "lightgray", lty=3)

> abline(a=1, b=2, col = 2)

> text(5,11, "abline( 1, 2 )", col=2, adj=c(-.1,-.1))

> segments(6,4,9,5,col="green")

> text(6,5,"segments(6,4,9,5)")

> lines(sale5,col="pink")

线段

box画出当前盒子的边界,polygon画多边形,polypath画路径,rect画距形。

> x <- c(1:9,8:1)

> y <- c(1,2*(5:3),2,-1,17,9,8,2:9)

> op <- par(mfcol=c(3,1))

> for(xpd in c(FALSE,TRUE,NA)) {

+   plot(1:10, main = paste("xpd =", xpd))

+   box("figure", col = "pink", lwd=3)

+   polygon(x,y, xpd=xpd, col="orange", lty=2, lwd=2, border="red")

+ }

> par(op)

> plotPath <- function(x, y, col = "grey", rule = "winding") {

+     plot.new()

+     plot.window(range(x, na.rm = TRUE), range(y, na.rm = TRUE))

+     polypath(x, y, col = col, rule = rule)

+     if (!is.na(col))

+         mtext(paste("Rule:", rule), side = 1, line = 0)

+ }

>

> plotRules <- function(x, y, title) {

+     plotPath(x, y)

+     plotPath(x, y, rule = "evenodd")

+     mtext(title, side = 3, line = 0)

+     plotPath(x, y, col = NA)

+ }

>

> op <- par(mfrow = c(5, 3), mar = c(2, 1, 1, 1))

>

> plotRules(c(.1, .1, .9, .9, NA, .2, .2, .8, .8),

+           c(.1, .9, .9, .1, NA, .2, .8, .8, .2),

+           "Nested rectangles, both clockwise")

> plotRules(c(.1, .1, .9, .9, NA, .2, .8, .8, .2),

+           c(.1, .9, .9, .1, NA, .2, .2, .8, .8),

+           "Nested rectangles, outer clockwise, inner anti-clockwise")

> plotRules(c(.1, .1, .4, .4, NA, .6, .9, .9, .6),

+           c(.1, .4, .4, .1, NA, .6, .6, .9, .9),

+           "Disjoint rectangles")

> plotRules(c(.1, .1, .6, .6, NA, .4, .4, .9, .9),

+           c(.1, .6, .6, .1, NA, .4, .9, .9, .4),

+           "Overlapping rectangles, both clockwise")

> plotRules(c(.1, .1, .6, .6, NA, .4, .9, .9, .4),

+           c(.1, .6, .6, .1, NA, .4, .4, .9, .9),

+           "Overlapping rectangles, one clockwise, other anti-clockwise")

>

> par(op)

> require(grDevices)

> ## set up the plot region:

> op <- par(bg = "thistle")

> plot(c(100, 250), c(300, 450), type = "n", xlab="", ylab="",

+      main = "2 x 11 rectangles; 'rect(100+i,300+i,  150+i,380+i)'")

> i <- 4*(0:10)

> ## draw rectangles with bottom left (100, 300)+i

> ## and top right (150, 380)+i

> rect(100+i, 300+i, 150+i, 380+i, col=rainbow(11, start=.7,end=.1))

> rect(240-i, 320+i, 250-i, 410+i, col=heat.colors(11), lwd=i/5)

> ## Background alternating  ( transparent / "bg" ) :

> j <- 10*(0:5)

> rect(125+j, 360+j,   141+j, 405+j/2, col = c(NA,0),

+      border = "gold", lwd = 2)

> rect(125+j, 296+j/2, 141+j, 331+j/5, col = c(NA,"midnightblue"))

> mtext("+  2 x 6 rect(*, col = c(NA,0)) and  col = c(NA,\"m..blue\"))")

多边形

路径

矩形

arrows用于画箭头，symbols用于画符号

> ## Note that  example(trees)  shows more sensible plots!

> N <- nrow(trees)

> with(trees, {

+ ## Girth is diameter in inches

+ symbols(Height, Volume, circles = Girth/24, inches = FALSE,

+         main = "Trees' Girth") # xlab and ylab automatically

+ ## Colours too:

+ op <- palette(rainbow(N, end = 0.9))

+ symbols(Height, Volume, circles = Girth/16, inches = FALSE, bg = 1:N,

+         fg = "gray30", main = "symbols(*, circles = Girth/16, bg = 1:N)")

+ palette(op)

+ })

符号

画圆

> library(plotrix)

> plot(1:5,seq(1,10,length=5),type="n",xlab="",ylab="",main="Test draw.circle")

>  draw.circle(2,4,c(1,0.66,0.33),border="purple",

+   col=c("#ff00ff","#ff77ff","#ffccff"),lty=1,lwd=1)

>  draw.circle(2.5,8,0.6,border="red",lty=3,lwd=3)

>  draw.circle(4,3,0.7,border="green",lty=1,lwd=1)

>  draw.circle(3.5,7,0.8,border="blue",lty=2,lwd=2)

圆

散点图及趋势线

一维点图使用dotchart函数。

> # Dotplot: Grouped Sorted and Colored

> # Sort by mpg, group and color by cylinder

> x <- mtcars[order(mtcars$mpg),] # sort by mpg

> x$cyl <- factor(x$cyl) # it must be a factor

> x$color[x$cyl==4] <- "red"

> x$color[x$cyl==6] <- "blue"

> x$color[x$cyl==8] <- "darkgreen"

> dotchart(x$mpg,labels=row.names(x),cex=.7,groups= x$cyl,

+   	 main="Gas Milage for Car Models\ngrouped by cylinder",

+    xlab="Miles Per Gallon", gcolor="black", color=x$color)

一维点图

二维散点图使用plot函数。直趋势线使用abline函数，拟合曲线在拟合后使用line函数绘制。

> attach(mtcars)

> plot(wt, mpg, main="Scatterplot Example",

+   	 xlab="Car Weight ", ylab="Miles Per Gallon ", pch=19)

> # Add fit lines

> abline(lm(mpg~wt), col="red") # regression line (y~x)

> lines(lowess(wt,mpg), col="blue") # lowess line (x,y)

点阵图

曲线

曲线使用lines函数。其type参数可以使用”p”,”l”,”o”,”b,c”,”s,S”,”h”,”n”等。

> x <- c(1:5); y <- x # create some data

> par(pch=22, col="blue") # plotting symbol and color

> par(mfrow=c(2,4)) # all plots on one page

> opts = c("p","l","o","b","c","s","S","h")

> for(i in 1:length(opts)){

+   heading = paste("type=",opts[i])

+   plot(x, y, main=heading)

+   lines(x, y, type=opts[i])

+ }

曲线

柱状图

普通的柱状图使用barplot函数。其参数horiz=TRUE表示水平画图，beside=TRUE表示如果是多组数据的话，在并排画图，否则原位堆叠画图。

> par(mfrow=c(1,2))

> counts <- table(mtcars$vs, mtcars$gear)

> barplot(counts, main="Car Distribution by Gears and VS",

+   xlab="Number of Gears", col=c("darkblue","red"),

+  	 legend = rownames(counts),horiz=TRUE)

> barplot(counts, main="Car Distribution by Gears and VS",

+   xlab="Number of Gears", col=c("darkblue","red"),

+  	 legend = rownames(counts), beside=TRUE)

bar图

柱状统计图使用hist函数。其breaks参数设置每组的范围。使用density函数可以拟合曲线。

> hist(mtcars$mpg, breaks=12)

> dens<-density(mtcars$mpg)

> lines(dens$x,dens$y*100,col="red")

柱状图

饼图

饼图使用pie函数。

> x<-table(mtcars$gear)

> pie(x,label=paste("gear=",rownames(x),sep=""))

饼图

3维饼图使用plotrix库中的pie3D函数。

> x<-table(mtcars$gear)

> pie3D(x,labels=paste("gear=",rownames(x),sep=""),explode=0.1)

3Ｄ饼图

箱线图

箱线图使用boxplot函数。boxplot中参数x为公式。R中的公式如何定义呢？最简单的 y ~ x 就是y是x的一次函数。好了，下面就是相关的符号代表的意思：

符号	示例	意义
+	+x	包括该变量
–	-x	不包括该变量
:	x:z	包括两变量的相互关系
*	x*z	包括两变量，以及它们之间的相互关系
/	x/z	nesting: include z nested within x
\|	x\|z	条件或分组：包括指定z的x
^	(u+v+w)^3	include these variables and all interactions up to three way
poly	poly(x,3)	polynomial regression: orthogonal polynomials
Error	Error(a/b)	specify the error term
I	I(x*z)	as is: include a new variable consisting of these variables multiplied
1	-1	截距：减去该截距

> boxplot(mpg~cyl,data=mtcars, main="Car Milage Data",

+   	 xlab="Number of Cylinders", ylab="Miles Per Gallon")

> boxplot(len~supp*dose, data=ToothGrowth, notch=TRUE,

+   col=(c("gold","darkgreen")),

+   main="Tooth Growth", xlab="Suppliment and Dose")

箱线图

如果我想在箱线图上叠加样品点，即所谓的蜂群图，如何做呢？

> source("http://bioconductor.org/biocLite.R")

> biocLite(c("beeswarm","ggplot2"))

> library(beeswarm)

> library(ggplot2)

> data(breast)

> beeswarm <- beeswarm(time_survival ~ event_survival,

+             data = breast, method = 'swarm',

+             pwcol = ER)[, c(1, 2, 4, 6)]

> colnames(beeswarm) <- c("x", "y", "ER", "event_survival")

>

> beeswarm.plot <- ggplot(beeswarm, aes(x, y)) +

+   xlab("") +

+   scale_y_continuous(expression("Follow-up time (months)"))

> beeswarm.plot2 <- beeswarm.plot + geom_boxplot(aes(x, y,

+   group = round(x)), outlier.shape = NA)

> beeswarm.plot3 <- beeswarm.plot2 + geom_point(aes(colour = ER)) +

+   scale_colour_manual(values = c("black", "red")) +

+   scale_x_continuous(breaks = c(1:2),

+ labels = c("Censored", "Metastasis"), expand = c(0, 0.5))

> print(beeswarm.plot3)

蜂群图

> require(beeswarm)

> data(breast)

>

> beeswarm(time_survival ~ event_survival, data = breast,

+         method = 'swarm',

+         pch = 16, pwcol = as.numeric(ER),

+         xlab = '', ylab = 'Follow-up time (months)',

+         labels = c('Censored', 'Metastasis'))

>

> boxplot(time_survival ~ event_survival,

+         data = breast, add = T,

+         names = c("",""), col="#0000ff22")

蜂群图

分枝树

> require(graphics)

> opar<-par(mfrow=c(2,1),mar=c(4,3,0.5,0.5))

> hc <- hclust(dist(USArrests), "ave")

> plot(hc,main="")

> plot(hc, hang = -1,main="")

> par(opar)

分枝树

文氏图

> library(limma)

> Y <- matrix(rnorm(100*6),100,6)

> Y[1:10,3:4] <- Y[1:10,3:4]+3

> Y[1:20,5:6] <- Y[1:20,5:6]+3

> design <- cbind(1,c(0,0,1,1,0,0),c(0,0,0,0,1,1))

> fit <- eBayes(lmFit(Y,design))

> results <- decideTests(fit)

> a <- vennCounts(results)

> print(a)

     x1 x2 x3 Counts

[1,]  0  0  0     89

[2,]  0  0  1     11

[3,]  0  1  0      0

[4,]  0  1  1      0

[5,]  1  0  0      0

[6,]  1  0  1      0

[7,]  1  1  0      0

[8,]  1  1  1      0

attr(,"class")

[1] "VennCounts"

> vennDiagram(a)

> vennDiagram(results,include=c("up","down"),counts.col=c("red","green"))

文氏图

七，综合应用

点阵图加两方向柱状图

> def.par <- par(no.readonly = TRUE) # save default, for resetting...

>

> x <- pmin(3, pmax(-3, rnorm(50)))

> y <- pmin(3, pmax(-3, rnorm(50)))

> xhist <- hist(x, breaks=seq(-3,3,0.5), plot=FALSE)

> yhist <- hist(y, breaks=seq(-3,3,0.5), plot=FALSE)

> top <- max(c(xhist$counts, yhist$counts))

> xrange <- c(-3,3)

> yrange <- c(-3,3)

> nf <- layout(matrix(c(2,0,1,3),2,2,byrow=TRUE), c(3,1), c(1,3), TRUE)

> #layout.show(nf)

>

> par(mar=c(3,3,1,1))

> plot(x, y, xlim=xrange, ylim=yrange, xlab="", ylab="")

> par(mar=c(0,3,1,1))

> barplot(xhist$counts, axes=FALSE, ylim=c(0, top), space=0)

> par(mar=c(3,0,1,1))

> barplot(yhist$counts, axes=FALSE, xlim=c(0, top), space=0, horiz=TRUE)

>

> par(def.par)

点阵＋柱状

带误差的点线图

> # Clustered Error Bar for Groups of Cases.

> # Example: Experimental Condition (Stereotype Threat Yes/No) x Gender (Male / Female)

> # The following values would be calculated from data and are set fixed now for

> # code reproduction

>

> means.females <- c(0.08306698, -0.83376319)

> stderr.females <- c(0.13655378, 0.06973371)

>

> names(means.females) <- c("No","Yes")

> names(stderr.females) <- c("No","Yes")

>

> means.males <- c(0.4942997, 0.2845608)

> stderr.males <- c(0.07493673, 0.18479661)

>

> names(means.males) <- c("No","Yes")

> names(stderr.males) <- c("No","Yes")

>

> # Error Bar Plot

>

> library (gplots)

>

> # Draw the error bar for female experiment participants:

> plotCI(x = means.females, uiw = stderr.females, lty = 2, xaxt ="n", xlim = c(0.5,2.5), ylim = c(-1,1), gap = 0, ylab="Microworld Performance (Z Score)", xlab="Stereotype Threat", main = "Microworld performance over experimental conditions")

>

> # Add the males to the existing plot

> plotCI(x = means.males, uiw = stderr.males, lty = 1, xaxt ="n", xlim = c(0.5,2.5), ylim = c(-1,1), gap = 0, add = TRUE)

>

> # Draw the x-axis (omitted above)

> axis(side = 1, at = 1:2, labels = names(stderr.males), cex = 0.7)

>

> # Add legend for male and female participants

> legend(2,1,legend=c("Male","Female"),lty=1:2)

误差图

误差绘图plotCI加强版是plotmeans,具体可使用?plotmeans来试用它。这里就不多讲了。

带误差的柱状图。还是使用gplots包

> library(gplots)

> # Example with confidence intervals and grid

> hh <- t(VADeaths)[, 5:1]

> mybarcol <- "gray20"

> ci.l <- hh * 0.85

> ci.u <- hh * 1.15

> mp <- barplot2(hh, beside = TRUE,

+         col = c("lightblue", "mistyrose",

+                 "lightcyan", "lavender"),

+         legend = colnames(VADeaths), ylim = c(0, 100),

+         main = "Death Rates in Virginia", font.main = 4,

+         sub = "Faked 95 percent error bars", col.sub = mybarcol,

+         cex.names = 1.5, plot.ci = TRUE, ci.l = ci.l, ci.u = ci.u,

+         plot.grid = TRUE)

> mtext(side = 1, at = colMeans(mp), line = 2,

+       text = paste("Mean", formatC(colMeans(hh))), col = "red")

> box()

带误差的柱状图

漂亮的箱线图

> require(gplots) #for smartlegend

>

> data(ToothGrowth)

> boxplot(len ~ dose, data = ToothGrowth,

+         boxwex = 0.25, at = 1:3 - 0.2,

+         subset= supp == "VC", col="yellow",

+         main="Guinea Pigs' Tooth Growth",

+         xlab="Vitamin C dose mg",

+         ylab="tooth length", ylim=c(0,35))

> boxplot(len ~ dose, data = ToothGrowth, add = TRUE,

+         boxwex = 0.25, at = 1:3 + 0.2,

+         subset= supp == "OJ", col="orange")

>

> smartlegend(x="left",y="top", inset = 0,

+             c("Ascorbic acid", "Orange juice"),

+             fill = c("yellow", "orange"))

箱线图

基因芯片热图，具体参考http://www2.warwick.ac.uk/fac/sci/moac/students/peter_cock/r/heatmap/

> library("ALL")

> data("ALL")

> eset <- ALL[, ALL$mol.biol %in% c("BCR/ABL", "ALL1/AF4")]

> library("limma")

> f <- factor(as.character(eset$mol.biol))

> design <- model.matrix(~f)

> fit <- eBayes(lmFit(eset,design))

> selected  <- p.adjust(fit$p.value[, 2]) <0.005

> esetSel <- eset [selected, ]

> color.map <- function(mol.biol) { if (mol.biol=="ALL1/AF4") "#FF0000" else "#0000FF" }

> patientcolors <- unlist(lapply(esetSel$mol.bio, color.map))

> library("gplots")

> heatmap.2(exprs(esetSel), col=redgreen(75), scale="row", ColSideColors=patientcolors,

+            key=TRUE, symkey=FALSE, density.info="none", trace="none", cexRow=0.5)

点柱图(dothistogram)

英文名称	中文名称
bar	条形图
line	线图
area	面积图
pie	饼图
high-low	高低图
pareto	帕累托图
control	控制图
boxplot	箱线图
error bar	误差条图
scatter	散点图
P-P	P-P正态概率图
Q-Q	Q-Q正态概率图
sequence	序列图
ROC Curve	ROC分类效果曲线图
Time Series	时间序列图

好了，言归正传。那么什么又是点柱图(dot histogram)呢？之前我又称之为蜂群图(beeswarm)。还有称之为抖点图(jitter plots)。总之无论如何，在糗世界里我都称之为点柱图吧。

我们先看点柱图效果：

点柱图

以下是代码

> require(beeswarm)

> data(breast)

> head(breast)

            ER      ESR1     ERBB2 time_survival event_survival

100.CEL.gz neg  8.372133 13.085894            39              1

103.CEL.gz pos 10.559356  9.491683            97              0

104.CEL.gz pos 12.299905  9.599574            11              1

105.CEL.gz pos 10.776632  9.681747            99              0

106.CEL.gz pos 10.505124  9.436763            40              1

107.CEL.gz neg 10.377741  8.695576            94              0

> require(plotrix)

> cluster<-cluster.overplot(breast$event_survival, breast$time_survival)

> png("dothist.png",width=1000,height=1000)

> opar<-par(mfrow=c(3,3))

> plot (breast$event_survival, breast$time_survival, main="Multiple points on coordinate",col=as.numeric(breast$ER),xaxt="n",xlim=c(-1,2))

> axis(1,at=c(0,1),labels=c("Censored","Metastasis"))

> plot(jitter(breast$event_survival), breast$time_survival, main="Using Jitter on x-axis",col=as.numeric(breast$ER),xaxt="n",xlim=c(-0.5,1.5))

> axis(1,at=c(0,1),labels=c("Censored","Metastasis"))

> plot(jitter(breast$event_survival), jitter(breast$time_survival), main="Using Jitter on x and y-axis",col=as.numeric(breast$ER),xaxt="n",xlim=c(-0.5,1.5))

> axis(1,at=c(0,1),labels=c("Censored","Metastasis"))

> sunflowerplot(breast$event_survival, breast$time_survival, main="Using Sunflowers",xaxt="n",xlim=c(-0.5,1.5))

> axis(1,at=c(0,1),labels=c("Censored","Metastasis"))

> plot(cluster, main="Using cluster.overplot",col=as.numeric(breast$ER),xaxt="n",xlim=c(-0.5,1.5))

> axis(1,at=c(0,1),labels=c("Censored","Metastasis"))

> count.overplot(jitter(breast$event_survival), jitter(breast$time_survival), main="Using cout.overplot",col=as.numeric(breast$ER),xaxt="n")

> axis(1,at=c(0,1),labels=c("Censored","Metastasis"))

> sizeplot(breast$event_survival, breast$time_survival, main="Using sizeplot",col=as.numeric(breast$ER),xaxt="n",xlim=c(-0.5,1.5))

> axis(1,at=c(0,1),labels=c("Censored","Metastasis"))

> beeswarm(time_survival ~ event_survival, data = breast,

+          method = 'swarm',

+          pch = 16, pwcol = as.numeric(ER),

+          xlab = '', ylab = 'Follow-up time (months)',

+          labels = c('Censored', 'Metastasis'))

> dev.off()

quartz

     2

> par(opar)

以下是解释

在很多情况下，我们画散点图的时候，有许多点拥有相同的横坐标，如果我们简单的使用plot(x,y)的方式，会显得这些点拥挤在一起，象图中左上角一样，非常的不舒服。我们需要把这些点分散开。

最基本的思路是，把横坐标抖散(jitter)，使本来都拥有相同坐标的点的横坐标稍有不同。jitter是基类函数{base}，无需调用任何包。

> plot(jitter(breast$event_survival), breast$time_survival, main="Using Jitter on x-axis",col=as.numeric(breast$ER),xaxt="n",xlim=c(-0.5,1.5))

> axis(1,at=c(0,1),labels=c("Censored","Metastasis"))

> plot(jitter(breast$event_survival), jitter(breast$time_survival), main="Using Jitter on x and y-axis",col=as.numeric(breast$ER),xaxt="n",xlim=c(-0.5,1.5))

> axis(1,at=c(0,1),labels=c("Censored","Metastasis"))

我们比较图中上边靠右的两个图，我们发现，如果只抖散x坐标的话，还是有些点会粘在一起，所以同时抖散y坐标会好一些。我们可以使用factor参数来控制jitter抖散的强度。

> plot(rep(c(1,5,10),each=5),c(jitter(rep(100,5),factor=1),jitter(rep(100,5),factor=5),jitter(rep(100,5),factor=10)),col=c("red","blue","green","gray","black"),xlim=c(-2,13),xlab="",ylab="y",xaxt="n",main="jitter(rep(100,5)) with different factor")

> axis(1,at=c(1,5,10),labels=c(paste("factor=",c(1,5,10),sep="")))

不同强度的jitter

在graphics包中提供了一个sunflowerplot的函数。它的目的是用花瓣数目多少来显示落在同一坐标上的点的数目。但是从中左图看来，点多的时候效果并非总是那么好。

在plotrix包中提供了一些有意思的函数来解决点挤在一起的这个问题，它们分别是cluster.overplot, count.overplot, sizeplot。这三个函数的效果如图中及下靠左的两个。cluster.overplot的方法类似抖散，count overplot的方法是使用数字来显示落在同一坐标上的点的数目，sizeplot的方法是使用不同大小是点来显示落在同一坐标上的点的数目。从效果来看，点多的时候效果也并非理想。

而上一次提到过的蜂群图似乎是解决这一问题的最佳方案。

我们得出结论，在点数不同的情况下，使用plotrix包及sunflowerplot是不错的。但点数较多的情况下，还是使用jitter和beeswarm较为稳妥。

我们也可以使用ggplot2包中的geom来绘制点柱图

> require(beeswarm)

> data(breast)

> library(ggplot2)

> p<-ggplot(breast, aes(event_survival,time_survival))

> print(p+geom_jitter(aes(color=ER))+scale_colour_manual(value = c("black", "red"))+scale_x_continuous(breaks = c(0:1),labels = c("Censored", "Metastasis")))

ggplot点柱图

还有一种图，名称为Engelmann-Hecker-Plot，由plotrix的ehplot来实现。

> data(iris);library(plotrix)

> ehplot(iris$Sepal.Length, iris$Species,

+ intervals=20, cex=1.8, pch=20, main="pch=20")

> ehplot(iris$Sepal.Width, iris$Species,

+ intervals=20, box=TRUE, median=FALSE, main="box=TRUE")

> ehplot(iris$Petal.Length, iris$Species,

+ pch=17, col="red", log=TRUE, main="pch=17")

> ehplot(iris$Petal.Length, iris$Species,

+ offset=0.06, pch=as.numeric(iris$Species), main="pch=as.numeric(iris$Species)")

> rnd <- sample(150)

> plen <- iris$Petal.Length[rnd]

> pwid <- abs(rnorm(150, 1.2))

> spec <- iris$Species[rnd]

> ehplot(plen, spec, pch=19, cex=pwid,

+ col=rainbow(3, alpha=0.6)[as.numeric(spec)], main="cex and col changes")

坐标中断(axis breaks)

R当中的坐标中断一般都使用plotrix库中的axis.break(), gap.plot(), gap.barplot(), gap.boxplot()等几个函数来实现，例：

gap plot

> library(plotrix)

> opar<-par(mfrow=c(3,2))

> plot(sample(5:7,20,replace=T),main="Axis break test",ylim=c(2,8))

> axis.break(axis=2,breakpos=2.5,style="gap")

> axis.break(axis=2,breakpos=3.5,style="slash")

> axis.break(axis=2,breakpos=4.5,style="zigzag")

> twogrp<-c(rnorm(5)+4,rnorm(5)+20,rnorm(5)+5,rnorm(5)+22)

> gap.plot(twogrp,gap=c(8,16,25,35),

+  xlab="X values",ylab="Y values",xlim=c(1,30),ylim=c(0,25),

+  main="Test two gap plot with the lot",xtics=seq(0,30,by=5),

+  ytics=c(4,6,18,20,22,38,40,42),

+  lty=c(rep(1,10),rep(2,10)),

+  pch=c(rep(2,10),rep(3,10)),

+  col=c(rep(2,10),rep(3,10)),

+  type="b")

> gap.plot(21:30,rnorm(10)+40,gap=c(8,16,25,35),add=TRUE,

+   lty=rep(3,10),col=rep(4,10),type="l")

> gap.barplot(twogrp,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20),

+   ylab="Group values",main="Barplot with gap")

> gap.barplot(twogrp,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20),

+   ylab="Group values",horiz=TRUE,main="Horizontal barplot with gap")

> twovec<-list(vec1=c(rnorm(30),-6),vec2=c(sample(1:10,40,TRUE),20))

>  gap.boxplot(twovec,gap=list(top=c(12,18),bottom=c(-5,-3)),

+  main="Show outliers separately")

> gap.boxplot(twovec,gap=list(top=c(12,18),bottom=c(-5,-3)),range=0,

+  main="Include outliers in whiskers")

> par(opar)

从图像效果上来看，这样的坐标中断只能说实现了坐标中断，但效果上是非常一般的。甚至远不如excel, openoffice当中出图效果好。为此，我们需要对plotrix库中的gap.plot做出修改，以达到满意的效果。

最简单的修改办法就是在使用了gap.plot, gap.barplot, gap.boxplot之后重新使用axis.break来修改中断类型，使得看上去美一点。

> axis.break(2,from,breakcol="snow",style="gap")

> axis.break(2,from*(1+0.02),breakcol="black",style="slash")

> axis.break(4,from*(1+0.02),breakcol="black",style="slash")

> axis(2,at=from)

使用上面的办法可以绘制出双反斜线中断，并可以视实际情况加油断点起止位置。

改进的gap.plot

> library(plotrix)

> x<-c(1:5,6.9,7)

> y<-2^x

> from<-33

> to<-110

> opar<-par(mfrow=c(2,2))

> plot(x,y,type="b",main="normal plot")

> gap.plot(x,y,gap=c(from,to),type="b",main="gap plot")

> axis.break(2,from,breakcol="snow",style="gap")

> axis.break(2,from*(1+0.02),breakcol="black",style="slash")

> axis.break(4,from*(1+0.02),breakcol="black",style="slash")

> axis(2,at=from)

> gap.barplot(y,gap=c(from,to),col=as.numeric(x),main="barplot with gap")

> axis.break(2,from,breakcol="snow",style="gap")

> axis.break(2,from*(1+0.02),breakcol="black",style="slash")

> axis.break(4,from*(1+0.02),breakcol="black",style="slash")

> axis(2,at=from)

> gap.barplot(y,gap=c(from,to),col=as.numeric(x),horiz=T,main="Horizontal barplot with gap")

> axis.break(1,from,breakcol="snow",style="gap")

> axis.break(1,from*(1+0.02),breakcol="black",style="slash")

> axis.break(3,from*(1+0.02),breakcol="black",style="slash")

> axis(1,at=from)

> par(opar)

来源：http://blog.qiubio.com:8080/rbioconductor-2

这个总结的也不错，推荐下： ggplot2绘制Excel所有图

参数	描述
side	坐标轴所在的位置，1:下,2:左,3:上,4:右
at	坐标轴具体位置，通常由自动给出。
labels	坐标字符串
pos	坐标轴线所在的行，默认值为重绘所在位置上的原坐标
lty	线型
col	颜色
las	坐标标记与坐标轴方位关系，＝0为平等，＝2为垂直
lwd.ticks	坐标刻度线宽度
col.ticks	坐标刻度线颜色
(…)	其它par()中可用的参数

参数	描述
main	主标题
sub	副标题
xlab	x轴标记
ylab	y轴标记
xlim	x轴上下限（范围）
ylim	y轴上下限
mgp	坐标轴标记，坐标字符，坐标刻度线距离坐标轴的行数，默认值为c(3,1,0)

参数	描述
x,y	图例所在位置，可以使用”bottom”,”bottomleft”,”left”,”topleft”,”top”,”topright”,”right”,”bottomleft”,”center”来指定。
inset	设置在主绘图边距
title	图例的标题
legend	图例的内容
…	其它par()可用的参数

参数	描述
location	图例所在位置
pos	所在的相对位置，1:下面,2:左边,3:上面,4:右边
side	所在边距的位置，1:下,2:左,3:上,4:右
…	其它par()可用的参数

码农公寓