今天介绍一下单因素方差分析可视化的内容,主要是实现如下图:
分组平均值+标准差
1. 数据
library(agricolae)
data(sweetpotato)
head(sweetpotato)
str(sweetpotato)
试验描述:
这些数据与在秘鲁南部塔克纳省进行的一项实验相符。研究了两种病毒(Spfmv和Spcsv)的作用。处理方法如下:CC(Spcsv)=甘薯褪绿矮秆,FF(Spfmv)=羽状斑驳,FC(Spfmv y Spcsv)=病毒复合物和OO(见证)健康植物。每个小区种植甘薯50株,共12个小区。每次治疗重复3次,实验结束时评估总重量(千克)。病毒通过插条传播,插条在田间播种。
2. 方差分析
mod1 = aov(yield~virus,data=sweetpotato)
summary(mod1)
可以看出,不同病毒之间达到极显著。可以进行多重比较。
3. 多重比较
re = LSD.test(mod1,"virus")
re
结果:
$statistics
MSerror Df Mean CV t.value LSD
22.48917 8 27.625 17.1666 2.306004 8.928965
$parameters
test p.ajusted name.t ntr alpha
Fisher-LSD none virus 4 0.05
$means
yield std r LCL UCL Min Max Q25 Q50 Q75
cc 24.40000 3.609709 3 18.086268 30.71373 21.7 28.5 22.35 23.0 25.75
fc 12.86667 2.159475 3 6.552935 19.18040 10.6 14.9 11.85 13.1 14.00
ff 36.33333 7.333030 3 30.019601 42.64707 28.0 41.8 33.60 39.2 40.50
oo 36.90000 4.300000 3 30.586268 43.21373 32.1 40.4 35.15 38.2 39.30
$comparison
NULL
$groups
yield groups
oo 36.90000 a
ff 36.33333 a
cc 24.40000 b
fc 12.86667 c
attr(,"class")
[1] "group"
4. 多重比较可视化
re1 = re$groups
re1
# 计算品种标准误
xx = aggregate(yield ~ virus, sweetpotato,sd)
names(xx) = c("virus","sd")
xx
re2 = re1 %>% mutate(virus = rownames(re1)) %>% inner_join(.,xx,by="virus")
re2
# 作图
## 做直方图
re2 %>% ggplot(aes(virus,yield)) + geom_col(aes(fill = virus), width=.4) +
geom_errorbar(aes(ymax = yield + sd, ymin = yield - sd),width = .1,size=.5)+
geom_text(aes(label = groups,y = yield + sd +1.5)) + theme(panel.grid = element_blank(), panel.background = element_rect(color = "black",fill = "transparent"))
5. 完整代码
library(agricolae)
data(sweetpotato)
mod1 = aov(yield~virus,data=sweetpotato)
summary(mod1)
re = LSD.test(mod1,"virus")
re
re1 = re$groups
re1
# 计算品种标准误
xx = aggregate(yield ~ virus, sweetpotato,sd)
names(xx) = c("virus","sd")
xx
re2 = re1 %>% mutate(virus = rownames(re1)) %>% inner_join(.,xx,by="virus")
re2
# 作图
## 做直方图
re2 %>% ggplot(aes(virus,yield)) + geom_col(aes(fill = virus), width=.4) +
geom_errorbar(aes(ymax = yield + sd, ymin = yield - sd),width = .1,size=.5)+
geom_text(aes(label = groups,y = yield + sd +1.5)) + theme(panel.grid = element_blank(), panel.background = element_rect(color = "black",fill = "transparent"))
6. R语言太难?来用Genstat吧
6.1 导入数据
6.2 选择方差分析模型
结果:
Analysis of variance
Variate: yield
Source of variation d.f. s.s. m.s. v.r. F pr.
virus 3 1170.21 390.07 17.34 <.001
Residual 8 179.91 22.49
Total 11 1350.12
Message: the following units have large residuals.
*units* 9 -8.3 s.e. 3.9
Tables of means
Variate: yield
Grand mean 27.6
virus cc fc ff oo
24.4 12.9 36.3 36.9
Standard errors of differences of means
Table virus
rep. 3
d.f. 8
s.e.d. 3.87
Least significant differences of means (5% level)
Table virus
rep. 3
d.f. 8
l.s.d. 8.93
6.3 多重比较
Fisher's protected least significant difference test
virus
Mean
oo 36.90 a
ff 36.33 a
cc 24.40 b
fc 12.87 c
6.4 结果可视化
结果:
欢迎关注我的公众号:
育种数据分析之放飞自我
。主要分享R语言,Python,育种数据分析,生物统计,数量遗传学,混合线性模型,GWAS和GS相关的知识。