ANOSIM验证
ANOSIM用于检验组间差异性。假定有两个先验分组Group1和Group2,将样本间的距离矩阵进行秩转换。此时,类似方差分的零假设H0即是:两个组别之间(或多个组别间)不存在差异。Clarke等设计了一个统计数R来评估组间差异:
这其中分别为组间(between)和组内(within)亚矩阵秩的平均值.
当所有的最小秩均在组内亚矩阵(within-group submatrix)中时,R=1;当高秩或低秩在组间和组内亚矩阵之间均匀分布时,R=0。统计数R不大可能小于零,如果统计数小于零,说明组内差异大于组间差异,即先验分组存在问题(在排除数据问题后)。统计数R也可以处理多个分组案例(大于等于3)。ANOSIM和Mantel test都是假定组内距离与组间距离是可比较的,并且组内距离小于组间距离,如果H0是错误的。这可以在运行ANOSIM之前进行检验。组间强异方差性(heteroscedasticity),如密集分组和扩散分组的出现,可能实际上违背这种情况,并增加检验的Ⅰ类错误。
data preparation
## ANOSIM test验证
## 20200425
## LY
library(vegan)
data(dune)
data(dune.env)
dune.dist <- vegdist(dune)
EPS <- sqrt(.Machine$double.eps)
anosim in vagan
# function "anosim" in package vagan
dune.ano <- with(dune.env, anosim(dune.dist, Management))
summary(dune.ano)
Procedure of permutation and the distribution of statistics under H0
距离矩阵进行秩转换
x.rank <- rank(dune.dist)
N <- attr(dune.dist, "Size")
div <- length(dune.dist)/2
分别获得组间和组内亚矩阵的秩均值
# between-group submatrix and within-group submatrix
matched <- function(irow, icol, grouping) {
grouping[irow] == grouping[icol]
}
irow <- as.vector(as.dist(row(matrix(nrow = N, ncol = N))))
icol <- as.vector(as.dist(col(matrix(nrow = N, ncol = N))))
within <- matched(irow, icol, dune.env$Management)
aver <- tapply(x.rank, within, mean)
计算R统计数
# R statistic
statistic <- -diff(aver)/div
进行置换检验,获得p-value
# Permutation test
cl.vec <- rep("Between", length(dune.dist)) # creating 190 between-group element
take <- as.numeric(irow[within])
cl.vec[within] <- levels(dune.env$Management)[dune.env$Management[take]]
cl.vec <- factor(cl.vec, levels = c("Between", levels(dune.env$Management)))
ptest <- function(take, ...) {
cl.perm <- dune.env$Management[take]
tmp.within <- matched(irow, icol, cl.perm)
tmp.ave <- tapply(x.rank, tmp.within, mean)
-diff(tmp.ave)/div
}
getPermuteMatrix = function (perm, N, strata = NULL)
{
if (length(perm) == 1) {
perm <- how(nperm = perm)
}
if (!missing(strata) && !is.null(strata)) {
if (inherits(perm, "how") && is.null(getBlocks(perm)))
setBlocks(perm) <- strata
}
if (inherits(perm, "how"))
perm <- shuffleSet(N, control = perm)
else {
if (!is.integer(perm) && !all(perm == round(perm)))
stop("permutation matrix must be strictly integers: use round()")
}
if (is.null(attr(perm, "control")))
attr(perm, "control") <- structure(list(within = list(type = "supplied matrix"),
nperm = nrow(perm)), class = "how")
perm
}
permat <- getPermuteMatrix(999, N, strata = NULL)
permutations <- nrow(permat)
perm <- sapply(1:permutations, function(i) ptest(permat[i,]))
p.val <- (1 + sum(perm >= statistic - EPS))/(1 + permutations)
p.val