13.11 Replications
Little recap: There is a trade-off between type I & type II error
- Type I error reject \(H_0\) when \(H_0\) is TRUE
- Type I error Rate is the prob of type I error
- Type II error no reject \(H_0\) when \(H_0\) is FALSE
- POWER of hypothesis is the prob of not making type II error
- R: sum of the number of pvalues which are below the threshold and will be rejected (n. rejections)
- m: total hypothesis testing
- m0: negatives
- m1: m-m0 positives
Now what we do is replicating the sample in a lab environment to obtain fake data, in order to do that we generate a matrix from a replication of normal distribution of the same size of our data.
<- 24
n <- 8793
m <- 2 # this is the sample split in two
delta <- 500 # m1
positives # negatives
<- m-positives # 8293
m0
<- matrix(rnorm(n*m),m,n)
mat 1:positives,1:(n/2)] <- mat[1:positives,1:(n/2)] + delta mat[
Then we do 1000 replications to see in what is the false discovery rate for this lab.
<-1000
Bset.seed(1173)
<- replicate(B,{
results_global <- matrix(rnorm(n*m),m,n)
mat 1:positives,1:(n/2)] <- mat[1:positives,1:(n/2)] + delta
mat[= genefilter::rowttests(mat,null_hypothesis)$p.val
pvals ##Bonferroni
<- sum(pvals[-(1:positives)]<=0.05/m)
FP1 <- sum(pvals[1:positives]>0.05/m)
FN1 # p.adjust
<- p.adjust(pvals,method = "fdr")
qvals1 <-sum(qvals1[-(1:positives)]<=0.05)
FP2<- sum(qvals1[1:positives]>0.05)
FN2 # qvalue
<- qvalue::qvalue(pvals)$qvalues
qvals2 <-sum(qvals2[-(1:positives)]<=0.05)
FP3<- sum(qvals2[1:positives]>0.05)
FN3 c(FP1,FN1,FP2,FN2,FP3,FN3)
})
class(results_global)
## [1] "matrix" "array"
In this table are summarised the: false positives (FP) and the false negatives (FN) for three methods:
- Bonferroni
- FDR with p.adjust()
- FDR with qvalue()
The counts for FP and FN for the three cases:
## V1 V2 V3 V4 V5 V6
## FP1 0 0 0 0 0 0
## FN1 383 382 372 391 383 369
## FP2-padjust 22 22 20 32 22 20
## FN2-padjust 42 29 34 28 38 30
## FP3-qvalue 24 22 20 35 24 24
## FN3-qvalue 38 28 33 27 34 29
The mean values or the proportions for the same methods:
## id Bonferroni FDR...padjust FDR...qvalue
## 1 FP 4.461594e-06 0.002773906 0.002960328
## 2 FN 7.645120e-01 0.081822000 0.077982000
Benjamini-Hochberg
\[p(i)≤\frac{i}{m}\alpha\]
<- 0.05
alpha = seq(along=pvals)
i
mypar(1,2)
plot(i,sort(pvals))
abline(0,i/m*alpha)
##close-up
plot(i[1:15],sort(pvals)[1:15],main="Close-up")
abline(0,i/m*alpha)
<- max( which( sort(pvals) < i/m*alpha) )
k <- sort(pvals)[k]
cutoff cat("k =",k,"p-value cutoff=",cutoff)
## k = 13 p-value cutoff= 3.83879e-05
<- p.adjust(pvals, method="fdr")
fdr mypar(1,1)
plot(pvals,fdr,log="xy")
abline(h=alpha,v=cutoff)