1 Nov 2006 10:15

### Re: F-test vs.T-test-on-differences

```Hello Benjamin!

I think there is some misunderstanding here. The t-test is a test for
the differences between the means of two distributions. If you center
your data like you propose the difference is 0 so the t-statistic will
always behave very much like under the nullhypothesis (not exactly as
the distributions might differ in variances and other properties, but
the t-test is NOT meant to detect those).
The F-test on the other hand specifically tests for difference in
variances, so it is clearly the more appropriate test in your case (and
if you are worrried about non-normality you might determine p-values by
a resampling method like bootstrap).
I think what might have confused you is that there are TWO F-tests:
a) the one for testing differences between variances (lets call that F1)
b) the F-test that is being used in Analysis of Variance (ANOVA) (lets
call it F2)
Despite its name ANOVA is a method to compare MEANS not VARIANCES. With
two groups you have the trivial case of a one-way ANOVA and if you
calculate the F-statistic F2 for that it is just a transformation of the
usual t-statistic, i.e. the test will yield the same p-values.
So F1 and F2 are very different statistics for very different things,
but both have a F-distribution under normality assumptions so their
names are the same (there are plenty of chi-square tests out there as well!)

Hope this helps

Claus

Benjamin Otto wrote:
> Dear community,
```

1 Nov 2006 12:29

### hyperGTest and ath1121501-continued

```Dear Nianhua & list,

to continue the discussion on the hyperGTest and ath1121501
(https://stat.ethz.ch/pipermail/bioconductor/2006-October/014715.html).
As far as I understood, this function needs the ENTREZID variable of the
studied environment, which in this case doesn't exist (~because we work
with the AGI-IDs and not the entrezIDs):

> hgOver <- hyperGTest(params)
Error in get(x, envir, mode, inherits) : variable "ath1121501ENTREZID"

Did I completely miss it and is there a way around this problem or
should I look for another function, like Thomas Girke's GOHyperGAll?

Tine

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

```
1 Nov 2006 13:09

### Re: Nci-60 gene expression correlation coefficients

```On Tuesday 31 October 2006 13:20, John Morrow wrote:

> But here it gets tricky since working with this data does not tie back
> easily with the genes.  I hope that maybe a bioconductor package can
> streamline this.

I think the usual way to do this in R is to make a new data structure (in this
case, a matrix) rather than to print out the results, which aren't that
useful for further computations.

So, to get your correlations if the matrix 'a' contains your gene expression
measurements with genes as rows:

my.correlations <- cor(t(a),t(a),method='spearman')

x <- matrix(nc=ngenes,nr=ngenes)

for(i in 1:ngenes) {
for(j in i:ngenes) {
x[i,j] <- cor.test(a[i,],a[j,],method='spearman')\$p.value
}
}

I didn't test these, but I hope you get the idea.

Sean

_______________________________________________
Bioconductor mailing list
Bioconductor@...
```

1 Nov 2006 15:11

### Re: F-test vs.T-test-on-differences

```Actually, since Benjamin took  abs(x-xbar) the means are not the
same.  abs(x-sbar) should be centered roughly on SD(x).

--Naomi

At 04:15 AM 11/1/2006, Claus Mayer wrote:
>Hello Benjamin!
>
>I think there is some misunderstanding here. The t-test is a test for
>the differences between the means of two distributions. If you center
>your data like you propose the difference is 0 so the t-statistic will
>always behave very much like under the nullhypothesis (not exactly as
>the distributions might differ in variances and other properties, but
>the t-test is NOT meant to detect those).
>The F-test on the other hand specifically tests for difference in
>variances, so it is clearly the more appropriate test in your case (and
>if you are worrried about non-normality you might determine p-values by
>a resampling method like bootstrap).
>I think what might have confused you is that there are TWO F-tests:
>a) the one for testing differences between variances (lets call that F1)
>b) the F-test that is being used in Analysis of Variance (ANOVA) (lets
>call it F2)
>Despite its name ANOVA is a method to compare MEANS not VARIANCES. With
>two groups you have the trivial case of a one-way ANOVA and if you
>calculate the F-statistic F2 for that it is just a transformation of the
>usual t-statistic, i.e. the test will yield the same p-values.
>So F1 and F2 are very different statistics for very different things,
>but both have a F-distribution under normality assumptions so their
>names are the same (there are plenty of chi-square tests out there as well!)
>
```

1 Nov 2006 01:25

### question_gene list from venn diagram limma function for two color array data

```Hello,
I am using my two color array data. I want to know the genes that are significantly Up or Down in both "HFEvsWT"
and "SlavsWT" groups from Venn diagram results. I also want to know the genes that are only significantly
Up or Down in one group. When I tried using the gene list from topTable function, I got different number of
genes compared to Venn Diagram results. Thus, I want to know what are the genes after Venndiagram analysis.
Could you help me?
Thank you so much.

Sincerely, Seungmin Lee

library(limma)
f<-function(x) as.numeric(x\$Flags>-99)
files<-targets[,c("FileName")]
plotMA(RG)
RG\$printer<-getLayout(RG\$genes)
MA.p<-normalizeWithinArrays(RG,method="loess")
MA.pAq<-normalizeBetweenArrays(MA.p,method="Aquantile")
design<-modelMatrix(targets,ref="WT.C57.chow")
design
contrast.matrix<-cbind("HFEvsWT"=c(1,0),"SlavsWT"=c(0,1))
rownames(contrast.matrix)<-colnames(design)
contrast.matrix
fit<-lmFit(MA.pAq,design)
fit2<-contrasts.fit(fit,contrast.matrix)
```

1 Nov 2006 18:05

### Re: hyperGTest and ath1121501-continued

```Dear Tine,

Could you please run traceback() and sessionInfo() after the error and send me
the results? Many thanks. Those will be helpful in debugging. It will be even
better if you happen to have a small example script.

thanks

nianhua

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

```
1 Nov 2006 18:57

### Re: question_gene list from venn diagram limma function for two color array data

```Hi Seungmin,

lee wrote:
> Hello,
> I am using my two color array data. I want to know the genes that are
significantly Up or Down in both "HFEvsWT" and "SlavsWT" groups from
Venn diagram results. I also want to know the genes that are only
significantly Up or Down in one group. When I tried using the gene list
from topTable function, I got different number of genes compared to Venn
Diagram results. Thus, I want to know what are the genes after
Venndiagram analysis.
> Could you help me? Thank you so much.
>
>   Sincerely, Seungmin Lee
>
>
>
>
>   library(limma)
> f<-function(x) as.numeric(x\$Flags>-99)
> files<-targets[,c("FileName")]
> RG<-read.maimages(files,columns=list(R="F635 Mean",G="F532 Mean",Rb="B635 Median",Gb="F532 Median"),annotation=c("Block","Row","Column","ID","Accession","Symbol"))
>   plotMA(RG)
> RG\$printer<-getLayout(RG\$genes)
>   MA.p<-normalizeWithinArrays(RG,method="loess")
> MA.pAq<-normalizeBetweenArrays(MA.p,method="Aquantile")
> design<-modelMatrix(targets,ref="WT.C57.chow")
> design
```

1 Nov 2006 19:09

### Re: F-test vs.T-test-on-differences

```Which only shows that one should read these things properly before one
I haven't come across that approach as a test for differences in
variances yet, but I can see the idea now.  As the F-test has optimality
properties for normal distributions I still would prefer it (possibly
performed as a resampling test to make it more robust against deviations
of non-normality).

Sorry again for misreading and misinterpreting the question

Claus

Naomi Altman wrote:
> Actually, since Benjamin took  abs(x-xbar) the means are not the same.
> abs(x-sbar) should be centered roughly on SD(x).
>
> --Naomi
>
> At 04:15 AM 11/1/2006, Claus Mayer wrote:
>> Hello Benjamin!
>>
>> I think there is some misunderstanding here. The t-test is a test for
>> the differences between the means of two distributions. If you center
>> your data like you propose the difference is 0 so the t-statistic will
>> always behave very much like under the nullhypothesis (not exactly as
>> the distributions might differ in variances and other properties, but
>> the t-test is NOT meant to detect those).
>> The F-test on the other hand specifically tests for difference in
>> variances, so it is clearly the more appropriate test in your case (and
>> if you are worrried about non-normality you might determine p-values by
```

1 Nov 2006 19:30

### Re: hyperGTest and ath1121501-continued

```Hi Nianhua,

after updating several packages (ath121501, GO,.. ) it worked fine.
I am still wondering about the following: the variable
ath1121501ENTREZID is not defined in the ath1121501 environment, right?
Is in this case (for this arabidopsis-specific environment only)
ath1121501ENTREZID identical to ath1121501ACCNUM?

I have build my own cdf and annotation package for this array, is the
following correct:

> Myath1annotENTREZID <- Myath1annotACCNUM

in order to being able to use hyperGTest?
thanks!

Nianhua Li wrote:

>Dear Tine,
>
>Could you please run traceback() and sessionInfo() after the error and send me
>the results? Many thanks. Those will be helpful in debugging. It will be even
>better if you happen to have a small example script.
>
>thanks
>
>nianhua
>
>_______________________________________________
>Bioconductor mailing list
```

1 Nov 2006 19:44

### Re: hyperGTest and ath1121501-continued

```Dear Tine,
> I am still wondering about the following: the variable
> ath1121501ENTREZID is not defined in the ath1121501 environment, right?
ath1121501ENTREZID is defined in the ath1121501 environment, but it is a
reference to ath1121501ACCNUM.
> Is in this case (for this arabidopsis-specific environment only)
> ath1121501ENTREZID identical to ath1121501ACCNUM?
Yes.
>
> I have build my own cdf and annotation package for this array, is the
> following correct:
>
> > Myath1annotENTREZID <- Myath1annotACCNUM
>
>
> in order to being able to use hyperGTest?
Yes if your annotation package was built by athPkgBuilder.

best

nianhua

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

```

Gmane