Claus Mayer | 1 Nov 2006 10:15
Picon

Re: F-test vs.T-test-on-differences

Hello Benjamin!

I think there is some misunderstanding here. The t-test is a test for 
the differences between the means of two distributions. If you center 
your data like you propose the difference is 0 so the t-statistic will 
always behave very much like under the nullhypothesis (not exactly as 
the distributions might differ in variances and other properties, but 
the t-test is NOT meant to detect those).
The F-test on the other hand specifically tests for difference in 
variances, so it is clearly the more appropriate test in your case (and 
if you are worrried about non-normality you might determine p-values by 
a resampling method like bootstrap).
I think what might have confused you is that there are TWO F-tests:
a) the one for testing differences between variances (lets call that F1)
b) the F-test that is being used in Analysis of Variance (ANOVA) (lets 
call it F2)
Despite its name ANOVA is a method to compare MEANS not VARIANCES. With 
two groups you have the trivial case of a one-way ANOVA and if you 
calculate the F-statistic F2 for that it is just a transformation of the 
usual t-statistic, i.e. the test will yield the same p-values.
So F1 and F2 are very different statistics for very different things, 
but both have a F-distribution under normality assumptions so their 
names are the same (there are plenty of chi-square tests out there as well!)

Hope this helps

Claus

Benjamin Otto wrote:
> Dear community,
(Continue reading)

Tine Casneuf | 1 Nov 2006 12:29
Picon
Picon
Favicon

hyperGTest and ath1121501-continued

Dear Nianhua & list,

to continue the discussion on the hyperGTest and ath1121501 
(https://stat.ethz.ch/pipermail/bioconductor/2006-October/014715.html). 
As far as I understood, this function needs the ENTREZID variable of the 
studied environment, which in this case doesn't exist (~because we work 
with the AGI-IDs and not the entrezIDs):

 > hgOver <- hyperGTest(params)
Error in get(x, envir, mode, inherits) : variable "ath1121501ENTREZID" 
of mode "environment" was not found

Did I completely miss it and is there a way around this problem or 
should I look for another function, like Thomas Girke's GOHyperGAll?

Many thanks in advance,

Tine

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

Sean Davis | 1 Nov 2006 13:09
Picon

Re: Nci-60 gene expression correlation coefficients

On Tuesday 31 October 2006 13:20, John Morrow wrote:

> But here it gets tricky since working with this data does not tie back
> easily with the genes.  I hope that maybe a bioconductor package can
> streamline this.

I think the usual way to do this in R is to make a new data structure (in this 
case, a matrix) rather than to print out the results, which aren't that 
useful for further computations.

So, to get your correlations if the matrix 'a' contains your gene expression 
measurements with genes as rows:

my.correlations <- cor(t(a),t(a),method='spearman')

x <- matrix(nc=ngenes,nr=ngenes)

for(i in 1:ngenes) {
  for(j in i:ngenes) {
    x[i,j] <- cor.test(a[i,],a[j,],method='spearman')$p.value
  }
}

I didn't test these, but I hope you get the idea.

Sean

_______________________________________________
Bioconductor mailing list
Bioconductor@...
(Continue reading)

Naomi Altman | 1 Nov 2006 15:11
Picon

Re: F-test vs.T-test-on-differences

Actually, since Benjamin took  abs(x-xbar) the means are not the 
same.  abs(x-sbar) should be centered roughly on SD(x).

--Naomi

At 04:15 AM 11/1/2006, Claus Mayer wrote:
>Hello Benjamin!
>
>I think there is some misunderstanding here. The t-test is a test for
>the differences between the means of two distributions. If you center
>your data like you propose the difference is 0 so the t-statistic will
>always behave very much like under the nullhypothesis (not exactly as
>the distributions might differ in variances and other properties, but
>the t-test is NOT meant to detect those).
>The F-test on the other hand specifically tests for difference in
>variances, so it is clearly the more appropriate test in your case (and
>if you are worrried about non-normality you might determine p-values by
>a resampling method like bootstrap).
>I think what might have confused you is that there are TWO F-tests:
>a) the one for testing differences between variances (lets call that F1)
>b) the F-test that is being used in Analysis of Variance (ANOVA) (lets
>call it F2)
>Despite its name ANOVA is a method to compare MEANS not VARIANCES. With
>two groups you have the trivial case of a one-way ANOVA and if you
>calculate the F-statistic F2 for that it is just a transformation of the
>usual t-statistic, i.e. the test will yield the same p-values.
>So F1 and F2 are very different statistics for very different things,
>but both have a F-distribution under normality assumptions so their
>names are the same (there are plenty of chi-square tests out there as well!)
>
(Continue reading)

lee | 1 Nov 2006 01:25
Picon
Favicon

question_gene list from venn diagram limma function for two color array data

Hello,
  I am using my two color array data. I want to know the genes that are significantly Up or Down in both "HFEvsWT"
and "SlavsWT" groups from Venn diagram results. I also want to know the genes that are only significantly
Up or Down in one group. When I tried using the gene list from topTable function, I got different number of
genes compared to Venn Diagram results. Thus, I want to know what are the genes after Venndiagram analysis.
  Could you help me?
  Thank you so much.

  Sincerely, Seungmin Lee

   

   
  library(limma)
targets<-readTargets("Target4wkEnteroHFESla.txt")
f<-function(x) as.numeric(x$Flags>-99)
files<-targets[,c("FileName")]
RG<-read.maimages(files,columns=list(R="F635 Mean",G="F532 Mean",Rb="B635 Median",Gb="F532 Median"),annotation=c("Block","Row","Column","ID","Accession","Symbol"))
  plotMA(RG)
RG$genes<-readGAL("meebo.gal")
RG$printer<-getLayout(RG$genes)
  MA.p<-normalizeWithinArrays(RG,method="loess")
MA.pAq<-normalizeBetweenArrays(MA.p,method="Aquantile")
design<-modelMatrix(targets,ref="WT.C57.chow")
design
contrast.matrix<-cbind("HFEvsWT"=c(1,0),"SlavsWT"=c(0,1))
rownames(contrast.matrix)<-colnames(design)
contrast.matrix
fit<-lmFit(MA.pAq,design)
fit2<-contrasts.fit(fit,contrast.matrix)
(Continue reading)

Nianhua Li | 1 Nov 2006 18:05

Re: hyperGTest and ath1121501-continued

Dear Tine,

Could you please run traceback() and sessionInfo() after the error and send me
the results? Many thanks. Those will be helpful in debugging. It will be even
better if you happen to have a small example script.

thanks

nianhua

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

James W. MacDonald | 1 Nov 2006 18:57
Picon
Picon

Re: question_gene list from venn diagram limma function for two color array data

Hi Seungmin,

lee wrote:
> Hello,
> I am using my two color array data. I want to know the genes that are
significantly Up or Down in both "HFEvsWT" and "SlavsWT" groups from
Venn diagram results. I also want to know the genes that are only
significantly Up or Down in one group. When I tried using the gene list
from topTable function, I got different number of genes compared to Venn
Diagram results. Thus, I want to know what are the genes after
Venndiagram analysis.
> Could you help me? Thank you so much.
>    
>   Sincerely, Seungmin Lee
>    
>    
>    
>    
>   library(limma)
> targets<-readTargets("Target4wkEnteroHFESla.txt")
> f<-function(x) as.numeric(x$Flags>-99)
> files<-targets[,c("FileName")]
> RG<-read.maimages(files,columns=list(R="F635 Mean",G="F532 Mean",Rb="B635 Median",Gb="F532 Median"),annotation=c("Block","Row","Column","ID","Accession","Symbol"))
>   plotMA(RG)
> RG$genes<-readGAL("meebo.gal")
> RG$printer<-getLayout(RG$genes)
>   MA.p<-normalizeWithinArrays(RG,method="loess")
> MA.pAq<-normalizeBetweenArrays(MA.p,method="Aquantile")
> design<-modelMatrix(targets,ref="WT.C57.chow")
> design
(Continue reading)

Claus Mayer | 1 Nov 2006 19:09
Picon

Re: F-test vs.T-test-on-differences

Which only shows that one should read these things properly before one 
replies. Very sorry about that!
I haven't come across that approach as a test for differences in 
variances yet, but I can see the idea now.  As the F-test has optimality 
properties for normal distributions I still would prefer it (possibly 
performed as a resampling test to make it more robust against deviations 
of non-normality).

Sorry again for misreading and misinterpreting the question

Claus

Naomi Altman wrote:
> Actually, since Benjamin took  abs(x-xbar) the means are not the same.  
> abs(x-sbar) should be centered roughly on SD(x).
> 
> --Naomi
> 
> At 04:15 AM 11/1/2006, Claus Mayer wrote:
>> Hello Benjamin!
>>
>> I think there is some misunderstanding here. The t-test is a test for
>> the differences between the means of two distributions. If you center
>> your data like you propose the difference is 0 so the t-statistic will
>> always behave very much like under the nullhypothesis (not exactly as
>> the distributions might differ in variances and other properties, but
>> the t-test is NOT meant to detect those).
>> The F-test on the other hand specifically tests for difference in
>> variances, so it is clearly the more appropriate test in your case (and
>> if you are worrried about non-normality you might determine p-values by
(Continue reading)

Tine Casneuf | 1 Nov 2006 19:30
Picon
Picon
Favicon

Re: hyperGTest and ath1121501-continued

Hi Nianhua,

after updating several packages (ath121501, GO,.. ) it worked fine.
I am still wondering about the following: the variable 
ath1121501ENTREZID is not defined in the ath1121501 environment, right? 
Is in this case (for this arabidopsis-specific environment only) 
ath1121501ENTREZID identical to ath1121501ACCNUM?

I have build my own cdf and annotation package for this array, is the 
following correct:

 > Myath1annotENTREZID <- Myath1annotACCNUM

in order to being able to use hyperGTest?
thanks!

Nianhua Li wrote:

>Dear Tine,
>
>Could you please run traceback() and sessionInfo() after the error and send me
>the results? Many thanks. Those will be helpful in debugging. It will be even
>better if you happen to have a small example script.
>
>thanks
>
>nianhua
>
>_______________________________________________
>Bioconductor mailing list
(Continue reading)

Nianhua Li | 1 Nov 2006 19:44

Re: hyperGTest and ath1121501-continued

Dear Tine,
> I am still wondering about the following: the variable
> ath1121501ENTREZID is not defined in the ath1121501 environment, right? 
ath1121501ENTREZID is defined in the ath1121501 environment, but it is a
reference to ath1121501ACCNUM.
> Is in this case (for this arabidopsis-specific environment only)
> ath1121501ENTREZID identical to ath1121501ACCNUM?
Yes.
>
> I have build my own cdf and annotation package for this array, is the
> following correct:
>
> > Myath1annotENTREZID <- Myath1annotACCNUM
>
>
> in order to being able to use hyperGTest?
Yes if your annotation package was built by athPkgBuilder.

best

nianhua

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


Gmane