Valerie Obenchain | 1 Feb 04:09

Re: [Rocky] - R code for PGSEA package to identify differentially expressed genes

Hi Rocky,

I'm pasting your message here so we can keep all comments on the mailing 
list.

## ---------------------------------------------------------------

Dear Valerie,

I would like to find down-regulated expression signatures for each 
sample from GSE11024 dataset containing 79 samples. How could I 
implement using PGSEA to get it. I would be glad and highly appreciate 
for your kindness.

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE11024
GSM278765 - GSM278774 (10 samples) -> CC_KIDNEY
GSM278775 - GSM278780 (6 samples) -> CHR_KIDNEY
GSM278781 - GSM278792 (12 samples) -> NOR_KIDNEY
GSM278793 - GSM278799 (7 samples) -> ON_KIDNEY
GSM278800 - GSM278816 (17 samples) -> Pappilary_KIDNEY
GSM278817 - GSM278843 (27 samples) -> WM_KIDNEY

I also tried with affy and LIMMA package for normalizing the GSM CEL 
files using these code –

source("http://www.bioconductor.org/biocLite.R")
biocLite("limma")
biocLite("affy")
library(affy)
setwd("/home/haojamrocky/DATA/GSE11024")
(Continue reading)

Valerie Obenchain | 1 Feb 04:41

Re: Help With RNA-seq

Hi Tina,

I'm pasting in your message below so we can keep all communication on 
the mailing list.

The 'samplesize' argument looks wrong in your call to compute the priors.

     CDP.Poi<- getPriors.Pois(CD, samplesize = 2, takemean = TRUE, cl = cl)

Why have you set this to 2? It should be a much larger number. Try using 
the default (1e5),

     CDP.Poi<- getPriors.Pois(CD, cl = cl)

Does this speed it up?

Valerie

## -----------------------------------------------------------------------

Hi Valerie,

Thank you once again for all your help.

As you requested in the previous email for a clearer explanation to the problem I am encountering at present.

  head(D)
    R1L1Kidney R1L2Liver R1L3Kidney R1L4Liver R1L6Liver R1L7Kidney R1L8Liver
10          0         0          0         0         0          2         1
15          4        35          7        32        31          3        29
(Continue reading)

Baskaran [guest] | 1 Feb 09:01
Favicon

Can SAM be combined with the bootstrap with replacement method?


Hello,

I have a set of differentially expressed genes (DEG) mined from microarray data using the two-sample
t-test with bootstrapping performed with replacement. 

I wanted to test the SAM method (using either the samr or siggenes package) for the detection of these DEG.
However, the SAM method performs the selection of DEG with a permutation method, which I understand to be
bootstrapping without replacement. As this method is intrinsic to SAM, I cannot change it.

I would like to run SAM on the microarray data combined with bootstrap with replacement, so that I can
compare the results with my original t-test analysis a bit more clearly. 

Does anyone know of any package or method that can perform this step?

 -- output of sessionInfo(): 

R version 2.11.1 (2010-05-31) 
x86_64-pc-mingw32 

locale:
[1] LC_COLLATE=English_Singapore.1252  LC_CTYPE=English_Singapore.1252   
[3] LC_MONETARY=English_Singapore.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Singapore.1252    

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] samr_1.28       impute_1.22.0   siggenes_1.22.0 multtest_2.5.14 Biobase_2.8.0  
(Continue reading)

Lescai, Francesco | 1 Feb 13:25
Picon
Picon
Favicon

Re: VariantAnnotation updated - error in predictCoding

Much more straightforward, the workflow is much simpler this way.
thanks very much, it works great.
Francesco

On 31 Jan 2012, at 16:28, Valerie Obenchain wrote:

The man page was a little out of date - which was probably a source of your confusion. Sorry about that.  An
updated page is now in v 1.1.32.

Valerie

On 01/31/2012 08:15 AM, Valerie Obenchain wrote:
Hi Francesco,

I changed the requirements of the varAllele argument from character to
either DNAStringSet or missing. Allowing the varAllele to be a character
name of a metadatacolumn was problematic.

 >  showMethods("predictCoding")
Function: predictCoding (package VariantAnnotation)
query="GRanges", subject="TranscriptDb", seqSource="ANY",
varAllele="DNAStringSet"
query="Ranges", subject="TranscriptDb", seqSource="ANY",
varAllele="DNAStringSet"
query="VCF", subject="TranscriptDb", seqSource="ANY", varAllele="missing"

I've also added a method for the query to be a VCF object so you can
just pass the VCF which allows you bypass the "flattening" step. When
you pass a VCF as the query the varAllele argument will be missing; it
is taken from the VCF internally with values(alt(<VCF>))[["ALT"]]. An
(Continue reading)

James W. MacDonald | 1 Feb 14:59
Picon
Picon

Re: converting intensity values to log ratios?

Hi Mark,

On 1/31/2012 4:16 PM, Mark Baumeister wrote:
> Hi all,
>
> I am trying to convert RMA normalized intensity values in my matrix
> (generated from Affymetirx raw .CEL data composed of
> ~30 tumor samples and 7 normal samples) to log ratios in which the
> intensity values from each tumor sample are compared with a reference
> sample (in this case I chose one of my normals).  I was told that I could
> use a function like
>
>> log(eset/esetRef,2)
> to generate the new matrix containing log ratios.

Wow. That was some horrible advice. Please stop relying on whoever told 
you that.

First, please note that RMA normalized intensity values are already on 
the log_2 scale. So taking logs again isn't necessary, nor advisable.

Second, why are you converting to log ratios? You are losing information 
for no apparent gain. Your best bet is to simply make comparisons 
between groups using something like limma or siggenes or multtest and 
proceed from there. Also note that the numerator of your statistic 
(which limma returns by default with the topTable() function) is the log 
ratio that you are seeking, so you get that for free.

Best,

(Continue reading)

Valerie Obenchain | 1 Feb 17:50

Re: Help With RNA-seq

Thinking about this a bit more ...

Have you tried modifying the 'maxit' argument to getPriors.Pois() ? It's 
possible this threshold could be reduced; it would also confirm that the 
algorithm is converging (if you reduced it to a point where you saw an 
error).

If I understand correctly, you are not seeing any error messages but 
getPirors.Pois() and getPrior.NB() are taking a long time to run. (fyi 
per your comment below, CDP.Poi <at> priors is just accessing the data in the 
slot of the object; it is not a function). Without having your data to 
test on it is difficult to know what is going on. It would be useful to 
know what kind of machine you are working on, how may cpus, how much 
memory etc.

I'm cc'ing Thomas (package author) in case he has ideas.

Valerie

On 01/31/2012 07:41 PM, Valerie Obenchain wrote:
> Hi Tina,
>
> I'm pasting in your message below so we can keep all communication on 
> the mailing list.
>
> The 'samplesize' argument looks wrong in your call to compute the priors.
>
>     CDP.Poi<- getPriors.Pois(CD, samplesize = 2, takemean = TRUE, cl = 
> cl)
>
(Continue reading)

Picon
Picon
Favicon

Re: Help With RNA-seq

Well, this is embarrassing...

A bug in the code for getPriors.Pois meant that it tripped into an 
infinite loop. I've posted a fix to the release track of Bioconductor; 
version 1.8.2 should allow getPriors.Pois to work. I suppose that nobody 
finding this before means that most people are using the negative 
binomial approach - as they should be!

Tom

On 01/02/12 16:50, Valerie Obenchain wrote:
> Thinking about this a bit more ...
>
> Have you tried modifying the 'maxit' argument to getPriors.Pois() ? 
> It's possible this threshold could be reduced; it would also confirm 
> that the algorithm is converging (if you reduced it to a point where 
> you saw an error).
>
> If I understand correctly, you are not seeing any error messages but 
> getPirors.Pois() and getPrior.NB() are taking a long time to run. (fyi 
> per your comment below, CDP.Poi <at> priors is just accessing the data in 
> the slot of the object; it is not a function). Without having your 
> data to test on it is difficult to know what is going on. It would be 
> useful to know what kind of machine you are working on, how may cpus, 
> how much memory etc.
>
> I'm cc'ing Thomas (package author) in case he has ideas.
>
> Valerie
>
(Continue reading)

Paul Geeleher | 1 Feb 19:07
Picon

Loading quantiles normalized root data in XPS

Hi,

I've used xps to quantiles normalize (at probe level) some Affy Exon
Array data. I now have a "root" file called "exon_quantiles.root", but
if I try to load it the same was I'd load my raw data (using the
scheme file I created for Affy exon arrays) I get the error below? I
can load my raw data just fine though. Any ideas? Do I perhaps need a
different "root scheme" file for this normalized data? Unfortunately,
I haven't been able to find an answer.

> scheme.huex10stv2 <- root.scheme("huex10stv2.root")
> data_qn <- root.data(scheme.huex10stv2, "exon_quantiles.root")
Error in if (chipname != treetitle) { : argument is of length zero

Hope someone can help,

Paul.

> sessionInfo()
R version 2.11.0 (2010-04-22)
x86_64-redhat-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

(Continue reading)

Juliet Hannah | 1 Feb 21:05
Picon

subsetting transcriptDb

All,

I am studying "Using the GenomicFeatures package" slides by Marc Carlson.

In the slides, a subset of the following corresponding to chr 9 is used.

library(GenomicFeatures)
mm9KG <- makeTranscriptDbFromUCSC(genome = "mm9", tablename = "knownGene")

How can I subset this transcriptDb object so that it contains only chr 9.

My goal is to use the statement

txExons <- exonsBy(mm9KGChr9)

where I believe mm9KGChr9 is what I am seeking above.

Apologies if I have missed something obvious.

Thanks,

Juliet

> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
(Continue reading)

cstrato | 1 Feb 22:01
Picon

Re: Loading quantiles normalized root data in XPS

Dear Paul,

Please have a look at the help ?root.expr.

If I understand you correctly, you did only do quantile normalization?

To see the tree names in your file you should do:
 > getTreeNames("exon_quantiles.root")

You will probably see trees with extension "int", see help ?validTreetype.

To load these trees you need to do:
 > data_qn <- root.expr(scheme.huex10stv2, "exon_quantiles.root", "int")

Please let me know if this did solve your problem.

Best regards
Christian
_._._._._._._._._._._._._._._._._._
C.h.r.i.s.t.i.a.n   S.t.r.a.t.o.w.a
V.i.e.n.n.a           A.u.s.t.r.i.a
e.m.a.i.l:        cstrato at aon.at
_._._._._._._._._._._._._._._._._._


On 2/1/12 7:07 PM, Paul Geeleher wrote:
> Hi,
>
> I've used xps to quantiles normalize (at probe level) some Affy Exon
> Array data. I now have a "root" file called "exon_quantiles.root", but
(Continue reading)


Gmane