Henrik Bengtsson | 1 Apr 2006 12:45
Picon
Picon

Re: R crashes on winXP with 2G memory

Hi.

R should never crash, that's true.  What's your version of R?  Are you
running R v.2.2.1?  Then try to download the latest "patched" version.
 Are you running R v2.3.0 devel?  Then try to download a newer
version, because bugs get introduced once in a while in the devel
version which are fixed a few days later.

Troubleshooting:
When does it crash?  In the read.table() call?  Try to narrow it down.
 If it is a memory problem, you can always try to through in a gc() in
the end of your for loop.  Also, if you're only interested in a few of
the columns, you can the read.table() to ignore all others.  See the
help and argument 'colClasses'.  That will speed up your reading and
decrease the memory usage. R should still not crash, but you might
avoid the bug this way.

In your script, 'Count' and 'count' are two different objects!  Also,
your 'assign(print(i,quote=FALSE),x)' statement is very misleading
(although correct).  It's better to split it up in two lines:

  print(i,quote=FALSE);
  assign(i,x);

/Henrik

On 3/31/06, Hao Liu <liuha@...> wrote:
> Dear All:
>
> I encountered this several times, could someone point out where the problem
(Continue reading)

HFL | 1 Apr 2006 18:24
Picon
Favicon

Re: UseMart not working

Dear Amy,

I had the same problem with the biomaRt package before.
But now it works fine with the option, *mysql = TRUE*, as Jim suggested.

=================================
mart=useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl")
# Error in curlPerform(curl = curl, .opts = opts) :
#         Failed writing body

mart=useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl", mysql=T)
# connected to:  ensembl
# Reading database configuration of: hsapiens_gene_ensembl
# Checking main tables ... ok
# Checking attributes and filters ... ok

mart <- useMart("ensembl_mart_37")
# Error in curlPerform(curl = curl, .opts = opts) :
#         Failed writing body

mart <- useMart("ensembl_mart_37", mysql=T)
# connected to:  ensembl_mart_37
================================

Regards,

Ho

>Amy Mikhail wrote:
>
(Continue reading)

Chelsea Ellis | 1 Apr 2006 22:50
Picon
Favicon

Classification and the Golub data set

Hi,

I'm just learning Bioconductor, and I'm trying to do KNN classification 
using the Golub test and training sets with ALL and AML as the classifier.

When I use the function

knn(golubTrain, golubTest, cl=golubTrain$ALL.AML, k=3),

it's says that the lengths of the training set and the classifier don't 
match.  The documentation on KNN says you need to have the test and training 
sets in matrix form, but I'm not sure how to change an expression set into a 
matrix.  I tried "unclass" and "as.matrix" with no luck.  This is probably 
an easy question, but I'm stuck.  Thanks for any help you can give.

Chelsea

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

Christos Hatzis | 2 Apr 2006 00:05

Re: Classification and the Golub data set

I have not used this data, but if golubTest is an expression set,

exprs(golubTest)

will return the expression values as a data frame.  If this does not work,
try

as.matrix(exprs(golubTest))

-Christos

-----Original Message-----
From: bioconductor-bounces@...
[mailto:bioconductor-bounces@...] On Behalf Of
Chelsea Ellis
Sent: Saturday, April 01, 2006 3:51 PM
To: bioconductor@...
Subject: [BioC] Classification and the Golub data set

Hi,

I'm just learning Bioconductor, and I'm trying to do KNN classification
using the Golub test and training sets with ALL and AML as the classifier.

When I use the function

knn(golubTrain, golubTest, cl=golubTrain$ALL.AML, k=3),

it's says that the lengths of the training set and the classifier don't
match.  The documentation on KNN says you need to have the test and training
(Continue reading)

Gordon Smyth | 2 Apr 2006 01:07
Picon
Picon
Favicon

BioC] using limma with no replicates

Dear Pedro,

The strategy you are proposing is to ignore experimental factors 
which you think will have relatively small effects, so as to generate 
some degrees of freedom for error. This is an ok strategy, long used 
in statistics, as long as you understand clearly what you are testing 
for. If you do this, limma will try to find genes which have 
differential expression which stands out relative to the effects you 
have ignored.

Power is not the issue here. This approach is actually conservative, 
in that the residual variability will be larger than if you had true 
replicate arrays, hence you will find fewer DE genes than you might otherwise.

Best wishes
Gordon

>Date: Fri, 31 Mar 2006 12:48:20 +0200
>From: Pedro L?pez Romero <plopez@...>
>Subject: [BioC] using limma with no replicates
>To: <bioconductor@...>
>
>Dear list,
>
>I have been given with some data to analyze. Unfortunately they only gave 1
>replicate per experimental condition, so I do not expect to draw meaningful
>information from here. Anyway, I would like to use limma, since I expect
>that this could be more powerful than the mere inspection of the log2 fold
>change.
>
(Continue reading)

Gordon Smyth | 2 Apr 2006 01:22
Picon
Picon
Favicon

LIMMA: choice of offset value for background correction method "normexp"

Dear Pie,

The offset means:

    R = normexp(Rf - Rb) + offset

where normexp() is the transformation used by the 'normexp' 
background correction method. The purpose of the offset is to reduce 
the variability of the log-ratios for low intensity spots, because 
the log-ratios

   M = log2( R / G)

are damped towards 0 by larger offsets. The optimal choice is the one 
which makes the variability of the log-ratios as constant as possible 
across the range of intensity levels. (This is the same general 
purpose as the vsn package, but by other method.)

You can judge a good value for the offset by inspection of the 
MA-plots. If you really want a quantitative way to judge this, look 
at the component fit$df.prior after you use the eBayes() function in 
limma. The better you stabilise the variances, the larger will be 
df.prior and the greater will be the power to detect DE genes. Hence 
the offset which maximises df.prior is, in sense, optimal.

Best wishes
Gordon

>Date: Fri, 31 Mar 2006 16:46:15 +0100
>From: Pie Muller <pie.muller@...>
(Continue reading)

James W. MacDonald | 2 Apr 2006 01:54
Picon
Picon

Re: Classification and the Golub data set

Christos Hatzis wrote:
> I have not used this data, but if golubTest is an expression set,
> 
> exprs(golubTest)
> 
> will return the expression values as a data frame.  If this does not work,
> try
> 
> as.matrix(exprs(golubTest))

Another thing to consider is that most 'classical' statistical methods 
expect that the data are in the 'usual' format, with rows as samples and 
columns as observations. With microarray data, the opposite convention 
holds, with columns as samples and rows as observations. Hence you need 
to do:

knn(t(exprs(golubTrain)), t(exprs(golubTest)), cl=golubTrain$ALL.AML, k=3)

Best,

Jim

> 
> -Christos
> 
> -----Original Message-----
> From: bioconductor-bounces@...
> [mailto:bioconductor-bounces@...] On Behalf Of
Chelsea Ellis
> Sent: Saturday, April 01, 2006 3:51 PM
(Continue reading)

Gordon Smyth | 2 Apr 2006 02:32
Picon
Picon
Favicon

Limma: background correction. Use or ignore?

Dear Jose,

For some brief but relevant comments see See Section 6.1: Background 
Correction in the Limma User's Guide, and Section 3 of 
http://www.statsci.org/smyth/pubs/mareview.pdf

Whether background subtraction is a good idea depends entirely on the 
background estimation used. You do not mention what image analysis 
program you used or which background estimation method was chosen, 
but everything depends on this.

Firstly, can you get away with ignoring the background entirely? I 
agree with Jim and Naomi's general remarks, and I agree with Jim that 
not background correcting can lead to cleaner results for some data 
sets, especially for good quality arrays with low background. The 
UCSF microarray center has made the same argument for their own 
arrays. But in my lab, we always background correct. There are a lot 
of reasons for this. For one thing, foreground-background plots 
almost always show that background correcting does remove some 
systematic bias. The most critical reason though is to achieve 
comparability between experimental conditions. Not background 
correction is a lot like adding an offset to your data (see the 
backgroundCorrect function in limma), and the size of the offset 
depends on the level of the background. In my lab we see data from 
lots of different labs, platforms, image analysis programs, species 
etc, and the background levels can vary wildly. For example, I 
analysed one important experiment when the scanner changed from Axon 
to Agilent halfway through, and the overall background levels 
increased 10-fold. I prefer to background correct and to add the 
offset explicitly, rather than to allow it to vary with the data in 
(Continue reading)

Adaikalavan Ramasamy | 2 Apr 2006 04:31
Picon

merge SMD data from different print batch

Dear all,

I need some advice on certain potentially ad-hoc measure that I am using
to deal with Stanford Microarray Database (SMD) data.

Problem : 
A single study in SMD often utilises different print batches - sometimes
with twice as many unique features in one batch than other batches. This
is compounded by the fact there are multiple copies for some genes.

Current solution : 
I am aware that marray has read.SMD(), limma has read.maimagenes( ...,
source="smd"), merge.MAList() and someone else proposed read.SMD2() on
the mailing list. I could be wrong here I believe these assume that the
arrays contain the same probeset but potentially disordered.

Potential solution : 
My steps are as following. Scripts to do this are given at the end.

1) First we preprocess each array with LOESS followed by scale
normalisation before combining with other arrays. This is done is the
first half of the function my.SMD.expr().

2) Next, we average the log ratios over the LUID or SUID (for old SMD
dataset) and removing redundant gene annotations. This is done in
get.SMD.expr(). This is potentially the a contentious issue.

3) Finally we merge the different arrays by using the LUID and the
average gene expression for that LUID that was calculated in step 2. 

(Continue reading)

Paquet, Agnes | 2 Apr 2006 08:41
Picon

Re: MEEBO-tool in arrayQuality


Hi Anja,

It looks like meeboQuality couldn't find  the controls identifiers. This function is looking for MEEBO ids
starting with mMC, mCT, mCP...
Does your image processing output file include them? If so, did you try to specify the name of your column
containing the MEEBO identifier in the arguments of the function?

Regards,

Agnes

-----Original Message-----
From: bioconductor-bounces@... on behalf of Anja Schiel
Sent: Fri 3/31/2006 4:51 AM
To: bioconductor@...
Subject: [BioC] MEEBO-tool in arrayQuality

Hello,
I am running the newest version of arrayQuality (1.6.3) to analyse MEEBO
arrays. We have spotted the library on two slides and I would like to
use the very nice new meebo-tool from the package. I get as far as
making the diagnostic plot, but seem to be unable to use the mismatch
controls or get the plots with the spike in ratios (I have adapted the
Spike in file, I only used Ambion Spikes).
This is my session output, in R 2.2.0, Linux, package 1.6.3.
Can anybody give me a hint what seems to go wrong with the NA at the
end?

 test<-meeboQuality(SpikeTypeFile="AmbionA.txt",
(Continue reading)


Gmane