Marc Carlson | 1 Mar 2012 01:17

Re: problem creating own org.Ss.eg.db

Hi Guido,

I am not surprised by any of that.  Annotations change more than most 
people expect.  And the pig org package is made (special) here along 
with all the other major organism packages that we support.  It is not 
generated by the method you used because that method has to work for ALL 
organisms that are at NCBI.  That means that when you use that method, 
you don't get any of the extras that we add for you here.  The 
auto-generated version is just generated by using information from NCBI 
with the help of some external GO mappings.  And it is really not meant 
to be a way to get a newer package.  It is meant to allow people who are 
using non-model organisms to get annotations.  So it is expected that 
some things are definitely going to be missing unless you want to do the 
extra work of finding those things and adding them back in manually.

And it is really not possible to keep the "too new" GO terms when you 
generate the package either, because that would break a lot of software 
that depends on GO.db being in sync with the organism packages.  The 
only way around that would be if you also generated a new GO.db package 
from scratch.  I suppose that you could do that, and if you did (and 
installed it) the method would stop trying to drop those "too new" GO 
terms, but it would be a lot of work to generate that package from 
scratch.  And even if you used it you would lose some of the benefits of 
using versioned annotation packages.  Personally, I would never 
recommend that strategy, I only mention it so that you can understand 
what is happening here (and why).

A new release of Bioconductor should drop in about a month and with it 
will be an update to org.Ss.eg.db.  If you are feeling impatient, there 
should be a new package in devel even sooner than that..
(Continue reading)

Marc Carlson | 1 Mar 2012 01:30

Re: how to get gene list for given GO terms?

Hi Jianhong,

The 1st example Martin showed will get you answers concerning the 
immediate GO associations.  When considering the possibility that your 
GO term may be an ancestor and performing the same kind of operation 
with consideration for that fact, please see the GO2ALLEGS mapping:

library(org.Hs.eg.db)
help("org.Hs.egGO2ALLEGS")

## then to get your entrez gene IDs you can just use this:
mget(c("GO:0042254"),org.Hs.egGO2ALLEGS)

I hope this helps,

   Marc

On 02/29/2012 02:33 PM, Martin Morgan wrote:
> On 02/29/2012 11:13 AM, Ou, Jianhong wrote:
>> Hi Martin,
>>
>> Thank you for your reply.
>>
>> The question may be divided into two parts. The first part is like 
>> what you replied. The second one is that maybe the given GO term is 
>> the ancestor of other GO terms which are not annotated in the 
>> org.Hs.egGO db.
>>
>> I what I did for this is that fist map all the gene entrez_id into GO 
>> terms and get all the ancestor of the GO terms. Then go back to 
(Continue reading)

Ou, Jianhong | 1 Mar 2012 02:38
Favicon

Re: how to get gene list for given GO terms?

Hi Marc,

cool! It is exactly what I want. Thanks.

Yours sincerely,

Jianhong Ou

jianhong.ou@...

On Feb 29, 2012, at 7:30 PM, Marc Carlson wrote:

> Hi Jianhong,
> 
> The 1st example Martin showed will get you answers concerning the immediate GO associations.  When
considering the possibility that your GO term may be an ancestor and performing the same kind of operation
with consideration for that fact, please see the GO2ALLEGS mapping:
> 
> library(org.Hs.eg.db)
> help("org.Hs.egGO2ALLEGS")
> 
> ## then to get your entrez gene IDs you can just use this:
> mget(c("GO:0042254"),org.Hs.egGO2ALLEGS)
> 
> I hope this helps,
> 
> 
>  Marc
> 
> 
(Continue reading)

Ou, Jianhong | 1 Mar 2012 02:40
Favicon

Re: how to get gene list for given GO terms?

Hi Martin,

I am sorry I can not write it clear. Thank you again for your time and answers.

Yours sincerely,

Jianhong Ou

jianhong.ou@...

On Feb 29, 2012, at 5:33 PM, Martin Morgan wrote:

> On 02/29/2012 11:13 AM, Ou, Jianhong wrote:
>> Hi Martin,
>> 
>> Thank you for your reply.
>> 
>> The question may be divided into two parts. The first part is like what you replied. The second one is that
maybe the given GO term is the ancestor of other GO terms which are not annotated in the org.Hs.egGO db.
>> 
>> I what I did for this is that fist map all the gene entrez_id into GO terms and get all the ancestor of the GO
terms. Then go back to extract all the genes involved in one GO term.
> 
> I don't really understand your question so probably shouldn't try to answer, but if I knew a GO term
GO:0006281 I could find out about it and all its offspring (or immediate children, if I used GOBPCHILDREN)
> 
> > library(GO.db)
> > GOTERM[["GO:0006281"]]
> GOID: GO:0006281
> Term: DNA repair
(Continue reading)

Yuan Tian | 1 Mar 2012 05:09
Favicon

how edgeR control outliers?

Dear all,

I'm currently using edgeR to detect the differentially expressed genes from a RNAseq datasets, and I'm
also using the gof() function to test for potential outliers. I have two questions regarding the outlier
detection, and would like to have your suggestions. 

1) How the outlier is defined? Is it the gene that have a deviance larger than a threshold? How is the deviance
contained in the glmfit data calculated?

2) In gof() function, it assumes the deviance should follow a chi-squared distribution. But what is the
statistic basis for this assumption? 

Thanks!

Yuan
	[[alternative HTML version deleted]]

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

Ekta Jain | 1 Mar 2012 05:26

Filtering genes on highest expression before or after LIMMA?

Dear All,
I have analyzed a dataset for differential gene expression using LIMMA. The requirement was to select
probesets with highest value of expression. I notice that there is a change in results for when i filter for
probesets before and after performing LIMMA. The logFC  and gene list remains the same, only change is in
the p-value and B Value, again this is possible because the probesets are not averaged to the gene level but
retained on maximum expression there by not having any affect of the filtering on the fold change.

I can see that a similar question had been asked before https://stat.ethz.ch/pipermail/bioconductor/2011-June/039936.html

Would be grateful if someone can please tell me if its best to filter before any statistical analysis or
after? I am leaning towards 'filter before always' but though will gather more views on the same.

Many Thanks.

Regards,
Ekta
The information contained in this electronic message and in any attachments to this message is
confidential, legally privileged and intended only for use by the person or entity to which this
electronic message is addressed. If you are not the intended recipient, and have received this message in
error, please notify the sender and system manager by return email and delete the message and its
attachments and also you are hereby notified that any distribution, copying, review, retransmission,
dissemination or other use of this electronic transmission or the information contained in it is
strictly prohibited. Please note that any views or opinions presented in this email are solely those of
the author and may not represent those of the Company or bind the Company. Any commitments made over e
 -mail are not financially binding on the company unless accompanied or followed by a valid purchase order.
This message has been scanned for viruses and dangerous content by Mail Scanner, a!
 nd is believed to be clean. The Company accepts no liability for any damage caused by any virus transmitted
by this email.
www.jubl.com

(Continue reading)

Ekta Jain | 1 Mar 2012 05:40

Re: Use probesets with highest baseline expression for differntial gene

Dear  Gordon,
That worked fantastic. 

Thank you very much.

Best Regards,
Ekta

-----Original Message-----
From: Gordon K Smyth [mailto:smyth@...] 
Sent: 24 February 2012 05:00
To: Ekta Jain
Cc: Bioconductor mailing list
Subject: Use probesets with highest baseline expression for differntial gene

Dear Ekta,

Jim as already pointed out that you have some incorrect perceptions about 
what limma does by default.

If you need to keep one probe for each gene symbol after a limma lmFit, 
and you want to choose the probe with highest average expression, it is 
easy to do like this.  I will assume that your linear model fit object is 
called 'fit', and your annotation includes a column called "Symbol" 
containing the gene symbol.

    o <- order(fit$Amean, decreasing=TRUE)
    dup <- duplicated(fit$genes$Symbol[o])
    fit.unique <- fit[o,][!dup,]

(Continue reading)

wuchunyan | 1 Mar 2012 08:52
Picon

Fw: Warning of function "ncbiTaxonomy"

Dear All,
   I have a list of NCBI taxon ids for which I would like to have both the full lineage and common name
information. So I Install the package called 'genomes' (genomes_2.0.0.zip£©,then use function
'ncbiTaxonomy' as followed,

> ncbiTaxonomy (1000587, "lineage")
Premature end of data in tag TaxaSet line 1

I search by google to find the reason, but I conld not get any.
If anyone know?

Thanks very much!

	[[alternative HTML version deleted]]

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
wuchunyan | 1 Mar 2012 09:30
Picon

Warning of function "ncbiTaxonomy"

Hi,
   I have a list of NCBI taxon ids for which I would like to have both the full lineage and common name
information. So I Install the package called 'genomes' (genomes_2.0.0.zip£©,then use function
'ncbiTaxonomy' as followed,

> ncbiTaxonomy (1000587, "lineage")
Premature end of data in tag TaxaSet line 1

I search by google to find the reason, but I conld not get any.
If anyone know?

Thanks very much!

	[[alternative HTML version deleted]]

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
Ovokeraye Achinike-Oduaran | 1 Mar 2012 09:38
Picon

Re: PostForm() with KEGG

Hi Duncan and Martin,

My bad, no bug whatsoever...it was me. Got my code sorted for the most
part. Thanks again for all the help. It's much appreciated.

-Avoks

On Wed, Feb 29, 2012 at 12:19 PM, Ovokeraye Achinike-Oduaran
<ovokeraye@...> wrote:
> Hi Morgan,
>
> Thanks. I think there's possibly a bug with the
> getHTMLFormDescription() but I do understand what you've explained.
>
> Thanks again.
>
>
> -Avoks
>
> On Tue, Feb 28, 2012 at 6:19 PM, Martin Morgan <mtmorgan@...> wrote:
>> On 02/28/2012 06:14 AM, Ovokeraye Achinike-Oduaran wrote:
>>>
>>> Hi Duncan,
>>>
>>> My understanding is that xpathSApply() combines both the geneSetNode()
>>> and the sapply(). I hope that this is a correct assumption. In
>>> attempting to retrieve nodes in general from the pathway, I used  both
>>>
>>> xpathSApply(doc, "//li/node()",  xmlGetAttr, "href")
>>> and
(Continue reading)


Gmane