Picon
Favicon

Re: How to get from gi/ref/gb to genomic coordinates ?

This is a simple test from gene ID 3632373 (protein is 46100068) to
contig coordinates: 

perl -MLWP::Simple -e 'map {print $_, "\n" if
/<(Gene-source_src.*?>)(.*)?<$1/} (split "\n",
get(q{http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=gene&i
d=3632373&retmode=xml}))'

You need to translate protein id to gene id though. 

If the genome is available at Map Viewer, try (the contig name is
NW_101115 from last step)
http://www.ncbi.nlm.nih.gov/mapview/maps.cgi?taxid=5270&gnl=NW_101115&MA
PS=genes&cmd=txt

Wenwu Cui, PhD

-----Original Message-----
From: Rainer Machne [mailto:raim <at> tbi.univie.ac.at] 
Sent: Wednesday, January 31, 2007 4:10 PM
To: bioperl-l <at> lists.open-bio.org
Subject: [Bioperl-l] How to get from gi/ref/gb to genomic coordinates ?

Dear Bioperl list,

hoping not be on the wrong email list, i would have a short question:

Is there a standard way or are there nice (Bioperl) tools to come from a

gene id (gi) other ids (see below) to the genomic coordinates of the 
(Continue reading)

Rainer Machne | 1 Feb 13:54
Picon
Picon
Favicon

Re: How to get from gi/ref/gb to genomic coordinates ?

Barry and Jason,

thanks for your quick and very helpful replies.

I guess we should have done (or repeat) our blast search at 
http://fungal.genome.duke.edu/
to get better mapping from proteins to genomes ?

As I retrieved all my proteins via whole genome blasts we should find 
(most of) them in the genbank files ... a good opportunity for me to 
learn some Bioperl and the other packages you mentioned in case we want 
to do more complex analysis later :-)

Thank you very much!

Rainer

Barry Moore wrote:
> Rainer,
> 
> We use a perl library called CGL written by Mark Yandell and  colleagues 
> (which in turn uses Chris Mungal's BioChaos and  Unflattener.pm referred 
> to by Jason) for this type of task.  The  basic pipeline is convert 
> GenBank files to Chaos XML, then use CGL  with those XML files to get a 
> nice object oriented access to exons,  transcripts, proteins, 
> coordinates and more for of those genes.  I am  currently using this 
> with good success on most GenBank genomes  (unfortunately I haven't been 
> working with the fungal genomes, but it  should work fine).  The Ensembl 
> API provides similar functionality  for Ensembl genomes - but not very 
> many fungi there.
(Continue reading)

Chris Fields | 1 Feb 18:55
Picon

Re: How to get from gi/ref/gb to genomic coordinates ?


On Feb 1, 2007, at 6:54 AM, Rainer Machne wrote:

> Barry and Jason,
>
> thanks for your quick and very helpful replies.
>
> I guess we should have done (or repeat) our blast search at
> http://fungal.genome.duke.edu/
> to get better mapping from proteins to genomes ?
>
> As I retrieved all my proteins via whole genome blasts we should find
> (most of) them in the genbank files ... a good opportunity for me to
> learn some Bioperl and the other packages you mentioned in case we  
> want
> to do more complex analysis later :-)
>
> Thank you very much!
>
> Rainer

If the data is available in GenBank you could run the BLAST searches  
at NCBI and limit the search with an Entrez query:

http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query

Most (all?) genome files are tagged as complete

I'm not sure but there might be a way of doing this via  
Bio::Tools::Run::RemoteBlast.  Jason, any ideas?
(Continue reading)

Chris Fields | 1 Feb 19:09
Picon

Re: How to get from gi/ref/gb to genomic coordinates ?

> If the data is available in GenBank you could run the BLAST searches
> at NCBI and limit the search with an Entrez query:
>
> http://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml#entrez_query
>
> Most (all?) genome files are tagged as complete

sorry, didn't finish that...

"Most (all?) genome files are tagged as complete, wgs, in progress,  
etc. and can be limited by taxonomy using Fungi[ORGN] or similar."

chris
Jason Stajich | 1 Feb 19:36
Gravatar

Re: How to get from gi/ref/gb to genomic coordinates ?


On Feb 1, 2007, at 9:55 AM, Chris Fields wrote:

>
> On Feb 1, 2007, at 6:54 AM, Rainer Machne wrote:
>
>> Barry and Jason,
>>
>> thanks for your quick and very helpful replies.
>>
>> I guess we should have done (or repeat) our blast search at
>> http://fungal.genome.duke.edu/
>> to get better mapping from proteins to genomes ?
>>

Well I'm not quite sure of your exact goals.  To find upstream  
regions of known genes, or look at upstream regions of orthologous  
genes?

You can first figure out orthologs based on protein similarities,  
then go in an extract upstream regions for the orthologous genes (I  
provide a link to a big all-vs-all FASTA result at the bottom of the  
page if you want those results, as well as some pairiwise orthology  
assignments, although you may want more or less stringent parameters).

All the GFF and AA data is freely available for download on the site  
for each genome we've annotated or for annotation we've re-formatted  
so you can do things locally and/or modify it to your liking.

>> As I retrieved all my proteins via whole genome blasts we should find
(Continue reading)

Reena Yadav | 1 Feb 19:38
Picon

pdb parser

hi need to extract pdb atomic coordinates (1ake), and do certain
calculations.
i am going stepwise:
steps that involved are:
(1) reading the atomic coordinates
(2) read the result in a file.

need to understand how to whole xyz line in another file.
could someone help.
R.
sandhya khatal | 1 Feb 14:06
Gravatar

Regarding Bioperl program

Respected Sir,
                      I want to do a program which gives dendrogram like
UPGMA a clustering method, but i want this dendrogram by using single
linkage or centroid method.Can u help me for this.U have given the  
code for
tree but i want dendrogram as output by using above any method.

Thanks for anticipating.

Regards,
Sandhya Khatal.
Jason Stajich | 2 Feb 01:55
Gravatar

Fwd: Regarding Bioperl program

re-forwarding Sandhya's email to the list so the email address is  
visible.

The approach that is coded in bioperl is for distance based data such  
as evolutionary distance of DNA or protein sequences - I assume you  
are talking about clustering expression data? You may want to focus  
on the available literature and toolkits that focus on expression  
data - something BioPerl doesn't deliberately focus on right now.

-jason
Begin forwarded message:

> From: "sandhya khatal" <sandhya.khatal <at> gmail.com>
> Date: February 1, 2007 5:06:42 AM PST
> To: jason <at> bioperl.org
> Subject: Regarding Bioperl program
>
> Respected Sir,
>                      I want to do a program which gives dendrogram  
> like
> UPGMA a clustering method, but i want this dendrogram by using single
> linkage or centroid method.Can u help me for this.U have given the  
> code for
> tree but i want dendrogram as output by using above any method.
>
> Thanks for anticipating.
>
> Regards,
> Sandhya Khatal.

(Continue reading)

zhihua li | 2 Feb 04:20
Picon

Bio::index::Fasta- where's the indexed file?


_________________________________________________________________
免费下载 MSN Explorer:   http://explorer.msn.com/lccn/  

_______________________________________________
Bioperl-l mailing list
Bioperl-l <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
zhihua li | 2 Feb 04:27
Picon

Bio::index::Fasta- where's the indexed file?

Sorry guys, the former empty mail was sent out by mistake.

I'm using Bio::index::Fasta to index a file containing lots of sequences in 
fasta format. All is fine except one thing.

According to the bioperl tutorial and the documents, the following code 
will make a indexed file:

my $inx = Bio::Index::Fasta->new(-filename => "test.fasta.idx",
                                     -write_flag => 1);
    $inx->make_index("test.fasta");

And in another script I can access the indexed file by sayinig

$ENV{BIOPERL_INDEX} = "."; # find index in current directory
 my $inx = Bio::Index::Fasta->new(-filename => "test.fasta.idx");
my $seq=$inx->fetch("ent1001");        #fetch the sequence named ent1001

However, after running the first script, I cannot find a new file 
test.fasta.idx in my current directory. And not surprisingly, when I ran 
the second script, perl told me it couldn't find "test.fasta.idx".

What's going on here?

Thanks a lot!

_________________________________________________________________
与联机的朋友进行交流,请使用 MSN Messenger:  http://messenger.msn.com/cn  

(Continue reading)


Gmane