Scott Markel | 1 Dec 02:05 2005

Re: clustalw.pm: could not open sequence file error

Olena,

Are you getting a BioPerl error or a ClustalW error?

What happens if you invoke ClustalW directly on your input
file, i.e., without using BioPerl?

Scott

Olena Morozova wrote:

> Hi all,
> 
> I am trying to use this script
> 
> use Bio::Tools::Run::Alignment::Clustalw;
> 
> $ENV{CLUSTALDIR} = 'C:/perl/clustalw1.8/';
>  my  <at> params = ('ktuple' => 2, 'matrix' => 'BLOSUM',
> 
> 'outfile'=> 'al_mouse.txt');
>  my $factory =
> 
> Bio::Tools::Run::Alignment::Clustalw->new( <at> params);
>  $inputfilename = 'c:/perl/mouse_unique.txt';
>  my $aln = $factory->align($inputfilename);
> 
> to do a MSA, and it works for a test file with 2 or 3 sequences.
> However, when I try to run it on my actual file (has 97 sequences)
> which is in exactly the same format as the test file (fasta), I get a
(Continue reading)

Barry Moore | 1 Dec 14:14 2005
Picon

RE: clustalw.pm: could not open sequence file error

Olena,

Does the filename for the file in question have any spaces anywhere in
the path?  I know clustalx won't open files with a space in the path
even though Windows allows that.  Don't know for sure on clustalw, but
seems like it might behave the same way.

Barry
-----Original Message-----
From: bioperl-l-bounces <at> portal.open-bio.org
[mailto:bioperl-l-bounces <at> portal.open-bio.org] On Behalf Of Olena
Morozova
Sent: Tuesday, November 29, 2005 3:34 PM
To: bioperl-ml List
Subject: [Bioperl-l] clustalw.pm: could not open sequence file error

Hi all,

I am trying to use this script

use Bio::Tools::Run::Alignment::Clustalw;

$ENV{CLUSTALDIR} = 'C:/perl/clustalw1.8/';
 my  <at> params = ('ktuple' => 2, 'matrix' => 'BLOSUM',

'outfile'=> 'al_mouse.txt');
 my $factory =

Bio::Tools::Run::Alignment::Clustalw->new( <at> params);
 $inputfilename = 'c:/perl/mouse_unique.txt';
(Continue reading)

Qunfeng | 1 Dec 21:04 2005

solved Re: Deep recursion on subroutine

Hello Jason and Jonathan,

Thanks for your help. Jason figures out the problem "
This is just Perl complaining because the recursion is deep because there are
so many levels in your tree (449 deep).  It thinks it hit a snag because it
doesn't expect to usually have a recursive call go that many levels.

You can make the warnings go away by adding this
no warnings 'recursion';"

Qunfeng
At 04:39 PM 11/30/2005, Jonathan Arthur wrote:
>Hello Qunfeng,
>
>I have not seen this specifically with bioperl, but have had it occur once 
>or twice in my own code and have always traced the problem back to an 
>error in the tree where one node is its own ancestor, thereby causing an 
>infinite recursion when you attempt to find all descendants from that node.
>
>If each node has a unique identifier, and if the tree is not too large, 
>you could find the offedning node with a small script to traverse the 
>tree, testing the unique identifer of each node against a list of all the 
>nodes seen before and dying when it sees offending node again.
>
>Cheers,
>
>Jonathan
>
>Qunfeng wrote:
>
(Continue reading)

chen li | 2 Dec 06:00 2005
Picon

bioperl-db and MySQL

Very special thanks to Hilmar and Barry who help me
out for installing bioperl-db.

The follwoings are my experience for installing
bioperl-db:

1) Have a Linux operation system (in my case)
2) Use anonymouse CVS to install bioperl and follow
the HOWTO
3) the CPAN method doesn't work in my case

Li

		
__________________________________________ 
Yahoo! DSL – Something to write home about. 
Just $16.99/mo. or less. 
dsl.yahoo.com 
Hubert Prielinger | 1 Dec 23:49 2005
Picon
Picon

remoteblast doesn't save the Output File

Hi,
I'm quite desperated, because since two days I'm trying to save my 
remoteblast Output and it doesn't work
here we go....

#!/usr/bin/perl -w

use strict;
use warnings;
use Bio::SeqIO;
use Bio::Tools::Run::RemoteBlast;
use Bio::Seq;
use IO::String;
use Bio::SearchIO;

my $prog = 'blastp';
my $db   = 'swissprot';
my $e_val= '20000';
my $matrix = 'PAM30';
#my $outfile = 'Output';

my  <at> data;
my $line_dataArray;
my $rid;
my $count = 1;

my  <at> params = ( '-prog' => $prog,
                '-data' => $db,
                '-expect' => $e_val,
                '-matrix' => $matrix);
(Continue reading)

Erik Sjölund | 2 Dec 17:47 2005
Picon

abi2xml a new parser for abi trace files

Bioperl contains a module to parse abi trace files, called

Bio::SeqIO::abi

http://doc.bioperl.org/releases/bioperl-1.4/Bio/SeqIO/abi.html

So you might be interested to know that a new command line utility has
been released

http://abi2xml.sourceforge.net

that converts abi trace files to xml files. This bioinformatics
utility is written in C++ and released under the GPL license. A perl
programmer could first convert the abi files to xml files and then
access the information over a DOM interface  or over XPATH. The
advantage with this over Bio::SeqIO::abi is to get access to more
information of the abi file. Like for instance the time when the
experiment was done. I don't think that is possible with
Bio::SeqIO:abi ( correct me if I'm wrong ).

cheers,
Erik Sjölund
Sam Al-Droubi | 4 Dec 21:17 2005
Picon

Bio:Seq $seq_obj->accession_number not returning accession number?

The fasta format for this sequence AF410462 from NCBI looks like this

 
>gi|17066572|gb|AF410462.1|AF410462 Mus musculus PEM homeobox (Pem) gene, promoter region and
partial cds
ATGCGTGTGGGCATGCGCTCATGCCCACTTGCTTGAGCACATGTGTGCTCACATGGACGTTAGAGGCAAC
TTTCAGGAGTTATTTTTTTCCCTTCTAACTTGAGTTCCTGGACCTCAGACTTGTATAATAGGTACTTTCC
CAACTTAAGTCTTACTGGCTCCAGGGTATCTGGTATACTCTTCTAGCCTCCAAGGGCAGCCACTCATGCT
TCTTCAGGTGTGAAGAGGTGAGCCAGATACAACGGTGGGAGGCAGTGTGCCCTCAGTGTGTAGACTCTTT
ATGCCCTTGGGGATTAGCGCCTCTAGCTGCCAGTCGGGTCTCTGGGTCCCTCCTGCTAAGGCCACTCTCG
TCATGGTTCCTCTTGTCCTGGTGAGCCATTACGACCCTCTCACTTCCTTGTGTTCTCTTCCCTGTGTTCT
CTCTCTGCTGCTGTGGCCATTCTAGCTCCCTGCACAGTCCTTCAAGCTCACCTCCTGCCTTCCGTGGACA
AGAGGAAGCACAAAGAATCATCCAGTATGTATGCTCATGGCATAAGGGGATCCTGGGGAAGGGCTGAAGC
CTGAGCCGGGCTGGTCAACAGAATCTCCCTCTCCCTAACTCCATCTCCCTCTCCTTCCCTCTTCCTCTCT
CTATCCCTCCCCCCTCTCTCCCCCCACCACCGCATGTTTTGGGTCAGCTGACTGCTCTAGCCTTGATGAG
ATATCTTCCCAGGAAGAGTTGGTGCTGACTGTACAGATTGAGTTAGAGGGAGGGAAGAAAGCTCCTGTTT
GATCACTGGAGATCTTTATGCCTAGCTACATGTCTTACCAAAGCCAGGGGAGTCAGCTGAGCTGTAACTG
GGCACCCTAAGTTCTGCACACCCACATGCCCATGAACTGTGTCCATCTTGCAAGCACATCGTGCTCATTA
CATCCCCAAACTGCTATCACTTGTGTACCCCAAAGGCTCGGCCCACAGGAACGTCCTGTGAGCAAATCAC
AAAGACCAGCTTAGGGCTGGAAACATTGTAACCTGAAGTAGGCCAGAGGAGATCCCTGCCAGGTTGAGCA
TCACAGATCTCATTCTGTTCCCGGGGACACCAGGGGCCCAAGCTCAGAATCTGCCGAAGCATAACTTCAT
CATTGATCCTATTCAGGGTATGGAAGCTGAGGGTTCCAGCCGCAAGGTCACCAGGCTACTCCGCCTGGGA
GTCAAGGAAG

 When I read this from a file as a sequence object using Bio::Seq I get accession_number unknow.  The 
 accession number is in the header of the fasta file.  Anyone knows why this happens. 

 My code looks like this:

 print "primary id is: ",$seq_obj->primary_id."\n";
(Continue reading)

Barry Moore | 4 Dec 22:23 2005
Picon

RE: Bio:Seq $seq_obj->accession_number not returningaccession number?

Sam-

The fasta parser makes no attempt to parse the fasta header since there
is no standard format for what should be in a fasta header.  Parse the
accession out of the primary_id field with a regular expression in your
script or use GenBank or ENSEMBL format sequences to get all the goodies
parsed for you.  Google on "accession fasta parse site:bioperl.org" to
read other posts on this topic.

Barry

-----Original Message-----
From: bioperl-l-bounces <at> portal.open-bio.org
[mailto:bioperl-l-bounces <at> portal.open-bio.org] On Behalf Of Sam
Al-Droubi
Sent: Sunday, December 04, 2005 1:18 PM
To: BioPerl list BioPerl list
Subject: [Bioperl-l] Bio:Seq $seq_obj->accession_number not
returningaccession number?

The fasta format for this sequence AF410462 from NCBI looks like this

 
>gi|17066572|gb|AF410462.1|AF410462 Mus musculus PEM homeobox (Pem)
gene, promoter region and partial cds
ATGCGTGTGGGCATGCGCTCATGCCCACTTGCTTGAGCACATGTGTGCTCACATGGACGTTAGAGGCAAC
TTTCAGGAGTTATTTTTTTCCCTTCTAACTTGAGTTCCTGGACCTCAGACTTGTATAATAGGTACTTTCC
CAACTTAAGTCTTACTGGCTCCAGGGTATCTGGTATACTCTTCTAGCCTCCAAGGGCAGCCACTCATGCT
TCTTCAGGTGTGAAGAGGTGAGCCAGATACAACGGTGGGAGGCAGTGTGCCCTCAGTGTGTAGACTCTTT
ATGCCCTTGGGGATTAGCGCCTCTAGCTGCCAGTCGGGTCTCTGGGTCCCTCCTGCTAAGGCCACTCTCG
(Continue reading)

Jason Stajich | 4 Dec 22:49 2005

Re: Bio:Seq $seq_obj->accession_number not returningaccession number?

Sam -
Yeah what Barry said.

It doesn't get set when reading fasta files - see Hilmar's link below  
for more info - all the info is in the display id, available in $seq- 
 >display_id

my ($gi,$acc,$locus);
(undef,$gi,undef,$acc,$locus) = split(/\|/,$seq->display_id);
$seq->accession_number($acc);

I thought there was a function already to do this for you, but I  
guess not.  There is something Search::Hit objects to parse accession  
number so maybe we can consolidate this if someone volunteers to do it.

See also Hilmar's response about this:
http://bioperl.org/pipermail/bioperl-l/2005-August/019579.html

I've added it as a Q&A to the new wiki FAQ which we'll roll out soon.

-jason

On Dec 4, 2005, at 4:23 PM, Barry Moore wrote:

> Sam-
>
> The fasta parser makes no attempt to parse the fasta header since  
> there
> is no standard format for what should be in a fasta header.  Parse the
> accession out of the primary_id field with a regular expression in  
(Continue reading)

Angshu Kar | 5 Dec 02:32 2005
Picon

parsing a BLAST output

Hi,

To begin with, I'm new to Bioperl.
Now, I've written the following simple piece of code to parse a WU-Blast
output which filters data *for a given e-value and >50% overlap*.

I'm writing the main algorithm here:

my $blast_report = $ARG[1];
my $threshold_evalue = $ARG[2];

my $in = new Bio::SearchIO(-format => 'blast', -file => $blast_report);

while (my $result = $in -> next_result)
   {
      while(my $hit = $result->next_hit)
         {
            if(($line{$hit->name} == $line{$result->query_accession}))
               {
                  next;
               }
            if($hit->hsp->evalue <= $threshold_evalue)
               {
                  if($hit->hsp->frac_indentical>=0.5)
                     {
                        print $line{$result->query_accession} . "\t" .
$line{$hit->name} . "\t" . $hit->hsp-evalue . "\n";
                    }
              }
      }
(Continue reading)


Gmane