Scott Markel | 1 Dec 2005 02:05

Re: clustalw.pm: could not open sequence file error

Olena,

Are you getting a BioPerl error or a ClustalW error?

What happens if you invoke ClustalW directly on your input
file, i.e., without using BioPerl?

Scott

Olena Morozova wrote:

> Hi all,
> 
> I am trying to use this script
> 
> use Bio::Tools::Run::Alignment::Clustalw;
> 
> $ENV{CLUSTALDIR} = 'C:/perl/clustalw1.8/';
>  my  <at> params = ('ktuple' => 2, 'matrix' => 'BLOSUM',
> 
> 'outfile'=> 'al_mouse.txt');
>  my $factory =
> 
> Bio::Tools::Run::Alignment::Clustalw->new( <at> params);
>  $inputfilename = 'c:/perl/mouse_unique.txt';
>  my $aln = $factory->align($inputfilename);
> 
> to do a MSA, and it works for a test file with 2 or 3 sequences.
> However, when I try to run it on my actual file (has 97 sequences)
> which is in exactly the same format as the test file (fasta), I get a
(Continue reading)

Barry Moore | 1 Dec 2005 14:14
Picon

RE: clustalw.pm: could not open sequence file error

Olena,

Does the filename for the file in question have any spaces anywhere in
the path?  I know clustalx won't open files with a space in the path
even though Windows allows that.  Don't know for sure on clustalw, but
seems like it might behave the same way.

Barry
-----Original Message-----
From: bioperl-l-bounces <at> portal.open-bio.org
[mailto:bioperl-l-bounces <at> portal.open-bio.org] On Behalf Of Olena
Morozova
Sent: Tuesday, November 29, 2005 3:34 PM
To: bioperl-ml List
Subject: [Bioperl-l] clustalw.pm: could not open sequence file error

Hi all,

I am trying to use this script

use Bio::Tools::Run::Alignment::Clustalw;

$ENV{CLUSTALDIR} = 'C:/perl/clustalw1.8/';
 my  <at> params = ('ktuple' => 2, 'matrix' => 'BLOSUM',

'outfile'=> 'al_mouse.txt');
 my $factory =

Bio::Tools::Run::Alignment::Clustalw->new( <at> params);
 $inputfilename = 'c:/perl/mouse_unique.txt';
(Continue reading)

Qunfeng | 1 Dec 2005 21:04
Favicon

solved Re: Deep recursion on subroutine

Hello Jason and Jonathan,

Thanks for your help. Jason figures out the problem "
This is just Perl complaining because the recursion is deep because there are
so many levels in your tree (449 deep).  It thinks it hit a snag because it
doesn't expect to usually have a recursive call go that many levels.

You can make the warnings go away by adding this
no warnings 'recursion';"

Qunfeng
At 04:39 PM 11/30/2005, Jonathan Arthur wrote:
>Hello Qunfeng,
>
>I have not seen this specifically with bioperl, but have had it occur once 
>or twice in my own code and have always traced the problem back to an 
>error in the tree where one node is its own ancestor, thereby causing an 
>infinite recursion when you attempt to find all descendants from that node.
>
>If each node has a unique identifier, and if the tree is not too large, 
>you could find the offedning node with a small script to traverse the 
>tree, testing the unique identifer of each node against a list of all the 
>nodes seen before and dying when it sees offending node again.
>
>Cheers,
>
>Jonathan
>
>Qunfeng wrote:
>
(Continue reading)

chen li | 2 Dec 2005 06:00
Picon
Favicon

bioperl-db and MySQL

Very special thanks to Hilmar and Barry who help me
out for installing bioperl-db.

The follwoings are my experience for installing
bioperl-db:

1) Have a Linux operation system (in my case)
2) Use anonymouse CVS to install bioperl and follow
the HOWTO
3) the CPAN method doesn't work in my case

Li

		
__________________________________________ 
Yahoo! DSL – Something to write home about. 
Just $16.99/mo. or less. 
dsl.yahoo.com 
Hubert Prielinger | 1 Dec 2005 23:49
Picon
Picon

remoteblast doesn't save the Output File

Hi,
I'm quite desperated, because since two days I'm trying to save my 
remoteblast Output and it doesn't work
here we go....

#!/usr/bin/perl -w

use strict;
use warnings;
use Bio::SeqIO;
use Bio::Tools::Run::RemoteBlast;
use Bio::Seq;
use IO::String;
use Bio::SearchIO;

my $prog = 'blastp';
my $db   = 'swissprot';
my $e_val= '20000';
my $matrix = 'PAM30';
#my $outfile = 'Output';

my  <at> data;
my $line_dataArray;
my $rid;
my $count = 1;

my  <at> params = ( '-prog' => $prog,
                '-data' => $db,
                '-expect' => $e_val,
                '-matrix' => $matrix);
(Continue reading)

Erik Sjölund | 2 Dec 2005 17:47
Picon

abi2xml a new parser for abi trace files

Bioperl contains a module to parse abi trace files, called

Bio::SeqIO::abi

http://doc.bioperl.org/releases/bioperl-1.4/Bio/SeqIO/abi.html

So you might be interested to know that a new command line utility has
been released

http://abi2xml.sourceforge.net

that converts abi trace files to xml files. This bioinformatics
utility is written in C++ and released under the GPL license. A perl
programmer could first convert the abi files to xml files and then
access the information over a DOM interface  or over XPATH. The
advantage with this over Bio::SeqIO::abi is to get access to more
information of the abi file. Like for instance the time when the
experiment was done. I don't think that is possible with
Bio::SeqIO:abi ( correct me if I'm wrong ).

cheers,
Erik Sjölund
Sam Al-Droubi | 4 Dec 2005 21:17
Picon
Favicon

Bio:Seq $seq_obj->accession_number not returning accession number?

The fasta format for this sequence AF410462 from NCBI looks like this

 
>gi|17066572|gb|AF410462.1|AF410462 Mus musculus PEM homeobox (Pem) gene, promoter region and
partial cds
ATGCGTGTGGGCATGCGCTCATGCCCACTTGCTTGAGCACATGTGTGCTCACATGGACGTTAGAGGCAAC
TTTCAGGAGTTATTTTTTTCCCTTCTAACTTGAGTTCCTGGACCTCAGACTTGTATAATAGGTACTTTCC
CAACTTAAGTCTTACTGGCTCCAGGGTATCTGGTATACTCTTCTAGCCTCCAAGGGCAGCCACTCATGCT
TCTTCAGGTGTGAAGAGGTGAGCCAGATACAACGGTGGGAGGCAGTGTGCCCTCAGTGTGTAGACTCTTT
ATGCCCTTGGGGATTAGCGCCTCTAGCTGCCAGTCGGGTCTCTGGGTCCCTCCTGCTAAGGCCACTCTCG
TCATGGTTCCTCTTGTCCTGGTGAGCCATTACGACCCTCTCACTTCCTTGTGTTCTCTTCCCTGTGTTCT
CTCTCTGCTGCTGTGGCCATTCTAGCTCCCTGCACAGTCCTTCAAGCTCACCTCCTGCCTTCCGTGGACA
AGAGGAAGCACAAAGAATCATCCAGTATGTATGCTCATGGCATAAGGGGATCCTGGGGAAGGGCTGAAGC
CTGAGCCGGGCTGGTCAACAGAATCTCCCTCTCCCTAACTCCATCTCCCTCTCCTTCCCTCTTCCTCTCT
CTATCCCTCCCCCCTCTCTCCCCCCACCACCGCATGTTTTGGGTCAGCTGACTGCTCTAGCCTTGATGAG
ATATCTTCCCAGGAAGAGTTGGTGCTGACTGTACAGATTGAGTTAGAGGGAGGGAAGAAAGCTCCTGTTT
GATCACTGGAGATCTTTATGCCTAGCTACATGTCTTACCAAAGCCAGGGGAGTCAGCTGAGCTGTAACTG
GGCACCCTAAGTTCTGCACACCCACATGCCCATGAACTGTGTCCATCTTGCAAGCACATCGTGCTCATTA
CATCCCCAAACTGCTATCACTTGTGTACCCCAAAGGCTCGGCCCACAGGAACGTCCTGTGAGCAAATCAC
AAAGACCAGCTTAGGGCTGGAAACATTGTAACCTGAAGTAGGCCAGAGGAGATCCCTGCCAGGTTGAGCA
TCACAGATCTCATTCTGTTCCCGGGGACACCAGGGGCCCAAGCTCAGAATCTGCCGAAGCATAACTTCAT
CATTGATCCTATTCAGGGTATGGAAGCTGAGGGTTCCAGCCGCAAGGTCACCAGGCTACTCCGCCTGGGA
GTCAAGGAAG

 When I read this from a file as a sequence object using Bio::Seq I get accession_number unknow.  The 
 accession number is in the header of the fasta file.  Anyone knows why this happens. 

 My code looks like this:

 print "primary id is: ",$seq_obj->primary_id."\n";
(Continue reading)

Barry Moore | 4 Dec 2005 22:23
Picon

RE: Bio:Seq $seq_obj->accession_number not returningaccession number?

Sam-

The fasta parser makes no attempt to parse the fasta header since there
is no standard format for what should be in a fasta header.  Parse the
accession out of the primary_id field with a regular expression in your
script or use GenBank or ENSEMBL format sequences to get all the goodies
parsed for you.  Google on "accession fasta parse site:bioperl.org" to
read other posts on this topic.

Barry

-----Original Message-----
From: bioperl-l-bounces <at> portal.open-bio.org
[mailto:bioperl-l-bounces <at> portal.open-bio.org] On Behalf Of Sam
Al-Droubi
Sent: Sunday, December 04, 2005 1:18 PM
To: BioPerl list BioPerl list
Subject: [Bioperl-l] Bio:Seq $seq_obj->accession_number not
returningaccession number?

The fasta format for this sequence AF410462 from NCBI looks like this

 
>gi|17066572|gb|AF410462.1|AF410462 Mus musculus PEM homeobox (Pem)
gene, promoter region and partial cds
ATGCGTGTGGGCATGCGCTCATGCCCACTTGCTTGAGCACATGTGTGCTCACATGGACGTTAGAGGCAAC
TTTCAGGAGTTATTTTTTTCCCTTCTAACTTGAGTTCCTGGACCTCAGACTTGTATAATAGGTACTTTCC
CAACTTAAGTCTTACTGGCTCCAGGGTATCTGGTATACTCTTCTAGCCTCCAAGGGCAGCCACTCATGCT
TCTTCAGGTGTGAAGAGGTGAGCCAGATACAACGGTGGGAGGCAGTGTGCCCTCAGTGTGTAGACTCTTT
ATGCCCTTGGGGATTAGCGCCTCTAGCTGCCAGTCGGGTCTCTGGGTCCCTCCTGCTAAGGCCACTCTCG
(Continue reading)

Jason Stajich | 4 Dec 2005 22:49
Favicon

Re: Bio:Seq $seq_obj->accession_number not returningaccession number?

Sam -
Yeah what Barry said.

It doesn't get set when reading fasta files - see Hilmar's link below  
for more info - all the info is in the display id, available in $seq- 
 >display_id

my ($gi,$acc,$locus);
(undef,$gi,undef,$acc,$locus) = split(/\|/,$seq->display_id);
$seq->accession_number($acc);

I thought there was a function already to do this for you, but I  
guess not.  There is something Search::Hit objects to parse accession  
number so maybe we can consolidate this if someone volunteers to do it.

See also Hilmar's response about this:
http://bioperl.org/pipermail/bioperl-l/2005-August/019579.html

I've added it as a Q&A to the new wiki FAQ which we'll roll out soon.

-jason

On Dec 4, 2005, at 4:23 PM, Barry Moore wrote:

> Sam-
>
> The fasta parser makes no attempt to parse the fasta header since  
> there
> is no standard format for what should be in a fasta header.  Parse the
> accession out of the primary_id field with a regular expression in  
(Continue reading)

Angshu Kar | 5 Dec 2005 02:32
Picon

parsing a BLAST output

Hi,

To begin with, I'm new to Bioperl.
Now, I've written the following simple piece of code to parse a WU-Blast
output which filters data *for a given e-value and >50% overlap*.

I'm writing the main algorithm here:

my $blast_report = $ARG[1];
my $threshold_evalue = $ARG[2];

my $in = new Bio::SearchIO(-format => 'blast', -file => $blast_report);

while (my $result = $in -> next_result)
   {
      while(my $hit = $result->next_hit)
         {
            if(($line{$hit->name} == $line{$result->query_accession}))
               {
                  next;
               }
            if($hit->hsp->evalue <= $threshold_evalue)
               {
                  if($hit->hsp->frac_indentical>=0.5)
                     {
                        print $line{$result->query_accession} . "\t" .
$line{$hit->name} . "\t" . $hit->hsp-evalue . "\n";
                    }
              }
      }
(Continue reading)


Gmane