1 Jul 2009 03:41
Re: Parsing a FASTA file (Was: Bioperl-l Digest, Vol 74, Issue 25)
Mark A. Jensen <maj <at> fortinbras.us>
2009-07-01 01:41:16 GMT
2009-07-01 01:41:16 GMT
Hi Paola,
You want to try Bio::SearchIO, I think. It's not quite clear what you
want to do, but here's an example of what you can do:
Get all high-scoring pairs ( the mini-alignments ) involving
the database sequence called "2ojg:A"--
use Bio::SearchIO;
my $io = Bio::SearchIO->new(-format=>'fasta', -file=>'yourfile.fasta');
my $result = $io->next_result;
my <at> desired_hsps;
while ( my $hit = $result->next_hit ) {
push <at> desired_hsps, grep { $_->subject->seq_id =~ /2ojg:A/ } $hit->hsps;
}
# now all your desired hsps are in the array <at> desired_hsps;
# you can get Bio::SimpleAlign objects from them all, for example:
my <at> aligns = map { $_->get_aln } <at> desired_hsps;
#...and lots of other things...
Look at http://www.bioperl.org/wiki/HOWTO:SearchIO#Using_SearchIO
and http://www.bioperl.org/wiki/HOWTO:SearchIO#Using_the_methods
for a nice introduction to the Bio::SearchIO system by its authors. They
use a blast output as an example, but everything applies to fasta output
as well.
You didn't waste your time writing regexps, by the way. For a Perl
(Continue reading)
RSS Feed