Paola Bisignano | 1 Sep 14:20
Picon
Favicon

help parsing msf file or clustalW file reports

Hi, 

I'm trying to parse fasta files, where I have couple of alignments....I need to identify my residue in my
alignment......I have separate lists that derived from ligplot parsing files.. so I have to manipulate
string...but I don't now how to start..it seems complicated..
I used Bio::AlignIO to parse the fasta file, so I can have a parsed file in msf or clustalW forma

here an example:
CLUSTAL W(1.81) multiple sequence alignment

Sequence/9-273         DKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKE
2pl0:A/6-268           DEWEVPRETLKLVERLGAGQFGEVWMGYYNGHT-KVAVKSLKQGSMSPDAFLAEANLMKQ
                       *:**: *  :.: .:**.**:***: * :: :: .****:**:.:*. : ** ** :**:

Sequence/9-273         IKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVSAVVLLYMATQISSAME
2pl0:A/6-268           LQHQRLVRLYAVVTQEP-IYIITEYMENGSLVDFLKTPSGIKLTINKLLDMAAQIAEGMA
                       ::* .**:* .* *:** :*****:*  *.*:*:*:  .  :::   **
**:**:..* 

I  choose two residue for example...how can I extract them...starting from their position in the pdb file?
I need to walk...to my sequence 

I don't know if it is clear because I cannot explain the question correctly in english...are there any Italians?
could anyone help me?

      
Scott Cain | 1 Sep 15:21
Favicon

GMOD Chado perl modules moving to the Bio namespace

Hello all,

I just wanted to send out a general announcement about a change that  
is coming for perl modules that are distributed with the gmod/chado  
package.  There are some modules, notably Class::DBI classes that are  
automatically generated, that are currently in the Chado namespace.   
This move has been requested by the CPAN maintainers.  So any  
Chado::*  modules will become Bio::Chado::*, except for the Class::DBI  
classes, which will become Bio::Chado::CDBI::*.

This will probably affect relatively few users, though ModWare in its  
current incarnation will need to be updated.

Scott

-----------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research
Peter | 1 Sep 17:33
Picon
Picon

Re: Next-Gen and the next point release - updates

On Thu, Aug 27, 2009 at 12:55 PM, Peter wrote:
>> The two conversions to solexa are still failing.  I'm not sure but I think
>> it's something fairly simple, but I can't work on it until Friday (got too
>> many other things on my plate ATM).  If I get stumped I'll post a message.
>
> ...
>
> This should narrow it down - the bug is in mapping PHRED
> scores (from either Sanger or Illumina 1.3+ files) to the
> Solexa encoding.
>
> Peter

Hi Chris,

I've just noticed BioPerl is treating invalid characters in the quality
string as a warning condition (not an error):
http://lists.open-bio.org/pipermail/open-bio-l/2009-September/000568.html

It seems for fastq-sanger and fastq-illumina, these get given PHRED 0
(character "!" or "@" respectively) which is reasonable. For fastq-solexa
to fastq-solexa however, Solexa -5 (ASCII 59, character ";") does not get
used - a bug?

Also, in all these cases there is currently a spurious "data loss" warning:

$ ./bioperl_sanger2sanger.pl < error_qual_null.fastq

--------------------- WARNING ---------------------
MSG: Unknown symbol with ASCII value 0 outside of quality range,
(Continue reading)

Jason Stajich | 1 Sep 17:49
Gravatar

Re: help parsing msf file or clustalW file reports

I think you might want to use the column_from_residue_number method  
that is part of Bio::SimpleAlign - it lets you get the column from an  
alignment based on the sequence residue, doing some math along the way  
to deal with gaps. That is the residue -> alignment direction.  If you  
are starting at the alignment and want to get the residue's position  
you will use the location_from_column on a particular sequence so

     # select somehow a sequence from the alignment, e.g.
     my $seq = $aln->get_seq_by_pos(1);
     #$loc is undef or Bio::LocationI object
     my $loc = $seq->location_from_column(5);

-jason

On Sep 1, 2009, at 5:20 AM, Paola Bisignano wrote:

> Hi,
>
> I'm trying to parse fasta files, where I have couple of  
> alignments....I need to identify my residue in my alignment......I  
> have separate lists that derived from ligplot parsing files.. so I  
> have to manipulate string...but I don't now how to start..it seems  
> complicated..
> I used Bio::AlignIO to parse the fasta file, so I can have a parsed  
> file in msf or clustalW forma
>
> here an example:
> CLUSTAL W(1.81) multiple sequence alignment
>
>
(Continue reading)

Chris Fields | 1 Sep 18:05
Favicon
Gravatar

Re: Next-Gen and the next point release - updates


On Sep 1, 2009, at 10:33 AM, Peter wrote:

> On Thu, Aug 27, 2009 at 12:55 PM, Peter wrote:
>>> The two conversions to solexa are still failing.  I'm not sure but  
>>> I think
>>> it's something fairly simple, but I can't work on it until Friday  
>>> (got too
>>> many other things on my plate ATM).  If I get stumped I'll post a  
>>> message.
>>
>> ...
>>
>> This should narrow it down - the bug is in mapping PHRED
>> scores (from either Sanger or Illumina 1.3+ files) to the
>> Solexa encoding.
>>
>> Peter
>
> Hi Chris,
>
> I've just noticed BioPerl is treating invalid characters in the  
> quality
> string as a warning condition (not an error):
> http://lists.open-bio.org/pipermail/open-bio-l/2009-September/000568.html
>
> It seems for fastq-sanger and fastq-illumina, these get given PHRED 0
> (character "!" or "@" respectively) which is reasonable. For fastq- 
> solexa
> to fastq-solexa however, Solexa -5 (ASCII 59, character ";") does  
(Continue reading)

Marcelo Iwata | 1 Sep 19:33
Picon

remove overlapped sequences from Blastn results

Hi

I've made a blastn with such arguments:

../bin/blastall -p blastn -d DBBank -i myFasta.FASTA.txt  -e 0.00001 -o
Out2Blast.txt -a 8

and i want a script that removes overlapped sequences from the results..
For example, if a unigene A has the hit->start  and hit-end as 1 and 4, and
the B is at 2 and 3, respectively, the script remove second one.

I want to know if it already exist, and if not, is there a library that
works with such issue.

I know that at Bio::DB::gff we have overlapping_features. But , if something
directly exist (works with blast format), is better for me.

thanks in advance
Chris Fields | 1 Sep 20:10
Favicon
Gravatar

Re: remove overlapped sequences from Blastn results

Marcelo,

Do you mean tiling?  See:

http://www.bioperl.org/wiki/HOWTO:Tiling

chris

On Sep 1, 2009, at 12:33 PM, Marcelo Iwata wrote:

> Hi
>
> I've made a blastn with such arguments:
>
> ../bin/blastall -p blastn -d DBBank -i myFasta.FASTA.txt  -e 0.00001  
> -o
> Out2Blast.txt -a 8
>
> and i want a script that removes overlapped sequences from the  
> results..
> For example, if a unigene A has the hit->start  and hit-end as 1 and  
> 4, and
> the B is at 2 and 3, respectively, the script remove second one.
>
> I want to know if it already exist, and if not, is there a library  
> that
> works with such issue.
>
> I know that at Bio::DB::gff we have overlapping_features. But , if  
> something
(Continue reading)

Scott Cain | 1 Sep 21:47
Picon

Re: GMOD Chado perl modules moving to the Bio namespace

Hi Don,

I just wanted to let you know that I also updated the code in  
GMODTools, but I don't have a simple way to test it; perhaps you  
should take a look at the cvs diff to make sure what I did makes sense.

Thanks,
Scott

On Sep 1, 2009, at 9:21 AM, Scott Cain wrote:

> Hello all,
>
> I just wanted to send out a general announcement about a change that  
> is coming for perl modules that are distributed with the gmod/chado  
> package.  There are some modules, notably Class::DBI classes that  
> are automatically generated, that are currently in the Chado  
> namespace.  This move has been requested by the CPAN maintainers.   
> So any Chado::*  modules will become Bio::Chado::*, except for the  
> Class::DBI classes, which will become Bio::Chado::CDBI::*.
>
> This will probably affect relatively few users, though ModWare in  
> its current incarnation will need to be updated.
>
> Scott
>
> -----------------------------------------------------------------------
> Scott Cain, Ph. D. scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/) 216-392-3087
> Ontario Institute for Cancer Research
(Continue reading)

Mark A. Jensen | 2 Sep 06:19
Picon
Favicon
Gravatar

bioperl invades emacs

Hi All, 

As part of the Documentation Project, I've written a full-
fledged minor mode for emacs, bioperl-mode. It allows 
the user to access BP pod while coding, using keyboard
shortcuts or menus. Pod pops up in a new view buffer,
which it itself active for quick pod searching. You can 
get the whole pod, pieces of pod, or even the pod headers
of individual methods. 

The best feature (IMHO) is the completion facility. This
not only saves typing, but allows browsing and follow-your-nose
programming (exactly the technique I used to make bioperl-mode,
thanks to the Extensible Self-Documenting Editor).

It's very easy to install, requires only one additional line 
in your .emacs file, and directly infects perl-mode 
(if you so choose) so its available whenever you
open .pl or .pm files.

For details, screenshots, download and install info,
and soporific design details, see
http://www.bioperl.org/wiki/Emacs_bioperl-mode

Send me the bugs!
cheers, 
MAJ
Robert Buels | 2 Sep 06:31
Picon
Favicon
Gravatar

Re: bioperl invades emacs

Wow.  Bravo!

Rob

Gmane