Brad Chapman | 1 May 14:11 2009

Re: MUMmer

Marcin;

> I guess I should start with a nice 'hi' to everybody, now that I am
> sending my first message to this group. So: Hi, Everybody! 

Welcome. We are happy to have you.

> Now, that we have the formality out of the way, I will get to the point.
> Recently, I have written some Python code for parsing and processing the
> output of MUMmer tool (http://mummer.sourceforge.net/). More
> specifically, the code I have manages invocations and handles outputs of
> the nucmer pipeline (alignment of multiple closely related nucleotide
> sequences) and of mummer itself (short exact matches). Obviously, the
> results are ultimately rendered as pairs of biopython's Seq objects. 

This is great -- we don't have support for MUMmer alignments so this
is very welcome.

> I use this stuff only myself, in work on bacterial genomes, but I would
> be more than willing to contribute it to the project. It may be rough
> around the edges at the moment, but I think I could easily give it the
> necessary polish if there is interest in having it included. 

As Bartek mentioned, the first step is to organize the code you have
and start it as a branch on GitHub. Being able to see the code will
help us make specific suggestions. Generally, based on what you've
written it sounds like this will fit into the alignment interfaces.
Peter and Cymon have been working on organizing this. Support for
command lines and running programs lives in:

(Continue reading)

Brad Chapman | 1 May 14:28 2009

Re: XML parsing library for new modules

Eric;
Thanks for summarizing the issues. I know Peter is taking a few well
deserved days off but I suspect he will have some thoughts when he
returns. We'd love to hear the experience of others who have used
different python XML parsers.

My lean is towards ElementTree for reasons of code clarity. SAX
parsers require a lot of boilerplate style code. They also can be
tricky with nested elements; I always find myself using a lot of "if
in_tag; else if in_tag" style code. ElementTree eliminates a lot of
these issues which should result in easier to maintain code.

Brad

> I'm writing a parser for the PhyloXML format for Google Summer of Code this
> year, and as the name would imply, it requires parsing some large XML files.
> The existing modules in Biopython for parsing XML formats seem to use
> xml.sax in the standard library. In Python 2.5, a faster and more Pythonic
> parser was added to the standard lib: ElementTree (xml.etree), in
> pure-Python and C-enhanced flavors. How do you feel about each of these
> libraries as the basis for a new Biopython module?
> 
> Here are some interesting benchmarks:
> http://effbot.org/zone/celementtree.htm#benchmarks
> 
> The ElementTree library is also available as a standalone package,
> compatible back to Python 2.1, and the lxml package also offers an
> independent implementation. So maintaining compatibility with Python 2.4
> would require the availability of one of these third-party packages, and my
> code would try each of these imports in order:
(Continue reading)

Marcin Swiatek | 1 May 20:17 2009
Picon

Re: MUMmer

Bartek, Brad,

Thank you for the suggestions. I will set myself up as proposed and see
what I can do to align my code with local customs and traditions. If
questions arise, I will post again. 

As for the use of alignment object, I have actually chosen to represent
'candidate' matches by my own simplistic class. Nucmer, the way I use
it, generates lots of spurious matches, which I always need to somehow
filter. Thus, it seemed perfectly reasonable at the time to create the
proper representation of alignment later on, in a separate function
call. Following your suggestion I will probably change it to return an
alignment object, rather than a pair of sequences. But details are best
discussed once the code is available, so I think we will return to this
matter later. 

Regards,

Marcin

-----Original Message-----
From: barwil <at> gmail.com [mailto:barwil <at> gmail.com] On Behalf Of Bartek
Wilczynski
Sent: Thursday, April 30, 2009 12:51 PM
To: Marcin Swiatek
Cc: biopython-dev <at> biopython.org
Subject: Re: [Biopython-dev] MUMmer

Hi Marcin,

(Continue reading)

bugzilla-daemon | 1 May 20:16 2009

[Bug 2820] Convert test_PDB.py to unittest

http://bugzilla.open-bio.org/show_bug.cgi?id=2820

------- Comment #8 from eric.talevich <at> gmail.com  2009-05-01 14:16 EST -------
(In reply to comment #7)
> (In reply to comment #2)
> > Python 2.6 includes a context manager that makes all these problems
> > *completely* go away, by catching all of the warnings raised within a
> > context and optionally storing them as a list of warning objects that
> > can be inspected.
> 
> That sounds much better :)
> 
> > Would you be interested in having a unit test that does a more thorough
> > check of the warnings system, but only runs on Py2.6? I'm guessing no,
> > but hey, worth a shot.
> 
> Yes - other than using the old print-and-compare test, this seems worth doing
> in order to actually test the warnings we expect are being issued.  It could be
> a whole new file, test_PDB_warnings.py which required Python 2.6+, but as its
> just one or two tests, maybe just use conditional method(s) within the
> test_PDB_unit.py file.
> 
> Peter
> 

I have something that works on both Py2.5 and Py2.6 now:
http://github.com/etal/biopython/tree/pdbtidy

I added a new file called _PDB_extra.py which test_PDB_unit.py imports if an
attribute called 'catch_warnings' is available in the current warnings module.
(Continue reading)

bugzilla-daemon | 4 May 12:57 2009

[Bug 2822] Bio.Application.AbstractCommandline - properties and kwargs

http://bugzilla.open-bio.org/show_bug.cgi?id=2822

biopython-bugzilla <at> maubp.freeserve.co.uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
Attachment #1288 is|0                           |1
           obsolete|                            |

------- Comment #4 from biopython-bugzilla <at> maubp.freeserve.co.uk  2009-05-04 06:57 EST -------
Created an attachment (id=1289)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=1289&action=view)
Patch to add keyword arguments and properties to command line wrappers

Brad likes the idea, and as the Bio.Application module owner that's good :)
http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005963.html

This patch makes a very slight difference to reduce the changes needed to old
code (i.e. in the __init__ method use self.parameters = [...] as before) with
the bonus that the base class and subclasses have the same __init__ signature
(argument list).

This patch also now covers Bio.Align.Applications, Bio.Motif.Applications and
Bio.AlignAce.Applications as well as Bio.Emboss.Applications (i.e. all affected
files).

Peter

--

-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
(Continue reading)

Peter Cock | 4 May 14:02 2009

Re: MUMmer

On Thu, Apr 30, 2009 at 4:23 PM, Marcin Swiatek
<marcin.swiatek <at> mail.mcgill.ca> wrote:
> Hello,
>
> I guess I should start with a nice 'hi' to everybody, now that I am
> sending my first message to this group. So: Hi, Everybody!

Hi!

> Now, that we have the formality out of the way, I will get to the point.
> Recently, I have written some Python code for parsing and processing the
> output of MUMmer tool (http://mummer.sourceforge.net/). More
> specifically, the code I have manages invocations and handles outputs of
> the nucmer pipeline (alignment of multiple closely related nucleotide
> sequences) and of mummer itself (short exact matches). Obviously, the
> results are ultimately rendered as pairs of biopython's Seq objects.
>
> I use this stuff only myself, in work on bacterial genomes, but I would
> be more than willing to contribute it to the project. It may be rough
> around the edges at the moment, but I think I could easily give it the
> necessary polish if there is interest in having it included.

Great!  I assume your OK with our licence, and there are no problems
from your employer/University with a contribution like this?

> Should that be the case, could one of the project leads point me in the
> right direction, please? How should I go about the submission?

In terms of showing us the code, how do you feel about trying out
github (see Bartek's email)?  Alternatively file and enhancement bug
(Continue reading)

Peter Cock | 4 May 14:15 2009

Re: XML parsing library for new modules

On Fri, May 1, 2009 at 1:28 PM, Brad Chapman <chapmanb <at> 50mail.com> wrote:
> Eric;
> Thanks for summarizing the issues. I know Peter is taking a few well
> deserved days off but I suspect he will have some thoughts when he
> returns. We'd love to hear the experience of others who have used
> different python XML parsers.

I would be interested to hear Michiel's views on this, as he knows
more about the specifics of the existing XML parsers in Biopython
(e.g. Bio.Entrez).

> My lean is towards ElementTree for reasons of code clarity. SAX
> parsers require a lot of boilerplate style code. They also can be
> tricky with nested elements; I always find myself using a lot of "if
> in_tag; else if in_tag" style code. ElementTree eliminates a lot of
> these issues which should result in easier to maintain code.

We have been trying to avoid external library dependencies where
possible (moving away from Martel for parsing has really helped here).
Given ElementTree and cElementTree are included with Python 2.5+,
this is only an issue for Biopython running on Python 2.4.  Both
ElementTree and cElementTree are available as separate downloads
(with Windows installers).  I think under their licence we could even
bundle it with Biopython if need be.

So, while it is a shame ElementTree isn't part of Python 2.4, if it is
the best technical solution, that shouldn't stop us from using it.  Note
we should ONLY use those core features which are included with
Python 2.5+ inself.

(Continue reading)

bugzilla-daemon | 4 May 15:47 2009

[Bug 2822] Bio.Application.AbstractCommandline - properties and kwargs

http://bugzilla.open-bio.org/show_bug.cgi?id=2822

biopython-bugzilla <at> maubp.freeserve.co.uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
Attachment #1289 is|0                           |1
           obsolete|                            |

------- Comment #5 from biopython-bugzilla <at> maubp.freeserve.co.uk  2009-05-04 09:47 EST -------
Created an attachment (id=1290)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=1290&action=view)
Patch to add keyword arguments, properties and __repr__ to command line
wrappers

Extended to include __repr__ support (using the new keyword arguments support).

Note that the Muscle wrapper will need an alternative python valid identifier
for the -in argument, e.g. "input", because we can't use just "in" as a
property or keyword argument.

--

-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon | 4 May 16:07 2009

[Bug 2822] Bio.Application.AbstractCommandline - properties and kwargs

http://bugzilla.open-bio.org/show_bug.cgi?id=2822

biopython-bugzilla <at> maubp.freeserve.co.uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
Attachment #1290 is|0                           |1
           obsolete|                            |

------- Comment #6 from biopython-bugzilla <at> maubp.freeserve.co.uk  2009-05-04 10:07 EST -------
Created an attachment (id=1291)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=1291&action=view)
Patch to add keyword arguments, properties and __repr__ to command line
wrappers

As in previous patch but with support for clearing parameters by "deleting" the
property, and some basic doctests in Bio.Application.

Still need to co-ordinate with Cymon to give the Muscle wrapper a valid python
identifier as an alias for the -in argument, e.g. "input", because we can't use
just "in" as a property or keyword argument.

--

-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
Peter | 4 May 16:48 2009
Picon
Picon

Re: Properties in Bio.Application interface?

On Thu, Apr 30, 2009 at 1:05 PM, Brad Chapman <chapmanb <at> 50mail.com> wrote:
> I love what you are doing here. The keywords and properties make
> it much more Pythonic; the old way reeks of Java-style get/sets. My
> vote is to put them both in.

Cool - I was hoping people would agree it is more pythonic.

I have some follow up thoughts, or points for discussion ...

Peter

Gmane