Lucas van Dijk | 26 Feb 23:30 2015

[Biopython] GSoC 2015: Interactive visualisations for GenomeDiagram

Hi all!

I don't know if you guys already have a contender, but I'm really interested to be part of the Google Summer of Code this year, and especially doing the interactive visualisations for GenomeDiagram project!

A little bit about myself:
- I'm 24 years old and from the Netherlands, currently studying at Delft University of Technology.
- Electrical (Computer) Engineering background
- Started this year with the master Bio-informatics
- Very experienced in Python, and I also have done a lot of webdev work so Javascript is no problem.
- Currently have a small side job (1 day/week) as python software engineer where I create tools and webservices to visualise GIS related data.

Why this project:
- I'm really excited for the field of Bio-informatics, and I want to get more experience with the existing toolkits
- I like making visualisations
- This looks like something I can handle :)

About the project:
You mention the library Bokeh, which looks very awesome, but it may depend on a lot of other libraries which are not necessary for the rest of BioPython. Some possibilities:
- Make bokeh and all its dependencies optional: display a nice error message if the dependencies aren't installed when using trying to create a visualisation
- Put it the other way around: Create a separate python package which depends on BioPython for creating the visualisations.

By the way, any particular reason for the choice of Bokeh, besides looking awesome? Any thoughts on Vispy or other libraries?

Hope to hear from you guys!

With kind regards,
Lucas van Dijk
Biopython mailing list  -  Biopython <at>
Horea Chrristian | 25 Feb 17:03 2015

[Biopython] Read sequence from file

Hi guys, how can I read a sequence from a .txt file which contains only a string of letters (nucleotides)? I tried `"my/file","...")` but if my second value is fasta or genbank, it complains about missing handles, and nothing like "plain", "string", or "str" worked... What can I do? It would be nice if I can do this via a one-liner rather than just read it explicitly with python and then explicitly parse it.

Biopython mailing list  -  Biopython <at>
PC | 24 Feb 23:45 2015

[Biopython] adding multiple models to one PDB


I have the following PDB's


I can align B, C , D with A but I want to write ALL four structures in one file to open in a views like pymol.

	io = Bio.PDB.PDBIO()

This has one file, on the second iteration I want to append to moved.pdb the other pdb's.
How can I do that in biopython?

Thank you,

FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
Check it out at

Biopython mailing list  -  Biopython <at>

Alan | 24 Feb 15:36 2015
Picon ?

Hi there, is nice but not practical for me anymore.

I am wondering if there are python tools/scripts that could help me to achieve similar results.

To start, I basically have 2 sets of taxons: one at phyla level and another at species level and I just want a kind of histogram of species per phylum as seen in e.g. Then I will extend it for including other data.

Many thanks in advance,


Alan Wilter SOUSA da SILVA, DSc
Bioinformatician, UniProt
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Cambridge CB10 1SD
United Kingdom
Tel: +44 (0)1223 494588
Biopython mailing list  -  Biopython <at>
Raoul J P Bonnal | 23 Feb 09:51 2015

[Biopython] OBF GSoC Registration

Please all possible mentors,

register yourself

and try to connect to the organization Open Bioinformatics Foundation
id: obf

in case you are aware that some mentor can not be reached by this 
message feel free to forward it.

Ra & Fra

On 2/20/15 8:51 PM, Raoul Bonnal wrote:
> Dear All,
> I have created a special section
> I think that would be very useful to add all the references where GSoC has been used for doing science.
> please go ahead and add your project/paper/or whatever you think that google should know about our work.
> --
> Ra

Biopython mailing list  -  Biopython <at>

Juan BC | 21 Feb 21:56 2015

[Biopython] [ANN] Scipy Latin América 2015 - Call for Proposals

SciPy Latin América 2015, the third annual Scientific Computing with Python Conference, will be held this May 20-22 in Posadas, Misiones, Argentina.

SciPy is a community dedicated to the advancement of scientific computing through open source Python software for mathematics, science, and engineering. The annual SciPy Conferences allows participants from academic, commercial, and governmental organizations to showcase their latest projects, learn from skilled users and developers, and collaborate on code development.

Proposals are now being accepted for SciPy Latin América 2015.

Presentation content can be at a novice, intermediate or advanced level. Talks will run 30-40 min and hands-on tutorials will run 100-120 min. We also receive proposal for posters. For more information about the different types of proposal, see below the "Different types of Communication" section.

How to Submit?

  1. Register for an account on
  2. Submit your proposal at

Important Dates

  • April 6th: Talks, poster, tutorial submission deadline.
  • April 20th: Notification Talks / Posters / Tutorial accepted.
  • May 20th-22nd: SciPy Latin América 2015.

Different types of Communication

Talks: These are the traditional talk sessions given during the main conference days. They're mostly 30 minutes long with 5 min for questions. If you think you have a topic but aren't sure how to propose it, contact our program committee and we'll work with you. We'd love to help you come up with a great proposal.

Tutorials: We are looking for tutorials that can grow this community at any level. We aim for tutorials that will advance Scientific Python, advance this community, and shape the future. They're are 100-120 minutes long, but if you think you need more than one slot, you can split the content and submit two self-contained proposals.

Posters: The poster session provides a more interactive, attendee-driven presentation than the speaker-driven conference talks. Poster presentations have fostered extensive discussions on the topics, with many that have gone on much longer than the actual "session" called for. The idea is to present your topic on poster board and as attendees mingle through the rows, they find your topic, read through what you've written, then strike up a discussion on it. It's as simple as that. You could be doing Q&A in the first minute of the session with a group of 10 people.

Lightning Talks: Want to give a talk, but do not have enough material for a full talk? These talks are, at max, 5 minute talks done in quick succession in the main hall. No need to fill the whole slot, though!

Juan B Cabral
Biopython mailing list  -  Biopython <at>
Tiago Antao | 20 Feb 15:28 2015

[Biopython] Bio.PDB and Bio.SeqIO.PdbIO


I am trying to get around the relationship between Bio.PDB and
Bio.SeqIO.PdbIO, lets see if I got this right:

1. Bio.PDB is quite limited in terms of what records it can parse. For
example I cannot get to SEQRES or anything related to secondary
structure (SHEET, HELIX)?

2. PdbIO allows to extract the protein sequences from SEQRES

Or, to put it another way, a generic PDB file is not completely
accessible from inside Biopython in the sense that we cannot access
some of the records (SHEET and HELIX being the examples here). Or is
there a way to get to them via Bio.PDB?

Many thanks for your help,
Biopython mailing list  -  Biopython <at>

Brad Chapman | 17 Feb 11:09 2015

Re: [Biopython] biopython installation on Windows

Thanks for the e-mail and sorry about the problems installing
Biopython on Windows. Unfortunately I have no experience working with
Windows so don't have good tips for you. I'm cc'ing the Biopython
mailing list where folks might have more experience.

My general thoughts would be:

- Are you using one of the Windows installers for Biopython or trying to
  build from source? Installing with a pre-built installer should be the
  easiest way:

- In general, could you provide more details about what you tried and
  where you got stuck? The specific error messages you're seeing would
  be helpful to provide more debugging info.

Thanks for your patience getting it installed and hope this helps some,

>> Brad:
>> My grandson will finish his degree in chemistry this semester at Northern
>> Arizona University,  cum laude. He also will have a degree in psychology, cum
>> laude. In his biochem capstone course he wants to use biopython to present data
>> for his research paper.  He has applied to several universities for his Phd. 
>> He likes research.
>> However,  he  and I have been unable to get biopython installed.  He has been
>> using python for sometime accessing databases on the internet. 
>> I tried to install biopython on windows 7 and 8 using both 2.7.5 an 3.3
>> versions.  And I get stopped at the same point every time.  It appears that
>> this interpreter was built for the unix os and made to run on windows.
>> The install script appears to be looking for an entry that does not exist.  I
>> can run python just fine.  Bio - not so fine.
>> So, How do I get biopython installed??  By the way,  I have read the 10 page
>> pdf and myriad other references to no avail.
>> Thanks in advance for whatever assistance you can render.
>> Ken Ingham
Biopython mailing list  -  Biopython <at>

Raoul J P Bonnal | 16 Feb 17:17 2015

[Biopython] Google Summer of Code 2015, call for project idea and mentors.

Hi All

We have LESS than a week to submit the application for the Google Summer of Code 2015, and complete the application.

20 February:

19:00 UTC

Mentoring organization application deadline.

23 - 27 February:

Google program administrators review organization applications.

2 March:

19:00 UTC

List of accepted mentoring organizations published on the Google Summer of Code 2015 site.

OBF is going to apply to be a mentoring organization for Google Summer of
Code 2015. To make the ideas list more digestible for Google's reviewers,
we consolidated all of the Bio* projects' ideas into a single page on the
OBF wiki:

Me (Raoul J.P. Bonnal) and Francesco Strozzi are the OrgAdmin, thanks to the OBF Board.

We encourage each mentor of an affiliated sub-project to fill in/add project
to the above page. Please report directly to me(bonnal <at> your availability as a mentor
for this year. Student from past years can mentor and propose an idea, if supported by
their community.

Any other communication related to OBF and GSoC must use  gsoc <at>
Subscribe here:

Last year we introduced the Cross Projects, i.e.
those involving two or more programming languages or Bio* project
communities and/or can be useful to many languages
( web APIs reusable from any language ). The first 2015 cross project
is and you can find the proposal here:

This page is the one we listed in our application. It is separate from the
OBF wiki page for general GSoC information:

If OBF is accepted for GSoC 2015, it would make sense to point each Bio*
project's GSoC wiki page to this one, instead of duplicating the content.

As another way to interact with potential students, we've created a Google
Plus page for OBF:

And a G+ community for OBF's GSoC activities:

Feel free to forward this message to your colleagues or other possible orgs that want to join us.

Thanks to Eric Talevich, the GSoC 2014 main OrgAdmin, he did a great work and provide a lot of docs and useful hints.

Best regards,
Raoul & Francesco
OBF GSoC 2015 Org Admins
Biopython mailing list  -  Biopython <at>
PC | 11 Feb 03:31 2015

Re: [Biopython] Extracting a PDB list

Hi David,

Thank you, yes I am in the process of trying to do this.

Thank you for the suggestion of the spacing too, something I didn't think of.


-----Original Message-----
From: davidsshin <at>
Sent: Tue, 10 Feb 2015 17:46:35 -0800
Subject: Re: [Biopython] Extracting a PDB list

Hi Patrick,

You should be able to write a script to do this (shell script with some python or awk).

Off the top of my head, for each file you would:

for each file:
   extract the lines with ^ATOM into a new file to make things easier
   read each line into some list
   subtract the residue number from each line from the next line in the list
      if that value is > 1  
          print something ( the file name, or some flag)
      else there are no breaks... can do something else if you want

The only tough parts are using spaces to separate items. If say a protein had 1000 residues, then the 1000 will run into the chain ID. So that's something to consider. Using specific column numbers would be the better way. 

That and I'm not sure about the uniformity of PDB files that are really old.

Let me know if that helps, if not, I can maybe help out further.


On Tue, Feb 10, 2015 at 2:24 PM, João Rodrigues <j.p.g.l.m.rodrigues <at>> wrote:

Without manually checking every single one, there is no such list, at least that I know of. Your best bet could be to reduce your resolution as low as possible, usually those structures are of very good quality.



2015-02-10 22:35 GMT+01:00 PC <patrick.cossins <at>>:

I do know about PISCES lists but I want a list of PDB's without any chain breaks.

Is there such a list or a way to obtain such a list?

Thank you.

FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
Check it out at

Biopython mailing list  -  Biopython <at>

Biopython mailing list  -  Biopython <at>

David Shin, Ph.D
Lawrence Berkeley National Labs
1 Cyclotron Road
MS 83-R0101
Berkeley, CA 94720

Free 3D Earth Screensaver
Watch the Earth right on your desktop!Check it out at
Biopython mailing list  -  Biopython <at>
PC | 10 Feb 22:35 2015

[Biopython] Extracting a PDB list


I do know about PISCES lists but I want a list of PDB's without any chain breaks.

Is there such a list or a way to obtain such a list?

Thank you.

FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
Check it out at

Biopython mailing list  -  Biopython <at>