Jose Colbes | 19 May 20:00 2015

Possible Issue with Calc.angle() and Calc.torsion()


As a part of my work, I use these methods a lot. But I get NaN when the vectors are parallel. 

I know that it is a common issue (rounding errors and acos()) and there is info in the web to get around the problem, but I wanted you to know anyway.

Best regards,

PD: Thank you for Biojava.
Biojava-l mailing list  -  Biojava-l <at>
Andreas Prlic | 15 May 00:46 2015

Fwd: Obtaining second structure with biojava

Hi Mohammad,

Please don't send BioJava related questions to me directly, but to the mailing list.

The secondary structure assignment code in BioJava is still in beta. If you want to get the author's assignment of secondary structure (most of the time the same as DSSP assignments), you can take a look at the tutorial for how to access it. Check the section "Working with groups" for an example.

Hope that helps,


---------- Forwarded message ----------
From: Mohammad Taheri <mo.taheri.ledari <at>>
Date: Thu, May 14, 2015 at 2:01 AM
Subject: Obtaining second structure with biojava
To: andreas.prlic <at>

Hello Mr Andreas Prlic.

I am using biojava to load and analyze protein structure, but i have problem with obtaining the secondary structure of a protein. I use the code here to get the second structure but it is not giving me right locations of alpha helices. For example it considers some 3/10 helices as alpha helices.
I even tried using raw pdb file to obtain alpha helices and beta sheets by using toPDB()  method of the structure object but this method is not giving me the oroginal pdb file with HELIX and SHEET sections and just giving me atoms section.
May you tell me how can i obtain right and exact second structure of a protein chain by using biojava?

Thank you in advance for your help.

Biojava-l mailing list  -  Biojava-l <at>
Rose, Peter | 13 May 21:19 2015

[Job] Java Web Developers at RCSB Protein Data Bank

Java Web Developer

The RCSB PDB is seeking exceptional Developers, and we know we’re not alone in our search. So why choose to work with us? Our team values open discussion and contribution. Starting from your first day, you will shape software and services used by thousands of people around the world. Our organization can trace its lineage back to the 1970’s, but we still operate like a start up. Have a great idea, let’s hear it. Want to try a new technology, let’s learn it. Want to write code at scale, let’s do it. Everyone at our organization is passionate about what we do, and that is why we are leaders in our field. We want to hear from skilled Developers, people passionate about their craft and what they can bring to the field.

We are looking for two experienced Developers to join our team of agile software Developers at the University of California, San Diego. By joining our team, a successful applicant would be able to contribute to a variety of projects ranging from:

  • Front end development using HTML, CSS, Javascript, JSP, NodeJS
    • Our core business is our website & web services
  • Middleware development that leverages Memcached, Hibernate and RabbitMQ
    • How we scale to meet tens of thousands of unique users every day
  • Back end development using Java, MySQL/MariaDB and NoSQL solutions
    • How we incorporate and add value to the scientific community
  • Special projects
    • Search using Apache Solr
    • Scalable solutions built on top of OpenStack, Hadoop, and Spark

The RCSB Protein Data Bank ( is one of the worldwide leading biological databases with more than 300,000 unique users per month from over 160 countries. It enables access to the singular global archive of the three-dimensional structures of proteins and nucleic acids and is a key resource for the design of new medicines, biofuels, nanomaterials, and enables fundamental discoveries in biology and medicine.


  • BS degree in Computer Science or related field
  • A minimum of 3 years of experience developing dynamic, highly scalable, database-driven web applications using HTML, CSS, JavaScript and Java/JSP
  • Demonstrable experience with database design and systems
    • Experience with NoSQL database systems, object-relational mapping using Hibernate and distributed parallel computing is a plus
  • Citable experience using agile software development and test-driven design

For more requirements or to apply, please view the UCSD job page.

Biojava-l mailing list  -  Biojava-l <at>
Jonas Dehairs | 28 Mar 17:13 2015

Introducing a mutation in a DNA sequence

I want to introduce a mutation to a DNA sequence at a particular location.
I can't seem to find a suitable method for this in the 4.0 API. What would make most sense to me is a setCompoundAt (int position, c compound) method in the AbstractSequence class, similar to the getCompoundAt(int position) method, but this doesn't seem to exist. And the mutator class seems to be for proteins only. How can I do this?
Biojava-l mailing list  -  Biojava-l <at>

Load NCBI Taxonomy data using BioJava

Hi all,

First of all, apologies if this question has been already answered or seems silly. I'm new using BioJava and I've been trying to do this for a couple of days but I don't find any answer/solution.

I just want to load NCBI Taxonomy content (names.dmp and nodes.dmp, as I saw in the documentation ( using BioJava in order to query the taxonomy data.

In the wiki I found an example about how to "start" the loading process ( but I realized that these classes doesn't exists in current Jars. I download all the jars and import all of them (because I didn't know if I only need core or any extra one) but the class is missing. I also see that Javadoc of 4.0 version ( doesn't have the classes mentioned in the example neither.

Searching I found that these classes seem to be classes of an older BioJava version (1.5, 1.7, ..).

I don't know if I'm doing something wrong or just is that this part of the wiki has not been updated or..? any clue?

If any of you, by the way, have some example of code working with ncbi taxonomy that can provide me it would be very helpful.


Dr. Alejandro Rodríguez González - PhD

Bioinformatics at Centre for Plant Biotechnology and Genomics UPM-INIA
Polytechnic University of Madrid
Phone: +34 
914524900 . Ext: 25550

Once the game is over, the king and the pawn go back in the same box. - Italian proverb
Biojava-l mailing list  -  Biojava-l <at>
Peter Cock | 3 Mar 16:50 2015

Sadly OBF not accepted for GSoC 2015

This announcement was posted to the OBF blog here:

Dear all,

Last year’s Google Summer of Code 2014 was very productive for the OBF
with six students working on Bio* and related bioinformatics projects
[1]. We applied to be part of GSoC 2015, but unfortunately this year
were not accepted.

Google’s program is enormously popular, and over-subscribed, meaning
Google has had to rotate organisation membership. The OBF is grateful
to have been accepted in 2010, 2011, 2012 and 2014. This year any
participation will be down to individual projects to find a willing
umbrella group from the organisations accepted for GSoC 2015 [2]. For
example, a Biopython project was included under NESCent for GSoC 2013.

Other organizations with bioinformatics as keyword are Ruby Science
Foundation, Department of Biomedical Informatics, Stony Brook
University, OncoBlocks, University of Nebraska – Helikar Lab. Other
organizations related to sciences are ASCEND , BRL-CAD, Debian Project
, HPCC Systems®,  International Neuroinformatics Coordinating
Facility, lmonade: scientific software distribution, OSGeo – Open
Source Geospatial Foundation, The Concord Consortium, The
Visualization Toolkit. Languages: Python, Scala, Apache Foundation.
Last but not least : Global Alliance for Genomics & Health. [3]

On behalf of the OBF, we would like to thank our volunteer GSoC
Administrators, Raoul Bonnal and Francesco Strozzi, for organising our
application - and all our potential mentors across the Bio* projects
who put forward potential project suggestions.

OBF Secretary

[3] See links on

Biojava-l mailing list  -  Biojava-l <at>
Raoul J P Bonnal | 23 Feb 09:51 2015

OBF GSoC Registration

Please all possible mentors,

register yourself

and try to connect to the organization Open Bioinformatics Foundation
id: obf

in case you are aware that some mentor can not be reached by this 
message feel free to forward it.

Ra & Fra

On 2/20/15 8:51 PM, Raoul Bonnal wrote:
> Dear All,
> I have created a special section
> I think that would be very useful to add all the references where GSoC has been used for doing science.
> please go ahead and add your project/paper/or whatever you think that google should know about our work.
> --
> Ra

Biojava-l mailing list  -  Biojava-l <at>

Raoul J P Bonnal | 16 Feb 17:17 2015

Google Summer of Code 2015, call for project idea and mentors.

Hi All

We have LESS than a week to submit the application for the Google Summer of Code 2015, and complete the application.

20 February:

19:00 UTC

Mentoring organization application deadline.

23 - 27 February:

Google program administrators review organization applications.

2 March:

19:00 UTC

List of accepted mentoring organizations published on the Google Summer of Code 2015 site.

OBF is going to apply to be a mentoring organization for Google Summer of
Code 2015. To make the ideas list more digestible for Google's reviewers,
we consolidated all of the Bio* projects' ideas into a single page on the
OBF wiki:

Me (Raoul J.P. Bonnal) and Francesco Strozzi are the OrgAdmin, thanks to the OBF Board.

We encourage each mentor of an affiliated sub-project to fill in/add project
to the above page. Please report directly to me(bonnal <at> your availability as a mentor
for this year. Student from past years can mentor and propose an idea, if supported by
their community.

Any other communication related to OBF and GSoC must use  gsoc <at>
Subscribe here:

Last year we introduced the Cross Projects, i.e.
those involving two or more programming languages or Bio* project
communities and/or can be useful to many languages
( web APIs reusable from any language ). The first 2015 cross project
is and you can find the proposal here:

This page is the one we listed in our application. It is separate from the
OBF wiki page for general GSoC information:

If OBF is accepted for GSoC 2015, it would make sense to point each Bio*
project's GSoC wiki page to this one, instead of duplicating the content.

As another way to interact with potential students, we've created a Google
Plus page for OBF:

And a G+ community for OBF's GSoC activities:

Feel free to forward this message to your colleagues or other possible orgs that want to join us.

Thanks to Eric Talevich, the GSoC 2014 main OrgAdmin, he did a great work and provide a lot of docs and useful hints.

Best regards,
Raoul & Francesco
OBF GSoC 2015 Org Admins
Biojava-l mailing list  -  Biojava-l <at>
Jose Manuel Duarte | 11 Feb 17:54 2015

Cookbook update

FYI I've updated many of the cookbook pages in to adapt 
them to Biojava 4. I'm sure there are still a lot of pointers to 3 and 
some other inconsistencies, but it's a first step towards an up-to-date 
Biojava 4 cookbook.

Biojava-l mailing list  -  Biojava-l <at>

Steve Darnell | 3 Feb 18:19 2015

[Job Posting] Software developer position at DNASTAR (Madison, WI USA)


My company is hiring software developers for our structural biology group in Madison, WI USA. If
interested, please submit applications to resume <at>

Best regards,
Steve Darnell

DNASTAR is a leading developer of desktop computer software for molecular biologists. Established in
1984, our products are used by pharmaceutical, biotech and academic researchers in more than 65

Due to company growth, we seek an experienced software developer to join our development team. Five or more
years of experience using Java and C++ in a commercial development environment are required. Bachelor's
or advanced degree in computer science and/or life science domain and experience developing software
for life scientists are preferred. Position is available in our structural biology group, which focuses
on 3D protein structure prediction, protein-protein docking, and computational protein engineering. 

We have a team oriented work environment, along with a competitive health, dental and 401k benefits

To apply, send resume and salary requirements to resume <at> No calls please.

Steve Darnell, Ph.D.
Senior Scientist
3801 Regent Street
Madison, WI 53705 USA

Biojava-l mailing list  -  Biojava-l <at>

Khalil El Mazouari | 31 Jan 14:40 2015

file i/o with ArrayList

Hi Stefan,

I recently had a similar problem. Object serialisation was OK. However, deserialization of huge sequence
list may consume a lot of memory => less performance ... 

I highly recommend to take a look at ChronicleMap (CM): A low latency Key Value Store, with consistency,
persistence and performance. Data can be stored and reloaded from disk, in a single file. You can also use
CM as off-heap store.

With CM we managed to persist, load and process 1.000.000 sequences with just 4G RAM.  

Best Regards,


On 30 Jan 2015, at 13:00, biojava-l-request <at> wrote:

> Today's Topics:
>   1. file i/o with ArrayList (stefan harjes)
> ----------------------------------------------------------------------
> Message: 1
> Date: Fri, 30 Jan 2015 11:01:33 +0000 (UTC)
> From: stefan harjes <stefanharjes <at>>
> To: "biojava-l <at>" <biojava-l <at>>
> Subject: [Biojava-l] file i/o with ArrayList
> Message-ID:
> 	< <at>>
> Content-Type: text/plain; charset="utf-8"
> Hi biojava-l
> I have a huge number of small sequences in an Array (ListArray<Sequence<?>>) which for server start and
stop I would like to store on disk. Unfortunately Sequence is not serilizable, so I searched and found that
GenbankWriterHelper.writeSequences(OutputStream os, Collection<Sequence<?>> seqs) should be
able to do the job. 
> However when looking at GenbankReaderHelper, there are no methods which correspond to the above writer
method. Am I on the wrong track completely? 
> When looking at the writer/reader helpers, I think I remember reading that they are rudimentary and save
only the sequence (fasta)? I would expect in such an advanced verision of biojava (4.0 is being prepared?)
that there must be a standard way to serialize rich sequences/arrays of them in order to send them around on
streams/Json etc?
> Any help would be appreciated
> CheersStefan
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <>
> ------------------------------
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l <at>
> End of Biojava-l Digest, Vol 143, Issue 2
> *****************************************

Confidentiality Notice: This e-mail and any files transmitted with it are private and confidential and
are solely for the use of the addressee. It may contain material which is legally privileged. If you are not
the addressee or the person responsible for delivering to the addressee, please notify that you have
received this e-mail in error and that any use of it is strictly prohibited. It would be helpful if you could
notify the author by replying to it.

Biojava-l mailing list  -  Biojava-l <at>