Re: Possible Contribution: UCSC Blat and Ensembl SSAHA Sequence Locator
Jeffrey Chang <jchang <at> smi.stanford.edu>
2003-02-07 20:42:29 GMT
On Fri, Feb 07, 2003 at 10:43:29AM +0200, Anthony Metzidis wrote:
[I've reordered some paragraphs...]
> We've developed a Python API for the UCSC
> BLAT(http://genome.ucsc.edu/cgi-bin/hgBlat?command=start) and Ensembl
> SSAHA (http://www.ensembl.org/Homo_sapiens/ssahaview) genome search tools.
> We would like to contribute this to BioPython, if you think there would
> be an interest in it.
Yes, there would definitely be interest in it!
> If so, could you offer advise about other existing BioPython interfaces
> that we should model ours after? I would like the interface to be as
> consistent as possible with the rest of BioPython.
There's a few data types that should be supported. More below...
> Using our tool, you can input a series of dna sequences in Fasta format
> and then get the results back as dictionaries, indexed by the Fasta
> title, of dictionaries indexed by the fields presented by the web
The DNA sequences should be Bio.Seq objects, and not require FASTA
format. Also, the results should be in defined and documented objects
(for an example, see Bio.Blast.Record), rather than dictionaries.
> The http connection and parsing of the HTML results pages are handled by
> our tool.