Re: Possible Contribution: UCSC Blat and Ensembl SSAHA Sequence Locator
Jeffrey Chang <jchang <at> smi.stanford.edu>
2003-02-07 20:42:29 GMT
On Fri, Feb 07, 2003 at 10:43:29AM +0200, Anthony Metzidis wrote:
[I've reordered some paragraphs...]
> Hello,
> We've developed a Python API for the UCSC
> BLAT(http://genome.ucsc.edu/cgi-bin/hgBlat?command=start) and Ensembl
> SSAHA (http://www.ensembl.org/Homo_sapiens/ssahaview) genome search tools.
> We would like to contribute this to BioPython, if you think there would
> be an interest in it.
Yes, there would definitely be interest in it!
> If so, could you offer advise about other existing BioPython interfaces
> that we should model ours after? I would like the interface to be as
> consistent as possible with the rest of BioPython.
There's a few data types that should be supported. More below...
> Using our tool, you can input a series of dna sequences in Fasta format
> and then get the results back as dictionaries, indexed by the Fasta
> title, of dictionaries indexed by the fields presented by the web
> interfaces.
The DNA sequences should be Bio.Seq objects, and not require FASTA
format. Also, the results should be in defined and documented objects
(for an example, see Bio.Blast.Record), rather than dictionaries.
> The http connection and parsing of the HTML results pages are handled by
> our tool.
(Continue reading)