Katie Edmonds | 7 May 23:02 2008
Picon

[BioPython] PSI-BLAST using NCBIWWW

Hi,

I'm trying to use biopython to run psi-blast on the ncbi server.  It looks
like qblast cannot be used for psi-blast, and as far as I can tell blast()
doesn't work at all anymore, though it has a 'run_psiblast' parameter.  Has
anyone had success with running psi-blast with biopython who could offer
some advice?

Thanks,
Katie
_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

gbastian | 7 May 23:29 2008
Picon
Picon

[BioPython] NCBIXML error

Dear all,

I have been using a script to blast sequences for days without a
problem, then, after 2/3 hours it started giving me this error
and never worked again...did they change xml blast format?

this is the error:

File "ppinvestigator.py", line 918, in ?
    pdbs.find_homologous_seqs(int_list)
  File "ppinvestigator.py", line 122, in find_homologous_seqs
    data = search_seq(self.sequences[chain][0], interactor_list)
  File
"/home/giacomotion/Desktop/VU-PROJECT/PPI_PDBS/PPINVESTIGATOR/tools.py",
line 32, in search_seq
    blast_record = blast_records.next()
  File "/usr/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 576,
in parse
    expat_parser.Parse(text, False)
  File "/usr/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 98,
in endElement
    eval("self.%s()" % method)
  File "<string>", line 0, in ?
  File "/usr/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 216,
in _end_BlastOutput_version
    self._header.date = self._value.split()[2][1:-1]
IndexError: list index out of range

this is my script:

(Continue reading)

Peter | 8 May 10:57 2008
Picon
Picon

Re: [BioPython] NCBIXML error

On Wed, May 7, 2008 at 10:29 PM,  <gbastian <at> pasteur.fr> wrote:
> Dear all,
>
>  I have been using a script to blast sequences for days without a
>  problem, then, after 2/3 hours it started giving me this error
>  and never worked again...did they change xml blast format?

Looking at the XML snippet, the version is "BLASTP 2.2.18+" (with a
plus but no date) so it looks like the may well have updated
something.  Its possible that they'll make further tweaks in the next
couple of days, so it would be worth retesting.  The Biopython code
expects something like "BLASTP 2.2.12 [Aug-07-2005]", and its the
missing date that is causing this error for you.

On a different note, if you have really been running BLASTP for days
over the internet, it would probably be faster and more efficient to
install standalone blast and the nr database on your local machine.
You can still ask for XML output, so the parsing side of your script
shouldn't change.

Peter
_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

Christof Winter | 8 May 10:47 2008
Picon

Re: [BioPython] NCBIXML error

gbastian <at> pasteur.fr wrote:
> Dear all,
> 
> I have been using a script to blast sequences for days without a
> problem, then, after 2/3 hours it started giving me this error
> and never worked again...did they change xml blast format?
> 
> this is the error:
> 
> File "ppinvestigator.py", line 918, in ?
>     pdbs.find_homologous_seqs(int_list)
>   File "ppinvestigator.py", line 122, in find_homologous_seqs
>     data = search_seq(self.sequences[chain][0], interactor_list)
>   File
> "/home/giacomotion/Desktop/VU-PROJECT/PPI_PDBS/PPINVESTIGATOR/tools.py",
> line 32, in search_seq
>     blast_record = blast_records.next()
>   File "/usr/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 576,
> in parse
>     expat_parser.Parse(text, False)
>   File "/usr/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 98,
> in endElement
>     eval("self.%s()" % method)
>   File "<string>", line 0, in ?
>   File "/usr/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 216,
> in _end_BlastOutput_version
>     self._header.date = self._value.split()[2][1:-1]
> IndexError: list index out of range

[...]
(Continue reading)

Peter | 8 May 11:18 2008
Picon
Picon

Re: [BioPython] NCBIXML error

>  It seems they did change the format. When I run blast locally ...
>  whereas it chokes ... as "BLASTP 2.2.18+".split() lacks a third element.
>  Should be easy to fix, shouldn't it?
>
>  Christof

I came to the same conclusion Christof, and its now fixed in CVS (with
a new test case too).

Giacomo, if you want to try this you'll need to update your system.
Checking out the latest code from CVS and installing Biopython from
source would be one way.

However, you only need to update one file, replacing
/usr/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py with CVS
revision 1.4.  If you want, you can just grab this from the web
interface here (once the website is automatically updated in a few
hours):

http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/Blast/NCBIXML.py?cvsroot=biopython

Please let us know on the mailing list if that works for you (or if
there are still problems).

Peter
_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

(Continue reading)

gbastian | 8 May 11:46 2008
Picon
Picon

[BioPython] NCBIXML error

Hello Peter and Christof,

I just modified the code of NCBIXML.py where it trys
to get the version date information.
Now it works.

line 216

self._header.date = 'Dec-21-2012'

thanks,

Giacomo

_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

Peter | 8 May 13:37 2008
Picon
Picon

Re: [BioPython] PSI-BLAST using NCBIWWW

On Wed, May 7, 2008 at 10:02 PM, Katie Edmonds <betainverse <at> gmail.com> wrote:
> Hi,
>
>  I'm trying to use biopython to run psi-blast on the ncbi server.  It looks
>  like qblast cannot be used for psi-blast, and as far as I can tell blast()
>  doesn't work at all anymore, though it has a 'run_psiblast' parameter.  Has
>  anyone had success with running psi-blast with biopython who could offer
>  some advice?

I've done some investigation of using Bio.Blast.NCBIWWW.qblast() with
psi-blast, and it does seem to be missing the run_psiblast option.
However, I don't think that's the only issue here...

Have you been able to try running standalone psiblast, and parse its XML output?

Peter
_______________________________________________
BioPython mailing list  -  BioPython <at> lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython

Peter | 9 May 10:59 2008
Picon
Picon

Re: [BioPython] PSI-BLAST using NCBIWWW

On Thu, May 8, 2008 at 8:28 PM, Katie Edmonds wrote:
> From what I can get with the web interface, it seems like parsing the XML
> should be ok, though for however many iterations I try, it only seems to be
> giving me XML for iteration #1.

Could you try running standalone PSI-Blast and see how the XML output compares?

> I'm still trying to figure out how to run subsequent iterations of PSI-BLAST
> with your patch.

For anyone else interested, see
http://bugzilla.open-bio.org/show_bug.cgi?id=2496

> In the web form it seems to keep track of all past iterations with NEXT_I:
>
> <input name="NEXT_I" type="hidden" value="Run PSI-Blast iteration 2">
>
> <input name="NEXT_I" type="hidden" value="Run PSI-Blast iteration 3">
>
> <input name="NEXT_I" type="hidden" value="Run PSI-Blast iteration 4">
>
> I don't have any idea how similar the qblast interface is to the web
> interface, though.
>
> Thanks,
>
> Katie

You are talking about multiple iterations of results from a single
query?  Looking at the example output I got yesterday, there is indeed
(Continue reading)

Martin MOKREJŠ | 12 May 00:48 2008
Picon

[BioPython] blastall does not flush buffers due to biopython buffering?

Hi,
  when I try to use Bio/Blast/NCBIStandalone blast sometimes the process hangs
and sometimes it works (tested from Unix shell and via Apache mod_python).
I see blastall process in the list of system processes, attaching strace(1)
to it shows that it did print some line from the result output, but somewhat
does not continue to write out the buffers (you know that at the end of blast
output is the summary stats ...;). I believe that is because the consuming
process did not read yet the output already written. Effectively, blastall
gets blocked due to biopython.

I see in the stacktrace of a killed process:

    print ''.join(_error_info.readlines())
  File "/usr/lib/python2.5/site-packages/Bio/File.py", line 37, in readlines
    lines = self._saved + self._handle.readlines(*args,**keywds)
KeyboardInterrupt
$

Currently, there is in CVS:

def blastall(blastcmd, program, database, infile, align_view='7', **keywds):
    """blastall(blastcmd, program, database, infile, align_view='7', **keywds)
    -> read, error Undohandles
...
    w, r, e = os.popen3(" ".join([blastcmd] + params))
    w.close()
    return File.UndoHandle(r), File.UndoHandle(e)

I did not study yet Bio/File.py but let me say that running just the following
works fine for me:
(Continue reading)

Michiel de Hoon | 12 May 04:11 2008
Picon

Re: [BioPython] blastall does not flush buffers due to biopython buffering?

Can you show an example script that causes the UndoHandle to block? Just to understand better what is going on.
On a related note, the UndoHandle works by saving all lines that were read. Particularly for large Blast
files, that is not what one would like to do. So if there is no strong reason for returning a UndoHandle, I'd
be in favor of simply returning the handle directly.

--Michiel.

Martin MOKREJÅ  <mmokrejs <at> ribosome.natur.cuni.cz> wrote: Hi,
  when I try to use Bio/Blast/NCBIStandalone blast sometimes the process hangs
and sometimes it works (tested from Unix shell and via Apache mod_python).
I see blastall process in the list of system processes, attaching strace(1)
to it shows that it did print some line from the result output, but somewhat
does not continue to write out the buffers (you know that at the end of blast
output is the summary stats ...;). I believe that is because the consuming
process did not read yet the output already written. Effectively, blastall
gets blocked due to biopython.

I see in the stacktrace of a killed process:

    print ''.join(_error_info.readlines())
  File "/usr/lib/python2.5/site-packages/Bio/File.py", line 37, in readlines
    lines = self._saved + self._handle.readlines(*args,**keywds)
KeyboardInterrupt
$

Currently, there is in CVS:

def blastall(blastcmd, program, database, infile, align_view='7', **keywds):
    """blastall(blastcmd, program, database, infile, align_view='7', **keywds)
    -> read, error Undohandles
(Continue reading)


Gmane