Aaron Darling | 1 Dec 2006 05:23
Picon
Favicon

Re: Run time for large query files?

Hi Stephen,
The mpiblast behavior you are observing is "normal" and results from 
mpiblast carefully calculating the exact effective search space size for 
every query in the large query set you are using.  Unfortunately, that 
part of mpiblast 1.4.0 is still a serial component of the algorithm 
which takes place prior to the database search.

Others have also complained about how long the initial search space 
calculation takes on AA queries, so I have introduced a bit of code to 
mpiblast.cpp that should provide a fast approximation to the true 
effective search space.  My logic behind doing so is that the error 
introduced by the approximation will in most cases be small enough that 
it shouldn't matter (ask me if you want a more detailed explanation).
The approximation works by taking the effective database size and query 
length adjustment from the first query and applying it to all subsequent 
queries.  This will be more likely to cause e-value inaccuracy if the 
first query is not representative of the remaining queries, i.e. the 
first query is substantially longer, shorter, or contains a 
substantially different amount of low-complexity sequence that would be 
removed by the dust filter.
I have committed the changes to mpiblast.cpp in our CVS repository.  
Follow these instructions to build from the CVS repository:
http://mpiblast.lanl.gov/Docs.FAQ.html#cvs
The changes may take up to a day to sync between the developer 
repository and the public repository.  Note that I have not yet had time 
to test the changes, but they are small, so hopefully they'll "just 
work."  I'll make an effort to do some testing over the weekend...

hope this helps,
-Aaron
(Continue reading)

elizeu | 1 Dec 2006 19:55
Picon

Compiling problems


   Hi community,

   I've got the following compilation error message when tryying to
compile mppiblast-1.4.0 on a Debian Sarge Linux. I performed all steps
as recommended (including the patch part).

/ncbi/lib//libncbi.a(ncbifile.o): In function
`Nlm_TmpNam':ncbifile.c:(.text+0x1a4): warning: the use of `tempnam' is
dangerous, better use `mkstemp'
/ncbi/lib//libncbitool.a(kappa.o): In function
`RedoAlignmentCore':kappa.c:(.text+0x82): undefined reference to
`Blast_FrequencyDataIsAvailable'
:kappa.c:(.text+0x105): undefined reference to `Nlm_Int4MatrixNew'
:kappa.c:(.text+0x236): undefined reference to `BlastCompo_HeapInitialize'
:kappa.c:(.text+0x2e7): undefined reference to `BlastCompo_HeapPop'
:kappa.c:(.text+0x336): undefined reference to `Blast_RedoAlignParamsFree'
:kappa.c:(.text+0x363): undefined reference to `BlastCompo_HeapRelease'
:kappa.c:(.text+0x3c7): undefined reference to `Nlm_Int4MatrixFree'
:kappa.c:(.text+0x3f4): undefined reference to
`Blast_CompositionWorkspaceFree'
:kappa.c:(.text+0x40b): undefined reference to `Blast_ForbiddenRangesRelease'
:kappa.c:(.text+0x448): undefined reference to `BlastCompo_EarlyTermination'
:kappa.c:(.text+0x554): undefined reference to
`Blast_RedoOneMatchSmithWaterman'
:kappa.c:(.text+0x628): undefined reference to `BlastCompo_AlignmentsFree'
:kappa.c:(.text+0x72a): undefined reference to `BlastCompo_HeapWouldInsert'
:kappa.c:(.text+0x873): undefined reference to `BlastCompo_HeapInsert'
:kappa.c:(.text+0x9c0): undefined reference to `Blast_RedoOneMatch'
:kappa.c:(.text+0x9f1): undefined reference to
(Continue reading)

Aaron Darling | 4 Dec 2006 01:46
Picon
Favicon

Re: Run time for large query files?

An addendum:

The fast e-value approximation is disabled by default in the cvs code.  
To use it, add --fast-evalue-approximation to the mpiblast command line.

I tested it out over the weekend with a blastp between e. coli and yeast 
AA sequences.  In that dataset the e-values were correct within a factor 
of 2, so it may be desirable to adjust the e-value cutoff slightly when 
searching for very weak homology.  For example, a hit given an e-value 
of 8 by NCBI blastall may be given an approximate evalue of 11 by 
mpiblast (using --fast-evalue-approximation), which would be above the 
default e-value cutoff of 10 and thus discarded.

-Aaron

Stephen Ficklin wrote:
> Hi, I have a query file with 359,001 ESTs (218MB). I'm running a blastx using mpiBLAST against the
Swiss-Prot Uniprot database containing 241,242 protein sequences (119MB) in size.   The job on the master
node has been doing all the work.  The worker nodes just sit idle and after more than 12 hours I still have no
results and the workers nodes are still doing nothing.
>
> Here's my command-line entry:
> /usr/local/mpich/bin/mpirun -np 12 -machinefile /home/userx/test/mpiblast/machines
/usr/local/mpiblast/bin/mpiblast -p blastx -i /local/scratch/rosaceae2006_6_14.trim.lib -d
uniprot_sprot.fasta -e 1e-5 -F F -o /home/userx/test/mpiblast/results.out --debug=/test/mpiblast/debug.out
>
> I'm running this job on a 60 node cluster with each node having dual 64bit AMD Opteron 2.2Ghz processors and
2GB of RAM.  The database files, mpiblast binaries and output files all reside on NFS mounted filesystems.
>
> Is mpiblast really just in the "preparation" phase before the workers get busy? Can anyone tell me if I am
(Continue reading)

elizeu | 4 Dec 2006 15:23
Picon

Sorry, mpiBLAST must be run on 3 or more nodes (unexpected)


   Hi all,

    I've managed to install the binary distribution of mpiblast and NCBI
Toolbox.

     However, when I try to execute the following line:

 /usr/bin/mpirun -np 16 /usr/local/bin/mpiblast -d
/local/elizeu/human_genomic.00.nsq -i blast_query.fas -p blastn -o
blast_results.txt

     I got the following error message.

Sorry, mpiBLAST must be run on 3 or more nodes
aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0

     The strange thing is that no matter which number of processors is
passed in the -np parameter I'll get this message.

      I've tried the same command line with the -t option to see which
procgroup file was being generated, and it is fine.

      Have you experience similar situation ?

      Thanks in advance,
      Eli

-------------------------------------------------------------------------
(Continue reading)

Daniel Xavier de Sousa | 5 Dec 2006 13:59
Picon
Favicon

One doubt...

Hi,

Can tell me anyone, if pioBLAST (of Efficient Data Access for Parallel BLAST) change the source code of BLAST?

Thanks
Daniel

Yahoo! Search
Música para ver e ouvir: You're Beautiful, do James Blunt
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Mpiblast-users mailing list
Mpiblast-users@...
https://lists.sourceforge.net/lists/listinfo/mpiblast-users
Heshan Lin | 5 Dec 2006 17:38
Picon
Favicon

Re: One doubt...

Hi Daniel,

 

pioBLAST does not change the core of NCBI BLAST algorithm. It uses the same code for sequence comparison as mpiBLAST does. With regarding to the performance, pioBLAST optimizes the input of sequence data with parallel reading and explicit memory caching, as well as enables concurrent output with parallel I/O through MPI-IO library. You can refer to the original pioBLAST paper for more information:

http://mpiblast.lanl.gov/downloads/pubs/IPDPS05-pioBLAST.pdf

 

Thanks,

Heshan

 

From: mpiblast-users-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org [mailto:mpiblast-users-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org] On Behalf Of Daniel Xavier de Sousa
Sent: Tuesday, December 05, 2006 8:00 AM
To: mpiBlast
Subject: [Mpiblast-users] One doubt...

 

Hi,

Can tell me anyone, if pioBLAST (of Efficient Data Access for Parallel BLAST) change the source code of BLAST?

Thanks
Daniel

 

Yahoo! Search
Música para ver e ouvir: You're Beautiful, do James Blunt

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Mpiblast-users mailing list
Mpiblast-users@...
https://lists.sourceforge.net/lists/listinfo/mpiblast-users
Daniel Xavier de Sousa | 5 Dec 2006 19:14
Picon
Favicon

Res: One doubt...

Hi Heshan,

Thanks for your help.

Daniel

----- Mensagem original ----
De: Heshan Lin <hlin2-e4nNhFGpWWo@public.gmane.org>
Para: mpiblast-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
Enviadas: Terça-feira, 5 de Dezembro de 2006 14:38:42
Assunto: Re: [Mpiblast-users] One doubt...

Hi Daniel,

 

pioBLAST does not change the core of NCBI BLAST algorithm. It uses the same code for sequence comparison as mpiBLAST does. With regarding to the performance, pioBLAST optimizes the input of sequence data with parallel reading and explicit memory caching, as well as enables concurrent output with parallel I/O through MPI-IO library. You can refer to the original pioBLAST paper for more information:

http://mpiblast.lanl.gov/downloads/pubs/IPDPS05-pioBLAST.pdf

 

Thanks,

Heshan

 

From: mpiblast-users-bounces-5NWGOfrQmnetEtDZOKyKiw@public.gmane.orgrge.net [mailto:mpiblast-users-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org] On Behalf Of Daniel Xavier de Sousa
Sent: Tuesday, December 05, 2006 8:00 AM
To: mpiBlast
Subject: [Mpiblast-users] One doubt...

 

Hi,

Can tell me anyone, if pioBLAST (of Efficient Data Access for Parallel BLAST) change the source code of BLAST?

Thanks
Daniel

 

Yahoo! Search
Música para ver e ouvir: You're Beautiful, do James Blunt

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Mpiblast-users mailing list
Mpiblast-users-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/mpiblast-users


Você quer respostas para suas perguntas? Ou você sabe muito e quer compartilhar seu conhecimento? Experimente o Yahoo! Respostas!
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Mpiblast-users mailing list
Mpiblast-users@...
https://lists.sourceforge.net/lists/listinfo/mpiblast-users

Gmane