Asad Ali | 1 Jul 01:45 2010
Picon

[OMPI users] Parallel Tempering with MPI

Hi all,

I am working on a parallel tempering MCMC code using OpenMPI scripts. I am a bit confused about proposing swaps between chains running on different cores.
I know how to propose swaps but I am not sure as to where to to do it (i.e. how to specify an independent node or core for it.). If some body is/was working on parallel tempering MCMC using MPI then please help me. An example code would be really helpful.

Cheers,

Asad

On Thu, Jul 1, 2010 at 9:28 AM, Riccardo Murri <riccardo.murri <at> gmail.com> wrote:
Hello,

The FAQ states: "Support for MPI_THREAD_MULTIPLE [...] has been
designed into Open MPI from its first planning meetings.  Support for
MPI_THREAD_MULTIPLE is included in the first version of Open MPI, but
it is only lightly tested and likely still has some bugs."

The man page of "mpirun" from v1.4.3a1r23323 in addition says "Open
MPI is, currently, neither thread-safe nor async-signal-safe" (section
"Process Termination / Signal Handling").

Are these statements up-to-date?  What is the status of
MPI_THREAD_MULTIPLE in OMPI 1.4?

Thanks in advance for any info!

Cheers,
Riccardo
_______________________________________________
users mailing list
users <at> open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
"Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write." - H.G. Wells
_______________________________________________
users mailing list
users <at> open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
Jack Bryan | 1 Jul 06:13 2010
Picon

Re: [OMPI users] Open MPI, Segmentation fault

thanks

I am not familiar with OpenMPI. 

Would you please help me with how to ask openMPI to show where the fault occurs ?

GNU debuger ?

Any help is appreciated. 

thanks!!!

Jack 

June 30  2010

Date: Wed, 30 Jun 2010 16:13:09 -0400
From: amjad11 <at> gmail.com
To: users <at> open-mpi.org
Subject: Re: [OMPI users] Open MPI, Segmentation fault

Based on my experiences, I would FULLY endorse (100% agree with) David Zhang.
It is usually a coding or typo mistake.

At first, Ensure that array sizes and dimension are correct.

I experience that if openmpi is compiled with gnu compilers (not with Intel) then it also point outs the subroutine exactly in which the fault occur. have a try.

best,
AA

 

On Wed, Jun 30, 2010 at 12:43 PM, David Zhang <solarbikedz <at> gmail.com> wrote:
When I got segmentation faults, it has always been my coding mistakes.  Perhaps your code is not robust against number of processes not divisible by 2?

On Wed, Jun 30, 2010 at 8:47 AM, Jack Bryan <dtustudy68 <at> hotmail.com> wrote:
Dear All,

I am using Open MPI, I got the error: 

n337:37664] *** Process received signal ***
[n337:37664] Signal: Segmentation fault (11)
[n337:37664] Signal code: Address not mapped (1)
[n337:37664] Failing at address: 0x7fffcfe90000
[n337:37664] [ 0] /lib64/libpthread.so.0 [0x3c50e0e4c0]
[n337:37664] [ 1] /lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2 [0x414ed7]
[n337:37664] [ 2] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3c5021d974]
[n337:37664] [ 3] /lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2(__gxx_personality_v0+0x1f1) [0x412139]
[n337:37664] *** End of error message ***

After searching answers, it seems that some functions fail. 
 
My program can run well for 1,2,10 processors, but fail when the number of tasks cannot
be divided evenly by number of processes. 

Any help is appreciated. 

thanks

Jack

June 30  2010


The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with Hotmail. Get busy.

_______________________________________________
users mailing list
users <at> open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
David Zhang
University of California, San Diego

_______________________________________________
users mailing list
users <at> open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox. Learn more.
_______________________________________________
users mailing list
users <at> open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
Asad Ali | 1 Jul 09:17 2010
Picon

Re: [OMPI users] Open MPI, Segmentation fault

Hi Jack,

Debugging OpenMPI with traditional debuggers is a pain.
>From your error message it sounds that you have some memory allocation problem. Do you use dynamic memory allocation (allocate and then free)?

I use display (printf()) command with MPIrank command. It tells me which thread is giving segmentation fault.

Cheers,

Asad

On Thu, Jul 1, 2010 at 4:13 PM, Jack Bryan <dtustudy68 <at> hotmail.com> wrote:
thanks

I am not familiar with OpenMPI. 

Would you please help me with how to ask openMPI to show where the fault occurs ?

GNU debuger ?

Any help is appreciated. 

thanks!!!

Jack 

June 30  2010

Date: Wed, 30 Jun 2010 16:13:09 -0400
From: amjad11 <at> gmail.com
To: users <at> open-mpi.org
Subject: Re: [OMPI users] Open MPI, Segmentation fault


Based on my experiences, I would FULLY endorse (100% agree with) David Zhang.
It is usually a coding or typo mistake.

At first, Ensure that array sizes and dimension are correct.

I experience that if openmpi is compiled with gnu compilers (not with Intel) then it also point outs the subroutine exactly in which the fault occur. have a try.

best,
AA

 

On Wed, Jun 30, 2010 at 12:43 PM, David Zhang <solarbikedz <at> gmail.com> wrote:
When I got segmentation faults, it has always been my coding mistakes.  Perhaps your code is not robust against number of processes not divisible by 2?

On Wed, Jun 30, 2010 at 8:47 AM, Jack Bryan <dtustudy68 <at> hotmail.com> wrote:
Dear All,

I am using Open MPI, I got the error: 

n337:37664] *** Process received signal ***
[n337:37664] Signal: Segmentation fault (11)
[n337:37664] Signal code: Address not mapped (1)
[n337:37664] Failing at address: 0x7fffcfe90000
[n337:37664] [ 0] /lib64/libpthread.so.0 [0x3c50e0e4c0]
[n337:37664] [ 1] /lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2 [0x414ed7]
[n337:37664] [ 2] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3c5021d974]
[n337:37664] [ 3] /lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2(__gxx_personality_v0+0x1f1) [0x412139]
[n337:37664] *** End of error message ***

After searching answers, it seems that some functions fail. 
 
My program can run well for 1,2,10 processors, but fail when the number of tasks cannot
be divided evenly by number of processes. 

Any help is appreciated. 

thanks

Jack

June 30  2010


The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with Hotmail. Get busy.

_______________________________________________
users mailing list
users <at> open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
David Zhang
University of California, San Diego

_______________________________________________
users mailing list
users <at> open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox. Learn more.

_______________________________________________
users mailing list
users <at> open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
"Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write." - H.G. Wells
_______________________________________________
users mailing list
users <at> open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
Jeff Squyres | 1 Jul 13:03 2010
Picon

Re: [OMPI users] Open MPI, Segmentation fault

Also see http://www.open-mpi.org/faq/?category=debugging.

On Jul 1, 2010, at 3:17 AM, Asad Ali wrote:

> Hi Jack,
> 
> Debugging OpenMPI with traditional debuggers is a pain.
> >From your error message it sounds that you have some memory allocation problem. Do you use dynamic memory
allocation (allocate and then free)?
> 
> I use display (printf()) command with MPIrank command. It tells me which thread is giving segmentation fault.
> 
> Cheers,
> 
> Asad
> 
> On Thu, Jul 1, 2010 at 4:13 PM, Jack Bryan <dtustudy68 <at> hotmail.com> wrote:
> thanks
> 
> I am not familiar with OpenMPI. 
> 
> Would you please help me with how to ask openMPI to show where the fault occurs ?
> 
> GNU debuger ?
> 
> Any help is appreciated. 
> 
> thanks!!!
> 
> Jack 
> 
> June 30  2010
> 
> Date: Wed, 30 Jun 2010 16:13:09 -0400
> From: amjad11 <at> gmail.com
> To: users <at> open-mpi.org
> Subject: Re: [OMPI users] Open MPI, Segmentation fault
> 
> 
> Based on my experiences, I would FULLY endorse (100% agree with) David Zhang.
> It is usually a coding or typo mistake.
> 
> At first, Ensure that array sizes and dimension are correct.
> 
> I experience that if openmpi is compiled with gnu compilers (not with Intel) then it also point outs the
subroutine exactly in which the fault occur. have a try.
> 
> best,
> AA
> 
>   
> 
> On Wed, Jun 30, 2010 at 12:43 PM, David Zhang <solarbikedz <at> gmail.com> wrote:
> When I got segmentation faults, it has always been my coding mistakes.  Perhaps your code is not robust
against number of processes not divisible by 2?
> 
> On Wed, Jun 30, 2010 at 8:47 AM, Jack Bryan <dtustudy68 <at> hotmail.com> wrote:
> Dear All,
> 
> I am using Open MPI, I got the error: 
> 
> n337:37664] *** Process received signal ***
> [n337:37664] Signal: Segmentation fault (11)
> [n337:37664] Signal code: Address not mapped (1)
> [n337:37664] Failing at address: 0x7fffcfe90000
> [n337:37664] [ 0] /lib64/libpthread.so.0 [0x3c50e0e4c0]
> [n337:37664] [ 1] /lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2 [0x414ed7]
> [n337:37664] [ 2] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3c5021d974]
> [n337:37664] [ 3]
/lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2(__gxx_personality_v0+0x1f1) [0x412139]
> [n337:37664] *** End of error message ***
> 
> After searching answers, it seems that some functions fail. 
>  
> My program can run well for 1,2,10 processors, but fail when the number of tasks cannot
> be divided evenly by number of processes. 
> 
> Any help is appreciated. 
> 
> thanks
> 
> Jack
> 
> June 30  2010
> 
> 
> The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with Hotmail. Get busy.
> 
> _______________________________________________
> users mailing list
> users <at> open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> David Zhang
> University of California, San Diego
> 
> _______________________________________________
> users mailing list
> users <at> open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox. Learn more.
> 
> _______________________________________________
> users mailing list
> users <at> open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> "Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and
write." - H.G. Wells
> _______________________________________________
> users mailing list
> users <at> open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

--

-- 
Jeff Squyres
jsquyres <at> cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/
amjad ali | 1 Jul 15:51 2010
Picon

Re: [OMPI users] Open MPI, Segmentation fault

from the start of your program, after a certain activitiy, say after 10 lines use print statement with STOP/EXIT , also printing processor rank.

If u get all the processors than its fine. Move this printing little ahead and get printing again. Repeat this process  until u reach the place of fault. Still u need to guess/observe what is the error in code. with print statement u will reach in the vicinity/segment of error. What is the error-----u need to observe.

_______________________________________________
users mailing list
users <at> open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
Gus Correa | 1 Jul 17:44 2010

Re: [OMPI users] Open MPI, Segmentation fault

Hello Jack, list

As others mentioned, this may be a problem with dynamic
memory allocation.
It could also be a violation of statically allocated memory,
I guess.

You say:

> My program can run well for 1,2,10 processors, but fail when the
> number of tasks cannot
> be divided evenly by number of processes.

Often times, when the division of the number of "tasks"
(or the global problem size) by the number of "processors" is not even, 
one processor gets a lighter/heavier workload then the others,
it also allocates  less/more memory than the others,
and it accesses smaller/larger arrays than the others.

In general integer division and remainder/module calculations
are used to control memory allocation, the array sizes, etc,
on different processors.
These formulas tend to use the MPI communicator size
(i.e., effectively the number of processors if you are using 
MPI_COMM_WORLD) to split the workload across the processors.

I would search for the lines of code where those calculations are done, 
and where the arrays are allocated and accessed,
to make sure the algorithm works both when
they are of the same size
(even workload across the processors),
as when they are of different sizes
(uneven workload across the processors).
You may be violating memory access by a few bytes only, due to a small
mistake in one of those integer division / remainder/module formulas,
perhaps where an array index upper or lower bound is calculated.
It happened to me before, probably to others too.

This type of code inspection can be done without a debugger,
or before you get to the debugger phase.

I hope this helps,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

> Jeff Squyres wrote:
> Also see http://www.open-mpi.org/faq/?category=debugging.
> 
> On Jul 1, 2010, at 3:17 AM, Asad Ali wrote:
> 
>> Hi Jack,
>>
>> Debugging OpenMPI with traditional debuggers is a pain.
>> >From your error message it sounds that you have some memory allocation problem. Do you use dynamic
memory allocation (allocate and then free)?
>>
>> I use display (printf()) command with MPIrank command. It tells me which thread is giving segmentation fault.
>>
>> Cheers,
>>
>> Asad
>>
>> On Thu, Jul 1, 2010 at 4:13 PM, Jack Bryan <dtustudy68 <at> hotmail.com> wrote:
>> thanks
>>
>> I am not familiar with OpenMPI. 
>>
>> Would you please help me with how to ask openMPI to show where the fault occurs ?
>>
>> GNU debuger ?
>>
>> Any help is appreciated. 
>>
>> thanks!!!
>>
>> Jack 
>>
>> June 30  2010
>>
>> Date: Wed, 30 Jun 2010 16:13:09 -0400
>> From: amjad11 <at> gmail.com
>> To: users <at> open-mpi.org
>> Subject: Re: [OMPI users] Open MPI, Segmentation fault
>>
>>
>> Based on my experiences, I would FULLY endorse (100% agree with) David Zhang.
>> It is usually a coding or typo mistake.
>>
>> At first, Ensure that array sizes and dimension are correct.
>>
>> I experience that if openmpi is compiled with gnu compilers (not with Intel) then it also point outs the
subroutine exactly in which the fault occur. have a try.
>>
>> best,
>> AA
>>
>>   
>>
>> On Wed, Jun 30, 2010 at 12:43 PM, David Zhang <solarbikedz <at> gmail.com> wrote:
>> When I got segmentation faults, it has always been my coding mistakes.  Perhaps your code is not robust
against number of processes not divisible by 2?
>>
>> On Wed, Jun 30, 2010 at 8:47 AM, Jack Bryan <dtustudy68 <at> hotmail.com> wrote:
>> Dear All,
>>
>> I am using Open MPI, I got the error: 
>>
>> n337:37664] *** Process received signal ***
>> [n337:37664] Signal: Segmentation fault (11)
>> [n337:37664] Signal code: Address not mapped (1)
>> [n337:37664] Failing at address: 0x7fffcfe90000
>> [n337:37664] [ 0] /lib64/libpthread.so.0 [0x3c50e0e4c0]
>> [n337:37664] [ 1] /lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2 [0x414ed7]
>> [n337:37664] [ 2] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3c5021d974]
>> [n337:37664] [ 3]
/lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2(__gxx_personality_v0+0x1f1) [0x412139]
>> [n337:37664] *** End of error message ***
>>
>> After searching answers, it seems that some functions fail. 
>>  
>> My program can run well for 1,2,10 processors, but fail when the number of tasks cannot
>> be divided evenly by number of processes. 
>>
>> Any help is appreciated. 
>>
>> thanks
>>
>> Jack
>>
>> June 30  2010
>>
>>
>> The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with Hotmail. Get busy.
>>
>> _______________________________________________
>> users mailing list
>> users <at> open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> -- 
>> David Zhang
>> University of California, San Diego
>>
>> _______________________________________________
>> users mailing list
>> users <at> open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox. Learn more.
>>
>> _______________________________________________
>> users mailing list
>> users <at> open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> -- 
>> "Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and
write." - H.G. Wells
>> _______________________________________________
>> users mailing list
>> users <at> open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
Jack Bryan | 1 Jul 18:09 2010
Picon

Re: [OMPI users] Open MPI, Segmentation fault

Thanks for all your replies.

I want to do master-worker asynchronous communication.

The master needs to distribute tasks to workers and then collect results from them.

master :

world.irecv(resultSourceRank, upStreamTaskTag, myResultTaskPackage[iRank][taskCounterT3]);

I got this error "MPI_ERR_TRUNCATE" , because I declared " TaskPackage myResultTaskPackage. "

It seems that the 2-dimension array cannot be used to receive my defined
class package from worker, who sends a TaskPackage to master.

So, I changed it to an int 2-d array to get the result, it works well.

But, I still want to find out how to store the result in a data structure with the type TaskPackage because
int type data can only be used to carry integers. Too limited.

What I want to do is:

The master can store the results from each worker and then combine them together
to form the final result after collecting all results from workers.

But, if the master has number of tasks that cannot be divided evenly by worker numbers,
each worker may have different number of tasks.

If we have 11 tasks and 3 workers.

aveTaskNumPerNode = (11 - 11%3) /3 = 3
leftTaskNum = 11%3 =2 = Z

the master distributes each of left tasks from worker 1 to work Z (Z < totalNumWorkers).

For example, worker 1: 4 tasks, worker 2: 4 task, worker 3: 3 tasks.

The master tries to distribute tasks evenly so that the difference between workloads of
each worker is minimized.

I am going to use vector's vector to do the dynamic data storage.

The 2-dimensional data-structure that can store results from workers.

Each row element of the data-structure has different columns.

It can be indexed by iterator so that I can find the a specified number worker task result
by searching the data strucutre.

For example,
               column           column
                  1                2
 row 1   (worker1.task1)    (worker1.task4)    
 row 2   (worker2.task2)     (worker1.task5)  
 row 3   (worker3.task3)

the data strucutre should remember the location of work ID and the task ID.
So that the master can know which task comes from which worker.

Any help or comment are appreciated.

thanks

Jack

June 30   2010



> Date: Thu, 1 Jul 2010 11:44:19 -0400
> From: gus <at> ldeo.columbia.edu
> To: users <at> open-mpi.org
> Subject: Re: [OMPI users] Open MPI, Segmentation fault
>
> Hello Jack, list
>
> As others mentioned, this may be a problem with dynamic
> memory allocation.
> It could also be a violation of statically allocated memory,
> I guess.
>
> You say:
>
> > My program can run well for 1,2,10 processors, but fail when the
> > number of tasks cannot
> > be divided evenly by number of processes.
>
> Often times, when the division of the number of "tasks"
> (or the global problem size) by the number of "processors" is not even,
> one processor gets a lighter/heavier workload then the others,
> it also allocates less/more memory than the others,
> and it accesses smaller/larger arrays than the others.
>
> In general integer division and remainder/module calculations
> are used to control memory allocation, the array sizes, etc,
> on different processors.
> These formulas tend to use the MPI communicator size
> (i.e., effectively the number of processors if you are using
> MPI_COMM_WORLD) to split the workload across the processors.
>
> I would search for the lines of code where those calculations are done,
> and where the arrays are allocated and accessed,
> to make sure the algorithm works both when
> they are of the same size
> (even workload across the processors),
> as when they are of different sizes
> (uneven workload across the processors).
> You may be violating memory access by a few bytes only, due to a small
> mistake in one of those integer division / remainder/module formulas,
> perhaps where an array index upper or lower bound is calculated.
> It happened to me before, probably to others too.
>
> This type of code inspection can be done without a debugger,
> or before you get to the debugger phase.
>
> I hope this helps,
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
>
> > Jeff Squyres wrote:
> > Also see http://www.open-mpi.org/faq/?category=debugging.
> >
> > On Jul 1, 2010, at 3:17 AM, Asad Ali wrote:
> >
> >> Hi Jack,
> >>
> >> Debugging OpenMPI with traditional debuggers is a pain.
> >> >From your error message it sounds that you have some memory allocation problem. Do you use dynamic memory allocation (allocate and then free)?
> >>
> >> I use display (printf()) command with MPIrank command. It tells me which thread is giving segmentation fault.
> >>
> >> Cheers,
> >>
> >> Asad
> >>
> >> On Thu, Jul 1, 2010 at 4:13 PM, Jack Bryan <dtustudy68 <at> hotmail.com> wrote:
> >> thanks
> >>
> >> I am not familiar with OpenMPI.
> >>
> >> Would you please help me with how to ask openMPI to show where the fault occurs ?
> >>
> >> GNU debuger ?
> >>
> >> Any help is appreciated.
> >>
> >> thanks!!!
> >>
> >> Jack
> >>
> >> June 30 2010
> >>
> >> Date: Wed, 30 Jun 2010 16:13:09 -0400
> >> From: amjad11 <at> gmail.com
> >> To: users <at> open-mpi.org
> >> Subject: Re: [OMPI users] Open MPI, Segmentation fault
> >>
> >>
> >> Based on my experiences, I would FULLY endorse (100% agree with) David Zhang.
> >> It is usually a coding or typo mistake.
> >>
> >> At first, Ensure that array sizes and dimension are correct.
> >>
> >> I experience that if openmpi is compiled with gnu compilers (not with Intel) then it also point outs the subroutine exactly in which the fault occur. have a try.
> >>
> >> best,
> >> AA
> >>
> >>
> >>
> >> On Wed, Jun 30, 2010 at 12:43 PM, David Zhang <solarbikedz <at> gmail.com> wrote:
> >> When I got segmentation faults, it has always been my coding mistakes. Perhaps your code is not robust against number of processes not divisible by 2?
> >>
> >> On Wed, Jun 30, 2010 at 8:47 AM, Jack Bryan <dtustudy68 <at> hotmail.com> wrote:
> >> Dear All,
> >>
> >> I am using Open MPI, I got the error:
> >>
> >> n337:37664] *** Process received signal ***
> >> [n337:37664] Signal: Segmentation fault (11)
> >> [n337:37664] Signal code: Address not mapped (1)
> >> [n337:37664] Failing at address: 0x7fffcfe90000
> >> [n337:37664] [ 0] /lib64/libpthread.so.0 [0x3c50e0e4c0]
> >> [n337:37664] [ 1] /lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2 [0x414ed7]
> >> [n337:37664] [ 2] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3c5021d974]
> >> [n337:37664] [ 3] /lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2(__gxx_personality_v0+0x1f1) [0x412139]
> >> [n337:37664] *** End of error message ***
> >>
> >> After searching answers, it seems that some functions fail.
> >>
> >> My program can run well for 1,2,10 processors, but fail when the number of tasks cannot
> >> be divided evenly by number of processes.
> >>
> >> Any help is appreciated.
> >>
> >> thanks
> >>
> >> Jack
> >>
> >> June 30 2010
> >>
> >>
> >> The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with Hotmail. Get busy.
> >>
> >> _______________________________________________
> >> users mailing list
> >> users <at> open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >>
> >> --
> >> David Zhang
> >> University of California, San Diego
> >>
> >> _______________________________________________
> >> users mailing list
> >> users <at> open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >> Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox. Learn more.
> >>
> >> _______________________________________________
> >> users mailing list
> >> users <at> open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >>
> >> --
> >> "Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write." - H.G. Wells
> >> _______________________________________________
> >> users mailing list
> >> users <at> open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
>
>
> _______________________________________________
> users mailing list
> users <at> open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail. Get busy.
_______________________________________________
users mailing list
users <at> open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
David Zhang | 1 Jul 18:54 2010
Picon

Re: [OMPI users] Open MPI, Segmentation fault


you can do that with MPI, all you have to do is declare your own MPI type.  The command is MPI_type_struct.  I haven't done this personally, but if you go to this link:
it'll show you know.  it is also a very good MPI reference.  I visit it often to look up MPI commands and syntax.


On Thu, Jul 1, 2010 at 9:09 AM, Jack Bryan <dtustudy68 <at> hotmail.com> wrote:
Thanks for all your replies.

I want to do master-worker asynchronous communication.

The master needs to distribute tasks to workers and then collect results from them.

master :

world.irecv(resultSourceRank, upStreamTaskTag, myResultTaskPackage[iRank][taskCounterT3]);

I got this error "MPI_ERR_TRUNCATE" , because I declared " TaskPackage myResultTaskPackage. "

It seems that the 2-dimension array cannot be used to receive my defined
class package from worker, who sends a TaskPackage to master.

So, I changed it to an int 2-d array to get the result, it works well.

But, I still want to find out how to store the result in a data structure with the type TaskPackage because
int type data can only be used to carry integers. Too limited.

What I want to do is:

The master can store the results from each worker and then combine them together
to form the final result after collecting all results from workers.

But, if the master has number of tasks that cannot be divided evenly by worker numbers,
each worker may have different number of tasks.

If we have 11 tasks and 3 workers.

aveTaskNumPerNode = (11 - 11%3) /3 = 3
leftTaskNum = 11%3 =2 = Z

the master distributes each of left tasks from worker 1 to work Z (Z < totalNumWorkers).

For example, worker 1: 4 tasks, worker 2: 4 task, worker 3: 3 tasks.

The master tries to distribute tasks evenly so that the difference between workloads of
each worker is minimized.

I am going to use vector's vector to do the dynamic data storage.

The 2-dimensional data-structure that can store results from workers.

Each row element of the data-structure has different columns.

It can be indexed by iterator so that I can find the a specified number worker task result
by searching the data strucutre.

For example,
               column           column
                  1                2
 row 1   (worker1.task1)    (worker1.task4)    
 row 2   (worker2.task2)     (worker1.task5)  
 row 3   (worker3.task3)

the data strucutre should remember the location of work ID and the task ID.
So that the master can know which task comes from which worker.

Any help or comment are appreciated.

thanks

Jack

June 30   2010



> Date: Thu, 1 Jul 2010 11:44:19 -0400
> From: gus <at> ldeo.columbia.edu

> To: users <at> open-mpi.org
> Subject: Re: [OMPI users] Open MPI, Segmentation fault
>
> Hello Jack, list
>
> As others mentioned, this may be a problem with dynamic
> memory allocation.
> It could also be a violation of statically allocated memory,
> I guess.
>
> You say:
>
> > My program can run well for 1,2,10 processors, but fail when the
> > number of tasks cannot
> > be divided evenly by number of processes.
>
> Often times, when the division of the number of "tasks"
> (or the global problem size) by the number of "processors" is not even,
> one processor gets a lighter/heavier workload then the others,
> it also allocates less/more memory than the others,
> and it accesses smaller/larger arrays than the others.
>
> In general integer division and remainder/module calculations
> are used to control memory allocation, the array sizes, etc,
> on different processors.
> These formulas tend to use the MPI communicator size
> (i.e., effectively the number of processors if you are using
> MPI_COMM_WORLD) to split the workload across the processors.
>
> I would search for the lines of code where those calculations are done,
> and where the arrays are allocated and accessed,
> to make sure the algorithm works both when
> they are of the same size
> (even workload across the processors),
> as when they are of different sizes
> (uneven workload across the processors).
> You may be violating memory access by a few bytes only, due to a small
> mistake in one of those integer division / remainder/module formulas,
> perhaps where an array index upper or lower bound is calculated.
> It happened to me before, probably to others too.
>
> This type of code inspection can be done without a debugger,
> or before you get to the debugger phase.
>
> I hope this helps,
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
>
> > Jeff Squyres wrote:
> > Also see http://www.open-mpi.org/faq/?category=debugging.
> >
> > On Jul 1, 2010, at 3:17 AM, Asad Ali wrote:
> >
> >> Hi Jack,
> >>
> >> Debugging OpenMPI with traditional debuggers is a pain.
> >> >From your error message it sounds that you have some memory allocation problem. Do you use dynamic memory allocation (allocate and then free)?
> >>
> >> I use display (printf()) command with MPIrank command. It tells me which thread is giving segmentation fault.
> >>
> >> Cheers,
> >>
> >> Asad
> >>
> >> On Thu, Jul 1, 2010 at 4:13 PM, Jack Bryan <dtustudy68 <at> hotmail.com> wrote:
> >> thanks
> >>
> >> I am not familiar with OpenMPI.
> >>
> >> Would you please help me with how to ask openMPI to show where the fault occurs ?
> >>
> >> GNU debuger ?
> >>
> >> Any help is appreciated.
> >>
> >> thanks!!!
> >>
> >> Jack
> >>
> >> June 30 2010
> >>
> >> Date: Wed, 30 Jun 2010 16:13:09 -0400
> >> From: amjad11 <at> gmail.com
> >> To: users <at> open-mpi.org
> >> Subject: Re: [OMPI users] Open MPI, Segmentation fault
> >>
> >>
> >> Based on my experiences, I would FULLY endorse (100% agree with) David Zhang.
> >> It is usually a coding or typo mistake.
> >>
> >> At first, Ensure that array sizes and dimension are correct.
> >>
> >> I experience that if openmpi is compiled with gnu compilers (not with Intel) then it also point outs the subroutine exactly in which the fault occur. have a try.
> >>
> >> best,
> >> AA
> >>
> >>
> >>
> >> On Wed, Jun 30, 2010 at 12:43 PM, David Zhang <solarbikedz <at> gmail.com> wrote:
> >> When I got segmentation faults, it has always been my coding mistakes. Perhaps your code is not robust against number of processes not divisible by 2?
> >>
> >> On Wed, Jun 30, 2010 at 8:47 AM, Jack Bryan <dtustudy68 <at> hotmail.com> wrote:
> >> Dear All,
> >>
> >> I am using Open MPI, I got the error:
> >>
> >> n337:37664] *** Process received signal ***
> >> [n337:37664] Signal: Segmentation fault (11)
> >> [n337:37664] Signal code: Address not mapped (1)
> >> [n337:37664] Failing at address: 0x7fffcfe90000
> >> [n337:37664] [ 0] /lib64/libpthread.so.0 [0x3c50e0e4c0]
> >> [n337:37664] [ 1] /lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2 [0x414ed7]
> >> [n337:37664] [ 2] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3c5021d974]
> >> [n337:37664] [ 3] /lustre/home/rhascheduler/RhaScheduler-0.4.1.1/mytest/nmn2(__gxx_personality_v0+0x1f1) [0x412139]
> >> [n337:37664] *** End of error message ***
> >>
> >> After searching answers, it seems that some functions fail.
> >>
> >> My program can run well for 1,2,10 processors, but fail when the number of tasks cannot
> >> be divided evenly by number of processes.
> >>
> >> Any help is appreciated.
> >>
> >> thanks
> >>
> >> Jack
> >>
> >> June 30 2010
> >>
> >>
> >> The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with Hotmail. Get busy.
> >>
> >> _______________________________________________
> >> users mailing list
> >> users <at> open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >>
> >> --
> >> David Zhang
> >> University of California, San Diego
> >>
> >> _______________________________________________
> >> users mailing list
> >> users <at> open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >> Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox. Learn more.
> >>
> >> _______________________________________________
> >> users mailing list
> >> users <at> open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >>
> >> --
> >> "Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write." - H.G. Wells
> >> _______________________________________________
> >> users mailing list
> >> users <at> open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
>
>
> _______________________________________________
> users mailing list
> users <at> open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail. Get busy.

_______________________________________________
users mailing list
users <at> open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
David Zhang
University of California, San Diego
_______________________________________________
users mailing list
users <at> open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
Eloi Gaudry | 2 Jul 11:06 2010
Picon

[OMPI users] [openib] segfault when using openib btl

Hi,

I'm observing a random segmentation fault during an internode parallel computation involving the openib btl and OpenMPI-1.4.2 (the same issue can be observed with OpenMPI-1.3.3).
  mpirun (Open MPI) 1.4.2
  Report bugs to http://www.open-mpi.org/community/help/
  [pbn08:02624] *** Process received signal ***
  [pbn08:02624] Signal: Segmentation fault (11)
  [pbn08:02624] Signal code: Address not mapped (1)
  [pbn08:02624] Failing at address: (nil)
  [pbn08:02624] [ 0] /lib64/libpthread.so.0 [0x349540e4c0]
  [pbn08:02624] *** End of error message ***
  sh: line 1:  2624 Segmentation fault      \/share\/hpc3\/actran_suite\/Actran_11\.0\.rc2\.41872\/RedHatEL\-5\/x86_64\/bin\/actranpy_mp '--apl=/share/hpc3/actran_suite/Actran_11.0.rc2.41872/RedHatEL-5/x86_64/Actran_11.0.rc2.41872' '--inputfile=/work/st25652/LSF_130073_0_47696_0/Case1_3Dreal_m4_n2.dat' '--scratch=/scratch/st25652/LSF_130073_0_47696_0/scratch' '--mem=3200' '--threads=1' '--errorlevel=FATAL' '--t_max=0.1' '--parallel=domain'

If I choose not to use the openib btl (by using --mca btl self,sm,tcp on the command line, for instance), I don't encounter any problem and the parallel computation runs flawlessly.

I would like to get some help to be able:
- to diagnose the issue I'm facing with the openib btl 
- understand why this issue is observed only when using the openib btl and not when using self,sm,tcp

Any help would be very much appreciated.

The outputs of ompi_info and the configure scripts of OpenMPI are enclosed to this email, and some information on the infiniband drivers as well.

Here is the command line used when launching a parallel computation using infiniband:
  path_to_openmpi/bin/mpirun -np $NPROCESS --hostfile host.list --mca btl openib,sm,self,tcp  --display-map --verbose --version --mca mpi_warn_on_fork 0 --mca btl_openib_want_fork_support 0 [...]
and the command line used if not using infiniband:
  path_to_openmpi/bin/mpirun -np $NPROCESS --hostfile host.list --mca btl self,sm,tcp  --display-map --verbose --version --mca mpi_warn_on_fork 0 --mca btl_openib_want_fork_support 0 [...]

Thanks,
Eloi






-- Eloi Gaudry Free Field Technologies Axis Park Louvain-la-Neuve Rue Emile Francqui, 1 B-1435 Mont-Saint Guibert BELGIUM Company Phone: +32 10 487 959 Company Fax: +32 10 454 626

                 MCA btl: parameter "btl_base_verbose" (current value: "0", data source: default value)
                          Verbosity level of the BTL framework
                 MCA btl: parameter "btl" (current value: <none>, data source: default value)
                          Default selection set of components for the btl framework (<none> means use all components that can be found)
                 MCA btl: parameter "btl_openib_verbose" (current value: "0", data source: default value)
                          Output some verbose OpenIB BTL information (0 = no output, nonzero = output)
                 MCA btl: parameter "btl_openib_warn_no_device_params_found" (current value: "1", data source:
default value, synonyms: btl_openib_warn_no_hca_params_found)
                          Warn when no device-specific parameters are found in the INI file specified by the
btl_openib_device_param_files MCA parameter (0 = do not warn; any other
                          value = warn)
                 MCA btl: parameter "btl_openib_warn_no_hca_params_found" (current value: "1", data source: default
value, deprecated, synonym of:
                          btl_openib_warn_no_device_params_found)
                          Warn when no device-specific parameters are found in the INI file specified by the
btl_openib_device_param_files MCA parameter (0 = do not warn; any other
                          value = warn)
                 MCA btl: parameter "btl_openib_warn_default_gid_prefix" (current value: "1", data source: default value)
                          Warn when there is more than one active ports and at least one of them connected to the network with only
default GID prefix configured (0 = do not warn;
                          any other value = warn)
                 MCA btl: parameter "btl_openib_warn_nonexistent_if" (current value: "1", data source: default value)
                          Warn if non-existent devices and/or ports are specified in the btl_openib_if_[in|ex]clude MCA
parameters (0 = do not warn; any other value = warn)
                 MCA btl: parameter "btl_openib_want_fork_support" (current value: "-1", data source: default value)
                          Whether fork support is desired or not (negative = try to enable fork support, but continue even if it is not
available, 0 = do not enable fork support,
                          positive = try to enable fork support and fail if it is not available)
                 MCA btl: parameter "btl_openib_device_param_files" (current value:
"/softs/appli/openmpi/openmpi-1.4.2/share/openmpi/mca-btl-openib-device-params.ini", data source:
                          default value, synonyms: btl_openib_hca_param_files)
                          Colon-delimited list of INI-style files that contain device vendor/part-specific parameters
                 MCA btl: parameter "btl_openib_hca_param_files" (current value:
"/softs/appli/openmpi/openmpi-1.4.2/share/openmpi/mca-btl-openib-device-params.ini", data source:
                          default value, deprecated, synonym of: btl_openib_device_param_files)
                          Colon-delimited list of INI-style files that contain device vendor/part-specific parameters
                 MCA btl: parameter "btl_openib_device_type" (current value: "all", data source: default value)
                          Specify to only use IB or iWARP network adapters (infiniband = only use InfiniBand HCAs; iwarp = only use
iWARP NICs; all = use any available adapters)
                 MCA btl: parameter "btl_openib_max_btls" (current value: "-1", data source: default value)
                          Maximum number of device ports to use (-1 = use all available, otherwise must be >= 1)
                 MCA btl: parameter "btl_openib_free_list_num" (current value: "8", data source: default value)
                          Intial size of free lists (must be >= 1)
                 MCA btl: parameter "btl_openib_free_list_max" (current value: "-1", data source: default value)
                          Maximum size of free lists (-1 = infinite, otherwise must be >= 0)
                 MCA btl: parameter "btl_openib_free_list_inc" (current value: "32", data source: default value)
                          Increment size of free lists (must be >= 1)
                 MCA btl: parameter "btl_openib_mpool" (current value: "rdma", data source: default value)
                          Name of the memory pool to be used (it is unlikely that you will ever want to change this
                 MCA btl: parameter "btl_openib_reg_mru_len" (current value: "16", data source: default value)
                          Length of the registration cache most recently used list (must be >= 1)
                 MCA btl: parameter "btl_openib_cq_size" (current value: "1000", data source: default value, synonyms: btl_openib_ib_cq_size)
                          Size of the OpenFabrics completion queue (will automatically be set to a minimum of (2 * number_of_peers * btl_openib_rd_num))
                 MCA btl: parameter "btl_openib_ib_cq_size" (current value: "1000", data source: default value,
deprecated, synonym of: btl_openib_cq_size)
                          Size of the OpenFabrics completion queue (will automatically be set to a minimum of (2 * number_of_peers * btl_openib_rd_num))
                 MCA btl: parameter "btl_openib_max_inline_data" (current value: "-1", data source: default value,
synonyms: btl_openib_ib_max_inline_data)
                          Maximum size of inline data segment (-1 = run-time probe to discover max value, otherwise must be >= 0). If
not explicitly set, use max_inline_data from the
                          INI file containing device-specific parameters
                 MCA btl: parameter "btl_openib_ib_max_inline_data" (current value: "-1", data source: default value,
deprecated, synonym of: btl_openib_max_inline_data)
                          Maximum size of inline data segment (-1 = run-time probe to discover max value, otherwise must be >= 0). If
not explicitly set, use max_inline_data from the
                          INI file containing device-specific parameters
                 MCA btl: parameter "btl_openib_pkey" (current value: "0", data source: default value, synonyms: btl_openib_ib_pkey_val)
                          OpenFabrics partition key (pkey) value. Unsigned integer decimal or hex values are allowed (e.g., "3" or
"0x3f") and will be masked against the maximum
                          allowable IB paritition key value (0x7fff)
                 MCA btl: parameter "btl_openib_ib_pkey_val" (current value: "0", data source: default value,
deprecated, synonym of: btl_openib_pkey)
                          OpenFabrics partition key (pkey) value. Unsigned integer decimal or hex values are allowed (e.g., "3" or
"0x3f") and will be masked against the maximum
                          allowable IB paritition key value (0x7fff)
                 MCA btl: parameter "btl_openib_psn" (current value: "0", data source: default value, synonyms: btl_openib_ib_psn)
                          OpenFabrics packet sequence starting number (must be >= 0)
                 MCA btl: parameter "btl_openib_ib_psn" (current value: "0", data source: default value, deprecated,
synonym of: btl_openib_psn)
                          OpenFabrics packet sequence starting number (must be >= 0)
                 MCA btl: parameter "btl_openib_ib_qp_ous_rd_atom" (current value: "4", data source: default value)
                          InfiniBand outstanding atomic reads (must be >= 0)
                 MCA btl: parameter "btl_openib_mtu" (current value: "3", data source: default value, synonyms: btl_openib_ib_mtu)
                          OpenFabrics MTU, in bytes (if not specified in INI files).  Valid values are: 1=256 bytes, 2=512 bytes,
3=1024 bytes, 4=2048 bytes, 5=4096 bytes
                 MCA btl: parameter "btl_openib_ib_mtu" (current value: "3", data source: default value, deprecated,
synonym of: btl_openib_mtu)
                          OpenFabrics MTU, in bytes (if not specified in INI files).  Valid values are: 1=256 bytes, 2=512 bytes,
3=1024 bytes, 4=2048 bytes, 5=4096 bytes
                 MCA btl: parameter "btl_openib_ib_min_rnr_timer" (current value: "25", data source: default value)
                          InfiniBand minimum "receiver not ready" timer, in seconds (must be >= 0 and <= 31)
                 MCA btl: parameter "btl_openib_ib_timeout" (current value: "20", data source: default value)
                          InfiniBand transmit timeout, plugged into formula: 4.096 microseconds *
(2^btl_openib_ib_timeout)(must be >= 0 and <= 31)
                 MCA btl: parameter "btl_openib_ib_retry_count" (current value: "7", data source: default value)
                          InfiniBand transmit retry count (must be >= 0 and <= 7)
                 MCA btl: parameter "btl_openib_ib_rnr_retry" (current value: "7", data source: default value)
                          InfiniBand "receiver not ready" retry count; applies *only* to SRQ/XRC queues.  PP queues use RNR retry
values of 0 because Open MPI performs software flow
                          control to guarantee that RNRs never occur (must be >= 0 and <= 7; 7 = "infinite")
                 MCA btl: parameter "btl_openib_ib_max_rdma_dst_ops" (current value: "4", data source: default value)
                          InfiniBand maximum pending RDMA destination operations (must be >= 0)
                 MCA btl: parameter "btl_openib_ib_service_level" (current value: "0", data source: default value)
                          InfiniBand service level (must be >= 0 and <= 15)
                 MCA btl: parameter "btl_openib_use_eager_rdma" (current value: "-1", data source: default value)
                          Use RDMA for eager messages (-1 = use device default, 0 = do not use eager RDMA, 1 = use eager RDMA)
                 MCA btl: parameter "btl_openib_eager_rdma_threshold" (current value: "16", data source: default value)
                          Use RDMA for short messages after this number of messages are received from a given peer (must be >= 1)
                 MCA btl: parameter "btl_openib_max_eager_rdma" (current value: "16", data source: default value)
                          Maximum number of peers allowed to use RDMA for short messages (RDMA is used for all long messages, except if
explicitly disabled, such as with the "dr"
                          pml) (must be >= 0)
                 MCA btl: parameter "btl_openib_eager_rdma_num" (current value: "16", data source: default value)
                          Number of RDMA buffers to allocate for small messages(must be >= 1)
                 MCA btl: parameter "btl_openib_btls_per_lid" (current value: "1", data source: default value)
                          Number of BTLs to create for each InfiniBand LID (must be >= 1)
                 MCA btl: parameter "btl_openib_max_lmc" (current value: "0", data source: default value)
                          Maximum number of LIDs to use for each device port (must be >= 0, where 0 = use all available)
                 MCA btl: parameter "btl_openib_enable_apm_over_lmc" (current value: "0", data source: default value)
                          Maximum number of alterative paths for each device port (must be >= -1, where 0 = disable apm, -1 = all
availible alternative paths )
                 MCA btl: parameter "btl_openib_enable_apm_over_ports" (current value: "0", data source: default value)
                          Enable alterative path migration (APM) over different ports of the same device (must be >= 0, where 0 =
disable APM over ports , 1 = enable APM over ports
                          of the same device)
                 MCA btl: parameter "btl_openib_use_async_event_thread" (current value: "1", data source: default value)
                          If nonzero, use the thread that will handle InfiniBand asyncihronous events
                 MCA btl: parameter "btl_openib_buffer_alignment" (current value: "64", data source: default value)
                          Prefered communication buffer alignment, in bytes (must be > 0 and power of two)
                 MCA btl: parameter "btl_openib_use_message_coalescing" (current value: "1", data source: default value)
                          Use message coalescing
                 MCA btl: parameter "btl_openib_cq_poll_ratio" (current value: "100", data source: default value)
                          how often poll high priority CQ versus low priority CQ
                 MCA btl: parameter "btl_openib_eager_rdma_poll_ratio" (current value: "100", data source: default value)
                          how often poll eager RDMA channel versus CQ
                 MCA btl: parameter "btl_openib_hp_cq_poll_per_progress" (current value: "10", data source: default value)
                          max number of completion events to process for each call of BTL progress engine
                 MCA btl: information "btl_openib_have_fork_support" (value: "1", data source: default value)
                          Whether the OpenFabrics stack supports applications that invoke the "fork()" system call or not (0 = no, 1 =
yes).  Note that this value does NOT indicate
                          whether the system being run on supports "fork()" with OpenFabrics applications or not.
                 MCA btl: parameter "btl_openib_exclusivity" (current value: "1024", data source: default value)
                          BTL exclusivity (must be >= 0)
                 MCA btl: parameter "btl_openib_flags" (current value: "310", data source: default value)
                          BTL bit flags (general flags: SEND=1, PUT=2, GET=4, SEND_INPLACE=8, RDMA_MATCHED=64,
HETEROGENEOUS_RDMA=256; flags only used by the "dr" PML (ignored by
                          others): ACK=16, CHECKSUM=32, RDMA_COMPLETION=128)
                 MCA btl: parameter "btl_openib_rndv_eager_limit" (current value: "12288", data source: default value)
                          Size (in bytes) of "phase 1" fragment sent for all large messages (must be >= 0 and <= eager_limit)
                 MCA btl: parameter "btl_openib_eager_limit" (current value: "12288", data source: default value)
                          Maximum size (in bytes) of "short" messages (must be >= 1).
                 MCA btl: parameter "btl_openib_max_send_size" (current value: "65536", data source: default value)
                          Maximum size (in bytes) of a single "phase 2" fragment of a long message when using the pipeline protocol
(must be >= 1)
                 MCA btl: parameter "btl_openib_rdma_pipeline_send_length" (current value: "1048576", data source:
default value)
                          Length of the "phase 2" portion of a large message (in bytes) when using the pipeline protocol.  This part of
the message will be split into fragments of
                          size max_send_size and sent using send/receive semantics (must be >= 0; only relevant when the PUT flag is set)
                 MCA btl: parameter "btl_openib_rdma_pipeline_frag_size" (current value: "1048576", data source:
default value)
                          Maximum size (in bytes) of a single "phase 3" fragment from a long message when using the pipeline protocol. 
These fragments will be sent using RDMA
                          semantics (must be >= 1; only relevant when the PUT flag is set)
                 MCA btl: parameter "btl_openib_min_rdma_pipeline_size" (current value: "262144", data source:
default value)
                          Messages smaller than this size (in bytes) will not use the RDMA pipeline protocol.  Instead, they will be
split into fragments of max_send_size and sent
                          using send/receive semantics (must be >=0, and is automatically adjusted up to at least
(eager_limit+btl_rdma_pipeline_send_length); only relevant when the
                          PUT flag is set)
                 MCA btl: parameter "btl_openib_bandwidth" (current value: "800", data source: default value)
                          Approximate maximum bandwidth of interconnect(must be >= 1)
                 MCA btl: parameter "btl_openib_latency" (current value: "10", data source: default value)
                          Approximate latency of interconnect (must be >= 0)
                 MCA btl: parameter "btl_openib_receive_queues" (current value:
"P,128,256,192,128:S,2048,256,128,32:S,12288,256,128,32:S,65536,256,128,32", data source: default
                          value)
                          Colon-delimited, comma delimited list of receive queues: P,4096,8,6,4:P,32768,8,6,4
                 MCA btl: parameter "btl_openib_if_include" (current value: <none>, data source: default value)
                          Comma-delimited list of devices/ports to be used (e.g. "mthca0,mthca1:2"; empty value means to use all
ports found).  Mutually exclusive with
                          btl_openib_if_exclude.
                 MCA btl: parameter "btl_openib_if_exclude" (current value: <none>, data source: default value)
                          Comma-delimited list of device/ports to be excluded (empty value means to not exclude any ports). 
Mutually exclusive with btl_openib_if_include.
                 MCA btl: parameter "btl_openib_ipaddr_include" (current value: <none>, data source: default value)
                          Comma-delimited list of IP Addresses to be used (e.g. "192.168.1.0/24").  Mutually exclusive with btl_openib_ipaddr_exclude.
                 MCA btl: parameter "btl_openib_ipaddr_exclude" (current value: <none>, data source: default value)
                          Comma-delimited list of IP Addresses to be excluded (e.g. "192.168.1.0/24").  Mutually exclusive with btl_openib_ipaddr_include.
                 MCA btl: parameter "btl_openib_cpc_include" (current value: <none>, data source: default value)
                          Method used to select OpenFabrics connections (valid values: oob,xoob,rdmacm)
                 MCA btl: parameter "btl_openib_cpc_exclude" (current value: <none>, data source: default value)
                          Method used to exclude OpenFabrics connections (valid values: oob,xoob,rdmacm)
                 MCA btl: parameter "btl_openib_connect_oob_priority" (current value: "50", data source: default value)
                          The selection method priority for oob
                 MCA btl: parameter "btl_openib_connect_xoob_priority" (current value: "60", data source: default value)
                          The selection method priority for xoob
                 MCA btl: parameter "btl_openib_connect_rdmacm_priority" (current value: "30", data source: default value)
                          The selection method priority for rdma_cm
                 MCA btl: parameter "btl_openib_connect_rdmacm_port" (current value: "0", data source: default value)
                          The selection method port for rdma_cm
                 MCA btl: parameter "btl_openib_connect_rdmacm_resolve_timeout" (current value: "1000", data
source: default value)
                          The timeout (in miliseconds) for address and route resolution
                 MCA btl: parameter "btl_openib_connect_rdmacm_retry_count" (current value: "20", data source:
default value)
                          Maximum number of times rdmacm will retry route resolution
                 MCA btl: parameter "btl_openib_connect_rdmacm_reject_causes_connect_error" (current value: "0",
data source: default value)
                          The drivers for some devices are buggy such that an RDMA REJECT action may result in a CONNECT_ERROR event
instead of a REJECTED event.  Setting this MCA
                          parameter to true tells Open MPI to treat CONNECT_ERROR events on connections where a REJECT is expected as
a REJECT (default: false)
                 MCA btl: parameter "btl_openib_priority" (current value: "0", data source: default value)
                 MCA btl: parameter "btl_base_warn_component_unused" (current value: "1", data source: default value)
                          This parameter is used to turn on warning messages when certain NICs are not used

                 Package: Open MPI opnsrc <at> cgidev Distribution
                Open MPI: 1.4.2
   Open MPI SVN revision: r23093
   Open MPI release date: May 04, 2010
                Open RTE: 1.4.2
   Open RTE SVN revision: r23093
   Open RTE release date: May 04, 2010
                    OPAL: 1.4.2
       OPAL SVN revision: r23093
       OPAL release date: May 04, 2010
            Ident string: 1.4.2
                  Prefix: /home/OPNSRC/OPENMPI-1.4.2
 Configured architecture: x86_64-unknown-linux-gnu
          Configure host: cgidev
           Configured by: opnsrc
           Configured on: Wed May 19 10:59:51 CEST 2010
          Configure host: cgidev
                Built by: opnsrc
                Built on: Wed May 19 13:38:05 CEST 2010
              Built host: cgidev
              C bindings: yes
            C++ bindings: yes
      Fortran77 bindings: yes (all)
      Fortran90 bindings: yes
 Fortran90 bindings size: small
              C compiler: /usr/bin/gcc
     C compiler absolute: 
            C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
      Fortran77 compiler: /usr/bin/gfortran
  Fortran77 compiler abs: 
      Fortran90 compiler: /usr/bin/gfortran
  Fortran90 compiler abs: 
             C profiling: yes
           C++ profiling: yes
     Fortran77 profiling: yes
     Fortran90 profiling: yes
          C++ exceptions: yes
          Thread support: posix (mpi: no, progress: no)
           Sparse Groups: no
  Internal debug support: no
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
         libltdl support: yes
   Heterogeneous support: no
 mpirun default --prefix: no
         MPI I/O support: yes
       MPI_WTIME support: gettimeofday
Symbol visibility support: yes
   FT Checkpoint support: no  (checkpoint thread: no)
           MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.4.2)
              MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.4.2)
           MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.4.2)
               MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.4.2)
               MCA carto: file (MCA v2.0, API v2.0, Component v1.4.2)
           MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.4.2)
           MCA maffinity: libnuma (MCA v2.0, API v2.0, Component v1.4.2)
               MCA timer: linux (MCA v2.0, API v2.0, Component v1.4.2)
         MCA installdirs: env (MCA v2.0, API v2.0, Component v1.4.2)
         MCA installdirs: config (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA dpm: orte (MCA v2.0, API v2.0, Component v1.4.2)
              MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.4.2)
           MCA allocator: basic (MCA v2.0, API v2.0, Component v1.4.2)
           MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.4.2)
                MCA coll: basic (MCA v2.0, API v2.0, Component v1.4.2)
                MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.4.2)
                MCA coll: inter (MCA v2.0, API v2.0, Component v1.4.2)
                MCA coll: self (MCA v2.0, API v2.0, Component v1.4.2)
                MCA coll: sm (MCA v2.0, API v2.0, Component v1.4.2)
                MCA coll: sync (MCA v2.0, API v2.0, Component v1.4.2)
                MCA coll: tuned (MCA v2.0, API v2.0, Component v1.4.2)
                  MCA io: romio (MCA v2.0, API v2.0, Component v1.4.2)
               MCA mpool: fake (MCA v2.0, API v2.0, Component v1.4.2)
               MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.4.2)
               MCA mpool: sm (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA pml: cm (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA pml: csum (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA pml: v (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA bml: r2 (MCA v2.0, API v2.0, Component v1.4.2)
              MCA rcache: vma (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA btl: ofud (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA btl: openib (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA btl: self (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA btl: sm (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA btl: tcp (MCA v2.0, API v2.0, Component v1.4.2)
                MCA topo: unity (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA osc: rdma (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA iof: hnp (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA iof: orted (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA iof: tool (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA oob: tcp (MCA v2.0, API v2.0, Component v1.4.2)
                MCA odls: default (MCA v2.0, API v2.0, Component v1.4.2)
               MCA rmaps: load_balance (MCA v2.0, API v2.0, Component
v1.4.2)
               MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.4.2)
               MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.4.2)
               MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA rml: oob (MCA v2.0, API v2.0, Component v1.4.2)
              MCA routed: binomial (MCA v2.0, API v2.0, Component v1.4.2)
              MCA routed: direct (MCA v2.0, API v2.0, Component v1.4.2)
              MCA routed: linear (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA plm: rsh (MCA v2.0, API v2.0, Component v1.4.2)
               MCA filem: rsh (MCA v2.0, API v2.0, Component v1.4.2)
              MCA errmgr: default (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA ess: env (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA ess: hnp (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA ess: singleton (MCA v2.0, API v2.0, Component v1.4.2)
                 MCA ess: tool (MCA v2.0, API v2.0, Component v1.4.2)
             MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.4.2)
             MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.4.2)

./configure --prefix=/softs/appli/openmpi/openmpi-1.4.2 --enable-cxx-exceptions --with-pic
--with-threads --with-openib=/usr --without-slurm

[root <at> psto ~]# lsmod | grep ib
zlib_deflate           52825  1 deflate
ib_ucm                 50312  0
ib_sdp                141788  0
rdma_cm                68756  3 rdma_ucm,rds,ib_sdp
ib_addr                41992  1 rdma_cm
ib_ipoib              113880  0
ipoib_helper           35728  2 ib_ipoib
ib_cm                  73000  3 ib_ucm,rdma_cm,ib_ipoib
ib_sa                  75016  3 rdma_cm,ib_ipoib,ib_cm
ipv6                  424609  66 ipcomp6,ah6,esp6,xfrm6_esp,xfrm6_mode_transport,xfrm6_tunnel,tunnel6,ib_ipoib
ib_uverbs              75824  2 rdma_ucm,ib_ucm
ib_umad                50472  0
mlx4_ib                99260  0
ib_mthca              157988  0
ib_mad                 70948  5 ib_cm,ib_sa,ib_umad,mlx4_ib,ib_mthca
ib_core               108544  14 rdma_ucm,rds,ib_ucm,ib_sdp,rdma_cm,iw_cm,ib_ipoib,ib_cm,ib_sa,ib_uverbs,ib_umad,mlx4_ib,ib_mthca,ib_mad
mlx4_core             130532  1 mlx4_ib
libata                208721  1 ata_piix
scsi_mod              196569  6 scsi_dh,sr_mod,sg,libata,cciss,sd_mod

[root <at> psto ~]# modinfo ib_core
filename:       /lib/modules/2.6.18-128.el5/updates/kernel/drivers/infiniband/core/ib_core.ko
license:        Dual BSD/GPL
description:    core kernel InfiniBand API
author:         Roland Dreier
srcversion:     71D6D691ABE8851E298B5A9
depends:       
vermagic:       2.6.18-128.el5 SMP mod_unload gcc-4.1

_______________________________________________
users mailing list
users <at> open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
Simone Pellegrini | 2 Jul 14:21 2010
Picon

[OMPI users] Open MPI runtime parameter tuning on a custom cluster

Dear Open MPI community,
I would like to know from expert system administrator if they know any 
"standardized" way for tuning Open MPI runtime parameters.

I need to tune the performance on a custom cluster so I would like to 
have some hints in order to proceed in the correct direction.

thanks in advance for your help,
Simone P.

Gmane