Jeff Nucciarone | 4 Jun 16:45 2002
Picon

[Myrinet] application hangs when using 'call system'

I am in the process of debugging an application that appears to hang 
whenever a Fortran call to 'system' is made. In the following example code:

---[cut here]---     
       include 'mpif.h'                                                   

      integer :: time_array0(8), time_array1(8), taskid                  
      character*50 scom                                                  

      call MPI_INIT (mpierr)                                             
      call MPI_COMM_SIZE (MPI_COMM_WORLD,numtasks,mpierr)                
      call MPI_COMM_RANK (MPI_COMM_WORLD,taskid,mpierr)                  

c                                                                        
      write (scom,600) taskid                                            
      call system (scom)                                                 
 600  format ('echo This is task ',i3,' on `/bin/hostname`')             
c                                                                        
      call date_and_time(values=time_array0)                             
      write (6,601) taskid,time_array0(2),time_array0(3),time_array0(1), 
     1 time_array0(5),time_array0(6),time_array0(7)                      
 601  format (' starting taskid=',i3, ' date & time is  ',i2,'/',i2,     
     1        '/',i4,2x,i2,':',i2,':',i2)                                
c                                                                        
        write(*,*) 'Starting MPI_BARRIER: ', taskid                      
        call MPI_BARRIER (MPI_COMM_WORLD,mpierr)                         
        write(*,*) 'End of MPI_BARRIER: ', taskid                        

        call MPI_FINALIZE (mpierr)                                       

(Continue reading)

William Gropp | 4 Jun 16:54 2002

Re: [Myrinet] application hangs when using 'call system'

At 10:45 AM 6/4/2002 -0400, Jeff Nucciarone wrote:
>I am in the process of debugging an application that appears to hang 
>whenever a Fortran call to 'system' is made. In the following example code:
>...
>I suspect the 'system' call is somehow upsetting GM. Any ideas (other than 
>eliminating any calls to system)?

I have sometimes had trouble with such calls taking signals away from the 
application and failing to restore them on exit.  You might try writing a 
short C routine that does fork/exec/waitpid instead of the system call.

Bill

Audet, Martin | 4 Jun 17:30 2002
Picon

RE: [Myrinet] application hangs when using 'call system'

Hi,

I had exactly the same problem with calling system() in Fortran. It was
producing very strange errors in later MPI calls. 

However by looking into the GM FAQ on Myricom site, it is stated that
using fork() and system() is a source of trouble on Linux. 

http://www.myri.com/scs/GM_FAQ.html#mpich14a

They suggested using vfork() but again, it did cause trouble.

Martin

Martin Audet
Tel:	450-641-5034 		Industrial Material Institute
Fax:	450-641-5106 		National Research Council, Canada
E-mail:martin.audet <at> nrc.ca	75, de Mortagne, Boucherville, QC, J4B 6Y4.

-----Original Message-----
From: William Gropp [mailto:gropp <at> mcs.anl.gov]
Sent: Tuesday, June 04, 2002 10:54
To: Jeff Nucciarone
Cc: myrinet <at> osc.edu
Subject: Re: [Myrinet] application hangs when using 'call system'

At 10:45 AM 6/4/2002 -0400, Jeff Nucciarone wrote:
>I am in the process of debugging an application that appears to hang 
>whenever a Fortran call to 'system' is made. In the following example code:
>...
(Continue reading)

Patrick Geoffray | 4 Jun 18:15 2002

Re: [Myrinet] application hangs when using 'call system'

Folks,

Audet, Martin wrote:
> I had exactly the same problem with calling system() in Fortran. It was
> producing very strange errors in later MPI calls. 

I have tried Jeff's code and the next GM message sent is effectively 
garbage, so the barrier hangs. It works fine in C, though.

BTW, the full fork support in Linux is provided by gm-1.5.2 that should 
be released today or tomorrow (we are at the release candidate 4). I 
have tried the same fortran code with this GM release and it works fine.

I will see with our Linux guru why vfork in Fortran is not working 
properly, actually it works but it has a bad side effect on GM.

Patrick

----------------------------------------------------------
|   Patrick Geoffray, Ph.D.      patrick <at> myri.com
|   Myricom, Inc.                http://www.myri.com
|   Cell:  865-389-8852          685 Emory Valley Rd (B)
|   Phone: 865-425-0978          Oak Ridge, TN 37830
----------------------------------------------------------

Patrick Geoffray | 4 Jun 19:54 2002

Re: [Myrinet] [Myricom help #9872] application hangs when using 'call system'

Patrick Geoffray wrote:

> I will see with our Linux guru why vfork in Fortran is not working 
> properly, actually it works but it has a bad side effect on GM.

OK, we know what is happening.
In GM-1.5.1, fork() is not supported (again, gm-1.5.2 solves that), but 
vfork() is safe. popen() is traditionaly using vfork() but system() uses 
fork(). So system() is defined in GM to used vfork() instead of fork().

Now, there is 2 distincts problems:
* Compiling C code, the system() used will be the one provided by GM. 
With Fortran code, the runtime has a system_.o object that provides a 
stub to call system() in C. The compiler will take this reference first 
and miss the one in GM, and at link time it will only find it in the 
libC, not in the GM lib. So you need to use the flag "-u system" at 
compile time to tell the compiler to look into the libs for a definition 
of system.

* However, this is not enough. The pthread lib on Linux re-implement 
vfork() to use fork(), making it unsafe. So even if the 
system-on-top-of-vfork provided by GM is used, it would produce nasty 
side effects on memory registration.

So, there is 2 solutions:
* using gm-1.5.2 when available. Everything works out of the box, with C 
or Fortran, with or without pthread.
* For compatibility with gm-1.5.1 and earlier: removing the (unused by 
default) pthread lib from the MPICH-GM configure line and add "-u 
system" to the Fortran flags in the same configure line. That will be 
(Continue reading)

Rasit Eskicioglu | 5 Jun 16:40 2002
Picon
Picon

[Myrinet] New GM...

Hi there,

When will GM 1.5.2 be released? Will it integrate VIA and Sockets versions 
as well?  What about mpich 1.2.4 port?

Thanks,

Rasit

--

-- 
M. Rasit Eskicioglu                  E-mail: rasit <at> cs.umanitoba.ca
Department of Computer Science       Phone : (204) 474-8835 (Office)
528 Machray Hall                             (204) 474-8313 (Messages)
University of Manitoba               Fax   : (204) 474-7609
Winnipeg, Manitoba R3T 2N2 Canada    URL   : http://www.cs.umanitoba.ca/~rasit

Patrick Geoffray | 5 Jun 17:44 2002

Re: [Myrinet] New GM...

Rasit Eskicioglu wrote:
> Hi there,
> 
> When will GM 1.5.2 be released? Will it integrate VIA and Sockets versions 
> as well?  What about mpich 1.2.4 port?
> 
Oops, I replied to Rasit directly but not CC the list.

In short:
GM-1.5.2_Linux should be released today. It passed all of the tests.

VI-GM, 1.0 is on the web but Intel has found 2 bugs. One is fixed and 
the second one is in progress. Once the second problem is fixed, there 
will be the VI-GM 1.1 release.

I don't know about Sockets-GM (Markus ?)

MPICH-1.2.4..8 is in testing, specifically the MPD support. The merge of 
ch_gm into MPICH's CVS in Argonne is done. The use of P4 machines file 
(without GM ports) and procgroup file (for MPMD) is also done. When 
basic testing is finished it will be on the FTP and annonced on the 
list. Only when extensive testing is done, it will be on the web.

Patrick

----------------------------------------------------------
|   Patrick Geoffray, Ph.D.      patrick <at> myri.com
|   Myricom, Inc.                http://www.myri.com
|   Cell:  865-389-8852          685 Emory Valley Rd (B)
|   Phone: 865-425-0978          Oak Ridge, TN 37830
(Continue reading)

Susan Blackford | 6 Jun 04:39 2002

[Myrinet] Mute (Diagnostic Monitoring Tool for Myrinet-2000 Networks) webpage is now available!

Hi, Everyone,

We recently introduced a Mute webpage on the Myricom website.
  http://www.myri.com/scs/mute/

Mute is a graphical diagnostic monitoring tool for Myrinet-2000 networks
(switch(es), cables, and hosts).  It exercises the monitoring capabilities
of the Myrinet-2000 switches (Myrinet-2000 Switch Tools, m3-dist.tar.gz), as 
well as the Mapper Tools (located in the mt subdirectory of the GM 
distribution). Mute builds an image/picture of the Myrinet-2000 Network, and 
can be used to non-intrusively monitor the Myrinet-2000 network in real time, 
analyzing the network traffic, and diagnosing/validating the integrity of the 
hardware components.

The Mute webpage provides step-by-step installation instructions,
an overview of Mute functionality, and typical usage scenarios.  Additional 
usage scenarios will be added over time.

We encourage customers to try out this very useful tool.  :-) If you have
any questions, feel free to contact us at help <at> myri.com.

P.S.  Also check out the new and improved Myrinet FAQ.  Many new 
       entries have been added over the last few months!
       http://www.myri.com/scs/GM_FAQ.html

Thanks,
Ruth and Susan (help <at> myri.com)

--

-- 

(Continue reading)

chuck | 6 Jun 06:16 2002

[Myrinet] MUG-2002 presentations

The slides used in most of the presentations at the Myrinet Users Group
(MUG-2002) conference, 13-14 May, Vienna, are now indexed and available
for download at http://www.myri.com/news/02512/.  Enjoy!

On behalf of the Myricom technical staff who organized technical program
of the the MUG-2002 conference, our thanks to the sponsors, speakers,
and attendees.

--Chuck Seitz

Ruth Sivilotti | 6 Jun 07:32 2002

[Myrinet] New GM release

We are happy to announce that gm-1.5.2 for Linux is now available,
and can be found at:
  http://www.myri.com/scs/linux.html

The enhancements and bug fixes for this release are detailed here:
  http://www.myri.com/scs/CHANGES

*** Please note that this release is required for PCI64C interfaces ***
*** running on machines with 33MHz PCI buses.                       ***

If you have any questions about this release, or other
technical Myrinet questions, please don't hesitate to
send mail to help <at> myri.com.

--Ruth


Gmane