Edward Ned Harvey (mono | 22 Jun 13:58 2016

Bug preserving stack trace

I know you don't usually want to catch & throw, because you're not adding new information. If you want to add new information, generally you should throw a new exception and include an InnerException. Here are some examples of situations you would want to catch and rethrow:

https://msdn.microsoft.com/en-us/library/0yd65esw.aspx

 

I also know you destroy the stack trace if you explicitly name the exception you're rethrowing. In order to preserve the stack, you should throw without any arguments.

https://msdn.microsoft.com/en-us/library/ms182363.aspx

 

I know the difference between a debug build and release build, particularly with regards to optimizations and inlining.

 

In a debug build, the following code behaves as expected (prints out the full stack trace) on .NET, but doesn't print out the full stack on mono 4.2.3. I'm pretty sure it's not an expected behavior.

 

using System;

namespace FunWithRethrow
{
     
class MainClass
      {
           
public static void Main (string[] args)
            {
                 
try {
                        Third ();
                  }
                 
catch (Exception e) {
                       
Console.WriteLine (e);
                  }
            }
           
static void Third()
            {
                  Second ();
            }
           
static void Second()
            {
                 
try {
                        First();
                  }
                 
catch {
                       
throw;
                  }
            }
           
static void First()
            {
                 
throw new Exception ("Something said... Not good...");
            }
      }
}

_______________________________________________
Mono-devel-list mailing list
Mono-devel-list <at> lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list

Re: FW: Random hangs while running mono app

Hi Rodrigo,


I finally can semi-reliably (most of the time) reproduce these hung processes.  The strace, gdb backtrace, and CUE cards for a single hung mono instance is shown below.

  1. I am now running on mono-4.4.0.  The hanging issue still exists.
  2. The code is compiled in VS 2015 and run under mono.
  3. I changed the code in mono-threads.c (line 72) to use 10 seconds instead of the 200ms just to test if there was a timing issue in mono_threads_wait_pending_operations.  You can see below that this timer expires as well.  My guess is something is deadlocked waiting on a condition to occur that isn’t going to occur.
Finally, can you give me an overall picture of what is occuring so I might be able to navigate the source a bit better.  There are three threads in the backtrace.  My best guess was that one is the GC, one is the Main Thread, and one is possibly a Finalizer, but I’m not sure.  I’m digging through mono/utils/mono-threads.c and related code.  Also, the CUE CARD state is 105, which according to the CUE CARD itself says : 
0x?05   - async suspend requested (BAD)

Under what cirumstances would an async suspend be requested in this context?

Thanks again for your help with this issue.

Cheers.



strace of hung process/threads:

-----------------------------------------------------------------------------------------------

ggeorge  64515 64382 64515 31    3 17:06 ?        00:08:37 mono --runtime=v4.0.30319 mycode.exe

ggeorge  64515 64382 64574  0    3 17:06 ?        00:00:00 mono --runtime=v4.0.30319 mycode.exe

ggeorge  64515 64382 64596  0    3 17:06 ?        00:00:06 mono --runtime=v4.0.30319 mycode.exe

ggeorge  67340 67160 67340  0    1 17:33 pts/0    00:00:00 grep mono


ggeorge <at> r8i2n17:~> strace -p 64515

Process 64515 attached - interrupt to quit

futex(0x7fffedb05968, FUTEX_WAIT_PRIVATE, 2, NULL^C <unfinished ...>

Process 64515 detached

ggeorge <at> r8i2n17:~> strace -p 64382

Process 64382 attached - interrupt to quit

wait4(-1, ^C <unfinished ...>

Process 64382 detached

ggeorge <at> r8i2n17:~> strace -p 64574

Process 64574 attached - interrupt to quit

futex(0x97f604, FUTEX_WAIT_PRIVATE, 523, NULL^C <unfinished ...>

Process 64574 detached

ggeorge <at> r8i2n17:~> strace -p 64596

Process 64596 attached - interrupt to quit

rt_sigsuspend(~[PWR RTMIN RT_1 RT_5]^C <unfinished ...>

Process 64596 detached

ggeorge <at> r8i2n17:~> 


gdb backtrace of hung process/threads:

-----------------------------------------------------------------------------------------------

Thread 3 (Thread 0x7fffebfff700 (LWP 64574)):

#0  0x00007fffecd3d66c in pthread_cond_wait <at> <at> GLIBC_2.3.2 () from /lib64/libpthread.so.0

#1  0x000000000060db93 in mono_os_cond_wait (mutex=0x97f640 <lock>, cond=0x97f600 <work_cond>) at ../../mono/utils/mono-os-mutex.h:105

#2  thread_func (thread_data=0x0) at sgen-thread-pool.c:118

#3  0x00007fffecd39806 in start_thread () from /lib64/libpthread.so.0

#4  0x00007fffec87d9bd in clone () from /lib64/libc.so.6

#5  0x0000000000000000 in ?? ()


Thread 2 (Thread 0x7fffd8093700 (LWP 64596)):

#0  0x00007fffec7d1c8b in sigsuspend () from /lib64/libc.so.6

#1  0x000000000063e3e6 in suspend_signal_handler (_dummy=<optimized out>, info=<optimized out>, context=0x7fffd808ff80) at mono-threads-posix-signals.c:209

#2  <signal handler called>

#3  0x00007fffed8faf97 in open64 () from /lib64/ld-linux-x86-64.so.2

#4  0x00007fffed8ea82d in open_verify () from /lib64/ld-linux-x86-64.so.2

#5  0x00007fffed8ecca1 in _dl_map_object () from /lib64/ld-linux-x86-64.so.2

#6  0x00007fffed8f7400 in dl_open_worker () from /lib64/ld-linux-x86-64.so.2

#7  0x00007fffed8f2e86 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2

#8  0x00007fffed8f6e3b in _dl_open () from /lib64/ld-linux-x86-64.so.2

#9  0x00007fffecf4ff9b in dlopen_doit () from /lib64/libdl.so.2

#10 0x00007fffed8f2e86 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2

#11 0x00007fffecf5033c in _dlerror_run () from /lib64/libdl.so.2

#12 0x00007fffecf4ff01 in dlopen <at> <at> GLIBC_2.2.5 () from /lib64/libdl.so.2

#13 0x00000000006327f5 in mono_dl_open_file (file=<optimized out>, flags=<optimized out>) at mono-dl-posix.c:67

#14 0x0000000000632029 in mono_dl_open (name=name <at> entry=0x1c604d0 "/p/home/apps/unsupported/NAVAIR/build/mono-4.4.0/lib/mono/gac/System.Data/4.0.0.0__b77a5c561934e089/libSystem.Data.dll", flags=flags <at> entry=1, 

    error_msg=error_msg <at> entry=0x7fffd8090e80) at mono-dl.c:150

#15 0x000000000054cf90 in cached_module_load (name=name <at> entry=0x1c604d0 "/p/home/apps/unsupported/NAVAIR/build/mono-4.4.0/lib/mono/gac/System.Data/4.0.0.0__b77a5c561934e089/libSystem.Data.dll", err=err <at> entry=0x7fffd8090e80, 

    flags=1) at loader.c:1402

Python Exception <type 'exceptions.ValueError'> zero length field name in format: 

#16 0x000000000054e218 in mono_lookup_pinvoke_call (method=method <at> entry=, exc_class=exc_class <at> entry=0x7fffd8091f00, exc_arg=exc_arg <at> entry=0x7fffd8091f08) at loader.c:1645

Python Exception <type 'exceptions.ValueError'> zero length field name in format: 

#17 0x00000000005641f6 in mono_marshal_get_native_wrapper (method=method <at> entry=, check_exceptions=check_exceptions <at> entry=1, aot=0) at marshal.c:7396

Python Exception <type 'exceptions.ValueError'> zero length field name in format: 

#18 0x00000000004602cf in mono_method_to_ir (cfg=cfg <at> entry=0x1c5f920, method=method <at> entry=, start_bblock=<optimized out>, start_bblock <at> entry=0x0, end_bblock=<optimized out>, end_bblock <at> entry=0x0, return_var=return_var <at> entry=0x0, 

    inline_args=inline_args <at> entry=0x0, inline_offset=0, is_virtual_call=0) at method-to-ir.c:9280

Python Exception <type 'exceptions.ValueError'> zero length field name in format: 

#19 0x000000000050acf9 in mini_method_compile (method=method <at> entry=, opts=opts <at> entry=370239999, domain=domain <at> entry=0x9c91f0, flags=flags <at> entry=JIT_FLAG_RUN_CCTORS, parts=parts <at> entry=0, aot_method_index=aot_method_index <at> entry=-1)

    at mini.c:3608

Python Exception <type 'exceptions.ValueError'> zero length field name in format: 

#20 0x000000000050c4d5 in mono_jit_compile_method_inner (method=method <at> entry=, target_domain=target_domain <at> entry=0x9c91f0, opt=opt <at> entry=370239999, jit_ex=jit_ex <at> entry=0x7fffd8092678) at mini.c:4263

Python Exception <type 'exceptions.ValueError'> zero length field name in format: 

#21 0x00000000004287f8 in mono_jit_compile_method_with_opt (method=method <at> entry=, opt=370239999, ex=ex <at> entry=0x7fffd8092678) at mini-runtime.c:1952

Python Exception <type 'exceptions.ValueError'> zero length field name in format: 

#22 0x0000000000428fbb in mono_jit_compile_method (method=) at mini-runtime.c:2008

Python Exception <type 'exceptions.ValueError'> zero length field name in format: 

#23 0x00000000004aeb43 in common_call_trampoline_inner (regs=regs <at> entry=0x7fffd8092890, code=code <at> entry=0x401c3ee4 "\270\001", m=m <at> entry=, vt=vt <at> entry=0x0, vtable_slot=<optimized out>, vtable_slot <at> entry=0x0)

    at mini-trampolines.c:694

Python Exception <type 'exceptions.ValueError'> zero length field name in format: 

#24 0x00000000004af2a0 in common_call_trampoline (regs=0x7fffd8092890, code=0x401c3ee4 "\270\001", m=, vt=0x0, vtable_slot=0x0) at mini-trampolines.c:808

#25 0x0000000040000289 in ?? ()

#26 0x0000000000a25145 in ?? ()

#27 0x00000000401c3ee4 in ?? ()

#28 0x0000000000a25145 in ?? ()

#29 0x00007fffd80929d0 in ?? ()

#30 0x00007fffd8092890 in ?? ()

#31 0x00007fffd8093698 in ?? ()

#32 0x00007fffd80d6188 in ?? ()

#33 0x00007fffd80d61a0 in ?? ()

#34 0x00007fffd8092af0 in ?? ()

#35 0x0000000000000003 in ?? ()

#36 0x00007fffd80929d0 in ?? ()

#37 0x00007fffd8092a60 in ?? ()

#38 0x0000000000000001 in ?? ()

#39 0x00007fffd80d6188 in ?? ()

#40 0x0000000000000000 in ?? ()


Thread 1 (Thread 0x7fffedae7780 (LWP 64515)):

#0  0x00007fffecd40324 in __lll_lock_wait () from /lib64/libpthread.so.0

#1  0x00007fffecd3b684 in _L_lock_1091 () from /lib64/libpthread.so.0

#2  0x00007fffecd3b4f6 in pthread_mutex_lock () from /lib64/libpthread.so.0

#3  0x00007fffed8f6dcc in _dl_open () from /lib64/ld-linux-x86-64.so.2

#4  0x00007fffec8b5530 in do_dlopen () from /lib64/libc.so.6

#5  0x00007fffed8f2e86 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2

#6  0x00007fffec8b55e5 in dlerror_run () from /lib64/libc.so.6

#7  0x00007fffec8b56d7 in __libc_dlopen_mode () from /lib64/libc.so.6

#8  0x00007fffec8902e5 in init () from /lib64/libc.so.6

#9  0x00007fffecd3ed03 in pthread_once () from /lib64/libpthread.so.0

#10 0x00007fffec89043c in backtrace () from /lib64/libc.so.6

#11 0x00000000004ad425 in mono_handle_native_sigsegv (signal=<optimized out>, ctx=<optimized out>, info=<optimized out>) at mini-exceptions.c:2309

#12 <signal handler called>

#13 0x00007fffec7d1875 in raise () from /lib64/libc.so.6

#14 0x00007fffec7d2e51 in abort () from /lib64/libc.so.6

#15 0x00000000006468be in monoeg_log_default_handler (log_domain=0x0, log_level=G_LOG_LEVEL_ERROR, message=0x1c70ac0 "suspend_thread suspend took 10000 ms, which is more than the allowed 10000 ms", unused_data=0x0) at goutput.c:233

#16 0x00000000006466ab in monoeg_g_logv (log_domain=0x0, log_level=G_LOG_LEVEL_ERROR, format=0x703240 "suspend_thread suspend took %d ms, which is more than the allowed %d ms", args=0x7fffffffc588) at goutput.c:113

#17 0x0000000000646761 in monoeg_g_log (log_domain=0x0, log_level=G_LOG_LEVEL_ERROR, format=0x703240 "suspend_thread suspend took %d ms, which is more than the allowed %d ms") at goutput.c:123

#18 0x000000000063b77f in mono_threads_wait_pending_operations () at mono-threads.c:236

#19 0x000000000063bf0d in suspend_sync (interrupt_kernel=1, tid=140736817870592) at mono-threads.c:875

#20 suspend_sync_nolock (interrupt_kernel=1, id=140736817870592) at mono-threads.c:890

#21 mono_thread_info_safe_suspend_and_run (id=140736817870592, interrupt_kernel=interrupt_kernel <at> entry=1, callback=callback <at> entry=0x58ead0 <abort_thread_critical>, user_data=user_data <at> entry=0x7fffffffcb10) at mono-threads.c:933

#22 0x0000000000592f96 in abort_thread_internal (thread=thread <at> entry=0x7fffd813c230, install_async_abort=install_async_abort <at> entry=1, can_raise_exception=1) at threads.c:4728

#23 0x0000000000593039 in mono_thread_internal_stop (thread=0x7fffd813c230) at threads.c:2385

#24 0x00000000005b266e in mono_gc_cleanup () at gc.c:842

#25 0x00000000005abfbe in mono_runtime_cleanup (domain=domain <at> entry=0x9c91f0) at appdomain.c:356

#26 0x000000000042702b in mini_cleanup (domain=0x9c91f0) at mini-runtime.c:4017

#27 0x0000000000480dc6 in mono_main (argc=11, argv=<optimized out>) at driver.c:2119

#28 0x0000000000424cdf in mono_main_with_options (argv=0x7fffffffce78, argc=11) at main.c:28

#29 main (argc=11, argv=0x7fffffffce78) at main.c:177

(gdb) 



STATE CUE CARD of above process:

-----------------------------------------------------------------------------------------------

STATE CUE CARD: (? means a positive number, usually 1 or 2, * means any number)

        0x0     - starting (GOOD, unless the thread is running managed code)

        0x1     - running (BAD, unless it's the gc thread)

        0x2     - detached (GOOD, unless the thread is running managed code)

        0x?03   - async suspended (GOOD)

        0x?04   - self suspended (GOOD)

        0x?05   - async suspend requested (BAD)

        0x?06   - self suspend requested (BAD)

        0x*07   - blocking (GOOD)

        0x?08   - blocking with pending suspend (GOOD)

--thread 0xa27f30 id 0x7fffd8093700 [(nil)] state 105

--thread 0x9c7a30 id 0x7fffedae7780 [(nil)] state 1  GC INITIATOR

WAITING for 1 threads, got 0 suspended

suspend_thread suspend took 10000 ms, which is more than the allowed 10000 ms

Shutting down finalizer thread timed out.

* Assertion at gc.c:867, condition `finalizer_thread_exited' not met



From: Rodrigo Kumpera <kumpera <at> gmail.com>
Date: Tuesday, June 7, 2016 at 2:07 PM
To: Glover George <glover.e.george <at> erdc.dren.mil>
Subject: Re: [Mono-dev] FW: Random hangs while running mono app

There should be only one CUE card dump per crash. You posted a bunch of those.

I need to know which CUR Card goes with with crash dump.

--
Rodrigo

On Tue, Jun 7, 2016 at 9:09 AM, George, Glover E ERDC-RDE-ITL-MS CIV <Glover.E.George <at> erdc.dren.mil> wrote:
What do you mean by proper logs?  Was the email I sent with the CUE cards and the stack trace not correct?
— — — 
Glover E. George
Computer Scientist
Information Technology Laboratory
US Army Engineer Research and Development Center
Vicksburg, MS 39180


From: Rodrigo Kumpera <kumpera <at> gmail.com>
Date: Tuesday, June 7, 2016 at 11:07 AM
To: Glover George <glover.e.george <at> erdc.dren.mil>

Subject: Re: [Mono-dev] FW: Random hangs while running mono app

Not really, without a test case and proper logs, nothing much we can do.


On Tue, Jun 7, 2016 at 9:02 AM, George, Glover E ERDC-RDE-ITL-MS CIV <Glover.E.George <at> erdc.dren.mil> wrote:
Hi Rodrigo, 

To avoid spamming the list, I’m emailing you off list.  Is there anything else I could be doing to help narrow down the problem?

Cheers.
— — — 
Glover E. George
Computer Scientist
Information Technology Laboratory
US Army Engineer Research and Development Center
Vicksburg, MS 39180


From: Rodrigo Kumpera <kumpera <at> gmail.com>
Date: Friday, June 3, 2016 at 3:02 PM
To: Glover George <glover.e.george <at> erdc.dren.mil>
Cc: "mono-devel-list <at> lists.ximian.com" <mono-devel-list <at> lists.ximian.com>
Subject: Re: [Mono-dev] FW: Random hangs while running mono app

Hi,

The cue card output on itself is not useful.
I need the cue card output with the matched thread dump as that allows me to see what's going on on the problematic thread.

--
Rodrigo


On Fri, Jun 3, 2016 at 12:42 PM, George, Glover E ERDC-RDE-ITL-MS CIV <Glover.E.George <at> erdc.dren.mil> wrote:
Hi Rodrigo,

I didn¹t realize our previous emails were off thread.  I was able to get
the STAT CUE CARD output.  See below:

STATE CUE CARD: (? means a positive number, usually 1 or 2, * means any
number)
        0x0     - starting (GOOD, unless the thread is running managed
code)
        0x1     - running (BAD, unless it's the gc thread)
        0x2     - detached (GOOD, unless the thread is running managed
code)
        0x?03   - async suspended (GOOD)
        0x?04   - self suspended (GOOD)
        0x?05   - async suspend requested (BAD)
        0x?06   - self suspend requested (BAD)
        0x*07   - blocking (GOOD)
        0x?08   - blocking with pending suspend (GOOD)
--thread 0xa38b30 id 0x7fffec637700 [(nil)] state 105
--thread 0x9d8640 id 0x7fffedae7780 [(nil)] state 1  GC INITIATOR
WAITING for 1 threads, got 0 suspended
suspend_thread suspend took 200 ms, which is more than the allowed 200 ms
STATE CUE CARD: (? means a positive number, usually 1 or 2, * means any
number)
        0x0     - starting (GOOD, unless the thread is running managed
code)
        0x1     - running (BAD, unless it's the gc thread)
        0x2     - detached (GOOD, unless the thread is running managed
code)
        0x?03   - async suspended (GOOD)
        0x?04   - self suspended (GOOD)
        0x?05   - async suspend requested (BAD)
        0x?06   - self suspend requested (BAD)
        0x*07   - blocking (GOOD)
        0x?08   - blocking with pending suspend (GOOD)
--thread 0xa38b30 id 0x7fffec637700 [(nil)] state 105
--thread 0x9d8640 id 0x7fffedae7780 [(nil)] state 1  GC INITIATOR
WAITING for 1 threads, got 0 suspended
suspend_thread suspend took 200 ms, which is more than the allowed 200 ms
STATE CUE CARD: (? means a positive number, usually 1 or 2, * means any
number)
        0x0     - starting (GOOD, unless the thread is running managed
code)
        0x1     - running (BAD, unless it's the gc thread)
        0x2     - detached (GOOD, unless the thread is running managed
code)
        0x?03   - async suspended (GOOD)
        0x?04   - self suspended (GOOD)
        0x?05   - async suspend requested (BAD)
        0x?06   - self suspend requested (BAD)
        0x*07   - blocking (GOOD)
        0x?08   - blocking with pending suspend (GOOD)
--thread 0xa38b30 id 0x7fffec637700 [(nil)] state 105
--thread 0x9d8640 id 0x7fffedae7780 [(nil)] state 1  GC INITIATOR
WAITING for 1 threads, got 0 suspended
suspend_thread suspend took 200 ms, which is more than the allowed 200 ms
STATE CUE CARD: (? means a positive number, usually 1 or 2, * means any
number)
        0x0     - starting (GOOD, unless the thread is running managed
code)
        0x1     - running (BAD, unless it's the gc thread)
        0x2     - detached (GOOD, unless the thread is running managed
code)
        0x?03   - async suspended (GOOD)
        0x?04   - self suspended (GOOD)
        0x?05   - async suspend requested (BAD)
        0x?06   - self suspend requested (BAD)
        0x*07   - blocking (GOOD)
        0x?08   - blocking with pending suspend (GOOD)
--thread 0xa38b30 id 0x7fffec637700 [(nil)] state 105
--thread 0x9d8640 id 0x7fffedae7780 [(nil)] state 1  GC INITIATOR
WAITING for 1 threads, got 0 suspended
suspend_thread suspend took 200 ms, which is more than the allowed 200 ms
STATE CUE CARD: (? means a positive number, usually 1 or 2, * means any
number)
        0x0     - starting (GOOD, unless the thread is running managed
code)
        0x1     - running (BAD, unless it's the gc thread)
        0x2     - detached (GOOD, unless the thread is running managed
code)
        0x?03   - async suspended (GOOD)
        0x?04   - self suspended (GOOD)
        0x?05   - async suspend requested (BAD)
        0x?06   - self suspend requested (BAD)
        0x*07   - blocking (GOOD)
        0x?08   - blocking with pending suspend (GOOD)
--thread 0xa38b30 id 0x7fffec637700 [(nil)] state 105
--thread 0x9d8640 id 0x7fffedae7780 [(nil)] state 1  GC INITIATOR
WAITING for 1 threads, got 0 suspended
suspend_thread suspend took 200 ms, which is more than the allowed 200 ms
STATE CUE CARD: (? means a positive number, usually 1 or 2, * means any
number)
        0x0     - starting (GOOD, unless the thread is running managed
code)
        0x1     - running (BAD, unless it's the gc thread)
        0x2     - detached (GOOD, unless the thread is running managed
code)
        0x?03   - async suspended (GOOD)
        0x?04   - self suspended (GOOD)
        0x?05   - async suspend requested (BAD)
        0x?06   - self suspend requested (BAD)
        0x*07   - blocking (GOOD)
        0x?08   - blocking with pending suspend (GOOD)
--thread 0xa38b30 id 0x7fffec637700 [(nil)] state 105
--thread 0x9d8640 id 0x7fffedae7780 [(nil)] state 1  GC INITIATOR
WAITING for 1 threads, got 0 suspended
suspend_thread suspend took 200 ms, which is more than the allowed 200 ms
STATE CUE CARD: (? means a positive number, usually 1 or 2, * means any
number)
        0x0     - starting (GOOD, unless the thread is running managed
code)
        0x1     - running (BAD, unless it's the gc thread)
        0x2     - detached (GOOD, unless the thread is running managed
code)
        0x?03   - async suspended (GOOD)
        0x?04   - self suspended (GOOD)
        0x?05   - async suspend requested (BAD)
        0x?06   - self suspend requested (BAD)
        0x*07   - blocking (GOOD)
        0x?08   - blocking with pending suspend (GOOD)
--thread 0xa38b30 id 0x7fffec637700 [(nil)] state 105
--thread 0x9d8640 id 0x7fffedae7780 [(nil)] state 1  GC INITIATOR


‹ ‹ ‹
Glover E. George
Computer Scientist
Information Technology Laboratory
US Army Engineer Research and Development Center
Vicksburg, MS 39180
601-634-4730





From:  Rodrigo Kumpera <kumpera <at> gmail.com>
Date:  Friday, April 29, 2016 at 9:12 AM
To:  Glover George <glover.e.george <at> erdc.dren.mil>
Cc:  "mono-devel-list <at> lists.ximian.com" <mono-devel-list <at> lists.ximian.com>
Subject:  Re: [Mono-dev] FW: Random hangs while running mono app


This looks like a shutdown bug in mono.

Do you have a reliable way to reproduce it?
How loaded are the machines running your workload?


On Thu, Apr 28, 2016 at 6:41 PM, George, Glover E ERDC-RDE-ITL-MS CIV
<Glover.E.George <at> erdc.dren.mil> wrote:

One more important point I forgot to mention, when this occurs, it is
generally only a few mono instances out of the 500+ simultaneous mono
instances.  Most of them terminate.  And more importantly, the mono
instances that do hang actually complete their
 work!  It seems to be when mono is trying to close down.

This has occurred with mono 4.2.x, 4.3.x, 4.4.x, and current git.

Happened today with this configuration:
Mono JIT compiler version 4.3.2 (Stable 4.3.2.467/ba2e5e4 Tue Mar 15
11:44:06 CDT 2016)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors.
BlockedBlockedBlockedBlockedwww.mono-project.comBlockedBlockedBlockedBlocked
<BlockedBlockedBlockedBlockedhttp://www.mono-project.comBlockedBlockedBlockedBlocked>

TLS:           __thread
SIGSEGV:       altstack
Notifications: epoll
Architecture:  amd64
Disabled:      none
Misc:          bigarrays softdebug
LLVM:          supported, not enabled.
GC:            sgen




From: <mono-devel-list-bounces <at> lists.ximian.com> on behalf of Glover
George <glover.e.george <at> erdc.dren.mil>
Date: Thursday, April 28, 2016 at 5:25 PM
To: "mono-devel-list <at> lists.ximian.com" <mono-devel-list <at> lists.ximian.com>
Subject: [Mono-dev] Random hangs while running mono app


Hi all,

Quick background: We¹re running mono on an HPC platform (SGI ICE-X / SUSE
Enterprise Linux 11) with a Lustre filesystem.  For several months, I¹ve
noticed random hangs with our application.  In a given batch job, we run
500+ instances of mono in parallel
 (no interprocess communication).   I haven¹t been able to reliably
reproduce the issue which leads me to believe it¹s a timing issue, and
more specifically, I have reason to believe it may be an issue with the
interaction of mono and the filesystem.  Being
 a clustered file-system, the load on the backing stores can vary due to
other users.

When I notice that my job hasn¹t finished in a reasonable amount of time,
I log into the compute node and do a  ³ ps -efL | grep mono ³ which
reveals:

george  16728 16575 16728  6    3 16:13 ?        00:03:58 mono
--runtime=v4.0.30319 /app/MyConsole.exe
ggeorge  16728 16575 16768  0    3 16:13 ?        00:00:02 mono
--runtime=v4.0.30319 /app/MyConsole.exe
ggeorge  16728 16575 16815  0    3 16:13 ?        00:00:04 mono
--runtime=v4.0.30319 /app/MyConsole.exe


16728 is the top proc which spawns 16768 and 16815 (main, gc and maybe
finalizer thread?).

Attaching to each of these 3 pid¹s with gdb and doing a backtrace reveals
the traces below.

1.  Is it possible this is a GC hang?

2.  Is it possible this is a race condition that has anything to do with
the filesystem?

3.  Any idea how this sequence could occur?  If there¹s a way to gather
more information, please let me know.

The reason for the file system suspicion is my app writes several small
files, times the number of mono instances (500+).  It only happens
randomly, but seems to be more frequent the more mono instances I run.

Any help is HIGHLY appreciated.

PID 16728
‹‹‹
(gdb) bt
#0  0x00007fffecccd324 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fffeccc8684 in _L_lock_1091 () from /lib64/libpthread.so.0
#2  0x00007fffeccc84f6 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007fffed8f6dcc in _dl_open () from /lib64/ld-linux-x86-64.so.2
#4  0x00007fffec842530 in do_dlopen () from /lib64/libc.so.6
#5  0x00007fffed8f2e86 in _dl_catch_error () from
/lib64/ld-linux-x86-64.so.2
#6  0x00007fffec8425e5 in dlerror_run () from /lib64/libc.so.6
#7  0x00007fffec8426d7 in __libc_dlopen_mode () from /lib64/libc.so.6
#8  0x00007fffec81d2e5 in init () from /lib64/libc.so.6
#9  0x00007fffecccbd03 in pthread_once () from /lib64/libpthread.so.0
#10 0x00007fffec81d43c in backtrace () from /lib64/libc.so.6
#11 0x00000000004ac025 in mono_handle_native_sigsegv (signal=<optimized
out>, ctx=<optimized out>, info=<optimized out>)
    at mini-exceptions.c:2309
#12 <signal handler called>
#13 0x00007fffec75e875 in raise () from /lib64/libc.so.6
#14 0x00007fffec75fe51 in abort () from /lib64/libc.so.6
#15 0x000000000064528a in monoeg_log_default_handler (log_domain=0x0,
log_level=G_LOG_LEVEL_ERROR, message=
    0x1ac7660 "suspend_thread suspend took 200 ms, which is more than the
allowed 200 ms", unused_data=0x0) at goutput.c:233
#16 0x0000000000645077 in monoeg_g_logv (log_domain=0x0,
log_level=G_LOG_LEVEL_ERROR, format=
    0x7015d8 "suspend_thread suspend took %d ms, which is more than the
allowed %d ms", args=0x7fffffffce58) at goutput.c:113
#17 0x000000000064512d in monoeg_g_log (log_domain=0x0,
log_level=G_LOG_LEVEL_ERROR, format=
    0x7015d8 "suspend_thread suspend took %d ms, which is more than the
allowed %d ms") at goutput.c:123
#18 0x000000000063a13f in mono_threads_wait_pending_operations () at
mono-threads.c:238
#19 0x000000000063a8cd in suspend_sync (interrupt_kernel=1,
tid=140737159329536) at mono-threads.c:877
#20 suspend_sync_nolock (interrupt_kernel=1, id=140737159329536) at
mono-threads.c:892
#21 mono_thread_info_safe_suspend_and_run (id=140737159329536,
interrupt_kernel=interrupt_kernel <at> entry=1, callback=callback <at> entry=
    0x58d5c0 <abort_thread_critical>,
user_data=user_data <at> entry=0x7fffffffd3e0) at mono-threads.c:935
#22 0x0000000000591a86 in abort_thread_internal
(thread=thread <at> entry=0x7fffec6e0230,
install_async_abort=install_async_abort <at> entry=1,
    can_raise_exception=1) at threads.c:4728
#23 0x0000000000591b29 in mono_thread_internal_stop
(thread=0x7fffec6e0230) at threads.c:2385
---Type <return> to continue, or q <return> to quit---
#24 0x00000000005b123e in mono_gc_cleanup () at gc.c:842
#25 0x00000000005aab8e in mono_runtime_cleanup
(domain=domain <at> entry=0x9d9e00) at appdomain.c:356
#26 0x0000000000426c8b in mini_cleanup (domain=0x9d9e00) at
mini-runtime.c:4017
#27 0x000000000047fac6 in mono_main (argc=11, argv=<optimized out>) at
driver.c:2115
#28 0x0000000000424c68 in mono_main_with_options (argv=0x7fffffffd698,
argc=11) at main.c:20
#29 main (argc=<optimized out>, argv=<optimized out>) at main.c:53

Thread 16768
‹‹‹
(gdb) bt
#0  0x00007fffeccca66c in pthread_cond_wait <at> <at> GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1  0x000000000060c873 in mono_os_cond_wait (mutex=0x97e640 <lock>,
cond=0x97e600 <work_cond>) at ../../mono/utils/mono-os-mutex.h:105
#2  thread_func (thread_data=0x0) at sgen-thread-pool.c:118
#3  0x00007fffeccc6806 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fffec80a9bd in clone () from /lib64/libc.so.6
#5  0x0000000000000000 in ?? ()
(gdb)


Thread 16815
‹‹‹‹
Thread #0  0x00007fffec75ec8b in sigsuspend () from /lib64/libc.so.6
#1  0x000000000063cda6 in suspend_signal_handler (_dummy=<optimized out>,
info=<optimized out>, context=0x7fffec633dc0)
    at mono-threads-posix-signals.c:209
#2  <signal handler called>
#3  0x00007fffed8faf97 in open64 () from /lib64/ld-linux-x86-64.so.2
#4  0x00007fffed8ea82d in open_verify () from /lib64/ld-linux-x86-64.so.2
#5  0x00007fffed8eade0 in open_path () from /lib64/ld-linux-x86-64.so.2
#6  0x00007fffed8ece7c in _dl_map_object () from
/lib64/ld-linux-x86-64.so.2
#7  0x00007fffed8f7400 in dl_open_worker () from
/lib64/ld-linux-x86-64.so.2
#8  0x00007fffed8f2e86 in _dl_catch_error () from
/lib64/ld-linux-x86-64.so.2
#9  0x00007fffed8f6e3b in _dl_open () from /lib64/ld-linux-x86-64.so.2
#10 0x00007fffecedcf9b in dlopen_doit () from /lib64/libdl.so.2
#11 0x00007fffed8f2e86 in _dl_catch_error () from
/lib64/ld-linux-x86-64.so.2
#12 0x00007fffecedd33c in _dlerror_run () from /lib64/libdl.so.2
#13 0x00007fffecedcf01 in dlopen <at> <at> GLIBC_2.2.5 () from /lib64/libdl.so.2
#14 0x0000000000630b79 in mono_dl_open (name=name <at> entry=0x1c02790
"libSystem.Data", flags=flags <at> entry=1, error_msg=error_msg <at> entry=
    0x7fffec634e80) at mono-dl.c:150
#15 0x000000000054b9f0 in cached_module_load (name=name <at> entry=0x1c02790
"libSystem.Data", err=err <at> entry=0x7fffec634e80, flags=1)
    at loader.c:1398








_______________________________________________
Mono-devel-list mailing list
Mono-devel-list <at> lists.ximian.com
Blockedhttp://lists.ximian.com/mailman/listinfo/mono-devel-listBlocked









_______________________________________________
Mono-devel-list mailing list
Mono-devel-list <at> lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list
Neale Ferguson | 16 Jun 21:42 2016
Picon

DateTime.Now

When DateTime.Now looks to adjust its value based on timezone (and
daylight saving requirements), it fetches a value UtcNow. This value is
obtained by a call to unmanaged code GetSystemTimeAsFileTime() which the
JIT maps to mono_100ns_datetime(). This routine uses gettimeofday() to
fetch the current time. However, this is the local time and it does not
adjust it to be UTC. So the comparison it makes in the test to see if the
datetime has transitioned to or from daylight saving or not is done
against local time and not UTC.

I have instrumented the code to show how the test against the rule appears
to be being made against local time and not UTC at
http://pastebin.ca/3629144.

Neale
Mikhail Filippov | 16 Jun 14:04 2016
Gravatar

Race condition in named mutex

I found problem in Mono in named mutex I have repro program:
using System;
using System.Threading;

namespace Crasher
{
	internal class Program
	{
		public static void Main (string[] args)
		{
			var a = "";
			for (var i = 0; i < 100; i++)
			{				
				new Thread(Crasher).Start();
			}
			Console.WriteLine(a);
			Console.ReadLine();
		}

		private static void Crasher(){
			var rnd = new Random();
			while (true)
			{
				Thread.Sleep(rnd.Next(100, 10000));
				using (var mutex = new Mutex(false, "Global\\TEST"))
				{
					var owner = false;
					try
					{
						owner = mutex.WaitOne(TimeSpan.FromMinutes(1));
					}
					finally
					{
						if (owner)
						{
							mutex.ReleaseMutex();
						}
					}
					Console.WriteLine ("PING");
				}
				Thread.Sleep(rnd.Next(100, 10000));
			}
		}
	}
}

It program crached mono after few seconds.
I see namedmutex_create and namedmutex_release use different locks.
First time I found it problem when run Nuget client on mono.
I verify repro on mono 4.2, 4.4, and master.
Anybody can help me resolve this issue?
On 4.4.0 native crash:
namedmutex_create: error creating mutex handle
PING
PING
PING
PING
PING
PING
PING
PING
PING
PING
PING
PING
PING
PING
_wapi_handle_unref_full: Attempting to unref unused handle 0x4e0
PING
PING
namedmutex_create: error creating mutex handle
PING
PING
PING
PING
PING
PING
_wapi_handle_ref: Attempting to ref unused handle 0x4e3
* Assertion at ../../mono/utils/mono-os-mutex.h:135, condition `res != EINVAL' not met

Stacktrace:

  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) System.Threading.WaitHandle.WaitOne_internal
(System.Threading.WaitHandle,intptr,int,bool) <0x00073>
  at System.Threading.WaitHandle.WaitOne (System.TimeSpan,bool) <0x0009b>
  at System.Threading.WaitHandle.WaitOne (System.TimeSpan) <0x0001d>
  at Crasher.Program.Crasher () <0x000f0>
  at System.Threading.ThreadHelper.ThreadStart_Context (object) <0x0009a>
  at System.Threading.ExecutionContext.RunInternal
(System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool) <0x001c6>
  at System.Threading.ExecutionContext.Run
(System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool) <0x00020>
  at System.Threading.ExecutionContext.Run
(System.Threading.ExecutionContext,System.Threading.ContextCallback,object) <0x00059>
  at System.Threading.ThreadHelper.ThreadStart () <0x0002e>
  at (wrapper runtime-invoke) object.runtime_invoke_void__this__ (object,intptr,intptr,intptr) <0x000e0>

Native stacktrace:

	0   mono                                0x0000000103f9b0ca mono_handle_native_sigsegv + 271
	1   libsystem_platform.dylib            0x00007fff854c252a _sigtramp + 26
	2   mono                                0x00000001042036a4 tmp_dir + 5316
	3   libsystem_c.dylib                   0x00007fff935536e7 abort + 129
	4   mono                                0x000000010410d1f0 monoeg_g_log + 0
	5   mono                                0x000000010410d175 monoeg_g_logv + 83
	6   mono                                0x000000010410d31a monoeg_assertion_message + 143
	7   mono                                0x00000001040e453d _wapi_handle_timedwait_signal_handle + 1153
	8   mono                                0x00000001040f4aec wapi_WaitForSingleObjectEx + 606
	9   mono                                0x000000010406c01c mono_wait_uninterrupted + 130
	10  mono                                0x000000010406c1ff ves_icall_System_Threading_WaitHandle_WaitOne_internal + 73
	11  ???                                 0x0000000108130b54 0x0 + 4430433108
	12  mscorlib.dll.dylib                  0x00000001062ed7ae System_Threading_WaitHandle_WaitOne_System_TimeSpan
+ 30
	13  mscorlib.dll.dylib                  0x0000000106140e5b
System_Threading_ThreadHelper_ThreadStart_Context_object + 155
	14  mscorlib.dll.dylib                  0x000000010613f331
System_Threading_ExecutionContext_Run_System_Threading_ExecutionContext_System_Threading_ContextCallback_object_bool
+ 33
	15  mono                                0x0000000103f04876 mono_jit_runtime_invoke + 1578
	16  mono                                0x0000000104090c23 mono_runtime_invoke + 130
	17  mono                                0x0000000104070409 start_wrapper + 424
	18  mono                                0x0000000104106cb1 inner_start_thread + 305
	19  libsystem_pthread.dylib             0x00007fff989c399d _pthread_body + 131
	20  libsystem_pthread.dylib             0x00007fff989c391a _pthread_body + 0
	21  libsystem_pthread.dylib             0x00007fff989c1351 thread_start + 13

On master UnhandlerException:
namedmutex_create: error creating mutex handle
PING
PING
PING
PING
PING
_wapi_handle_unref_full: Attempting to unref unused handle 0x4eb
PING
PING
PING
_wapi_handle_unref_full: Attempting to unref unused handle 0x4ec
PING
PING
PING
PING
PING
PING
PING
PING
_wapi_handle_unref_full: Attempting to unref unused handle 0x4ee
PING
PING
PING
PING
PING
PING
PING
PING
PING
PING
PING
PING
PING
_wapi_handle_unref_full: Attempting to unref unused handle 0x4f1
PING
PING
PING
PING
PING
PING
PING

Unhandled Exception:
System.ApplicationException: Mutex is not owned
  at System.Threading.Mutex.ReleaseMutex () <0x10e769ef0 + 0x00043> in <filename unknown>:0 
  at (wrapper remoting-invoke-with-check) System.Threading.Mutex:ReleaseMutex ()
  at Crasher.Program.Crasher () <0x10cfdfc80 + 0x00129> in <filename unknown>:0 
  at System.Threading.ThreadHelper.ThreadStart_Context (System.Object state) <0x10e5fb710 +
0x00096> in <filename unknown>:0 
  at System.Threading.ExecutionContext.RunInternal (System.Threading.ExecutionContext
executionContext, System.Threading.ContextCallback callback, System.Object state,
System.Boolean preserveSyncCtx) <0x10e5f93b0 + 0x001c6> in <filename unknown>:0 
  at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext executionContext,
System.Threading.ContextCallback callback, System.Object state, System.Boolean
preserveSyncCtx) <0x10e5f9380 + 0x00020> in <filename unknown>:0 
  at System.Threading.ExecutionContext.Run (System.Threading.ExecutionContext executionContext,
System.Threading.ContextCallback callback, System.Object state) <0x10e5f92d0 + 0x00059> in
<filename unknown>:0 
  at System.Threading.ThreadHelper.ThreadStart () <0x10e5fb8c0 + 0x0002e> in <filename unknown>:0 
_wapi_handle_unref_full: Attempting to unref unused handle 0x4f1
[ERROR] FATAL UNHANDLED EXCEPTION: Nested exception trying to figure out what went wrong
Seif Attar | 13 Jun 23:38 2016
Picon
Gravatar

release 4.4? 4.2.4?

Hello,

I noticed http://www.mono-project.com/docs/about-mono/releases/4.4.0/ a
couple of days ago, and had been eagerly waiting for the apt packages
for that. I just updated now, and got some updates, but not 4.4.0, mono
is now at 4.2.4. What is going on here? Is 4.4 coming out soon? what is
new in 4.2.4?

Thanks,
Seif
_______________________________________________
Mono-devel-list mailing list
Mono-devel-list <at> lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list
Chris Swiedler | 13 Jun 18:54 2016

Shutdown hangs

I'm getting semi-reliable shutdown hangs in a mono command-line app using mono 4.2.3 on Centos 7 64-bit.
The app does different things at different times, and the shutdowns only seem to happen when it connects to
our SQL server database (using the Mono database bindings, not ODBC). We've also seen an occasional
segfault, also seemingly only when doing database work, though those may have gone away since upgrading
to 4.2.3.

Does anyone have any thoughts? Pstack output below.

Thanks,
chris

Thread 5 (Thread 0x7f4c777ff700 (LWP 11658)):
#0  pthread_cond_wait <at>  <at> GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005f786c in thread_func (thread_data=0x0) at sgen-thread-pool.c:118
#2  0x00007f4c77ffedf5 in start_thread (arg=0x7f4c777ff700) at pthread_create.c:308
#3  0x00007f4c77d2c1ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 4 (Thread 0x7f4c75560700 (LWP 11659)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x000000000061aa37 in mono_sem_wait (sem=sem <at> entry=0x953c40 <finalizer_sem>,
alertable=alertable <at> entry=1) at mono-semaphore.c:107
#2  0x00000000005a1736 in finalizer_thread (unused=<optimized out>) at gc.c:1096
#3  0x00000000005841e9 in start_wrapper_internal (data=<optimized out>) at threads.c:725
#4  start_wrapper (data=<optimized out>) at threads.c:772
#5  0x0000000000621026 in inner_start_thread (arg=0x7fffda0d42e0) at mono-threads-posix.c:97
#6  0x00007f4c77ffedf5 in start_thread (arg=0x7f4c75560700) at pthread_create.c:308
#7  0x00007f4c77d2c1ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Thread 3 (Thread 0x7f4c74fd3700 (LWP 11660)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x000000000061aa37 in mono_sem_wait (sem=sem <at> entry=0x7f4c68000940,
alertable=alertable <at> entry=0) at mono-semaphore.c:107
#2  0x000000000061f5ba in mono_thread_info_wait_for_resume (info=<optimized out>) at mono-threads.c:110
#3  mono_thread_info_end_self_suspend () at mono-threads.c:692
#4  0x0000000000583d2c in self_suspend_internal (thread=0x7f4c78a74330) at threads.c:4546
#5  mono_thread_execute_interruption (thread=thread <at> entry=0x7f4c78a74330) at threads.c:4050
#6  0x0000000000583f50 in mono_thread_interruption_checkpoint_request
(bypass_abort_protection=bypass_abort_protection <at> entry=1) at threads.c:4184
#7  0x0000000000584dae in mono_thread_force_interruption_checkpoint () at threads.c:4221

Thread 2 (Thread 0x7f4c746ff700 (LWP 11661)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x000000000061aa37 in mono_sem_wait (sem=sem <at> entry=0x7f4c6c000940,
alertable=alertable <at> entry=0) at mono-semaphore.c:107
#2  0x000000000061f5ba in mono_thread_info_wait_for_resume (info=<optimized out>) at mono-threads.c:110
#3  mono_thread_info_end_self_suspend () at mono-threads.c:692
#4  0x0000000000583d2c in self_suspend_internal (thread=0x7f4c78a74430) at threads.c:4546
#5  mono_thread_execute_interruption (thread=thread <at> entry=0x7f4c78a74430) at threads.c:4050
#6  0x00000000005851f9 in ves_icall_System_Threading_Thread_Sleep_internal (ms=1000) at threads.c:1206
#17 0x00007f4c756fa2e7 in System_Threading_Thread_Sleep_int (millisecondsTimeout=1000) at /home/cswiedler/mono-main/Extern/mono/source/external/referencesource/mscorlib/system/threading/thread.cs:715

Thread 1 (Thread 0x7f4c78b34780 (LWP 11657)):
#0  0x00007f4c7800599d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x0000000000632a5a in monoeg_g_usleep (microseconds=1935890) at gdate-unix.c:53
#2  0x000000000061f2bb in suspend_sync_nolock (id=id <at> entry=139966347736832,
interrupt_kernel=interrupt_kernel <at> entry=1) at mono-threads.c:870
#3  0x000000000061f732 in mono_thread_info_safe_suspend_and_run (id=139966347736832,
interrupt_kernel=interrupt_kernel <at> entry=1, callback=callback <at> entry=0x5813c0
<suspend_thread_critical>, user_data=user_data <at> entry=0x7fffda0d3a10) at mono-threads.c:891
#4  0x0000000000581d29 in suspend_thread_internal (thread=0x7f4c78a74430, interrupt=1) at threads.c:4531
#5  0x00000000005834ff in mono_thread_suspend_all_other_threads () at threads.c:3154
#6  0x0000000000538176 in ves_icall_System_Environment_Exit (result=0) at icall.c:6006
#7  0x0000000040731832 in ?? ()
MarLOne | 10 Jun 13:20 2016
Picon

Configuration.HasFile returning wrong value in Mono

Hi,

I am using C# in Mono in Mint17 - Mono JIT compiler version 4.2.3 (Stable
4.2.3.4/832de4b Wed Mar 16 13:19:08 UTC 2016)
 and come across this problem picked up by my NUnit code when I am moving
code from Windows/CLR to Mono.

I obtain the *Configuration* object by using
*ConfigurationManager.OpenExeConfiguration( string exePath )*. When I pass
to it a path like this: /home/testapp/MyApp.exe.config, the method returns a
non-null object, same as in CLR.

However the Config.FilePath becomes /home/testapp/MyApp.exe.config.config
which clearly is not a conventional config file name for an executable.
File.Exists() with this path returning false, as expected.

However, Config.HasFile returning true in Mono while it is false in CLR.
Clearly this value should be false in view of File.Exists() returning false. 

Can someone explains why this anomaly between CLR and Mono?

Here is my Unit test code:

L5 fails in Mono while it does not fail in Windows/CLR. Why?

MarLOne

--
View this message in context: http://mono.1490590.n4.nabble.com/Configuration-HasFile-returning-wrong-value-in-Mono-tp4667959.html
Sent from the Mono - Dev mailing list archive at Nabble.com.
nicob | 9 Jun 16:10 2016
Picon

Initialize mono on non main thread

Hi, I'm initializing Mono using Java JNI. The JNI code doesn't run on the
main thread, therefore I get segmentation faults. I've tried
mono_thread_attach (mono_domain_get_root_domain and mono_domain_get) but
that's the place where the SIGSEGV is generated. 

Can Mono be initialized on a non main thread? As I'm using JNI I've
implemented the signal chaining to avoid segmentation faults. 

My code runs perfectly if you have only one thread, otherwise it fails with
a SIGSEGV. 

Thanks!

--
View this message in context: http://mono.1490590.n4.nabble.com/Initialize-mono-on-non-main-thread-tp4667956.html
Sent from the Mono - Dev mailing list archive at Nabble.com.
nicob | 8 Jun 21:26 2016
Picon

mono_add_internal_call to mscorlib method

Hi, I'm trying to use the Skype Sdk in mono embedded. I'm able to load the
assemblies but when I try to create a instance of a class that internally
calls System.Environment.GetCommandLineArgs it throws an exception. After
digging a bit I found out the System.Environment.GetCommandLineArgs
internally invokes:

private static extern string[] GetCommandLineArgsNative();

this method is returning null therefore it fails. I'm trying to add an
internal call to the extern method using:

mono_add_internal_call ("System.Environment::GetCommandLineArgsNative",
(void*)GetCommandLine);

I also tried:

mono_add_internal_call
("mscorlib.System.Environment::GetCommandLineArgsNative",
(void*)GetCommandLine);

but neither of them seem to work as my C++ function is never invoked.

C++ function:
MonoArray* GetCommandLine()
{
    MonoDomain* domain = mono_domain_get();
    MonoArray* array = mono_array_new(domain, mono_get_string_class(), 1);

    MonoString* ret = mono_string_new(domain, "yeah!");
    mono_array_setref(array, 0, ret);

    return array;
}

Am I missing something?

Thanks in advanced.

--
View this message in context: http://mono.1490590.n4.nabble.com/mono-add-internal-call-to-mscorlib-method-tp4667951.html
Sent from the Mono - Dev mailing list archive at Nabble.com.
Neale Ferguson | 7 Jun 22:20 2016
Picon

Setting and Getting Thread Priority

After adding support for setting and getting thread priorities I’ve found
there’s one piece missing. Before a thread is started you can set its
priority such that when it’s started it will start with this priority
instead of the priority of the thread creating it.

At the moment the code lets this happen but does nothing with it. I have
started adding code to fix this missing piece. The trouble I am having is
that the wthreads.c implementation of GetThreadPriority() will simply
return THREAD_PRIORITY_NORMAL if the thread hasn’t been started yet: even
if a call to SetPriority has already been made. I was storing the priority
value in the MonoThread structure, however, wthreads.c does not have
visibility of this structure. I thought of putting it in MonoThreadInfo
but I’m not sure at which point this structure comes into existence. The
other place is within MonoThreadInternal which wthreads.c does see but
this requires a change in Threads.cs which has a one-to-one field
correspondence with the structure.

The other changes work fine such that a thread (running with the right
policy) will have its priority set at creation time. All that is missing
is GetPriority reporting the correct value before start time.

Neale

_______________________________________________
Mono-devel-list mailing list
Mono-devel-list <at> lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list

Re: FW: Random hangs while running mono app

Hi Rodrigo, 

I didn¹t realize our previous emails were off thread.  I was able to get
the STAT CUE CARD output.  See below:

STATE CUE CARD: (? means a positive number, usually 1 or 2, * means any
number)
        0x0     - starting (GOOD, unless the thread is running managed
code)
        0x1     - running (BAD, unless it's the gc thread)
        0x2     - detached (GOOD, unless the thread is running managed
code)
        0x?03   - async suspended (GOOD)
        0x?04   - self suspended (GOOD)
        0x?05   - async suspend requested (BAD)
        0x?06   - self suspend requested (BAD)
        0x*07   - blocking (GOOD)
        0x?08   - blocking with pending suspend (GOOD)
--thread 0xa38b30 id 0x7fffec637700 [(nil)] state 105
--thread 0x9d8640 id 0x7fffedae7780 [(nil)] state 1  GC INITIATOR
WAITING for 1 threads, got 0 suspended
suspend_thread suspend took 200 ms, which is more than the allowed 200 ms
STATE CUE CARD: (? means a positive number, usually 1 or 2, * means any
number)
        0x0     - starting (GOOD, unless the thread is running managed
code)
        0x1     - running (BAD, unless it's the gc thread)
        0x2     - detached (GOOD, unless the thread is running managed
code)
        0x?03   - async suspended (GOOD)
        0x?04   - self suspended (GOOD)
        0x?05   - async suspend requested (BAD)
        0x?06   - self suspend requested (BAD)
        0x*07   - blocking (GOOD)
        0x?08   - blocking with pending suspend (GOOD)
--thread 0xa38b30 id 0x7fffec637700 [(nil)] state 105
--thread 0x9d8640 id 0x7fffedae7780 [(nil)] state 1  GC INITIATOR
WAITING for 1 threads, got 0 suspended
suspend_thread suspend took 200 ms, which is more than the allowed 200 ms
STATE CUE CARD: (? means a positive number, usually 1 or 2, * means any
number)
        0x0     - starting (GOOD, unless the thread is running managed
code)
        0x1     - running (BAD, unless it's the gc thread)
        0x2     - detached (GOOD, unless the thread is running managed
code)
        0x?03   - async suspended (GOOD)
        0x?04   - self suspended (GOOD)
        0x?05   - async suspend requested (BAD)
        0x?06   - self suspend requested (BAD)
        0x*07   - blocking (GOOD)
        0x?08   - blocking with pending suspend (GOOD)
--thread 0xa38b30 id 0x7fffec637700 [(nil)] state 105
--thread 0x9d8640 id 0x7fffedae7780 [(nil)] state 1  GC INITIATOR
WAITING for 1 threads, got 0 suspended
suspend_thread suspend took 200 ms, which is more than the allowed 200 ms
STATE CUE CARD: (? means a positive number, usually 1 or 2, * means any
number)
        0x0     - starting (GOOD, unless the thread is running managed
code)
        0x1     - running (BAD, unless it's the gc thread)
        0x2     - detached (GOOD, unless the thread is running managed
code)
        0x?03   - async suspended (GOOD)
        0x?04   - self suspended (GOOD)
        0x?05   - async suspend requested (BAD)
        0x?06   - self suspend requested (BAD)
        0x*07   - blocking (GOOD)
        0x?08   - blocking with pending suspend (GOOD)
--thread 0xa38b30 id 0x7fffec637700 [(nil)] state 105
--thread 0x9d8640 id 0x7fffedae7780 [(nil)] state 1  GC INITIATOR
WAITING for 1 threads, got 0 suspended
suspend_thread suspend took 200 ms, which is more than the allowed 200 ms
STATE CUE CARD: (? means a positive number, usually 1 or 2, * means any
number)
        0x0     - starting (GOOD, unless the thread is running managed
code)
        0x1     - running (BAD, unless it's the gc thread)
        0x2     - detached (GOOD, unless the thread is running managed
code)
        0x?03   - async suspended (GOOD)
        0x?04   - self suspended (GOOD)
        0x?05   - async suspend requested (BAD)
        0x?06   - self suspend requested (BAD)
        0x*07   - blocking (GOOD)
        0x?08   - blocking with pending suspend (GOOD)
--thread 0xa38b30 id 0x7fffec637700 [(nil)] state 105
--thread 0x9d8640 id 0x7fffedae7780 [(nil)] state 1  GC INITIATOR
WAITING for 1 threads, got 0 suspended
suspend_thread suspend took 200 ms, which is more than the allowed 200 ms
STATE CUE CARD: (? means a positive number, usually 1 or 2, * means any
number)
        0x0     - starting (GOOD, unless the thread is running managed
code)
        0x1     - running (BAD, unless it's the gc thread)
        0x2     - detached (GOOD, unless the thread is running managed
code)
        0x?03   - async suspended (GOOD)
        0x?04   - self suspended (GOOD)
        0x?05   - async suspend requested (BAD)
        0x?06   - self suspend requested (BAD)
        0x*07   - blocking (GOOD)
        0x?08   - blocking with pending suspend (GOOD)
--thread 0xa38b30 id 0x7fffec637700 [(nil)] state 105
--thread 0x9d8640 id 0x7fffedae7780 [(nil)] state 1  GC INITIATOR
WAITING for 1 threads, got 0 suspended
suspend_thread suspend took 200 ms, which is more than the allowed 200 ms
STATE CUE CARD: (? means a positive number, usually 1 or 2, * means any
number)
        0x0     - starting (GOOD, unless the thread is running managed
code)
        0x1     - running (BAD, unless it's the gc thread)
        0x2     - detached (GOOD, unless the thread is running managed
code)
        0x?03   - async suspended (GOOD)
        0x?04   - self suspended (GOOD)
        0x?05   - async suspend requested (BAD)
        0x?06   - self suspend requested (BAD)
        0x*07   - blocking (GOOD)
        0x?08   - blocking with pending suspend (GOOD)
--thread 0xa38b30 id 0x7fffec637700 [(nil)] state 105
--thread 0x9d8640 id 0x7fffedae7780 [(nil)] state 1  GC INITIATOR

‹ ‹ ‹ 
Glover E. George
Computer Scientist
Information Technology Laboratory
US Army Engineer Research and Development Center
Vicksburg, MS 39180
601-634-4730

From:  Rodrigo Kumpera <kumpera <at> gmail.com>
Date:  Friday, April 29, 2016 at 9:12 AM
To:  Glover George <glover.e.george <at> erdc.dren.mil>
Cc:  "mono-devel-list <at> lists.ximian.com" <mono-devel-list <at> lists.ximian.com>
Subject:  Re: [Mono-dev] FW: Random hangs while running mono app

This looks like a shutdown bug in mono.

Do you have a reliable way to reproduce it?
How loaded are the machines running your workload?

On Thu, Apr 28, 2016 at 6:41 PM, George, Glover E ERDC-RDE-ITL-MS CIV
<Glover.E.George <at> erdc.dren.mil> wrote:

One more important point I forgot to mention, when this occurs, it is
generally only a few mono instances out of the 500+ simultaneous mono
instances.  Most of them terminate.  And more importantly, the mono
instances that do hang actually complete their
 work!  It seems to be when mono is trying to close down.

This has occurred with mono 4.2.x, 4.3.x, 4.4.x, and current git.

Happened today with this configuration:
Mono JIT compiler version 4.3.2 (Stable 4.3.2.467/ba2e5e4 Tue Mar 15
11:44:06 CDT 2016)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors.
Blockedwww.mono-project.comBlocked
<Blockedhttp://www.mono-project.comBlocked>
TLS:           __thread
SIGSEGV:       altstack
Notifications: epoll
Architecture:  amd64
Disabled:      none
Misc:          bigarrays softdebug
LLVM:          supported, not enabled.
GC:            sgen

From: <mono-devel-list-bounces <at> lists.ximian.com> on behalf of Glover
George <glover.e.george <at> erdc.dren.mil>
Date: Thursday, April 28, 2016 at 5:25 PM
To: "mono-devel-list <at> lists.ximian.com" <mono-devel-list <at> lists.ximian.com>
Subject: [Mono-dev] Random hangs while running mono app

Hi all,

Quick background: We¹re running mono on an HPC platform (SGI ICE-X / SUSE
Enterprise Linux 11) with a Lustre filesystem.  For several months, I¹ve
noticed random hangs with our application.  In a given batch job, we run
500+ instances of mono in parallel
 (no interprocess communication).   I haven¹t been able to reliably
reproduce the issue which leads me to believe it¹s a timing issue, and
more specifically, I have reason to believe it may be an issue with the
interaction of mono and the filesystem.  Being
 a clustered file-system, the load on the backing stores can vary due to
other users.

When I notice that my job hasn¹t finished in a reasonable amount of time,
I log into the compute node and do a  ³ ps -efL | grep mono ³ which
reveals:

george  16728 16575 16728  6    3 16:13 ?        00:03:58 mono
--runtime=v4.0.30319 /app/MyConsole.exe
ggeorge  16728 16575 16768  0    3 16:13 ?        00:00:02 mono
--runtime=v4.0.30319 /app/MyConsole.exe
ggeorge  16728 16575 16815  0    3 16:13 ?        00:00:04 mono
--runtime=v4.0.30319 /app/MyConsole.exe

16728 is the top proc which spawns 16768 and 16815 (main, gc and maybe
finalizer thread?).

Attaching to each of these 3 pid¹s with gdb and doing a backtrace reveals
the traces below.  

1.  Is it possible this is a GC hang?

2.  Is it possible this is a race condition that has anything to do with
the filesystem?

3.  Any idea how this sequence could occur?  If there¹s a way to gather
more information, please let me know.

The reason for the file system suspicion is my app writes several small
files, times the number of mono instances (500+).  It only happens
randomly, but seems to be more frequent the more mono instances I run.

Any help is HIGHLY appreciated.

PID 16728
‹‹‹
(gdb) bt
#0  0x00007fffecccd324 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fffeccc8684 in _L_lock_1091 () from /lib64/libpthread.so.0
#2  0x00007fffeccc84f6 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007fffed8f6dcc in _dl_open () from /lib64/ld-linux-x86-64.so.2
#4  0x00007fffec842530 in do_dlopen () from /lib64/libc.so.6
#5  0x00007fffed8f2e86 in _dl_catch_error () from
/lib64/ld-linux-x86-64.so.2
#6  0x00007fffec8425e5 in dlerror_run () from /lib64/libc.so.6
#7  0x00007fffec8426d7 in __libc_dlopen_mode () from /lib64/libc.so.6
#8  0x00007fffec81d2e5 in init () from /lib64/libc.so.6
#9  0x00007fffecccbd03 in pthread_once () from /lib64/libpthread.so.0
#10 0x00007fffec81d43c in backtrace () from /lib64/libc.so.6
#11 0x00000000004ac025 in mono_handle_native_sigsegv (signal=<optimized
out>, ctx=<optimized out>, info=<optimized out>)
    at mini-exceptions.c:2309
#12 <signal handler called>
#13 0x00007fffec75e875 in raise () from /lib64/libc.so.6
#14 0x00007fffec75fe51 in abort () from /lib64/libc.so.6
#15 0x000000000064528a in monoeg_log_default_handler (log_domain=0x0,
log_level=G_LOG_LEVEL_ERROR, message=
    0x1ac7660 "suspend_thread suspend took 200 ms, which is more than the
allowed 200 ms", unused_data=0x0) at goutput.c:233
#16 0x0000000000645077 in monoeg_g_logv (log_domain=0x0,
log_level=G_LOG_LEVEL_ERROR, format=
    0x7015d8 "suspend_thread suspend took %d ms, which is more than the
allowed %d ms", args=0x7fffffffce58) at goutput.c:113
#17 0x000000000064512d in monoeg_g_log (log_domain=0x0,
log_level=G_LOG_LEVEL_ERROR, format=
    0x7015d8 "suspend_thread suspend took %d ms, which is more than the
allowed %d ms") at goutput.c:123
#18 0x000000000063a13f in mono_threads_wait_pending_operations () at
mono-threads.c:238
#19 0x000000000063a8cd in suspend_sync (interrupt_kernel=1,
tid=140737159329536) at mono-threads.c:877
#20 suspend_sync_nolock (interrupt_kernel=1, id=140737159329536) at
mono-threads.c:892
#21 mono_thread_info_safe_suspend_and_run (id=140737159329536,
interrupt_kernel=interrupt_kernel <at> entry=1, callback=callback <at> entry=
    0x58d5c0 <abort_thread_critical>,
user_data=user_data <at> entry=0x7fffffffd3e0) at mono-threads.c:935
#22 0x0000000000591a86 in abort_thread_internal
(thread=thread <at> entry=0x7fffec6e0230,
install_async_abort=install_async_abort <at> entry=1,
    can_raise_exception=1) at threads.c:4728
#23 0x0000000000591b29 in mono_thread_internal_stop
(thread=0x7fffec6e0230) at threads.c:2385
---Type <return> to continue, or q <return> to quit---
#24 0x00000000005b123e in mono_gc_cleanup () at gc.c:842
#25 0x00000000005aab8e in mono_runtime_cleanup
(domain=domain <at> entry=0x9d9e00) at appdomain.c:356
#26 0x0000000000426c8b in mini_cleanup (domain=0x9d9e00) at
mini-runtime.c:4017
#27 0x000000000047fac6 in mono_main (argc=11, argv=<optimized out>) at
driver.c:2115
#28 0x0000000000424c68 in mono_main_with_options (argv=0x7fffffffd698,
argc=11) at main.c:20
#29 main (argc=<optimized out>, argv=<optimized out>) at main.c:53

Thread 16768
‹‹‹
(gdb) bt
#0  0x00007fffeccca66c in pthread_cond_wait <at>  <at> GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1  0x000000000060c873 in mono_os_cond_wait (mutex=0x97e640 <lock>,
cond=0x97e600 <work_cond>) at ../../mono/utils/mono-os-mutex.h:105
#2  thread_func (thread_data=0x0) at sgen-thread-pool.c:118
#3  0x00007fffeccc6806 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fffec80a9bd in clone () from /lib64/libc.so.6
#5  0x0000000000000000 in ?? ()
(gdb) 

Thread 16815
‹‹‹‹
Thread #0  0x00007fffec75ec8b in sigsuspend () from /lib64/libc.so.6
#1  0x000000000063cda6 in suspend_signal_handler (_dummy=<optimized out>,
info=<optimized out>, context=0x7fffec633dc0)
    at mono-threads-posix-signals.c:209
#2  <signal handler called>
#3  0x00007fffed8faf97 in open64 () from /lib64/ld-linux-x86-64.so.2
#4  0x00007fffed8ea82d in open_verify () from /lib64/ld-linux-x86-64.so.2
#5  0x00007fffed8eade0 in open_path () from /lib64/ld-linux-x86-64.so.2
#6  0x00007fffed8ece7c in _dl_map_object () from
/lib64/ld-linux-x86-64.so.2
#7  0x00007fffed8f7400 in dl_open_worker () from
/lib64/ld-linux-x86-64.so.2
#8  0x00007fffed8f2e86 in _dl_catch_error () from
/lib64/ld-linux-x86-64.so.2
#9  0x00007fffed8f6e3b in _dl_open () from /lib64/ld-linux-x86-64.so.2
#10 0x00007fffecedcf9b in dlopen_doit () from /lib64/libdl.so.2
#11 0x00007fffed8f2e86 in _dl_catch_error () from
/lib64/ld-linux-x86-64.so.2
#12 0x00007fffecedd33c in _dlerror_run () from /lib64/libdl.so.2
#13 0x00007fffecedcf01 in dlopen <at>  <at> GLIBC_2.2.5 () from /lib64/libdl.so.2
#14 0x0000000000630b79 in mono_dl_open (name=name <at> entry=0x1c02790
"libSystem.Data", flags=flags <at> entry=1, error_msg=error_msg <at> entry=
    0x7fffec634e80) at mono-dl.c:150
#15 0x000000000054b9f0 in cached_module_load (name=name <at> entry=0x1c02790
"libSystem.Data", err=err <at> entry=0x7fffec634e80, flags=1)
    at loader.c:1398

_______________________________________________
Mono-devel-list mailing list
Mono-devel-list <at> lists.ximian.com
Blockedhttp://lists.ximian.com/mailman/listinfo/mono-devel-listBlocked

Gmane