Holger Weiß | 28 Oct 13:46 2014

gen_tcp:send/2 gets stuck despite send_timeout

Hi there,

I'm an ejabberd contributor, and we're currently facing the issue that
gen_tcp:send/2 occasionally blocks forever even though a 'send_timeout'
(and 'send_timeout_close') has been specified.¹  This seems to happen
only under rare circumstances, but when it happens, it can crash the VM,
as the process that's stuck in the gen_tcp:send/2 call stops processing
its message queue and therefore eats the available memory, eventually.

This *only* seems to happen when epoll(7) is used, i.e. when "+K true"
is specified on Linux.  "+K false" makes the issue go away.

Also, it only happens when the TCP socket is no longer usable.  In the
past, it could occur that an ejabberd process called gen_tcp:send/2 even
though an earlier call returned a failure already.  Since we changed the
code to fix that, the issue is triggered less frequently; and in those
cases where it still *is* triggered, it's obvious from looking at the
details that the socket got closed more or less at the same time.

The problem is that I'm not able to reproduce this myself.  So far,
we've only been made aware of this issue on two servers, both of them
running in production, and it's only easily reproducible on one of them.
That one is running Erlang 17.1 on a Xen instance (I guess I could ask
the admin to update to 17.3).

Without code to reproduce the issue, this is probably non-trivial to
debug :-(  At least there's one live system where the issue is usually
triggered multiple times per day.  Any suggestions on how to proceed?

Thanks, Holger
(Continue reading)

Vicent Ferrer Guasch | 26 Oct 19:05 2014

ASN.1 PER compile


I am trying to compile the last S1AP (3GPP 36.413) ASN specifications
using asn1ct:compile/2 , but the generated source file is incorrect. I
think compile/2 has a problem with the table constraints using per
encoding i.e.

InitiatingMessage ::= SEQUENCE {



({S1AP-ELEMENTARY-PROCEDURES}{ <at> procedureCode}),

({S1AP-ELEMENTARY-PROCEDURES}{ <at> procedureCode})


The generated code contains  <at> , an example.

<<V1 <at> V0:2/unsigned-unit:1,V1 <at> Buf1/bitstring>> = Bytes1,

{V1 <at> V0,V1 <at> Buf1}

(Continue reading)

Dániel Szoboszlay | 1 Oct 14:14 2014

gen_server timeout disturbed by system messages


When a gen_server (or gen_fsm, which is very similar in this aspect) is waiting for a message with a finite timeout but receives a system message, it will return to the message loop with the original timeout value. I think this is an incorrect behaviour.

Let’s say I set the timeout to 60 seconds, and 20 seconds later a system message arrives. After handling it the gen_server loop will wait a full 60 seconds until I receive the timeout - a total of 80 seconds instead of 60. Even worse, if a monitoring tool (like observer) keeps polling the server for its state I may never ever get a timeout.

The docs say:

If an integer timeout value is provided, a timeout will occur unless a request or a message is received within Timeout milliseconds.

I’m not sure whether a “message" in this text should mean a "system message" too, but I think system messages are not very well known and for me the docs read my gen_server code will get back the control via a handle_call, handle_cast or handle_info callback within the timeout specified. Loosing control for ever is definitely not something I would be prepared for when setting a timeout.

So I believe the gen_server code shall record the time when the wait started, and after processing a system message deduce the elapsed time from the original timeout. This way the timeout would occur when it should (unless the process receives system messages faster than it could handle them and can never clear its message queue of course).

Let me know whether you agree with me on the expected behaviour - if you do, I can write a patch and submit a PR, but I don’t want to waste my time working on a non-issue.

Thanks & Regards,
erlang-bugs mailing list
erlang-bugs <at> erlang.org
刘小飞 | 29 Sep 12:22 2014

Erlang vm beam.smp crash

I use http://www.erlang.org/download/otp_src_17.0.tar.gz to build the erlang.

BIF_RETTYPE port_get_data_1(BIF_ALIST_1)
     * This is not a signal. See comment above.
    Eterm res;
    erts_aint_t data;
    Port* prt;

    prt = data_lookup_port(BIF_P, BIF_ARG_1);
    if (!prt)

    data = erts_smp_atomic_read_ddrb(&prt->data);
    if (!data)
      //I add the two lines to correct it.

    if ((data & 0x3) != 0) {
    res = (Eterm) (UWord) data;
    else {
    ErtsPortDataHeap *pdhp = (ErtsPortDataHeap *) data;
    Eterm *hp = HAlloc(BIF_P, pdhp->hsize);
    res = copy_struct(pdhp->data, pdhp->hsize, &hp, &MSO(BIF_P));


(gdb) bt full
#0  0x0000000000514524 in port_get_data_1 (A__p=0x7f4bc0d66488, BIF__ARGS=<value optimized out>) at beam/erl_bif_port.c:591
        pdhp = 0x0
        hp = <value optimized out>
        data = 0
#1  0x000000000054d517 in process_main () at beam/beam_emu.c:2787
        bf = 0x514490 <port_get_data_1>
        result = 1688368833101607
        init_done = 1
        c_p = 0x7f4bc0d66488
        reds_used = 178536832
        x0 = 1688368833101607
        reg = 0x7f4c0aa44180
        HTOP = 0x7f4bc036d350
        E = 0x7f4bc0370b18
        I = 0x7f4bfb5c7af8
        FCALLS = 1984
        tmp_arg1 = 139963324058344
        tmp_arg2 = 15
        tmp_big = {139964436718400, 5662828}
        freg = 0x7f4c0aa461c0
        neg_o_reds = 0
        arith_func = 0
        opcodes = {0x54c14a, 0x54b78e, 0x54c06a, 0x54c0eb, 0x54c2f8, 0x54cee5, 0x54cb67, 0x54e173, 0x54ec5c, 0x54ca4d, 0x54ca43, 0x54ca23, 0x54908b, 0x54c5fe, 0x54d5c8, 0x54d605, 0x54d5f6, 0x54957d, 0x549451, 0x54d366, 0x54d26d,
          0x54d29b, 0x54d063, 0x54d092, 0x54d245, 0x54d223, 0x54d176, 0x54d4bb, 0x54ca52, 0x54caa1, 0x54ba31, 0x54ba07, 0x54ba26, 0x54bf45, 0x54bf66, 0x54ccfb, 0x54c949, 0x546667, 0x54ca9c, 0x54674e, 0x546771, 0x54667a, 0x5466ab,
          0x5466dc, 0x546715, 0x5464cd, 0x54eaa8, 0x54e14e, 0x54ea52, 0x54eb16, 0x546795, 0x5467b6, 0x5467d8, 0x5467f5, 0x546823, 0x546852, 0x546870, 0x54689f, 0x5468cf, 0x5468fc, 0x54692a, 0x546957, 0x54699e, 0x5469e6, 0x546a14,
          0x546a5c, 0x546aa5, 0x546ad3, 0x546b02, 0x546b30, 0x546b78, 0x546bc1, 0x546bf0, 0x546c39, 0x54d3bf, 0x54b2ce, 0x54b39e, 0x54b3c9, 0x54e468, 0x54b3bf, 0x54b5f8, 0x54e1d9, 0x54b046, 0x54b0ac, 0x54b65f, 0x54ce89, 0x54b47a,
          0x54d44b, 0x54e23b, 0x54b4bd, 0x5493b7, 0x54947d, 0x54ddf7, 0x54df22, 0x54e011, 0x54d8dd, 0x54d60f, 0x54d694, 0x54d965, 0x54d9db, 0x54da59, 0x54db67, 0x54d718, 0x54d348, 0x54d2c2, 0x54d340, 0x54b103, 0x54d34d, 0x54d1a0,
          0x54d8d6, 0x54d857, 0x54d8b7, 0x54d6ad, 0x54d57d, 0x54d5ba, 0x54d810, 0x54d849, 0x549430, 0x54bbbe, 0x54e2f0, 0x54d38c, 0x54e331, 0x54e345, 0x54bab9, 0x54e357, 0x549185, 0x54e3c9, 0x54e3e9, 0x5494f1, 0x549306, 0x549515,
          0x54d6a5, 0x54bbc9, 0x54953a, 0x54d0ee, 0x54d0b2, 0x54df04, 0x54ddd1, 0x54bc6c, 0x54dede, 0x54dd60, 0x54dc7d, 0x54dcef, 0x54af53, 0x54afcc, 0x54e7ca, 0x54e7ff, 0x54e294, 0x54e2be, 0x54b6ca, 0x54d355, 0x54d211, 0x54cf47,
          0x54cf9a, 0x54d007, 0x54d12e, 0x54ec20, 0x54b559, 0x54b507, 0x546539, 0x54656a, 0x5465a3, 0x5465e0, 0x54e0ea, 0x54e400, 0x546504, 0x5464e2, 0x54ea08, 0x54be99, 0x54c53b, 0x54c5d7, 0x54e774, 0x54bee0, 0x54bf28, 0x54bf36,
          0x5464cd, 0x54b222, 0x546d1f, 0x546d3e, 0x546d81, 0x546da0, 0x546dd2, 0x546e29, 0x546e49, 0x546e7c, 0x546d04, 0x546d5e, 0x546e05, 0x546ca2, 0x546cbd, 0x546ce0, 0x546c83, 0x54e83c, 0x54b1cc, 0x54b278, 0x54eb5b, 0x54ba46,
          0x54c683, 0x54c726, 0x54c7b4...}
        temp_bits = 139964436760704
        pt_arity = 139963334550664
        start_time = 0
        start_time_i = 0x0
        EBS = 0x7f4c0288c898
#2  0x00000000004a081b in sched_thread_func (vesdp=0x7f4c0288c880) at beam/erl_process.c:7665
        callbacks = {arg = 0x7f4c02882380, wakeup = 0x4a21b0 <thr_prgr_wakeup>, prepare_wait = 0x49e370 <thr_prgr_prep_wait>, wait = 0x49f6f0 <thr_prgr_wait>, finalize_wait = 0x49e350 <thr_prgr_fin_wait>}
        esdp = 0x7f4c0288c880
        no = 2
#3  0x00000000005df676 in thr_wrapper (vtwd=<value optimized out>) at pthread/ethread.c:110
        result = <value optimized out>
        res = 0x7fffd82d4b90
        twd = <value optimized out>
        thr_func = 0x4a0700 <sched_thread_func>
        arg = 0x7f4c0288c880
        tsep = 0x7f4c0a2800a0
#4  0x00000037d10079d1 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#5  0x00000037d0ce8b6d in ?? ()
No symbol table info available.
#6  0x0000000000000000 in ?? ()
No symbol table info available.

erlang-bugs mailing list
erlang-bugs <at> erlang.org
Pavel Baturko | 29 Sep 11:29 2014

bind_to is ignored in snmpm_net_if


I'm looking into erlang snmp manager code because my manager stops working when I updated erlang from 17.1 to 17.3.

The problem is that in otp/lib/snmp/src/manager/snmpm_net_if.erl (github erlang/otp, branch maint) in function socket_params parameter BindTo is ignored in case of Family==inet and init

socket_params(Domain, {IpAddr, IpPort}, BindTo, CommonSocketOpts) ->
    case Family of
    inet ->
        case init:get_argument(snmp_fd) of
        {ok, [[FdStr]]} ->
        error ->
            {IpPort, [{ip, IpAddr} | SocketOpts]} <<<<<<<< *
    _ ->

*: here ip option is added regardless of BindTo argument.

in other cases branches this option is utilized.

When address option is not specified in manager.conf snmp manager uses as default and socket is binding to After that all gen_upd:send fails with einval.

erlang-bugs mailing list
erlang-bugs <at> erlang.org
Lenrel Lenrel | 26 Sep 16:26 2014

beam.smp hangs and use 100% of cpu

I have compiled  from source erlang 17.3 and rabbitmq 3.3.5.

Then I generate a load of aprox. 200 msg/s to rabbitmq exchange.

After a couple of minutes, process beam.smp hangs and use 100% of cpu.

The rabbitmq stops receiving connections and can't do anything with rabbitmqctl.

The same thing happens when I compile older versions of erlang and rabbitmq (R16B03 / 3.1.5).

This is the strafe -f -p:

Process 3431 attached with 39 threads
[pid  3509] futex(0x7fa289380990, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3510] futex(0x7fa2893809d0, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3508] futex(0x7fa289380950, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3506] futex(0x7fa2893808d0, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3507] wait4(-1,  <unfinished ...>
[pid  3505] futex(0x7fa289380890, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3504] futex(0x7fa289380850, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3503] futex(0x7fa289380810, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3502] futex(0x7fa2893807d0, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3501] futex(0x7fa289380790, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3500] futex(0x7fa289380750, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3499] futex(0x7fa289380710, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3497] futex(0x7fa289380690, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3496] futex(0x7fa289380650, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3498] futex(0x7fa2893806d0, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3495] futex(0x7fa289380610, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3494] futex(0x7fa2893805d0, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3493] futex(0x7fa289380590, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3492] futex(0x7fa289380550, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3491] futex(0x7fa289380510, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3490] futex(0x7fa2893804d0, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3489] futex(0x7fa289380490, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3488] futex(0x7fa289380450, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3487] futex(0x7fa289380410, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3486] futex(0x7fa2893803d0, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3485] futex(0x7fa289380390, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3484] futex(0x7fa289380350, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3483] futex(0x7fa289380310, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3482] futex(0x7fa2893802d0, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3481] futex(0x7fa289380290, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3480] futex(0x7fa289380250, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3479] futex(0x7fa289380210, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3478] futex(0x7fa2893801d0, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3477] futex(0x7fa289380190, FUTEX_WAIT_PRIVATE, 4294967295, NULL <unfinished ...>
[pid  3476] futex(0x8dbd44, FUTEX_WAIT_PRIVATE, 3, NULL <unfinished ...>
[pid  3475] read(6,  <unfinished ...>
[pid  3431] select(0, NULL, NULL, NULL, NULL

any help?
erlang-bugs mailing list
erlang-bugs <at> erlang.org
Wiesław Bieniek | 26 Sep 15:03 2014

Why ets called ac_tab is public ?


I was wondering why the ets holding application envs is public : ets:new(ac_tab, [set, public, named_table]) (from application_controller) ?

Reading of envs is done by plain ets reads and setting envs is done by call to gen_server.
So it seems protected access shuold be enough.

My question is:

Why is it public ?

Thanks in advance
Wiesław Bieniek
Designer Telco BSS R&D

tel. +48 12 646 12 66
website: www.comarch.pl
Attachment (smime.p7s): application/pkcs7-signature, 2972 bytes
erlang-bugs mailing list
erlang-bugs <at> erlang.org
Sam Chapin | 26 Sep 07:34 2014

Funs with duplicate names

I'm new to the list, apologies if this is known. Discovered while working in OTP 17.1, and it's still present after updating to 17.3.

Create a module containing the following:

f1() -> fun F() -> "f1" end().                                                 
f2() -> fun F() -> "f2" end().

Load it and execute f2(). On my system, its return value is "f1".

erlang-bugs mailing list
erlang-bugs <at> erlang.org
Michael Truog | 26 Sep 04:19 2014

erts_debug:flat_size/1 wrong?


I have been attempting to compare the output of erts_debug:flat_size/1 
to the memory info at 
http://www.erlang.org/doc/efficiency_guide/advanced.html#id68923 and the 
results show that each term's size is off-by-one (at least for pids 
local/remote, refs local/remote, floats, integers, bignums, binaries and 
atoms).  I know the function is experimental, but this is a bug, right?  
The problem affects top-level terms and nested terms, so it is likely to 
understate the memory with large terms.  I wanted to make sure the 
memory info (in the efficiency guide) was accurate (it seems like it 
is).  I was testing with R16B03 on 64bits.

For example:
1> erts_debug:flat_size(576460752303423488).
2> erts_debug:flat_size(576460752303423487).
3> erts_debug:flat_size(undefined).
4> erts_debug:flat_size([]).
5> erts_debug:flat_size([undefined]).
% 1 word for each element in the list * 2 elements including []
6> erts_debug:flat_size(erlang:make_ref()).
7> erts_debug:flat_size(erlang:self()).
8> erts_debug:flat_size(1.0).

erlang-bugs mailing list
erlang-bugs <at> erlang.org

Ali Sabil | 25 Sep 17:07 2014

Erlang 17.3: can't load crypto library on OSX Lion

Hi all,

It looks like the crypto application in Erlang/OTP 17.3 fails on OSX Lion as described here:

Any input on why this would happen?

erlang-bugs mailing list
erlang-bugs <at> erlang.org
Tobias Schlager | 23 Sep 14:49 2014

mnesia subscription


I just found out the hard way, that mnesia:subscribe/1 might lie to you, when you trap exits and are linked to
other processes. Although running, Mnesia sometimes returns
'{error,{node_not_running,nonode <at> nohost}}' when trying to subscribe. Even worse subsequent calls
to subscribe from the same process tell you that the subscription actually took place (returning
'{error,{already_exists,Category}}'). This is due to a receive statement located in
mnesia_subscr:call/1 that tries to unify the functions return value. Unfortunately, it doesn't match
for a specific process id, so every 'EXIT' message in the callers message queue will be consumed and
trigger the false value.

For a minimal test case fire up an erlang shell and do the following:
1> application:start(mnesia).
2> process_flag(trap_exit, true).
3> spawn_link(fun() -> ok end).  
4> mnesia:subscribe(system).     
{error,{node_not_running,nonode <at> nohost}}
5> mnesia:subscribe(system).
6> mnesia:unsubscribe(system).
{ok,nonode <at> nohost}
7> mnesia:subscribe(system).  
{ok,nonode <at> nohost}

erlang-bugs mailing list
erlang-bugs <at> erlang.org