José Valim | 6 Feb 17:15 2016
Picon
Gravatar

Internal consistency check failed (with binary matching and orself)

Hello,

We have run into a failed internal consistency check. Here I am reporting the bug as asked. :D


The code is a bit convoluted but that was done in order to provide a minimum test case.

Thank you!


José Valim
Skype: jv.ptec
Founder and Director of R&D
_______________________________________________
erlang-bugs mailing list
erlang-bugs <at> erlang.org
http://erlang.org/mailman/listinfo/erlang-bugs
Ingars | 21 Jan 13:38 2016
Picon
Gravatar

ASN.1. generation with erlc -bber works / erlc -bper - fails

Hi,

I have found an ASN.1 fragment that compiles well with erlc -bber flag but fails with erlc -bper.

> iri <at> ubuntu:~/asn$ erlc -bber TEST.asn
  -> works well

> iri <at> ubuntu:~/asn$ erlc -bper TEST.asn
  -> raises an error
      ------------------------------------------------
      {{badmatch,1799999989},
       [{asn1ct_imm,per_enc_constrained,4,[{file,"asn1ct_imm.erl"},{line,1139}]},
        {asn1ct_imm,per_enc_integer_1,3,[{file,"asn1ct_imm.erl"},{line,1094}]},
        {asn1ct_imm,'-per_enc_integer/4-lc$^0/1-0-',4,
                    [{file,"asn1ct_imm.erl"},{line,248}]},
        {asn1ct_imm,per_enc_integer,4,[{file,"asn1ct_imm.erl"},{line,248}]},
        {asn1ct_gen_per,gen_encode_prim,3,[{file,"asn1ct_gen_per.erl"},{line,121}]},
        {asn1ct_gen_per,gen_encode_user,2,[{file,"asn1ct_gen_per.erl"},{line,98}]},
        {asn1ct_gen,pgen_types,5,[{file,"asn1ct_gen.erl"},{line,123}]},
        {asn1ct_gen,pgen_typeorval,4,[{file,"asn1ct_gen.erl"},{line,105}]}]}
      ------------------------------------------------

With ASN.1 -> C compiler
> asn1c -gen-PER TEST.asn
  -> also works well


File TEST.asn:
------------------------
    TEST DEFINITIONS IMPLICIT TAGS ::=
    BEGIN
      Longitude ::= INTEGER
      {
        oneMicrodegreeEast(10),
        oneMicrodegreeWest(-10),
        unavailable(1800000001)
      } (-1799999999..1800000001)
    END
------------------------


Thanks,

Ingars
/////

_______________________________________________
erlang-bugs mailing list
erlang-bugs <at> erlang.org
http://erlang.org/mailman/listinfo/erlang-bugs
Paul Davis | 21 Jan 06:26 2016
Picon
Gravatar

NIF segfault when using dirty schedulers

Hey all,

I've recently run into a segfault while working with dirty schedulers.
I managed to make a fairly concise reproducing test case at [1]. I
included a stack trace at [2] from when the segfault occurs. This is
definitely a racey segfault as well. I sometimes have to run `rebar
eunit` a handful of times to trigger it.

I'm not hugely familiar with all of the VM internals so I'm at a bit
of a loss on where to start looking further. I did try and get rid of
the requirement for eunit but I couldn't reproduce without it.

This reproduces on both 17.5.6.4 where I found it and 18.2.2. I
haven't tried master or anything of that nature.

Let me know if there's anything else I can do to help debug this.

Thanks,
Paul

[1] https://gist.github.com/davisp/1e71ec7f2f7a70d1b79c
[2] https://gist.github.com/davisp/1e71ec7f2f7a70d1b79c#file-gdb_backtrace-txt
_______________________________________________
erlang-bugs mailing list
erlang-bugs <at> erlang.org
http://erlang.org/mailman/listinfo/erlang-bugs

Filip Andres | 20 Jan 18:42 2016
Picon
Gravatar

Troubles with ethr_dw_atomic_cmpxchg implementation on intel Atom?

Hello,
I have found out that on intel Atom processors the erlang:processes() crashes the VM.
To me it seems like a glitch in the implementation of ethr_dw_atomic_cmpxchg on platforms lacking native support for double word compare-and-exchange (although I cannot claim I really understand all of erl_threads.h and atomic.h).

GDB stacktrace:
(gdb) bt #0 0x56798d70 in ethr_dw_atomic_cmpxchg () at ../include/internal/i386/atomic.h:177 #1 0x566103ce in ethr_dw_atomic_cmpxchg_nob (xchg=0xf4e0609c, new=0xf4e060a4, var=0x568688f0 <erts_proc+48>) at beam/erl_threads.h:1456 #2 erts_atomic64_inc_read_nob (var=0x568688f0 <erts_proc+48>) at beam/erl_threads.h:1646 #3 step_interval_nob (icp=0x568688f0 <erts_proc+48>) at beam/utils.c:4954 #4 erts_smp_step_interval_nob (icp=icp <at> entry=0x568688f0 <erts_proc+48>) at beam/utils.c:5004 #5 0x5671572b in ptab_list_bif_engine (c_p=c_p <at> entry=0xf6dc0218, res_accp=res_accp <at> entry=0xf4e06178, mbp=mbp <at> entry=0xf1f80a88) at beam/erl_ptab.c:927 #6 0x56716a5d in erts_ptab_list (c_p=c_p <at> entry=0xf6dc0218, ptab=0x568688c0 <erts_proc>) at beam/erl_ptab.c:766 #7 0x5661be76 in processes_0 (A__p=0xf6dc0218, BIF__ARGS=0xf7483100) at beam/bif.c:3841 #8 0x5659978b in process_main () at beam/beam_emu.c:3690 #9 0x56638784 in sched_thread_func (vesdp=0xf6087dc0) at beam/erl_process.c:8021 #10 0x567a19cc in thr_wrapper (vtwd=0xffffd1b4) at pthread/ethread.c:114 #11 0xf7f164be in start_thread (arg=0xf4e06b40) at pthread_create.c:333 #12 0xf7e2a3fe in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:114Some more information in: https://bugzilla.redhat.com/show_bug.cgi?id=1221824
_______________________________________________
erlang-bugs mailing list
erlang-bugs <at> erlang.org
http://erlang.org/mailman/listinfo/erlang-bugs
Ulf Wiger | 23 Dec 18:38 2015
Picon
Gravatar

SSL handshake crash

Hmm… I send this to erlang-bugs, but it didn’t seem to get through.

When connecting some Android software to an Erlang node using TLS, we sometimes (about 1 in 3 or 4 times) get the following errors:

2015-12-22 15:31:00.772 [error] <0.210.0> gen_fsm <0.210.0> in state hello terminated with reason: no function clause matching ssl_handshake:update_handshake_history(undefined, <<1,0,0,175,3,1,86,121,221,42,209,19,198,53,3,42,92,9,16,158,197,5,169,29,247,96,14,32,123,176,...>>) line 450

15:31:00.783<dlink_tls_conn/327>dlink_tls_conn:terminate(): Reason: {{function_clause,[{ssl_handshake,update_handshake_history,[undefined,<<1,0,0,175,3,1,86,121,221,42,209,19,198,53,3,42,92,9,16,158,197,5,169,29,247,96,14,32,123,176,109,210,170,150,204,23,32,228,0,0,70,0,4,0,5,0,47,0,53,192,2,192,4,192,5,192,12,192,14,192,15,192,7,192,9,192,10,192,17,192,19,192,20,0,51,0,57,0,50,0,56,0,10,192,3,192,13,192,8,192,18,0,22,0,19,0,9,0,21,0,18,0,3,0,8,0,20,0,17,0,255,1,0,0,64,0,11,0,4,3,0,1,2,0,10,0,52,0,50,0,14,0,13,0,25,0,11,0,12,0,24,0,9,0,10,0,22,0,23,0,8,0,6,0,7,0,20,0,21,0,4,0,5,0,18,0,19,0,1,0,2,0,3,0,15,0,16,0,17>>],[{file,"ssl_handshake.erl"},{line,450}]},{tls_connection,'-next_state/4-fun-0-',3,[{file,"tls_connection.erl"},{line,458}]},{tls_connection,next_state,4,[{file,"tls_connection.erl"},{line,467}]},{gen_fsm,handle_msg,7,[{file,"gen_fsm.erl"},{line,518}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]},{gen_fsm,sync_send_all_state_event,[<0.210.0>,{start,infinity},infinity]}} 

2015-12-22 15:31:00.784 [error] <0.210.0> CRASH REPORT Process <0.210.0> with 0 neighbours exited with reason: no function clause matching ssl_handshake:update_handshake_history(undefined, <<1,0,0,175,3,1,86,121,221,42,209,19,198,53,3,42,92,9,16,158,197,5,169,29,247,96,14,32,123,176,...>>) line 450 in gen_fsm:terminate/7 line 626

2015-12-22 15:31:00.785 [error] <0.209.0> gen_server <0.209.0> terminated with reason: {{function_clause,[{ssl_handshake,update_handshake_history,[undefined,<<1,0,0,175,3,1,86,121,221,42,209,19,198,53,3,42,92,9,16,158,197,5,169,29,247,96,14,32,123,176,109,210,170,150,204,23,32,228,0,0,70,0,4,0,5,0,47,0,53,192,2,192,4,192,5,192,12,192,14,192,15,192,7,192,9,192,10,192,17,192,19,192,20,0,51,0,57,0,50,0,56,0,10,192,3,192,13,192,8,192,18,0,22,0,19,0,9,0,21,0,18,0,3,0,8,0,20,0,17,0,255,1,0,0,64,0,11,0,4,3,0,1,2,0,10,0,52,0,50,0,14,0,13,0,25,0,11,0,12,0,24,0,9,0,10,0,22,0,23,0,8,0,...>>],...},...]},...} in gen_fsm:sync_send_all_state_event/3 line 257

2015-12-22 15:31:00.786 [error] <0.209.0> CRASH REPORT Process <0.209.0> with 0 neighbours exited with reason: {{function_clause,[{ssl_handshake,update_handshake_history,[undefined,<<1,0,0,175,3,1,86,121,221,42,209,19,198,53,3,42,92,9,16,158,197,5,169,29,247,96,14,32,123,176,109,210,170,150,204,23,32,228,0,0,70,0,4,0,5,0,47,0,53,192,2,192,4,192,5,192,12,192,14,192,15,192,7,192,9,192,10,192,17,192,19,192,20,0,51,0,57,0,50,0,56,0,10,192,3,192,13,192,8,192,18,0,22,0,19,0,9,0,21,0,18,0,3,0,8,0,20,0,17,0,255,1,0,0,64,0,11,0,4,3,0,1,2,0,10,0,52,0,50,0,14,0,13,0,25,0,11,0,12,0,24,0,9,0,10,0,22,0,23,0,8,0,...>>],...},...]},...} in gen_server:terminate/7 line 826

2015-12-22 15:31:00.787 [error] <0.109.0> Supervisor tls_connection_sup had child undefined started with {tls_connection,start_link,undefined} at <0.210.0> exit with reason no function clause matching ssl_handshake:update_handshake_history(undefined, <<1,0,0,175,3,1,86,121,221,42,209,19,198,53,3,42,92,9,16,158,197,5,169,29,247,96,14,32,123,176,...>>) line 450 in context child_terminated


We run OTP Erlang/OTP 18 [erts-7.2] with ssl-7.2, and the erlang side has the following options:

[{verify,verify_peer},
{certfile,"/home/.../device_cert.crt”},
{keyfile,"/home/.../device_key.pem”},
{cacertfile,"/home/.../root_cert.crt”},
{verify_fun,{#Fun<dlink_tls_conn.65.24728257>,{'RSAPublicKey’,...}}},
{partial_chain,#Fun<dlink_tls_conn.64.24728257>}]

Basically, the verify_fun validates a self-signed cert
https://github.com/PDXostc/rvi_core/blob/develop/components/dlink_tls/src/dlink_tls_conn.erl#L393

and the partial_chain fun most likely does much less than it should
https://github.com/PDXostc/rvi_core/blob/develop/components/dlink_tls/src/dlink_tls_conn.erl#L421

On the Android side, we’re using Android 4.4.2 (API 19).

It feels like a timing-related problem on the erlang side.

Let me know if you need more information.

BR,
Ulf W
_______________________________________________
erlang-bugs mailing list
erlang-bugs <at> erlang.org
http://erlang.org/mailman/listinfo/erlang-bugs
David Buckley | 20 Dec 20:48 2015
Picon

NIF resources are not checked on module unload

While playing with implementing a NIF, I found some segfaults, and I
eventually got it down to the test case here:

https://gist.github.com/bucko909/a841c716ede6d3903a13

It looks like it's down to my not re-registering the resource on upgrade
(presumably the handle goes stale, is garbage collected, and eventually
it corrupts memory causing segfaults in unrelated emulator code).

I fell into this trap by using code from
https://github.com/davisp/nif-examples -- which I've sent a pull request
to fix.

I fixed my problem by adding enif_open_resource to the upgrade function
once I'd clocked my error, so under normal and correct use, I think the
emulator is doing OK.

However, it looks like if I /don't/ reopen it, it's not properly
deleted, and the documentation seems to leave open the possibility of
doing just this ("Existing resource objects, of a module that is
upgraded, must either be deleted or taken over by the new NIF library").
References to resources with the old handle remain uncleaned. Even if I
completely destroy the old module, so that unload is called, these stale
resources persist until a garbage collection. They actually survive
/many/ purge/load cycles in my example code before being garbage
collected and segfaulting the emulator.

Ideas, based on my interpretation of the bug:

If there are lingering resources, which are not TAKEOVER-ed in the
upgrade function, and have a dtor, this should cause an immediate
emulator panic. I can't think of any other behaviour which is safe here.
If they don't have a dtor, it seems safe to keep them around, but their
resource handle needs to be kept alive until they are all destroyed. It
ought to be impossible to create new resources using the old handle, at
least when there is a dtor defined (can a 'dead' flag be set?).

Knowing this behaviour, an application author writing an upgrade
function for this NIF library might at least attempt to destroy all of
his objects when making such an upgrade, in order to have the emulator
survive!

Another approach is to require an /explicit/ delete of old resources,
perhaps simply a call to "enif_delete_unused_resources" or an iteration
of "enif_delete_resource" over "enif_list_resources", and have this call
fail where the old resources are still allocated. Perhaps the library
author could force a purge or panic the emulator themselves at this
point. The emulator should panic if a resource is neither deleted nor
reopened with TAKEOVER.

--

-- 
David Buckley
_______________________________________________
erlang-bugs mailing list
erlang-bugs <at> erlang.org
http://erlang.org/mailman/listinfo/erlang-bugs

David Buckley | 18 Dec 15:19 2015
Picon

NIF .so reload issues

Hi! I was playing with writing a NIF, and found I couldn't reload.

I'm doing the sort-of accepted thing of loading the nif in an on_load
function, though if I just execute the function just after load, I get
the same behaviour, so I don't think that's at issue.

Basically, what seems to be the case is that while erlang will
re-initialise my nif code (with 'upgrade'), it won't load a /new/
version of the nif code unless I completely purge the (erlang) code from
the runtime, forcing erlang to recheck the module. I'm guessing erlang
is caching the nif. Changing the compiled (.so) filename each time fixes
the problem.

Example code here:

https://gist.github.com/bucko909/a3b5099c74bf267e65db

test_reload_post_purge and test_reload_post_reload_complete_purge work
fine (erts-7.1), but the other three don't reload the .so file as I
would expect.

Is this fixable, or must I manually add a purge() in my init() function
before load_nif? (And why does that work? Because at that point there's
no evidence that the new module will have a load_nif, so the old dlopen
can be discarded?)

Seems like in general if the .so file has changed and a module is
reloaded, the user probably wants the new .so file, too! It's at least
worth adding a note to the docs (or a new return value?) if it's an evil
dlopen restriction.

--

-- 
David Buckley
_______________________________________________
erlang-bugs mailing list
erlang-bugs <at> erlang.org
http://erlang.org/mailman/listinfo/erlang-bugs

Kenji Rikitake | 18 Dec 04:57 2015

Re: OTP 18.2 HiPE fix on FreeBSD 10.2

Kawano-san: very much appreciated.

I've tested with --enable-hipe --enable-fp-exceptions --enable-native-libs
and so far the BEAM with HiPE seems to be working.

I have to check out the following issues:
* Is the sigaction() handling really OK on FreeBSD?
* Is the dlsym() handling really OK on FreeBSD?
Maybe I need more input from FreeBSD people.

For those who want to test a tentative Port, check here:
though I'm sure Jimmy Olgeni, the maintainer of FreeBSD Erlang Ports,
will override mine in a short period. 

Regards,
Kenji Rikitake


On Thu, Dec 17, 2015 at 10:52 PM, Tatsuya Kawano <tatsuya <at> hibaridb.org> wrote:
Hi Kenji,

On Thu, Dec 17, 2015, at 08:12 PM CST, Kenji Rikitake wrote:
>> The following includes a quick workaround and I need FreeBSD people
>> to further test the HiPE functionalities. (Any good test cases?)
...
> I mixed up patches for 18.2 and master branches. Here's the fixed one for 18.2:
>
> https://github.com/erlang/otp/pull/926

Thank you for the patch. It worked like a charm; I was able to build OTP
18.2 on FreeBSD 10.2 with HiPE enabled.

So far, I have only tested it against boundary bear
<https://github.com/boundary/bear>, which has HiPE enabled by default.
It passed all eunit cases.

--------------------------------------------------
/home/tatsuya% freebsd-version
10.2-RELEASE-p8

/home/tatsuya% cat .kerlrc
KERL_CONFIGURE_OPTIONS="--enable-hipe --enable-smp-support
--enable-threads --enable-kernel-poll"

/home/tatsuya% kerl build git https://github.com/jj1bdx/otp.git \
    jj1bdx-18.2-freebsd-hipe-fix-2 18.2_hipe_pr926
Checking Erlang/OTP git repository from
https://github.com/jj1bdx/otp.git...
Building Erlang/OTP 18.2_hipe_pr926 from git, please wait...
Erlang/OTP 18.2_hipe_pr926 from git has been successfully built

/home/tatsuya% kerl install 18.2_hipe_pr926 ~/erlang/18.2_hipe_pr926
Installing Erlang/OTP git (18.2_hipe_pr926) in
/home/tatsuya/erlang/18.2_hipe_pr926...
You can activate this installation running the following command:
. /home/tatsuya/erlang/18.2_hipe_pr926/activate
Later on, you can leave the installation typing:
kerl_deactivate

/home/tatsuya% . /home/tatsuya/erlang/18.2_hipe_pr926/activate
/home/tatsuya% erl
Erlang/OTP 18 [erts-7.2] [source-e616e04] [64-bit] [smp:8:8]
[async-threads:10] [hipe] [kernel-poll:false]

Eshell V7.2  (abort with ^G)
1>
User switch command
 --> q

/home/tatsuya% cd workhub/dev/hibari/hibari/lib/bear/
/home/tatsuya/workhub/dev/hibari/hibari/lib/bear% grep native src/*
src/bear.erl:-compile([native]).

/home/tatsuya/workhub/dev/hibari/hibari/lib/bear% ./rebar clean compile
eunit
==> bear (clean)
==> bear (compile)
Compiled src/bear.erl
==> bear (eunit)
Compiled test/bear_test.erl
Compiled src/bear.erl
  All 47 tests passed.
Cover analysis:
/usr/home/tatsuya/workhub/dev/hibari/hibari/lib/bear/.eunit/index.html
--------------------------------------------------

Thanks,
Tatsuya Kawano

_______________________________________________
erlang-bugs mailing list
erlang-bugs <at> erlang.org
http://erlang.org/mailman/listinfo/erlang-bugs
Kenji Rikitake | 17 Dec 04:10 2015

OTP 18.2 HiPE fix on FreeBSD 10.2

https://github.com/erlang/otp/pull/925

OTP 18.2 on FreeBSD 10.2-STABLE does not compile with HiPE enabled.
18.1.5 worked ok, so I guess the recent change for musl libc affected.
The following includes a quick workaround and I need FreeBSD people
to further test the HiPE functionalities. (Any good test cases?)

Regards,
Kenji Rikitake

_______________________________________________
erlang-bugs mailing list
erlang-bugs <at> erlang.org
http://erlang.org/mailman/listinfo/erlang-bugs

Mattias Jansson | 25 Nov 14:25 2015

HiPe compiler FP inlining crashing

Hello.

I am having a problem with a crashing HiPe compiler.

It has been tested on Erlang OTP 18.1 (64-bit) on OSX 10.11.1, and also on Fedora Linux 22 with OTP versions
18.0 and 17.5 (64-bit).
It crashes on Linux as well as OSX, but it does not crash in OTP 17.5. This has only been seen in 18.*

The following code has been shrunk to remain syntactically correct and still show the error when compiling
(you’d be surprised by how much of this code is actually necessary to reproduce this).

<code>
-module(rat_calc).

-export([eat/0]).

eat() ->
    eat_what(1.0, #{}).

eat_what(Factor, #{rat_type := LT} = Rat) ->
    #{ cheese := Cheese } = Rat,
    UnitCheese = Cheese / 2,
    RetA = case eat() of
               {full, RetA1} ->
                   CheeseB2 = min(RetA1, UnitCheese) * Factor,
                   case eat() of
                       full ->
                           {win, RetA1};
                       hungry ->
                           {partial, RetA1 - CheeseB2}
                   end;
               AOther ->
                   AOther
           end,
    RetB = case eat() of
               {full, RetB1} ->
                   CheeseA2 = min(RetB1, UnitCheese) * Factor,
                   rat:init(single, LT, CheeseA2),
                   case eat() of
                       full ->
                           {full, RetB1};
                       hungry ->
                           {hungry, RetB1 - CheeseA2}
                   end
           end,
    {RetA, RetB}.
</code>

The code compiles without problem if running the compiler without the +native flag. With the flag, the
compiler will crash.
If compiling with the no_inline_fp flag enabled, the compiler does not crash.

Thanks.
// Mattias Jansson
_______________________________________________
erlang-bugs mailing list
erlang-bugs <at> erlang.org
http://erlang.org/mailman/listinfo/erlang-bugs
Albin Stigö | 26 Oct 11:14 2015
Picon
Gravatar

erl -make returns 0 on failure.

Hello,

I'm pretty new to Erlang and I guess this is an old discussion. But why do

erl -make

return 0 even on error? It makes it harder to do:

erl -make && erl -pa ebin/ test/ -noshell -s r1_tests test -s init stop

I have found some previous posts about it through google but they
normally boil down to "write a wrapper".

--Albin
_______________________________________________
erlang-bugs mailing list
erlang-bugs <at> erlang.org
http://erlang.org/mailman/listinfo/erlang-bugs


Gmane