Evgeniy Khramtsov | 1 Jun 01:16 2011
Picon

Re: Semi-frequent lockup / "crash" in random nodes in ejabberd cluster

01.06.2011 01:18, Armando Di Cianno wrote:
> I'm having an odd case of freezing / "crashing" on seemingly random
> nodes in my 10 machine ejabberd cluster.
>
> Symptoms:
>
>   * Seemingly with no periodicity, one of the nodes in the cluster will
> freeze (the erlang process inside the beam.smp, not the VM)
>   * It doesn't crash (ergo the earlier "crash" scare-quotes), so
> there's no good erl crash dump file to look at
>   * The OS beam.smp process is still running, so there are some crash
> logs coming from our monitoring agent as it tries to restart ejabberd
> and *that* process crashes, since a node is already using that name
>   * The few times I've been right at my workstation, and able to log in
> and manually check what's going on, `ejabberdctl status` fails to run
> manually / connect to the ejabberd process
>
> Notes
>   * 10 machines?! Yeah ... this is running on a managed VM service,
> where we control everything about the guest VMs, but nothing about the
> host machines. Suffice to say, our web services don't seem to exhibit
> related issues, and I believe I have nearly exhausted all routes to
> put blame on the fact that we're using VMs (although, frankly, I'm
> still suspect).
>   * The machines seem to be over-provisioned for RAM, running ~4GiB
> each -- our stats aggregator shows that ejabbered rarely takes up
>    
>> 1.8GiB of RAM per node
>>      
>   * Average user count: ~4k
(Continue reading)

Armando Di Cianno | 1 Jun 01:35 2011
Picon

Re: Semi-frequent lockup / "crash" in random nodes in ejabberd cluster

> 1) I think your cluster is oversized :) 2 nodes for your load is pretty
> enough.

I don't doubt this at all. I was considering lowering the amount to
between 4 and 6, but not as low as 2 -- the VMs perform not so
admirably compared to real hardware. I can definitely lower this, and
see what happens, however.

Out of curiosity, what makes you suspect of the cluster size?

> 2) Please, explain in more details how the "freeze" looks like: is there
> cpu/ram/disk consumption? If there is a cpu consumption, how many cpu cores
> (threads) are overloaded?

The VMs are 4 core (used to be 8, but lowered trying to resolve this
very issue). With kernel-polling and smp-on, this turns out to be 9
linux LWP's. RAM, CPU, and disk are not pegging or full -- the node
has seemingly simply stopped responding.

When I was able to catch the issue in action on one node, and strace
and gdb-connect on each LWP / thread, I saw: 1 thread waiting on
read() from fd 6, 1 thread in a select(), and the rest of the threads
were in "futex_wait" (or some such name IIRC).

Since we used `ejabberdctl reopen_log` periodically, I've been suspect
of that. We also run `ejabberdctl status` very frequently, which I've
also been suspect of somehow triggering the issue.

From what I can tell, file descriptors are not running out, and erlang
should have more than enough max processes.
(Continue reading)

kael | 1 Jun 12:43 2011
Picon

Re: pep & caps

On 05/29/2011 01:21 AM, Fabio Forno wrote:
> Hi,

Hello,

> since I'upgraded to 2.1.6 (before I was running 2.0.x) I've started
> receiving all PEP notifications each time I connect, even for nodes
> not advertised with +notify. Since this make ejabberd pretty unusable
> , especially for avatars, is there a way to block this behavior?

This was a bug which has been fixed in the git repo and will be 
available in the next release.

--

-- 
kael
Fabio Forno | 1 Jun 14:43 2011
Picon

Re: pep & caps

thanks, checkin out... ;)

On Wed, Jun 1, 2011 at 12:43 PM, kael <ka-el <at> laposte.net> wrote:
> On 05/29/2011 01:21 AM, Fabio Forno wrote:
>>
>> Hi,
>
> Hello,
>
>> since I'upgraded to 2.1.6 (before I was running 2.0.x) I've started
>> receiving all PEP notifications each time I connect, even for nodes
>> not advertised with +notify. Since this make ejabberd pretty unusable
>> , especially for avatars, is there a way to block this behavior?
>
> This was a bug which has been fixed in the git repo and will be available in
> the next release.
>
> --
> kael
>
> _______________________________________________
> ejabberd mailing list
> ejabberd <at> jabber.ru
> http://lists.jabber.ru/mailman/listinfo/ejabberd
>

--

-- 
Fabio Forno,
jabber id: ff <at> jabber.bluendo.com
(Continue reading)

Jérôme Sautret | 1 Jun 18:45 2011
Picon

[ANN] New releases: ejabberd 2.1.7, 3.0.0-alpha-3 and exmpp 0.9.7

Hello,

We are pleased to announce the security and bugfix releases
ejabberd 2.1.7, ejabberd 3.0.0-alpha-3 and exmpp 0.9.7.

If you have ejabberd running in a public server, please update it immediately:
those releases contain a security fix that disables entity expansion completely
to prevent billion laughs DoS attack (CVE-2011-1753).

=== ejabberd 2.1.7 ===

This release contains many bugfixes, improvements and a few new features.

A short list of changes:
- BOSH: Keep the order of stanzas when BOSH sends several (EJAB-1374)
- CAPTCHA in MUC: New whitelist option
- CAPTCHA: New captcha_limit option
- Core: Disable all entity expansions (EJAB-1451)
- Core: Do not accept XML with undefined prefixes (EJAB-680)
- ejabberdctl: New DIST_USE_INTERFACE restricts IP erlang listen (EJAB-1404)
- ejabberdctl: New ERL_EPMD_ADDRESS that works since Erlang/OTP R14B03
- extauth: If script crashes, ejabberd should restart it (EJAB-1428)
- If a module start fails during server start, stop erlang (EJAB-1446)
- mod_blocking: New XEP-0191 Simple Communications Blocking (EJAB-695)
- mod_pres_counter: Prevent subscription flood (EJAB-1388)
- mod_register: Access now also controls account unregistrations
- mod_shared_roster: Fix support for anonymous accounts in  <at> all <at>  (EJAB-1264)
- mod_shared_roster: New  <at> online <at>  directive (EJAB-1391)
- New Indonesian translation (EJAB-1407)
- Pubsub: Apply filtered notification to PEP last items (EJAB-1456)
(Continue reading)

Badlop | 1 Jun 23:01 2011
Picon

Re: [ANN] New releases: ejabberd 2.1.7, 3.0.0-alpha-3 and exmpp 0.9.7

Hi, I bring bad news:

ejabberd 2.1.7 contains a new bug that makes impossible to publish
items in PubSub.
Administrators and packagers: you can find the patch and a fixed
mod_pubsub.beam in:
https://support.process-one.net/browse/EJAB-1457

Probably ejabberd 2.1.8 will be released in the next days, and will
include this bugfix and a few other minor changes.

---
Badlop
ProcessOne
Jérôme Sautret | 3 Jun 11:38 2011
Picon

[ANN] New release: ejabberd 2.1.8

Hello,

As ejabberd 2.1.7 has an issue on PubSub preventing publication
(EJAB-1457), we release ejabberd 2.1.8 to fix the problem.

http://www.process-one.net/en/ejabberd/downloads

Only code source is available right now, the installers should be
available on Monday.

If you upgrade from ejabberd 2.0.5 or older, read carefully the
release notes of ejabberd 2.1.0 too, because there were several
changes in the installation path and the configuration options.
zhong ming wu | 9 Jun 12:38 2011
Picon

md5 sum link on process one website not working

This link

http://www.process-one.net/downloads/ejabberd/2.1.8/ejabberd-2.1.8.tar.gz.md5
Badlop | 9 Jun 16:31 2011
Picon

Re: md5 sum link on process one website not working

2011/6/9 zhong ming wu <mr.z.m.wu <at> gmail.com>:
> This link
>
> http://www.process-one.net/downloads/ejabberd/2.1.8/ejabberd-2.1.8.tar.gz.md5

Thanks for reporting. File uploaded.

---
Badlop
ProcessOne
Daniel Dormont | 10 Jun 16:48 2011

malformed XML in error response: is CDATA not really CDATA?

Hi,

Running ejabberd 2.1.6 I have found at least one instance in which ejabberd sends invalid XML in a reply to a
user. Specifically when an admin in a MUC tries to Kick a Participant whose Nick includes the "&"
character, and that nick does not exist in the room, the reply from ejabberd does not encode the "&"
character in its error message.

Example XML sent:

<iq id="dAu3j-4" to="muc_30802 <at> conference.danslaptop" type="set"><query
xmlns="http://jabber.org/protocol/muc#admin"><item nick="Daniel &amp; Dormont#1004"
role="none"><reason>Kicked by admin</reason></item></query></iq>

Example XML received back:

<iq from='muc_30802 <at> conference.danslaptop' to='admin <at> danslaptop' id='dAu3j-4'
type='error'><query xmlns='http://jabber.org/protocol/muc#admin'><item nick='Daniel &amp;
Dormont#1004' role='none'><reason>Kicked by admin</reason></item></query><error code='406'
type='modify'><not-acceptable xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/><text
xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'>Nickname Daniel & Dormont#1004 does not exist in the room</text></error></iq>

As you can see, in the reply when the nickname is mentioned as an <item> attribute it is encoded properly, but
in the <text> message it is not, and my client chokes on the malformed XML.

I looked into the code and that error uses the macro ?ERRT_NOT_ACCEPTABLE which in turn calls
STANZA_ERRORT which in turn wraps the text in an {xmlcdata} term. But when actually transmitted to the
client, the CDATA wrapper is not present nor is the text encoded. Is there anything I can do to fix this? I
don't mind changing the code, but I'm not quite sure where to look.

thanks,
(Continue reading)


Gmane