Scott Haneda | 8 May 2009 01:21
Favicon

Core dumps

Hello, I deployed a DLZ system for a client on RHEL.  This is the  
first time I have used RHEL, mostly sticking to other OS's.  I kept  
pretty good notes as I went along.

If I look in /var/named I am seeing a good deal or large core dump  
files.  Previously I saw these, but traced them to malformed mysql  
inputs to DLZ.  I do not believe that to be a issue anymore.

What can a core dump tell me to help trace this issue down and solve  
it?  Named is going deaf/dead for some reason, perhaps related, I need  
it to keep up.

Here is what I did to the RHEL to get where I am now:

uanme -r
2.6.18-128.el5

Remove outdated stuff
"yum remove bind bind-chroot bind-libs bind-utils"

Unable to build from source due to openSSL issues I could not reconcile.

I went this route:
rpm -i http://people.redhat.com/atkac/bind/bind-9.6.0-2.P1.fc11.src.rpm
yum install libtool
yum install libcap-devel
yum install openldap-devel
yum install postgresql-devel
yum install rpmbuild

(Continue reading)

Scott Haneda | 8 May 2009 03:10
Favicon

Re: Core dumps

On May 7, 2009, at 4:21 PM, Scott Haneda wrote:

> What can a core dump tell me to help trace this issue down and solve
> it?  Named is going deaf/dead for some reason, perhaps related, I need
> it to keep up.

I did a little searching and found how to look into the core dumps,  
here is what is happening.  How can I solve this?

root <at> host [core_dumps:] $ gdb /usr/sbin/named-sdb core.9810
GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html 
 >
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show  
copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
Loaded symbols for /usr/sbin/named-sdb
Reading symbols from /usr/lib64/liblwres.so.50...done.
Loaded symbols for /usr/lib64/liblwres.so.50
Reading symbols from /usr/lib64/libdns.so.50...done.
Loaded symbols for /usr/lib64/libdns.so.50
Reading symbols from /usr/lib64/libbind9.so.50...done.
Loaded symbols for /usr/lib64/libbind9.so.50
Reading symbols from /usr/lib64/libisccfg.so.50...done.
Loaded symbols for /usr/lib64/libisccfg.so.50
Reading symbols from /usr/lib64/libgssapi_krb5.so.2...done.
Loaded symbols for /usr/lib64/libgssapi_krb5.so.2
(Continue reading)

Scott Haneda | 8 May 2009 07:17
Favicon

Re: Core dumps

On May 7, 2009, at 4:21 PM, Scott Haneda wrote:

> What can a core dump tell me to help trace this issue down and solve
> it?  Named is going deaf/dead for some reason, perhaps related, I need
> it to keep up.

I just had two more happen.  This is not even a production server as  
of yet, it is being readied for that though.  There should be very  
little hitting named-sdb at this point...

(gdb) backtrace
#0  0x00002af5a089fdfb in ?? () from /usr/lib64/mysql/ 
libmysqlclient.so.15
#1  0x00002af5a08a0179 in my_net_read () from /usr/lib64/mysql/ 
libmysqlclient.so.15
#2  0x00002af5a0899922 in cli_safe_read () from /usr/lib64/mysql/ 
libmysqlclient.so.15
#3  0x00002af5a089a9f9 in ?? () from /usr/lib64/mysql/ 
libmysqlclient.so.15
#4  0x00002af5a0898f9e in mysql_real_query ()
   from /usr/lib64/mysql/libmysqlclient.so.15
#5  0x00002af59f09c67a in mysql_get_resultset (zone=0x4542f960  
"ns1.*****.com",
    record=<value optimized out>, client=0x0, query=4,  
dbdata=0x2af59f3391e0,
    rs=0x4542f918) at ../../contrib/dlz/drivers/dlz_mysql_driver.c:324
#6  0x00002af59f09c80b in mysql_findzone (driverarg=<value optimized  
out>,
    dbdata=0x2af59f3391e0, name=0x4542f960 "ns1.******.com")
    at ../../contrib/dlz/drivers/dlz_mysql_driver.c:515
(Continue reading)

Scott Haneda | 8 May 2009 07:22
Favicon

Re: Core dumps

Here is a new an interesting thing I just noticed.  I turned on mysql  
query logging to be able to see if any one query in particular was  
causing an issue.

I kept similar to this:
090508  3:09:34	     24 Query       SELECT zone FROM resource_records  
WHERE zone = 'www.a.com'
		     24 Query       SELECT zone FROM resource_records WHERE zone =  
'a.com'
		     24 Query       SELECT zone FROM resource_records WHERE zone =  
'com'

That is the result of one dig:
dig  www.a.com  <at> ns1.example.com SOA

Why is it spitting it up and trying all three?

I can make some pretty cool looking logs with this, which can not be  
nice on the DB, let alone the fact that people could use this to beat  
the heck out of any DLZ db out there...

dig a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z.com  
 <at> ns1.example.com SOA

SELECT zone FROM resource_records
	WHERE zone = 'a.b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z.com'
SELECT zone FROM resource_records
	WHERE zone = 'b.c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z.com'
SELECT zone FROM resource_records WHERE zone =  
'c.d.e.f.g.h.i.j.k.l.m.n.o.p.q.r.s.t.u.v.w.x.y.z.com'
(Continue reading)

Reyner Herrpinark Lugo | 8 May 2009 07:42
Picon

Re: Core dumps

Así es como trabaja el driver mi amigo. Si su servidor de DNS es autoritario para la zona example.com y posee
un registro de tipo A con dato 10.0.0.1:

dig a www.example.com --> se irán cortando porciones del nombre de dominio completo hasta llegar a una
zona soportada por la BD de vuestro DNS.

SELECT zone FROM resource_records WHERE zone = 'www.example.com' --> Usted no posee una zona con este
nombre, por tanto se retira "www" y se repite la consulta;
SELECT zone FROM resource_records WHERE zone = 'example.com' --> Usted sí posee una zona con este nombre,
por tanto esta zona pasa a ser reemplazada en las restantes consultas por el token %zone% y www será
reemplazado en las mismas restantes consultas por el token %record%.

Slds

 Y0.

-----Mensaje original-----
De: Scott Haneda [mailto:talklists <at> newgeo.com] 
Enviado el: viernes, 08 de mayo de 2009 7:22
Para: bind-dlz-testers <at> lists.sourceforge.net
Asunto: Re: [Bind-dlz-testers] Core dumps

Here is a new an interesting thing I just noticed.  I turned on mysql  
query logging to be able to see if any one query in particular was  
causing an issue.

I kept similar to this:
090508  3:09:34	     24 Query       SELECT zone FROM resource_records  
WHERE zone = 'www.a.com'
		     24 Query       SELECT zone FROM resource_records WHERE zone =  
(Continue reading)

Scott Haneda | 8 May 2009 09:50
Favicon

Re: Core dumps

Thanks.  Does that mean that this is normal and that even a non DLZ  
driven system is doing the same things in the background?  I have a a  
few non DLZ systems in place, with query logging on, and I have never  
seen it do so.

Why would it try so hard to locate a record?  if www.example.com has  
no zone, why not return 0 and move on?

Thanks

P.S. I used this url to translate: http://is.gd/xGHE
I hope it got the most of it correct, seems to actually have done an  
amazing job.

While I can live with this multiple query issue, and learn about it  
later, the main issue is the core dumps and DLZ/SBD going totally deaf  
over time.

On May 7, 2009, at 10:42 PM, Reyner Herrpinark Lugo wrote:

> Así es como trabaja el driver mi amigo. Si su servidor de DNS es  
> autoritario para la zona example.com y posee un registro de tipo A  
> con dato 10.0.0.1:
>
> dig a www.example.com --> se irán cortando porciones del nombre de  
> dominio completo hasta llegar a una zona soportada por la BD de  
> vuestro DNS.
>
> SELECT zone FROM resource_records WHERE zone = 'www.example.com' -->  
> Usted no posee una zona con este nombre, por tanto se retira "www" y  
(Continue reading)

Reyner Herrpinark Lugo | 8 May 2009 13:51
Picon

Re: Core dumps

Perdóname Scott por dos cosas:

1.- No había leído el hilo completo y respondí apresuradamente sin saber que no te referías al DLZ
específicamente. Mi respuesta va basada en mi experiencia con el DLZ. Al menos, así es como se trabajan
las consultas hechas por "dig". Para nuestra suerte, no creo que esto repercuta mucho, pues al final DLZ
hace uso de las interfaces SDB del bind9, y el tratamiento al parecer resulta ser el mismo en ambos casos,
cosa que considero lógico. 

>Why would it try so hard to locate a record?  if www.example.com has  
no zone, why not return 0 and move on?

 Al menos DLZ (que parece tener un comportamiento similar al RPM bind-sdb), no tiene forma de a priori
responderte si esta zona (example.com) es soportada o no dentro de todos los registros de vuestra base de
datos. DLZ sencillamente intenta ubicar una zona a la cual tu servidor de DNS es autoritaria y coincida con
parte de la sub cadena en la primera consulta. Una desventaja que le veo a esto surge en el momento de delegar
un subdominio, al menos no he podido lograrlo, si lo haces, por favor, escríbeme en el idioma que quieras.

2.- Disculpa por no responderte en inglés, pero no sé qué es menos profesional, si responderte con
errores gramaticales por usar una herramienta al igual que tú o responderte en mi idioma nativo. De igual
modo yo me esfuerzo para leer sus preguntas, así que considero que estamos a mano. :) Saludos, y espero que
no se moleste.

-----Mensaje original-----
De: Scott Haneda [mailto:talklists <at> newgeo.com]
Enviado el: vie 08/05/2009 3:50
Para: bind-dlz-testers <at> lists.sourceforge.net
Asunto: Re: [Bind-dlz-testers] Core dumps

Thanks.  Does that mean that this is normal and that even a non DLZ  
driven system is doing the same things in the background?  I have a a  
(Continue reading)

Todd Lyons | 8 May 2009 21:10
Gravatar

Re: Core dumps

On Fri, May 8, 2009 at 12:50 AM, Scott Haneda <talklists <at> newgeo.com> wrote:
> Thanks.  Does that mean that this is normal and that even a non DLZ
> driven system is doing the same things in the background?  I have a a
> few non DLZ systems in place, with query logging on, and I have never
> seen it do so.

Core dumps are not normal.  Chances are that threads are biting you.
I'll bet you don't have this set:

CentOS52[root <at> ivdns51 ~]# egrep '^OPTIONS' /etc/sysconfig/named
OPTIONS="-n 1"

Explanation from: http://bind-dlz.sourceforge.net/mysql_driver.html

 IMPORTANT NOTICE!!! READ THIS!!! IMPORTANT INFORMATION BELOW!!!

The MySQL driver has one additional limitation. MySQL uses thread
local storage in its C api. Thus MySQL requires that each thread of an
application execute a MySQL "thread initialization" to setup the
thread local storage. This is impossible to do safely while staying
within the DLZ driver API. This is a limitation caused by MySQL, and
not the DLZ API. Because of this BIND MUST only run with a single
thread when using the MySQL driver. To ensure BIND runs with a single
thread pass "-n 1" on the command line when starting BIND (named).
This should not be a limitation on most UN*X systems as BIND is
normally compiled single threaded (there are some exceptions). Even if
BIND is compiled to support threads passing "-n 1" on the command line
will cause it to use a single thread. Also, if the MySQL driver is
compiled into BIND but NOT USED then "-n 1" is not required. The "-n
1" command line parameters are only required when the MySQL driver is
(Continue reading)

Scott Haneda | 9 May 2009 00:32
Favicon

Re: Core dumps

Hi Todd, thanks for the reply, some questions below...

On May 8, 2009, at 12:10 PM, Todd Lyons wrote:

> On Fri, May 8, 2009 at 12:50 AM, Scott Haneda <talklists <at> newgeo.com>  
> wrote:
>> Thanks.  Does that mean that this is normal and that even a non DLZ
>> driven system is doing the same things in the background?  I have a a
>> few non DLZ systems in place, with query logging on, and I have never
>> seen it do so.
>
> Core dumps are not normal.  Chances are that threads are biting you.
> I'll bet you don't have this set:
>
> CentOS52[root <at> ivdns51 ~]# egrep '^OPTIONS' /etc/sysconfig/named
> OPTIONS="-n 1"

I do not have a /etc/sysconfig can you tell me where I am to be  
looking at to confirm?

> Explanation from: http://bind-dlz.sourceforge.net/mysql_driver.html

I remember reading that before I did any installs.  I thought I  
conformed to it.  There was a chart on the site, and leaving the ones  
that say have threads:

Operating System	BIND built threaded
dec-osf			true
solaris			true
ibm-aix			true
(Continue reading)

Scott Haneda | 9 May 2009 00:41
Favicon

Re: Core dumps

On May 8, 2009, at 12:10 PM, Todd Lyons wrote:

> On Fri, May 8, 2009 at 12:50 AM, Scott Haneda <talklists <at> newgeo.com>  
> wrote:
>> Thanks.  Does that mean that this is normal and that even a non DLZ
>> driven system is doing the same things in the background?  I have a a
>> few non DLZ systems in place, with query logging on, and I have never
>> seen it do so.
>
> Core dumps are not normal.  Chances are that threads are biting you.
> I'll bet you don't have this set:
>
> CentOS52[root <at> ivdns51 ~]# egrep '^OPTIONS' /etc/sysconfig/named
> OPTIONS="-n 1"

I am a little green to this Linux stuff, mostly using Launchd on OS X,  
I have an /etc/init.d/named file:

http://pastebin.com/f36ba6a2f

Looking at lines 30 - 32 woudld I just change that to append the flags?

if [ -x /usr/sbin/named-sdb ]; then
         named='named-sdb -n 1'
fi

 From /var/log/messages I have this when named-sdb starts up:
http://pastebin.com/f19d7798e

I am wondering if redoing this *without* postgres, ldap, sqlite, dir  
(Continue reading)


Gmane