Matthew Bloch | 1 Apr 12:30 2003
Picon

Freezes under network load, multiple versions

Hello all,

One of our users reported a crash under 2.4.20-3 (also under 2.4.19-50 and 
2.4.19-45).  I've managed to catch it in a gdb session with this backtrace:

#0  0xa00020a1 in munmap ()
#1  0xa0106e3d in os_unmap_memory (addr=0xa4800000, len=-1468006400)
    at process.c:128
#2  0xa010382d in flush_kernel_vm_range (start=2759852032, end=2826960896)
    at tlb.c:86
#3  0xa01038bb in flush_tlb_kernel_vm_skas () at tlb.c:117
#4  0xa00fe717 in segv (address=2760822784, ip=0, is_write=0, is_user=0, 
    sc=0xa0270274) at trap_kern.c:96
#5  0xa00feafa in segv_handler (sig=11, regs=0xa0270274) at trap_user.c:69
#6  0xa01039c0 in sig_handler_common_skas (sig=11, sc_ptr=0x58)
    at trap_user.c:33
#7  0xa00febe9 in sig_handler (sig=-1533906944, sc=
      {gs = 0, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0, ds = 43, 
__dsh = 0, edi = 2760036352, esi = 255, ebp = 2686924444, esp = 2686924356, 
ebx = 2686990384, edx = 2760036352, ecx = 2686924516, eax = 196607, trapno = 
14, err = 4, eip = 2685611314, cs = 35, __csh = 0, eflags = 66054, 
esp_at_signal = 2686924356, ss = 43, __ssh = 0, fpstate = 0x0, oldmask = 
436215808, cr2 = 2760822784})
    at trap_user.c:105
#8  <signal handler called>
#9  0xa0132d32 in cowify_req (req=0xa03c4420, dev=0xa02736e4) at 
ubd_kern.c:833
#10 0xa0132e36 in prepare_request (req=0xa03ba6c0, io_req=0xa02736e4)
    at ubd_kern.c:877
#11 0xa0132ef1 in do_ubd_request (q=0xa02736e4) at ubd_kern.c:898
(Continue reading)

Matthew Bloch | 1 Apr 12:36 2003
Picon

Re: Freezes under network load, multiple versions

On Tuesday 01 April 2003 11:30, Matthew Bloch wrote:
> Hello all,
>
> One of our users reported a crash under 2.4.20-3 (also under 2.4.19-50 and
> 2.4.19-45).  I've managed to catch it in a gdb session with this backtrace:

Sorry, "crash" is misleading, the kernel freezes, and the backtrace I posted 
is what I get after interrupting the frozen kernel.

--

-- 
Matthew Bloch                             Bytemark Hosting
                                  tel. +44 (0) 8707 455026
                        http://www.bytemark-hosting.co.uk/
          Dedicated Linux hosts from 15ukp ($26) per month

-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb: 
Dedicated Hosting for just $79/mo with 500 GB of bandwidth! 
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
Matthew Bloch | 1 Apr 12:49 2003
Picon

Re: Freezes under network load, multiple versions

On Tuesday 01 April 2003 11:36, you wrote:
> On Tuesday 01 April 2003 11:30, Matthew Bloch wrote:
> > Hello all,
> >
> > One of our users reported a crash under 2.4.20-3 (also under 2.4.19-50
> > and 2.4.19-45).  I've managed to catch it in a gdb session with this
> > backtrace:
>
> Sorry, "crash" is misleading, the kernel freezes, and the backtrace I
> posted is what I get after interrupting the frozen kernel.

And I have a short memory for these things-- looks like I reported the exact 
same bug back on 12th December regarding 2.4.19-36.  In answer to your 
question back then, Jeff:

> The fault address is 2760798208 (== 0xa48e7000).  If you still have that 
> running, or can make it happen again, can you see what /proc/≤pid>/maps says 
> about that page.
> 
> For some reason, UML seems to be faulting on that page without fixing it
> up, leading to an infinite loop of segfaults, and a hang.

Take a look at:

   http://www.bytemark-hosting.co.uk/proc-maps.txt

And let me know if there's any more diagnostics I can provide, like I said, 
I've kept the session running this time.

--

-- 
(Continue reading)

Lynn Kerby | 1 Apr 20:19 2003
Picon

Re: Freezes under network load, multiple versions


On 2003.04.01 02:30 Matthew Bloch wrote:
>Hello all,
>
>One of our users reported a crash under 2.4.20-3 (also under 2.4.19-50 and 
>2.4.19-45).  I've managed to catch it in a gdb session with this backtrace:
>
>#0  0xa00020a1 in munmap ()
>#1  0xa0106e3d in os_unmap_memory (addr=0xa4800000, len=-1468006400)
>    at process.c:128
>#2  0xa010382d in flush_kernel_vm_range (start=2759852032, end=2826960896)
>    at tlb.c:86
>#3  0xa01038bb in flush_tlb_kernel_vm_skas () at tlb.c:117
>#4  0xa00fe717 in segv (address=2760822784, ip=0, is_write=0, is_user=0, 
>    sc=0xa0270274) at trap_kern.c:96
>#5  0xa00feafa in segv_handler (sig=11, regs=0xa0270274) at trap_user.c:69
>#6  0xa01039c0 in sig_handler_common_skas (sig=11, sc_ptr=0x58)
>    at trap_user.c:33
>#7  0xa00febe9 in sig_handler (sig=-1533906944, sc=
>      {gs = 0, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0, ds = 43, 
>__dsh = 0, edi = 2760036352, esi = 255, ebp = 2686924444, esp = 2686924356, 
>ebx = 2686990384, edx = 2760036352, ecx = 2686924516, eax = 196607, trapno = 
>14, err = 4, eip = 2685611314, cs = 35, __csh = 0, eflags = 66054, 
>esp_at_signal = 2686924356, ss = 43, __ssh = 0, fpstate = 0x0, oldmask = 
>436215808, cr2 = 2760822784})
>    at trap_user.c:105
>#8  <signal handler called>
>#9  0xa0132d32 in cowify_req (req=0xa03c4420, dev=0xa02736e4) at 
>ubd_kern.c:833
>#10 0xa0132e36 in prepare_request (req=0xa03ba6c0, io_req=0xa02736e4)
(Continue reading)

Jon Smirl | 1 Apr 22:43 2003
Picon

VPCI patches

Here's a copy of the VPCI patches for everyone that
has been asking. 

=====
Jon Smirl
jonsmirl <at> yahoo.com

__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://platinum.yahoo.com
Attachment (vpci.tar.bz2): application/x-bzip2, 46 KiB
Matthew Bloch | 2 Apr 00:09 2003
Picon

Re: Freezes under network load, multiple versions

On Tuesday 01 April 2003 19:19, Lynn Kerby wrote:
> On 2003.04.01 02:30 Matthew Bloch wrote:
> >Hello all,
> >
> >One of our users reported a crash under 2.4.20-3 (also under 2.4.19-50 and
> >2.4.19-45).  I've managed to catch it in a gdb session with this
> > backtrace:
> >
> >#0  0xa00020a1 in munmap ()
> >#1  0xa0106e3d in os_unmap_memory (addr=0xa4800000, len=-1468006400)
> >    at process.c:128
> >#2  0xa010382d in flush_kernel_vm_range (start=2759852032, end=2826960896)
> >    at tlb.c:86
> >#3  0xa01038bb in flush_tlb_kernel_vm_skas () at tlb.c:117
> >#4  0xa00fe717 in segv (address=2760822784, ip=0, is_write=0, is_user=0,
> >    sc=0xa0270274) at trap_kern.c:96
> >#5  0xa00feafa in segv_handler (sig=11, regs=0xa0270274) at trap_user.c:69
> >#6  0xa01039c0 in sig_handler_common_skas (sig=11, sc_ptr=0x58)
> >    at trap_user.c:33
> >#7  0xa00febe9 in sig_handler (sig=-1533906944, sc=
> >      {gs = 0, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0, ds = 43,
> >__dsh = 0, edi = 2760036352, esi = 255, ebp = 2686924444, esp =
> > 2686924356, ebx = 2686990384, edx = 2760036352, ecx = 2686924516, eax =
> > 196607, trapno = 14, err = 4, eip = 2685611314, cs = 35, __csh = 0,
> > eflags = 66054, esp_at_signal = 2686924356, ss = 43, __ssh = 0, fpstate =
> > 0x0, oldmask = 436215808, cr2 = 2760822784})
> >    at trap_user.c:105
> >#8  <signal handler called>
> >#9  0xa0132d32 in cowify_req (req=0xa03c4420, dev=0xa02736e4) at
> >ubd_kern.c:833
(Continue reading)

Lynn Kerby | 2 Apr 09:55 2003
Picon

Re: Freezes under network load, multiple versions


On 2003.04.01 14:09 Matthew Bloch wrote:
>
>Do either of these look out of the ordinary?  I'm not best placed to judge :)
>
>(gdb) print *req
>$2 = {op = 2688347840, fds = {-1607052328, 0}, offsets = {1, 25120}, offset = 
>1, length = 14746, buffer = 0x5fffe0 <Address 0x5fffe0 out of bounds>, 
>  sectorsize = 32, sector_mask = 6291424, cow_offset = 32, bitmap_words = {16, 
>1}, error = 8}
>
>(gdb) print *dev
>$3 = {file = 0x1 <Address 0x1 out of bounds>, count = 16, fd = 15, size = 
>3397490929827840, boot_openflags = {r = 0, w = 0, s = 0, c = 0, t = 0, a = 0, 
>e = 0, 
>    cl = 0}, openflags = {r = 0, w = 0, s = 0, c = 0, t = 0, a = 0, e = 0, cl 
>= 0}, devfs = 0x1000, cow = {
>    file = 0xa14f9000

>"{\rmv_\034\034\036\234ý\025E\217W\agGûïÞE¯\216O£Ýèd÷ôì`ïýáîitòþôäøÝ>ÚaÓT\001%\020\t\001Ë", 
>fd = 512, bitmap = 0xff, bitmap_len = 196607, 
>    bitmap_offset = 255, data_offset = 0}}

They both look quite out of the ordinary to me.  Until I built my kernel with '-g'
I had all kinds of problems getting gdb to print valid data.  Maybe you can get
the unformatted data simply by dumping the addresses?  The structures are only
60 bytes long each but the example below dumps 80 (20 words).  The addresses are
taken from your stack traceback.

x/20xw 0xa03c4420
(Continue reading)

Matthew Bloch | 2 Apr 10:16 2003
Picon

Re: Freezes under network load, multiple versions

On Wednesday 02 April 2003 08:55, Lynn Kerby wrote:
> On 2003.04.01 14:09 Matthew Bloch wrote:
> >Do either of these look out of the ordinary?  I'm not best placed to judge
> > :)
> >
[snip]
>
> They both look quite out of the ordinary to me.  Until I built my kernel
> with '-g' I had all kinds of problems getting gdb to print valid data. 

Thanks, I'll try that on our next kernel build.  I can get this crash to 
happen on demand so if it makes diagnosis easier I can get another strack 
trace from a kernel built with -g.

> Maybe you can get the unformatted data simply by dumping the addresses? 
> The structures are only 60 bytes long each but the example below dumps 80
> (20 words).  The addresses are taken from your stack traceback.
>
> x/20xw 0xa03c4420
> x/20xw 0xa02736e4

(gdb) x/20xw 0xa03c4420
0xa03c4420:     0xa03ceec0      0xa03653d8      0x00000000      0x00000001
0xa03c4430:     0x00006220      0x00000001      0x00000000      0x0000399a
0xa03c4440:     0x005fffe0      0x00000020      0x005fffe0      0x00000020
0xa03c4450:     0x00000010      0x00000001      0x00000008      0x00000008
0xa03c4460:     0x00000000      0xa14f9000      0x00000000      0xa1038260

(gdb) x/20xw 0xa02736e4
0xa0273734 <init_task_union+14132>:     0x00000000      0xa0273754      
(Continue reading)

Lynn Kerby | 2 Apr 21:01 2003
Picon

Re: Freezes under network load, multiple versions


On 2003.04.02 00:16 Matthew Bloch wrote:
>On Wednesday 02 April 2003 08:55, Lynn Kerby wrote:
>> On 2003.04.01 14:09 Matthew Bloch wrote:
>> >Do either of these look out of the ordinary?  I'm not best placed to judge
>> > :)
>> >
>[snip]
>>
>> They both look quite out of the ordinary to me.  Until I built my kernel
>> with '-g' I had all kinds of problems getting gdb to print valid data. 
>
>Thanks, I'll try that on our next kernel build.  I can get this crash to 
>happen on demand so if it makes diagnosis easier I can get another strack 
>trace from a kernel built with -g.
>
>> Maybe you can get the unformatted data simply by dumping the addresses? 
>> The structures are only 60 bytes long each but the example below dumps 80
>> (20 words).  The addresses are taken from your stack traceback.
>>
>> x/20xw 0xa03c4420
>> x/20xw 0xa02736e4
>
>(gdb) x/20xw 0xa03c4420
>0xa03c4420:     0xa03ceec0      0xa03653d8      0x00000000      0x00000001
>0xa03c4430:     0x00006220      0x00000001      0x00000000      0x0000399a
>0xa03c4440:     0x005fffe0      0x00000020      0x005fffe0      0x00000020
>0xa03c4450:     0x00000010      0x00000001      0x00000008      0x00000008
>0xa03c4460:     0x00000000      0xa14f9000      0x00000000      0xa1038260
>
(Continue reading)

M A Young | 3 Apr 19:45 2003
Picon
Picon

NPTL and TLS in UML?

How well integrated are the Native POSIX Thread Library and Thread-local
storage in UML? I have looking at getting a RedHat 9 kernel working, which
has these features added on to 2.4, but judging by RedHat's patch to the
i386 code (assuming I understanding it correctly)  there are bits missing
even in the 2.5 UML code.

	Michael Young

-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb: 
Dedicated Hosting for just $79/mo with 500 GB of bandwidth! 
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/

Gmane