Evren Yurtesen | 1 Oct 02:01 2004
Picon

Re: panic: sorele

Well, I only have squid in my machine and although I have 1 cpu. I am 
using hyperthreading. So probably the problem is associated with SMP as 
Vlad mentioned. I didnt try to compile kernel without SMP but I will try 
it next time my proxy crash. I also do not want anymore crashes on my 
production server.

I have lots of parts of the GENERIC kernel conf file commented out but 
these are my additions to GENERIC kernel below.

Funny coincidence because I also mingled with process size things in my 
box as Vlad did...Might be something about those?

My squid process is about 1500mbyte now so... I needed to increase the 
maximum process size and I needed to adjust shared memory stuff because 
of the requirements of diskd of squid.

By the way, what is the difference between
options         MAXDSIZ="(2048UL*1024*1024)"
and
options         MAXDSIZ="(850*1024*1024)"
I mean the UL part :)

Evren

#My Additions

# ACPI support
device          acpi

# To include support for VESA video modes
(Continue reading)

Tristan | 1 Oct 02:05 2004
Picon
Picon

ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=xxx


FreeBSD 6.0-CURRENT with GENERIC kernel built 27th Sep.
On a SunBlade 100 I see these messages regularly when
dma is enabled. The messages go away when I either use
atacontrol to set the mode to PIO4 or set hw.ata.ata_dma to 0
I do get data corruption on the disk if left in DMA mode.

dmesg:

GDB: no debug ports present
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2004 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 6.0-CURRENT #0: Mon Sep 27 14:47:16 CST 2004
    xxx <at> xxx.xxx:/usr/obj/usr/src/sys/GENERIC
WARNING: WITNESS option enabled, expect reduced performance.
Timecounter "tick" frequency 502000000 Hz quality 0
real memory  = 671088640 (640 MB)
avail memory = 641736704 (612 MB)
cpu0: Sun Microsystems UltraSparc-IIe Processor (502.00 MHz CPU)
nexus0: <Open Firmware Nexus device>
pcib0: <U2P UPA-PCI bridge> on nexus0
pcib0: Sabre (US-IIe) compatible, impl 0, version 0, ign 0x7c0, bus A
pcib0: [FAST]
pcib0: [GIANT-LOCKED]
pcib0: [FAST]
pcib0: [GIANT-LOCKED]
pcib0 dvma: DVMA map: 0xc0000000 to 0xc3ffffff
(Continue reading)

Ken Smith | 1 Oct 02:52 2004

Re: ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=xxx

On Fri, Oct 01, 2004 at 09:35:15AM +0930, Tristan wrote:

> FreeBSD 6.0-CURRENT with GENERIC kernel built 27th Sep.
> On a SunBlade 100 I see these messages regularly when
> dma is enabled. The messages go away when I either use
> atacontrol to set the mode to PIO4 or set hw.ata.ata_dma to 0
> I do get data corruption on the disk if left in DMA mode.

Just FYI my primary test machine is a SunBlade 100, it seems to
be doing OK with a kernel built from this morning's source.  I've
been doing most of my builds from an NFS server though, I'll do
a check with a full buildworld which will use the local drive more.

> ad0: 14594MB <ST315310A/3.28> [29651/16/63] at ata2-master UDMA66
> acd0: CDRW <LTN486S/YSU1> at ata2-slave PIO4
> ata3-master: DMA limited to UDMA33, non-ATA66 cable or device
> ad1: 39266MB <IBM-DTLA-305040/TW4OA60A> [79780/16/63] at ata3-master UDMA33
> Mounted root from ufs:/dev/ad0a.

Is the data corruption spread across both drives, or just ad1?  That
message about the cable or device being limited could be a clue.

--

-- 
						Ken Smith
- From there to here, from here to      |       kensmith <at> cse.buffalo.edu
  there, funny things are everywhere.   |
                      - Theodore Geisel |
_______________________________________________
freebsd-current <at> freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
(Continue reading)

Tristan | 1 Oct 02:58 2004
Picon
Picon

Re: ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=xxx

On Thu, 30 Sep 2004 20:52:27 -0400
Ken Smith <kensmith <at> cse.Buffalo.EDU> wrote:
> On Fri, Oct 01, 2004 at 09:35:15AM +0930, Tristan wrote:
> 
> > FreeBSD 6.0-CURRENT with GENERIC kernel built 27th Sep.
> > On a SunBlade 100 I see these messages regularly when
> > dma is enabled. The messages go away when I either use
> > atacontrol to set the mode to PIO4 or set hw.ata.ata_dma to 0
> > I do get data corruption on the disk if left in DMA mode.
> 
> Just FYI my primary test machine is a SunBlade 100, it seems to
> be doing OK with a kernel built from this morning's source.  I've
> been doing most of my builds from an NFS server though, I'll do
> a check with a full buildworld which will use the local drive more.
> 
> > ad0: 14594MB <ST315310A/3.28> [29651/16/63] at ata2-master UDMA66
> > acd0: CDRW <LTN486S/YSU1> at ata2-slave PIO4
> > ata3-master: DMA limited to UDMA33, non-ATA66 cable or device
> > ad1: 39266MB <IBM-DTLA-305040/TW4OA60A> [79780/16/63] at ata3-master UDMA33
> > Mounted root from ufs:/dev/ad0a.
> 
> Is the data corruption spread across both drives, or just ad1?  That
> message about the cable or device being limited could be a clue.

nope, its only on ad0. ad1 seems to be fine, however it is hardly used.
_______________________________________________
freebsd-current <at> freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe <at> freebsd.org"

(Continue reading)

Scott Long | 1 Oct 03:14 2004
Picon

Re: ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=xxx

Tristan wrote:
> On Thu, 30 Sep 2004 20:52:27 -0400
> Ken Smith <kensmith <at> cse.Buffalo.EDU> wrote:
> 
>>On Fri, Oct 01, 2004 at 09:35:15AM +0930, Tristan wrote:
>>
>>
>>>FreeBSD 6.0-CURRENT with GENERIC kernel built 27th Sep.
>>>On a SunBlade 100 I see these messages regularly when
>>>dma is enabled. The messages go away when I either use
>>>atacontrol to set the mode to PIO4 or set hw.ata.ata_dma to 0
>>>I do get data corruption on the disk if left in DMA mode.
>>
>>Just FYI my primary test machine is a SunBlade 100, it seems to
>>be doing OK with a kernel built from this morning's source.  I've
>>been doing most of my builds from an NFS server though, I'll do
>>a check with a full buildworld which will use the local drive more.
>>
>>
>>>ad0: 14594MB <ST315310A/3.28> [29651/16/63] at ata2-master UDMA66
>>>acd0: CDRW <LTN486S/YSU1> at ata2-slave PIO4
>>>ata3-master: DMA limited to UDMA33, non-ATA66 cable or device
>>>ad1: 39266MB <IBM-DTLA-305040/TW4OA60A> [79780/16/63] at ata3-master UDMA33
>>>Mounted root from ufs:/dev/ad0a.
>>
>>Is the data corruption spread across both drives, or just ad1?  That
>>message about the cable or device being limited could be a clue.
> 
> 
> nope, its only on ad0. ad1 seems to be fine, however it is hardly used.
(Continue reading)

Marius Strobl | 1 Oct 03:28 2004
Picon

Re: ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=xxx

On Thu, Sep 30, 2004 at 08:52:27PM -0400, Ken Smith wrote:
> On Fri, Oct 01, 2004 at 09:35:15AM +0930, Tristan wrote:
> 
> > FreeBSD 6.0-CURRENT with GENERIC kernel built 27th Sep.
> > On a SunBlade 100 I see these messages regularly when
> > dma is enabled. The messages go away when I either use
> > atacontrol to set the mode to PIO4 or set hw.ata.ata_dma to 0
> > I do get data corruption on the disk if left in DMA mode.
> 
> Just FYI my primary test machine is a SunBlade 100, it seems to
> be doing OK with a kernel built from this morning's source.  I've
> been doing most of my builds from an NFS server though, I'll do
> a check with a full buildworld which will use the local drive more.
> 
> > ad0: 14594MB <ST315310A/3.28> [29651/16/63] at ata2-master UDMA66
> > acd0: CDRW <LTN486S/YSU1> at ata2-slave PIO4
> > ata3-master: DMA limited to UDMA33, non-ATA66 cable or device
> > ad1: 39266MB <IBM-DTLA-305040/TW4OA60A> [79780/16/63] at ata3-master UDMA33
> > Mounted root from ufs:/dev/ad0a.
> 
> Is the data corruption spread across both drives, or just ad1?  That
> message about the cable or device being limited could be a clue.
> 

Did you replace the cable of the primary channel? AFAIK on Blade 100
there's a hardware bug that causes data corruption when using UDMA66
and Sun ships them with a 40-pin cable as sort of a work-around. So
these non-ATA66 cable messages should be rather normal on Blade 100.
Not all revisions might be affected though.

(Continue reading)

Tristan | 1 Oct 03:51 2004
Picon
Picon

Re: ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=xxx

On Fri, 1 Oct 2004 03:28:38 +0200
Marius Strobl <marius <at> alchemy.franken.de> wrote:

> On Thu, Sep 30, 2004 at 08:52:27PM -0400, Ken Smith wrote:
> > On Fri, Oct 01, 2004 at 09:35:15AM +0930, Tristan wrote:
> > 
> > > FreeBSD 6.0-CURRENT with GENERIC kernel built 27th Sep.
> > > On a SunBlade 100 I see these messages regularly when
> > > dma is enabled. The messages go away when I either use
> > > atacontrol to set the mode to PIO4 or set hw.ata.ata_dma to 0
> > > I do get data corruption on the disk if left in DMA mode.
> > 
> > Just FYI my primary test machine is a SunBlade 100, it seems to
> > be doing OK with a kernel built from this morning's source.  I've
> > been doing most of my builds from an NFS server though, I'll do
> > a check with a full buildworld which will use the local drive more.
> > 
> > > ad0: 14594MB <ST315310A/3.28> [29651/16/63] at ata2-master UDMA66
> > > acd0: CDRW <LTN486S/YSU1> at ata2-slave PIO4
> > > ata3-master: DMA limited to UDMA33, non-ATA66 cable or device
> > > ad1: 39266MB <IBM-DTLA-305040/TW4OA60A> [79780/16/63] at ata3-master UDMA33
> > > Mounted root from ufs:/dev/ad0a.
> > 
> > Is the data corruption spread across both drives, or just ad1?  That
> > message about the cable or device being limited could be a clue.
> > 
> 
> Did you replace the cable of the primary channel? AFAIK on Blade 100
> there's a hardware bug that causes data corruption when using UDMA66
> and Sun ships them with a 40-pin cable as sort of a work-around. So
(Continue reading)

Scott Long | 1 Oct 03:54 2004
Picon

Re: ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=xxx

Tristan wrote:
> On Fri, 1 Oct 2004 03:28:38 +0200
> Marius Strobl <marius <at> alchemy.franken.de> wrote:
> 
> 
>>On Thu, Sep 30, 2004 at 08:52:27PM -0400, Ken Smith wrote:
>>
>>>On Fri, Oct 01, 2004 at 09:35:15AM +0930, Tristan wrote:
>>>
>>>
>>>>FreeBSD 6.0-CURRENT with GENERIC kernel built 27th Sep.
>>>>On a SunBlade 100 I see these messages regularly when
>>>>dma is enabled. The messages go away when I either use
>>>>atacontrol to set the mode to PIO4 or set hw.ata.ata_dma to 0
>>>>I do get data corruption on the disk if left in DMA mode.
>>>
>>>Just FYI my primary test machine is a SunBlade 100, it seems to
>>>be doing OK with a kernel built from this morning's source.  I've
>>>been doing most of my builds from an NFS server though, I'll do
>>>a check with a full buildworld which will use the local drive more.
>>>
>>>
>>>>ad0: 14594MB <ST315310A/3.28> [29651/16/63] at ata2-master UDMA66
>>>>acd0: CDRW <LTN486S/YSU1> at ata2-slave PIO4
>>>>ata3-master: DMA limited to UDMA33, non-ATA66 cable or device
>>>>ad1: 39266MB <IBM-DTLA-305040/TW4OA60A> [79780/16/63] at ata3-master UDMA33
>>>>Mounted root from ufs:/dev/ad0a.
>>>
>>>Is the data corruption spread across both drives, or just ad1?  That
>>>message about the cable or device being limited could be a clue.
(Continue reading)

Marc G. Fournier | 1 Oct 04:47 2004

Re: fsck_ffs patch testers wanted


Would you be willing to post a 4.x version of this, or send me one, that I 
can test, since I'm the one that seems lucky to get the "glacially slow" 
fsck's :(

Also, some sort of "what I should be watching for" would be nice, if 
anything ... I've got my remote techs "trained" so that they can get me 
into single user mode so that I can watch fsck using ctl-t, so I can 
install this as a seperate fsck and manually test it as required ...

On Thu, 30 Sep 2004, Don Lewis wrote:

> I posted an earlier version of the patch below to current <at>  for review a
> few weeks ago.  This patch does not (or at least should not) change the
> functional behaviour of fsck_ffs, and it is functionally equivalent to
> the previous version of the patch.
>
> The current implementation of fsck_ffs puts inodes with an initial link
> count of zero on a linked list so that the inodes can be cleared later
> if their link counts are not adjusted upwards.  This can cause fsck pass
> 4 to become glacially slow if this list becomes large because there is a
> sequential search of the list as each inode is processed in pass 4 to
> see if each inode is on the list.
>
> This patch fixes the performance problem by eliminating the list and
> encoding whether or not the initial link count was zero in the inode
> state.
>
> This patch has been reviewed, and I'm running it on my -CURRENT machine
> (where fsck_ffs doesn't normally get much exercise), but due to the
(Continue reading)

Ken Smith | 1 Oct 04:58 2004

Re: ad0: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=xxx

On Thu, Sep 30, 2004 at 07:54:31PM -0600, Scott Long wrote:
> Tristan wrote:
> >On Fri, 1 Oct 2004 03:28:38 +0200
> >Marius Strobl <marius <at> alchemy.franken.de> wrote:
> >>On Thu, Sep 30, 2004 at 08:52:27PM -0400, Ken Smith wrote:
> >>>On Fri, Oct 01, 2004 at 09:35:15AM +0930, Tristan wrote:
> >>>>FreeBSD 6.0-CURRENT with GENERIC kernel built 27th Sep.
> >>>>On a SunBlade 100 I see these messages regularly when
> >>>>dma is enabled. The messages go away when I either use
> >>>>atacontrol to set the mode to PIO4 or set hw.ata.ata_dma to 0
> >>>>I do get data corruption on the disk if left in DMA mode.
> >>>
> >>>Just FYI my primary test machine is a SunBlade 100, it seems to
> >>>be doing OK with a kernel built from this morning's source.  I've
> >>>been doing most of my builds from an NFS server though, I'll do
> >>>a check with a full buildworld which will use the local drive more.
> >>>
> >>>>ad0: 14594MB <ST315310A/3.28> [29651/16/63] at ata2-master UDMA66
> >>>>acd0: CDRW <LTN486S/YSU1> at ata2-slave PIO4
> >>>>ata3-master: DMA limited to UDMA33, non-ATA66 cable or device
> >>>>ad1: 39266MB <IBM-DTLA-305040/TW4OA60A> [79780/16/63] at ata3-master 
> >>>>UDMA33
> >>>>Mounted root from ufs:/dev/ad0a.
> 
> Detaching the CDROM would be an easy first test.  The CDROM is likely
> wanting to use WDMA mode which itself can be problematic (and is why
> FreeBSD turns it off by default).

Just for reference, this is pieces from a machine I've got here
that doesn't *seem* to be having problems:
(Continue reading)


Gmane