Christian Pernegger | 1 Jul 2006 08:42
Picon

Problems with ICH7R (AHCI) and WD5000YS

Hi list!

I have the following configuration:

4x WD5000YS (Caviar RE2 500 GB) on the SATA ports of an Intel
SE7230NH1-E. The controller is an ICH7R in AHCI mode. Kernel is
2.6.17-1-686 from Debian testing. No jumpers on the drives, which
should be PM2 (whatever that is) off and 300Gb/s mode. The RE series
has special firmware with reduced error correction timeouts --
supposedly the RAID controller (libata and/or Linux Software-RAID in
my case) should take care of errors anyway.

All drives pass the advanced test of WD's diagnostic tool
individually, however when I tried to create an md array I got
something like this:

[...]
ata2: port reset, p_is 8000000 is 2 pis 0 cmd 44017 tf d0 ss 123 se 0
ata2: status=0x50 { DriveReady SeekComplete }
sdc: Current: sense key: No Sense
   Additional sense: No additional sense information
ata2: handling error/timeout
ata2: port reset, p_is 0 is 0 pis 0 cmd 44017 tf 150 ss 123 se 0
ata2: status=0x50 { DriveReady SeekComplete }
ata2: error=0x01 { AddrMarkNotFound }
sdc: Current: sense key: No Sense
   Additional sense: No additional sense information
[repeat]

Those messages loop until a timeout of some sort (a few minutes) is
(Continue reading)

Michael Hanselmann | 1 Jul 2006 12:01
Picon

Re: Parking hard disk head from drivers

On Thu, Jun 22, 2006 at 04:05:24PM -0400, Jeff Garzik wrote:
> If I had to guess, I would say use a notifier...

Thanks, that way I finally came up with something.

Below you find the HD parking part of my driver. I'll be really thankful
for any comments, suggestions and (constructive) critics on it. It's not
like I fully understood what the IDE layer does, but I looked at the
power management code and implemented something like it.

It does what it should on my PowerBook, but maybe it's totally wrong.

---
diff -Nrup --exclude-from linux-exclude-from linux-2.6.17.orig/block/hdpark.c linux-2.6.17/block/hdpark.c
--- linux-2.6.17.orig/block/hdpark.c	1970-01-01 01:00:00.000000000 +0100
+++ linux-2.6.17/block/hdpark.c	2006-07-01 01:23:39.000000000 +0200
 <at>  <at>  -0,0 +1,60  <at>  <at> 
+/*
+ * Generic code for hard disk head parking
+ *
+ * Copyright (C) 2006 Michael Hanselmann (linux-kernel <at> hansmi.ch)
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/module.h>
+#include <linux/types.h>
(Continue reading)

Jochen Heuer | 2 Jul 2006 02:07
Picon
Picon

Re: Bug or HW? ata1: command timeout, stat 0x50 host_stat 0x4

On Wed, Jun 28, 2006 at 10:09:12AM +0200, Jochen Heuer wrote:
[...]
> Patch is installed and system is running:
> 
> libata version 1.30 loaded.
> sata_via 0000:00:0f.0: version 1.2
> ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 18
> sata_via 0000:00:0f.0: routed to hard irq line 10
> ata1: SATA max UDMA/133 cmd 0xC000 ctl 0xB802 bmdma 0xA800 irq 18
> ata2: SATA max UDMA/133 cmd 0xB400 ctl 0xB002 bmdma 0xA808 irq 18
> scsi0 : sata_via
> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> ata1.00: configured for UDMA/133
> scsi1 : sata_via
> ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> ata2.00: configured for UDMA/133
>   Vendor: ATA       Model: SAMSUNG SP2504C   Rev: VT10
>   Type:   Direct-Access                      ANSI SCSI revision: 05
>   Vendor: ATA       Model: ST3200822AS       Rev: 3.01
>   Type:   Direct-Access                      ANSI SCSI revision: 05
> 
> I will monitor this for a some time. And thanks for the quick response!

Hi Tejun,

the system did hang again twice even with the new libata driver but I am not
sure if this really has something to do without. Somehow I am starting to
believe that the problem is somehow IRQ related. Today the network card (skge
driver) did not respond anymore (NETDEV WATCHDOG: eth0: transmit timed out) and
the interrupts did not count up anymore (even though the card still had a
(Continue reading)

matthieu castet | 2 Jul 2006 10:36
Picon
Favicon

Re: + via-pata-controller-xfer-fixes.patch added to -mm tree

Hi Albert,

Albert Lee wrote:
> castet.matthieu <at> free.fr wrote:
> 
>>
>>>Could you please test the current libata-upstream tree and
>>>turn on ATA_DEBUG and ATA_VERBOSE_DEBUG in include/linux/libata.h.
>>>
>>
>>Is there a easy way to get libata-upstream tree ?
>>Do I need to install git for that or there are some snapshots somewhere ?
>>
>>
> 
> 
> Hi Matthieu,
> 
> Tejun has a patch against 2.6.17:
> http://home-tj.org/files/libata-tj-stable/libata-tj-2.6.17-20060625-1.tar.bz2
> 
I don't know if I did someting wrong, but it didn't apply cleanly.
So I enable the trace on lastest -mm kernel and I disable the via quirk.

But the printk in the interrupt handler takes some times and hides the 
altstatus delay.

I will try to send you a trace, where I move the printk at the end of 
the interrupt handler.

(Continue reading)

matthieu castet | 2 Jul 2006 12:32
Picon
Favicon

Re: + via-pata-controller-xfer-fixes.patch added to -mm tree

matthieu castet wrote:
> Hi Albert,
> 
> Albert Lee wrote:
> 
>> castet.matthieu <at> free.fr wrote:
>>
>>>
>>>> Could you please test the current libata-upstream tree and
>>>> turn on ATA_DEBUG and ATA_VERBOSE_DEBUG in include/linux/libata.h.
>>>>
>>>
>>> Is there a easy way to get libata-upstream tree ?
>>> Do I need to install git for that or there are some snapshots 
>>> somewhere ?
>>>
>>>
>>
>>
>> Hi Matthieu,
>>
>> Tejun has a patch against 2.6.17:
>> http://home-tj.org/files/libata-tj-stable/libata-tj-2.6.17-20060625-1.tar.bz2 
>>
>>
> I don't know if I did someting wrong, but it didn't apply cleanly.
> So I enable the trace on lastest -mm kernel and I disable the via quirk.
> 
> But the printk in the interrupt handler takes some times and hides the 
> altstatus delay.
(Continue reading)

Albert Lee | 2 Jul 2006 14:46
Picon
Favicon

Re: + via-pata-controller-xfer-fixes.patch added to -mm tree

matthieu castet wrote:
> matthieu castet wrote:
> 
>> Hi Albert,
>>
>> Albert Lee wrote:
>>
>>> castet.matthieu <at> free.fr wrote:
>>>
>>>>
>>>>> Could you please test the current libata-upstream tree and
>>>>> turn on ATA_DEBUG and ATA_VERBOSE_DEBUG in include/linux/libata.h.
>>>>>
>>>>
>>>> Is there a easy way to get libata-upstream tree ?
>>>> Do I need to install git for that or there are some snapshots
>>>> somewhere ?
>>>>
>>>>
>>>
>>>
>>> Hi Matthieu,
>>>
>>> Tejun has a patch against 2.6.17:
>>> http://home-tj.org/files/libata-tj-stable/libata-tj-2.6.17-20060625-1.tar.bz2
>>>
>>>
>> I don't know if I did someting wrong, but it didn't apply cleanly.
>> So I enable the trace on lastest -mm kernel and I disable the via quirk.
>>
(Continue reading)

matthieu castet | 2 Jul 2006 15:06
Picon
Favicon

Re: + via-pata-controller-xfer-fixes.patch added to -mm tree

Hi,

Albert Lee wrote:
> Hi Matthieu,
> 
> Thanks for the log. But could you please keep the 
> VPRINTK() in the entrance of ata_host_intr()
If I do that, everything works correctly : the printk should take more 
than 3 us, and the altsatus is not busy when we read it.
Here is the log without moving the printk : 
http://castet.matthieu.free.fr/tmp/ata_log.orig

The only thing I could do is to move the printk between altstatus and 
status check and add one in idle_irq.

Will it be usefull for you ?

Matthieu.

-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Albert Lee | 2 Jul 2006 15:59
Picon
Favicon

Re: + via-pata-controller-xfer-fixes.patch added to -mm tree

Alan Cox wrote:
> Ar Gwe, 2006-06-30 am 15:09 +0800, ysgrifennodd Albert Lee:
> 
>>If it is the problem of the specific ATAPI device, all controllers
>>should be affected, not only VIA. So, strange not seeing the problem on
>>Promise.
> 
> 
> That may be because of the way the chips handle buffering of interrupt
> delivery and readahead/writebehind. I have two traces on the ALi
> chipsets that look like the delayed response problem.
> 
> 
> 

Understood. Thanks for the explanation. Checked Matthieu's log, and yes
it does look like early interrupt. Matthieu's Sil680 has no such problem.
Also the problem is not reproducible with the same CD-RW drive on my
Promise 20275 chip. So, the explanation makes sense.

BTW, even for VIA, the early irq problem occur on 'set features - xfer mode'
but IDENTIFY works ok. Just curious, does the ALi chip have the same
symptom? i.e. Besides the 'set features' command, are there any other
commands affected by the early irq problem? Say, any other PIO non-data
commands?

--
albert

(The relevant part from Matthieu's log.)
(Continue reading)

Albert Lee | 2 Jul 2006 16:17
Picon
Favicon

Re: + via-pata-controller-xfer-fixes.patch added to -mm tree

matthieu castet wrote:
> Hi,
> 
> Albert Lee wrote:
> 
>> Hi Matthieu,
>>
>> Thanks for the log. But could you please keep the VPRINTK() in the
>> entrance of ata_host_intr()
> 
> If I do that, everything works correctly : the printk should take more
> than 3 us, and the altsatus is not busy when we read it.
> Here is the log without moving the printk :
> http://castet.matthieu.free.fr/tmp/ata_log.orig

Hmm, the Uncertainty principle also applies to kernel debugging. :)

> 
> The only thing I could do is to move the printk between altstatus and
> status check and add one in idle_irq.
> 
> Will it be usefull for you ?
> 
> 
> Matthieu.
> 
> 

From your previous log, the timeout transacation is clearly logged
and it does look like early irq. Can compare/see both timeout and normal
(Continue reading)

Tejun Heo | 2 Jul 2006 19:54
Picon

[PATCH] libata: fix ehc->i.action setting in ata_eh_autopsy()

ata_eh_autopsy() used to directly assign determined action mask to
ehc->i.action thus overriding actions set by some of nested analyze
functions.  This patch makes ata_eh_autopsy() add action masks just as
it's done in other places.

Signed-off-by: Tejun Heo <htejun <at> gmail.com>

---
Jeff, this doesn't cause too much trouble but it would be nice to have
this fixed in #linus too.  Thanks.

 drivers/scsi/libata-eh.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

fda815b4e76189eda0fb6e9fab60fe191c9e019a
diff --git a/drivers/scsi/libata-eh.c b/drivers/scsi/libata-eh.c
index f2f29a8..4a670db 100644
--- a/drivers/scsi/libata-eh.c
+++ b/drivers/scsi/libata-eh.c
 <at>  <at>  -1346,7 +1346,7  <at>  <at>  static void ata_eh_autopsy(struct ata_po

 	/* record autopsy result */
 	ehc->i.dev = failed_dev;
-	ehc->i.action = action;
+	ehc->i.action |= action;

 	DPRINTK("EXIT\n");
 }
--

-- 
1.3.2
(Continue reading)


Gmane