Bruce Allen | 1 Sep 2006 01:11
Picon
Favicon

Re: Seagate ST3400633A has unknown attributes

I don't know what this attribute it.  Some vendors use this one to keep 
the maximum temperature that the drive has been at.  The normal formula is 
value = 100 - T where T is temp in Celsuis.  If true then the failure 
threshold is 55 Celsuis, your current drive temp is 63C and your highest 
ever drive temp was 75C. Since this exceeded the design failure threshold 
temp, the 'usage' attribute is failing.

Note: all the above is an educated guess. Is this drive running hot?

Bruce

On Wed, 30 Aug 2006, Jens Seidel wrote:

> Hi,
>
> I have a Seagate Barracuda 7200.9 ST3400633A which is (according to
> knowndrives.cpp, revision 1.141) not yet in the database.
>
> It contains a few unknown IDs and one of these Unknown_Attribute tests
> failed.
>
> root <at> dm7025:~# smartctl -H /dev/ide/host0/bus0/target0/lun0/disc
> smartctl version 5.33 [mipsel-unknown-linux-gnu] Copyright (C) 2002-4 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
>
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
> Please note the following marginal Attributes:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
> 190 Unknown_Attribute       0x0022   037   025   045    Old_age   Always   FAILING_NOW 31581955620927
(Continue reading)

Bruce Allen | 1 Sep 2006 01:14
Picon
Favicon

Re: Failing drives?

Both drives have dozens of reallocated sectors.   I would either replace 
them or try and persuade WD to replace them under warranty.  Technically 
they are 'not failing' but something is wrong.

(Also try running the WD Data LifeGuard utility from an msdos disk.)

On Wed, 30 Aug 2006, Guillaume Filion wrote:

> Hi all,
>
> I have two drives set in RAID1 and I'm concerned that both drives are
> failing. They are both 80 GB Western Digital Caviar SE (WDC WD800JB-00JJA0):
>
> hdc is an old drive (2 years old), and shows "Fatal or unknown error
> 90%" for all tests, but also says "SMART overall-health self-assessment
> test result: PASSED".
>
> hdg is a new drive (2 months old), but smartd is sending me emails with:
> Device: /dev/hdg, ATA error count increased from 0 to 3
> Device: /dev/hdg, 1 Currently unreadable (pending) sectors
> But smartctl also says: "SMART overall-health self-assessment test
> result: PASSED".
>
> I'm not sure what to do with this, both drives are still on warranty but
> have not failed yet. I've attached the output of smartctl --all  for
> both drives. Any suggestion on what I should do?
>
> Thanks a lot,
> GFK's
>
(Continue reading)

Jens Seidel | 1 Sep 2006 15:42

Re: Seagate ST3400633A has unknown attributes

Hi Bruce,

thanks for CC:ing me.

On Thu, Aug 31, 2006 at 06:11:48PM -0500, Bruce Allen wrote:
> I don't know what this attribute it.  Some vendors use this one to keep 
> the maximum temperature that the drive has been at.  The normal formula is 
> value = 100 - T where T is temp in Celsuis.  If true then the failure 
> threshold is 55 Celsuis, your current drive temp is 63C and your highest 
> ever drive temp was 75C. Since this exceeded the design failure threshold 
> temp, the 'usage' attribute is failing.
> 
> Note: all the above is an educated guess.

You guessed correctly. All time I compare attribute 190 with the
temperature the sum of both is 100. Can you add this ID to the database?
(Or maybe just skip it in the output, since it's a duplication and also
violates the "values should always not smaller than the threshold" rule).

> Is this drive running hot?

Yep, this is well known for the Dreambox 7025 satellite reciever. The
manufacturer means it's tolerable ... 

> On Wed, 30 Aug 2006, Jens Seidel wrote:
> >I have a Seagate Barracuda 7200.9 ST3400633A which is (according to
> >knowndrives.cpp, revision 1.141) not yet in the database.
> >
> >It contains a few unknown IDs and one of these Unknown_Attribute tests
> >failed.
(Continue reading)

Bruce Allen | 2 Sep 2006 11:25
Picon
Favicon

Re: Seagate ST3400633A has unknown attributes

> thanks for CC:ing me.

No problem.

>> I don't know what this attribute it.  Some vendors use this one to keep
>> the maximum temperature that the drive has been at.  The normal formula is
>> value = 100 - T where T is temp in Celsuis.  If true then the failure
>> threshold is 55 Celsuis, your current drive temp is 63C and your highest
>> ever drive temp was 75C. Since this exceeded the design failure threshold
>> temp, the 'usage' attribute is failing.
>>
>> Note: all the above is an educated guess.
>
> You guessed correctly. All time I compare attribute 190 with the
> temperature the sum of both is 100.

OK.

> Can you add this ID to the database?

I did that some time ago.  It's in CVS but has not made its way to tarball 
yet.

> (Or maybe just skip it in the output, since it's a duplication and also
> violates the "values should always not smaller than the threshold" rule).

This is too much of a hassle to put into the code ... and also does not 
violate the rule since the value should not be smaller than the threshold. 
If it is you have exceeded the design temp of the drive.

(Continue reading)

Serguei Miridonov | 2 Sep 2006 18:45
Picon
Favicon

Predicting future unreadable sectors

Hello,

Please, see subject. Is that possible?

I'm asking this because when you are trying to access some data 
on the disk and kernel reports read errors, and smartmontools 
report about uncorrectable sectors, the data typically can not 
be recovered any more. Too late...

Now, disk hardware has ECC bytes to recover from correctable 
read errors. smartmontools report the total number of such 
recoveries in Hardware_ECC_Recovered attribute but without any 
details, like the sector number and frequency of these 
recoveries in this sector, and/or a number of wrong and 
corrected bytes/bits in the sector, etc. In theory, some kind 
of statistics regarding such minor failures could give an idea 
about sectors which are still readable but may fail soon. This 
would greatly improve the chance to duplicate/copy/protect 
data, or one could force the disk firmware to reallocate data 
in this sector preventing its total loss.

What do you think about this. Is it possible?

For example, look at readcd tool. It has an option to check CDs 
against C2 read errors which are not fatal, they are corrected 
in the CD drive hardware. However, the number of these errors 
may say that it is time to copy that CD on a new one and save 
data for the future.

Similar capability exists for DVD drives. However, I don't know 
(Continue reading)

Bruce Allen | 2 Sep 2006 21:51
Picon
Favicon

Re: Predicting future unreadable sectors

I don't know of any linux tools that support this.  The problem is that 
the detailed information about which sectors are in trouble, and the kind 
of trouble, is vendor-specific. And as far as I know, none of the vendors 
document this, although they make use of it in vendor-specific tools.

Bruce

On Sat, 2 Sep 2006, Serguei Miridonov wrote:

> Hello,
>
> Please, see subject. Is that possible?
>
> I'm asking this because when you are trying to access some data
> on the disk and kernel reports read errors, and smartmontools
> report about uncorrectable sectors, the data typically can not
> be recovered any more. Too late...
>
> Now, disk hardware has ECC bytes to recover from correctable
> read errors. smartmontools report the total number of such
> recoveries in Hardware_ECC_Recovered attribute but without any
> details, like the sector number and frequency of these
> recoveries in this sector, and/or a number of wrong and
> corrected bytes/bits in the sector, etc. In theory, some kind
> of statistics regarding such minor failures could give an idea
> about sectors which are still readable but may fail soon. This
> would greatly improve the chance to duplicate/copy/protect
> data, or one could force the disk firmware to reallocate data
> in this sector preventing its total loss.
>
(Continue reading)

Serguei Miridonov | 2 Sep 2006 22:56
Picon
Favicon

Re: Predicting future unreadable sectors

In fact, it means that some data loss is unavoidable before 
user will start to worry about disk replacement. BTW, I don't 
know any vendor specific tool, for Hitachi disks, for example, 
that would show minor errors and suggest a user to reallocate 
those sectors. They typically scan the disk and reallocate 
sectors only when error becomes uncorrectable. Or, am I wrong?

What is sad that disk manufacturers rarely consider the disk 
defective when sectors can still be replaced during warranty 
period. They all suggest to run some kind of Fitness Test or 
Disk Repair tool, and only issue RMA when there is no 
replacement sectors available.

My Hitachi DK23DA-30 started to show trouble during first year 
of use in a new notebook. At that time the disk was almost 
empty and I almost had no problem reallocating about 35 
sectors. After that 2.5 years - no a single error. In May 2005 
another sector was in trouble - reallocated. One week ago, at 
the and of August 2006 - again. Now I'm starting to think 
about a new disk...

What is criteria for new disk behavior? Is it normal to detect 
errors during first year of operation? Should user insist on 
RMA in such a case? What are some general considerations? 
Also, regarding purchase of a new disk for notebook, is there 
something like a database with hard disk reliability data? 
Which brand is best?

Thank you and best regards,
Serguei.
(Continue reading)

Bruno Wolff III | 2 Sep 2006 23:39
Picon

Re: Predicting future unreadable sectors

On Sat, Sep 02, 2006 at 13:56:34 -0700,
  Serguei Miridonov <mirsev <at> cicese.mx> wrote:
> In fact, it means that some data loss is unavoidable before 
> user will start to worry about disk replacement. BTW, I don't 
> know any vendor specific tool, for Hitachi disks, for example, 
> that would show minor errors and suggest a user to reallocate 
> those sectors. They typically scan the disk and reallocate 
> sectors only when error becomes uncorrectable. Or, am I wrong?

That is one reason you want to run RAID.

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Serguei Miridonov | 2 Sep 2006 23:41
Picon
Favicon

Re: Predicting future unreadable sectors

On Saturday 02 September 2006 14:39, Bruno Wolff III wrote:
> On Sat, Sep 02, 2006 at 13:56:34 -0700,
>
>   Serguei Miridonov <mirsev <at> cicese.mx> wrote:
> > In fact, it means that some data loss is unavoidable
> > before user will start to worry about disk replacement.
> > BTW, I don't know any vendor specific tool, for Hitachi
> > disks, for example, that would show minor errors and
> > suggest a user to reallocate those sectors. They typically
> > scan the disk and reallocate sectors only when error
> > becomes uncorrectable. Or, am I wrong?
>
> That is one reason you want to run RAID.

On a notebook?

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Bruno Wolff III | 2 Sep 2006 23:55
Picon

Re: Predicting future unreadable sectors

On Sat, Sep 02, 2006 at 14:41:28 -0700,
  Serguei Miridonov <mirsev <at> cicese.mx> wrote:
> On Saturday 02 September 2006 14:39, Bruno Wolff III wrote:
> > On Sat, Sep 02, 2006 at 13:56:34 -0700,
> >
> >   Serguei Miridonov <mirsev <at> cicese.mx> wrote:
> > > In fact, it means that some data loss is unavoidable
> > > before user will start to worry about disk replacement.
> > > BTW, I don't know any vendor specific tool, for Hitachi
> > > disks, for example, that would show minor errors and
> > > suggest a user to reallocate those sectors. They typically
> > > scan the disk and reallocate sectors only when error
> > > becomes uncorrectable. Or, am I wrong?
> >
> > That is one reason you want to run RAID.
> 
> On a notebook?

Notebooks have other ways to lose data that are probabably more of a problem
than losing a couple of sectors worth of data without advanced notice.
You should be backing up data from your notebook to a safer location regularly.

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

Gmane