Bruce Allen | 1 Mar 2004 19:37
Picon
Favicon

Re: Drives to add do database

Hi Jan,

> I'm running an old Pentium based machine with three hard drives
> installed. Two of them are not in smartmontools' drive database, so I
> think you might like to add them!

Please resend your mail to smartmontools-database.

> The output created by smartctl -a for these drives is attached to this
> eMail. As you can see, the Power_On_Hours act a bit strange for the
> 120 GByte drive (hdb): It seems that the drive doesn't report minutes,
> halfminutes or seconds, and my calclator told me that this counter
> isn't reset every
> 1092 hours as suggested in the FAQ on the smartmontools web site.

You are having the same issue reported under 'Maxtor drives' in the FAQ.

Your 120 GB drive is 6840 hours old.
6840*64 = 437760 'maxtorminutes'

Attribute 9 keeps this in a two-byte RAW value that rolls over every 65536
'maxtorminutes'.  Hence the counter has rolled over 6 times, and the
current value of Attribute 9 is 437760-6*65536 = 44544 'maxtorminutes'.

> For both drives, the LifeTime printed in the self-test log seems to be
> OK - short self tests are run every six hours by a cronjob.

If you watch over the long term you'll see that they advance 7% too slowly
since a maxtor minute is 64 not 60 seconds.

(Continue reading)

Bruce Allen | 1 Mar 2004 22:18
Picon
Favicon

RE: Re: long selftest and SAMSUNG SV1604N not working

Hi Michael,

I want to post a transcript in the smartmontools FAQ, which shows how to
identify the file stored at a particular failing LBA.  I have some
questions about what you wrote in your earlier email.

I have been waiting to have a disk on the cluster that fails its
self-tests -- I just got one this morning.  Here is the failed self-test
on /dev/hda:

Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%       212
0x016561e9
# 2  Extended offline    Completed: read failure       90%       181
0x016561e9
# 3  Extended offline    Completed without error       00%        14
-
# 4  Extended offline    Completed without error       00%         4
-

so the error is at LBA = 0x016561e9 = 23421417

fdisk -lu /dev/hda shows:

[root <at> medusa-slave031 root]# fdisk -lu /dev/hda

Disk /dev/hda: 123.5 GB, 123522416640 bytes
255 heads, 63 sectors/track, 15017 cylinders, total 241254720 sectors
Units = sectors of 1 * 512 = 512 bytes
(Continue reading)

Bruce Allen | 1 Mar 2004 22:19
Picon
Favicon

Re: Drives to add do database

> > You are having the same issue reported under 'Maxtor drives' in the FAQ.
> > 
> > Your 120 GB drive is 6840 hours old.
> > 6840*64 = 437760 'maxtorminutes'
> > 
> > Attribute 9 keeps this in a two-byte RAW value that rolls over every 65536
> > 'maxtorminutes'.  Hence the counter has rolled over 6 times, and the
> > current value of Attribute 9 is 437760-6*65536 = 44544 'maxtorminutes'.
> I think it is - due to the wrap-around every 65536 "maxtormintes" -
> impossible to add a -v option to handle this behaviour since you can't
> know how often this attribute has wrapped around in the past?

Correct.

I can not add a -v option to work around this firmware bug.  The only way
I can calculate the lifetime is from the self-test log entry -- which
gives the answer directly.

Cheers,
	Bruce

-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
Thomas Subotitsch | 1 Mar 2004 17:57
Picon

SAMSUNG SP1614N - Warning

Hi developers!
I tried out smartmontools 5.29 and disvoveres that none of  my 3 
disk-info returned a valid info about the temperature of my disks. :(
The data of "unknown"-disks I sent to the database ML. But what didt the 
following mean?

Greetings,

-Thomas

smartctl version 5.29 Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     SAMSUNG SP1614N
Serial Number:    0642J1FW606994
Firmware Version: TM100-23
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  ATA/ATAPI-7 T13 1532D revision 0
Local Time is:    Mon Mar  1 17:05:09 2004 CET

==> WARNING: Contact developers at 
smartmontools-support <at> lists.sourceforge.net; may need -F samsung2 disabled

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
(Continue reading)

Bruce Allen | 1 Mar 2004 23:07
Picon
Favicon

Re: SAMSUNG SP1614N - Warning

> I tried out smartmontools 5.29 and disvoveres that none of  my 3 
> disk-info returned a valid info about the temperature of my disks. :(

> 194 Temperature_Celsius     0x0022   139   133   000    Old_age   
> Always       -       33

Your disk is 33 Celsius.  Please read the documentation.

Bruce

-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
Bruce Allen | 2 Mar 2004 01:18
Picon
Favicon

Re: S.M.A.R.T.


On Sat, 28 Feb 2004 pzmunka <at> freestart.hu wrote:

> Dear Bruce!
> 
> Sorry for disturb you again! I have one more question: where are you
> getting this attribute information? From the vendors? Because I have
> my own program (hardware surface test / S.M.A.R.T. query) and I would
> like advance it by this attribute information. But my program is
> commercial and I see in your source code that is under GNU licence and
> I wouldn't like harm this! Ergo if you getting the attribute
> information (and only that) from the vendors or it's not under GNU
> licence than I could use its my commercial program...(this is a
> "morality play")
> 
> Thanks the answer and bye

I'm afraid that the news is not very good.  I got the Attribute names from
Smartsuite, by looking around at other GPLd programs, and by talking with
vendors.  I don't think you can use this information in a non-GPL'd
program.

Cheers,
	Bruce

> 2004.02.28. 11:44-kor Bruce Allen  <ballen <at> gravity.phys.uwm.edu> írta: 
> 
> > On Fri, 27 Feb 2004 pzmunka <at> freestart.hu wrote:
> > > 2004.02.27. 0:22-kor Bruce Allen  <ballen <at> gravity.phys.uwm.edu> írta: 
> > > 
(Continue reading)

Sergey Vlasov | 2 Mar 2004 13:26
Picon
Favicon

Re: long selftest and SAMSUNG SV1604N not working

On Mon, 1 Mar 2004 15:18:21 -0600 (CST) Bruce Allen wrote:

[skip]
> so the error is at LBA = 0x016561e9 = 23421417
> 
> fdisk -lu /dev/hda shows:
> 
> [root <at> medusa-slave031 root]# fdisk -lu /dev/hda
> 
> Disk /dev/hda: 123.5 GB, 123522416640 bytes
> 255 heads, 63 sectors/track, 15017 cylinders, total 241254720 sectors
> Units = sectors of 1 * 512 = 512 bytes
> 
>    Device Boot    Start       End    Blocks   Id  System
> /dev/hda1   *        63   4209029   2104483+  83  Linux
> /dev/hda2       4209030   5269319    530145   82  Linux swap
> /dev/hda3       5269320 238227884 116479282+  83  Linux
> /dev/hda4     238227885 241248104   1510110   83  Linux
> 
> So this is an error in the file system on /dev/hda3 since
> 5269320 < 23421417 < 238227884.
> 
> Here is the debugfs transcript:
> debugfs 1.32 (09-Nov-2002)
> 
> debugfs:  open /dev/hda3
> 
> debugfs:  icheck 0x016561e9
> Block	Inode number
> 23421417	<block not found>
(Continue reading)

Bruce Allen | 2 Mar 2004 17:58
Picon
Favicon

Re: Re: long selftest and SAMSUNG SV1604N not working

Sergey,

This is VERY helpful.  Thank you!  I have one question, below.  (I am
going to write up a how-to, once I have the details sorted out).

> > so the error is at LBA = 0x016561e9 = 23421417
> > 
> > fdisk -lu /dev/hda shows:
> > /dev/hda3       5269320 238227884 116479282+  83  Linux

> First you need to calculate the sector number relative to the start of
> partition:
> 
> 23421417 - 5269320 = 18152097

Good, I agree.

> Then you need to convert the sector number to the filesystem block
> number. To do this, you need to know the filesystem block size -
> tune2fs -l /dev/hda1 will print this. Usually the block size is 4096
> bytes, which is 8 512-byte sectors.
> 
> 18152097 / 8 = 2269012.125 (the integer part gives the block number)
> So you need to do "icheck 2269012" in debugfs.

I am concerned about this.  Isn't it the case that the file system blocks
are counted from 1, not from zero?

debugfs:  icheck 0
icheck: Invalid block number 0
(Continue reading)

Sergey Vlasov | 2 Mar 2004 18:33
Picon
Favicon

Re: Re: long selftest and SAMSUNG SV1604N not working

On Tue, Mar 02, 2004 at 10:58:18AM -0600, Bruce Allen wrote:
[skip]
> > Then you need to convert the sector number to the filesystem block
> > number. To do this, you need to know the filesystem block size -
> > tune2fs -l /dev/hda1 will print this. Usually the block size is 4096
> > bytes, which is 8 512-byte sectors.
> > 
> > 18152097 / 8 = 2269012.125 (the integer part gives the block number)
> > So you need to do "icheck 2269012" in debugfs.
> 
> I am concerned about this.  Isn't it the case that the file system blocks
> are counted from 1, not from zero?
> 
> debugfs:  icheck 0
> icheck: Invalid block number 0
> debugfs:  icheck 1
> Block	Inode number
> 1	<block not found>
> 
> In this case, isn't the correct block given by 2269013 not 2269012?

In the ext2 filesystem block number 0 is explicitly invalid.  First
1 KB is unused by ext2 (often a bootloader is located there), the
primary superblock is located at the 1 KB offset.  If the block size
is large, the remaining space in block 0 is unused.
Bruce Allen | 2 Mar 2004 20:14
Picon
Favicon

Re: Re: long selftest and SAMSUNG SV1604N not working

Sergey,

Thanks for the explanation.

I just did what you suggested, and it worked!  (Note: after the dd
if=/dev/zero... , I did 'sync' to force this onto the disk).

Before:

root]# smartctl -A /dev/hda
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED
WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always
-       0
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always
-       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always
-       1
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline
-       1

After:

root]# smartctl -A /dev/hda
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED
WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always
-       1
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always
-       1
(Continue reading)


Gmane