Bruce Allen | 1 Oct 2003 14:23
Picon
Favicon

Re: SAMSUNG SP1203N

Hi Leon,

Thanks for the report.

> this disk has its power on time in minutes instead of hours.

I thought it had its power-on time in half-minutes.  Are you sure it's
minutes?  Could you run a short self-test then compare the self-test log
timestamp with Attribute 9's raw value?

> Also, it does NOT seem to need -F samsung.

> "SMART Error Log Version: 1
> Warning: ATA error count 1024 inconsistent with error log pointer 5"

The firmware has at least one obvious error -- it is byte-swapping the ATA
error count.  On your disk this appears to be 1024 = 00000100 00000000 in
binary.  The actual value should be 00000000 00000100 = 4 (base 10).

And indeed, you'll notice only four errors ATA reported below, not "the
last five".

I can add a flag (-F samsung2) to compensate for this firmware bug, and
put it into the drive database, in just a few minutes.  Unfortunately the
current smartmontools CVS archive is not in a buildable state because of
conversion to use autotools.  When that works is completed, I'll send you
a patched version to test.

Cheers,
	Bruce
(Continue reading)

Leon Woestenberg | 1 Oct 2003 19:31
Picon

Re: SAMSUNG SP1203N

Hello Bruce and mailing list(eners),

> > this disk has its power on time in minutes instead of hours.
>
> I thought it had its power-on time in half-minutes.  Are you sure it's
> minutes?  Could you run a short self-test then compare the self-test log
> timestamp with Attribute 9's raw value?
>
You are right, drive reports half minutes. Confirmed by both your method,
and comparing system clock with drive 'power on timer' over a period of
time. A very big RTFM for me.

While Reading The Fine Manual, I found this little typo:

In the last sentence describing the -F option, there is an error in the word
'error' as
it reads: "(2) very large numbers of  ATA errors  reported in the ATA erorr
log;"

Then a small change request, a very minor issue:

smartctl -o on -c /dev/hda

I think there is ambiguity about what this command would report; will it
(a) first enable offline tests and _then_ return the capabilities and
settings, or
(b) return current capabilities and settings, and _then_ enable offline
testing.
(c) use the order of command options

(Continue reading)

Bruce Allen | 1 Oct 2003 22:34
Picon
Favicon

Re: SAMSUNG SP1203N

> While Reading The Fine Manual, I found this little typo:
> 
> In the last sentence describing the -F option, there is an error in the word
> 'error' as
> it reads: "(2) very large numbers of  ATA errors  reported in the ATA erorr
> log;"

Fixed!

> Then a small change request, a very minor issue:
> 
> smartctl -o on -c /dev/hda
> 
> I think there is ambiguity about what this command would report; will it
> (a) first enable offline tests and _then_ return the capabilities and
> settings, or
> (b) return current capabilities and settings, and _then_ enable offline
> testing.
> (c) use the order of command options
> 
> I found that smartctl does (b), where (c) or (a) might be expected. I tested
> as follows:
> 
> # smartctl -o off /dev/hda
> # smartctl -c /dev/hda
> Auto Off-line Data Collection: Disabled.
> 
> smartctl -o on -c /dev/hda
> Auto Off-line Data Collection: Disabled.    <<<< This might be unexpected.
> smartctl -o on -c /dev/hda
(Continue reading)

Tom Maddox | 2 Oct 2003 01:55

Segmentation fault

Hi, just thought I'd give a little information about a problem I
encountered running smartd on my system.

Here are the specifics:
RedHat 9 with 2.4.20 kernel
Only hard drive is a Seagate ST340017A (not in smartctl/smartd database)
Nforce IDE chipset
smartmontools 5.1-18

When running smartd with the following smartd.conf, I get a segmentation
fault:

DEVICESCAN -H -l error -m tmaddox <at> thereinc.com -P ignore

The same configuration worked fine on a RedHat 7.3 system with a 2.4.18
kernel with SCSI drives but not in a system with IDE.  The culprit
appears to be the "-l error," since the segfaults stopped when I removed
that directive.

Any thoughts?

Thanks,

Tom

-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
(Continue reading)

Michael Bendzick | 3 Oct 2003 19:32
Favicon

SMART & Bad Block Interaction

In the message at
http://lists.debian.org/debian-isp/2003/debian-isp-200304/msg00191.html, the
writer indicates that SMART is able to keep track of how many bad sectors
have been remapped to spare locations on the disk...

"You can use SMART to determine how many re-mapping events have occurred.
Expect to be able to remap at least 1000 blocks before running out."

Does that sound like a correct parameter that is available?  (I'm not
worried about how many blocks/sectors are supposed to be available.) Is it
available in smartmontools as a attribute name that I'm not interpreting as
"remap count"?

Thank you,

-Michael Bendzick
Systems and Software Engineering Intern
Logic Product Development
michael.b <at> logicpd.com
www.logicpd.com 

-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
Bruce Allen | 3 Oct 2003 22:52
Picon
Favicon

Re: SMART & Bad Block Interaction

Hi Michael,

On Fri, 3 Oct 2003, Michael Bendzick wrote:

> In the message at
> http://lists.debian.org/debian-isp/2003/debian-isp-200304/msg00191.html, the
> writer indicates that SMART is able to keep track of how many bad sectors
> have been remapped to spare locations on the disk...
> 
> "You can use SMART to determine how many re-mapping events have occurred.
> Expect to be able to remap at least 1000 blocks before running out."
> 
> Does that sound like a correct parameter that is available?  (I'm not
> worried about how many blocks/sectors are supposed to be available.) Is it
> available in smartmontools as a attribute name that I'm not interpreting as
> "remap count"?

Yes, this parameter is available on most modern disks.  For an example
see:
  http://smartmontools.sourceforge.net/examples/MAXTOR-1.txt
which is one of the examples here:
  http://smartmontools.sourceforge.net/#sampleoutput

The Attribute that you want is the raw value of Attribute #5: Reallocated
Sector Count.  On this example it's 499 sectors (they are 512 bytes each
so this is just under 256 KB total).

Cheers,
	Bruce

(Continue reading)

Bruce Allen | 3 Oct 2003 23:29
Picon
Favicon

smartmontools 5.19 release

I have just issued a highly experimental release of smartmontools.  It is
the first release based on the autoconf/automake installation tools.  
Please report problems to the mailing list, especially missing or
extraneous files, etc.  See INSTALL for installation instructions.

The major installation changes are:
 [0] ./configure && make && make install 
     gives the same installation paths as previous releases
 [1]  installation scripts based on autoconfig/automake
 [2] ./configure [options] lets you set arbitrary paths
 [3] supports FHS with ./configure --prefix=/usr/local
 [4] correct paths are inserted into all man pages, binaries, etc.
 [5] tarballs and RPMs are now GPG-signed

Note that starting with this release the releases will be numbered 5.19,
5.20, etc.  So the release numbering since the initial smartmontools
release reads: 5.0-1, ... , 5.0-49, 5.1-1, ... 5.1-18, 5.19, 5.20, ...

Changes since 5.1-18 release.

  [BA] smartctl: added '-T verypermissive' option which is
       equivalent to giving '-T permissive' many times.

  [BA] Try harder to identify from IDENTIFY DEVICE structure
       if SMART supported/enabled.  smartd now does a more
       thorough job of trying to assess this before sending
       a SMART status command to find out for sure.

  [BA] smartctl: it's now possible to override the program's
       guess of the device type (ATA or SCSI) with -d option.
(Continue reading)

Jeffrey B. Layton | 4 Oct 2003 13:45

Using Smartmontools in a cluster

Good morning!

   I've been looking at using smartmontools on a few clusters
especially since I read Bruce's comments that he uses it in
his clusters. I've been trying smartmontools on a few
different hard drives (some old, some new) to get a feel for
what is being reported. Now, I'm ready to start diving into
parsing smartmontools output for useful information to
watch for drive problems and drive failures. However, let
me ask, does anyone have any scripts for parsing the output
of smartmontools for a cluster setting? A related question I
have is, what parts of the output are important to watch for
predicting or discovering drive failure?

TIA!

Jeff

-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
Bruce Allen | 4 Oct 2003 14:39
Picon
Favicon

Re: Using Smartmontools in a cluster

Hi Jeffrey,

>    I've been looking at using smartmontools on a few clusters
> especially since I read Bruce's comments that he uses it in his
> clusters. I've been trying smartmontools on a few different hard
> drives (some old, some new) to get a feel for what is being reported.
> Now, I'm ready to start diving into parsing smartmontools output for
> useful information to watch for drive problems and drive failures.

What we do on our cluster is:

(1) Run smartd, with this config file:
# First and second ATA/IDE hard disk.  Monitor all attributes
/dev/hda -S on -o on -a -I 194 -m me <at> my.address
/dev/hdc -S on -o on -a -I 194 -m me <at> my.address

(2) Run self-tests once per week from a cron script:

#! /bin/bash

# Once per week, run extended self-tests on the disks see man smartctl
# for further details of how this works.

if [ -e /proc/ide/hda ] ; then
    /usr/sbin/smartctl -t long /dev/hda > /dev/null 2> /dev/null && \
	/usr/bin/logger -t rundiskselftests "Starting long self-test on
/dev/hda" || \
	/usr/bin/logger -t rundiskselftests "FAILED starting long
self-test on /dev/hda"  
fi
(Continue reading)

Bruce Allen | 2 Oct 2003 11:09
Picon
Favicon

Re: Segmentation fault

Hi Tom,

> Hi, just thought I'd give a little information about a problem I
> encountered running smartd on my system.
> 
> Here are the specifics:
> RedHat 9 with 2.4.20 kernel
> Only hard drive is a Seagate ST340017A (not in smartctl/smartd database)
> Nforce IDE chipset
> smartmontools 5.1-18
> 
> When running smartd with the following smartd.conf, I get a segmentation
> fault:
> 
> DEVICESCAN -H -l error -m tmaddox <at> thereinc.com -P ignore

Well, you have the dubious honor of having found the first "real" bug in
5.1-18.  Looks like it's time for 5.19 (we are changing the release
numbering system).

I've just fixed this bug in the code base.  A good workaround is to add
"-o on" to the line above. This will enable auto-offline testing on any
drives that support it, probably a "good thing".

My apologies -- this was a programming oversight on my part.

Bruce

-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
(Continue reading)


Gmane