Piete Brooks | 1 Dec 10:14 2005
X-Face
Picon
Picon

Ideas as to how to find Pending_Sector's ?

I've read through http://smartmontools.sourceforge.net/BadBlockHowTo.txt but 
can's spot anything of use.
[ BTW: I have all FSs on RAID1 arrays, so it's easier for me -- I just break 
the morror, "badblocks -w" the partition, and re-sync the mirror :-) ]

I have a problem with carp which reports:

ID# ATTRIBUTE_NAME         FLAG   VALUE WORST THRESH TYPE     UPDATED 
WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate    0x000b 084   084   060    Pre-fail Always  - 
14287017
  2 Throughput_Performance 0x0005 100   100   050    Pre-fail Offline - 950
  3 Spin_Up_Time           0x0007 129   129   024    Pre-fail Always  - 469 
(Average 539)
  4 Start_Stop_Count       0x0012 100   100   000    Old_age  Always  - 265
  5 Reallocated_Sector_Ct  0x0033 100   100   005    Pre-fail Always  - 0
  7 Seek_Error_Rate        0x000b 100   100   067    Pre-fail Always  - 0
  8 Seek_Time_Performance  0x0005 095   095   020    Pre-fail Offline - 63
  9 Power_On_Hours         0x0012 097   097   000    Old_age  Always  - 26578
 10 Spin_Retry_Count       0x0013 100   100   060    Pre-fail Always  - 0
 12 Power_Cycle_Count      0x0032 100   100   000    Old_age  Always  - 28
196 Reallocated_Event_Count0x0032 100   100   000    Old_age  Always  - 0
197 Current_Pending_Sector 0x0022 100   100   000    Old_age  Always  - 34
198 Offline_Uncorrectable  0x0008 100   100   000    Old_age  Offline - 0
199 UDMA_CRC_Error_Count   0x000a 200   200   000    Old_age  Always  - 0

and

Num  Test_Description    Status                  Remaining  LifeTime(hours)  
LBA_of_first_error
(Continue reading)

Jeremy James | 2 Dec 12:05 2005
Picon

Re: newbie needs smartmon support

Michael wrote:
> The system did not reboot ... it hung in the AMI BIOS saying something like:
> 
>  S.M.A.R.T. has detected a problem on IDE master 3 ... backup and replace
> drive

This is the disk itself telling the BIOS that it has failed a SMART 
test. This is usually only going to happen if it is very close to 
failing (or probably already failed) - it seems to be a fairly 
conservative status, so if it says failed, start looking at a new disk.

> I powered down the system and restarted, but did not receive this message.
> 
> I was not aware (or had forgotten) that this BIOS had any SMART detection.

It does a simple check in this way for the general SMART status - this 
is the FAILED/PASSED message shown at the top of smartctl. The BIOS 
doesn't check any other SMART parameters.

> Q: In general, can I trust the BIOS when it says that a drive needs to be
> replaced?

See above. I certainly wouldn't trust the disk by that point.

> smartctl says:
> 
> [root <at> mykiss mth]# smartctl -a /dev/sda
> smartctl version 5.33 [x86_64-redhat-linux-gnu] Copyright (C) 2002-4 Bruce
> AllenHome page is http://smartmontools.sourceforge.net/
> 
(Continue reading)

Michael | 2 Dec 15:08 2005

Re: newbie needs smartmon support

Jeremy wrote a lot of good stuff:

>> S.M.A.R.T. has detected a problem on IDE master 3 ... backup
>> and replace drive
>
> This is the disk itself telling the BIOS that it has failed a SMART
> test. This is usually only going to happen if it is very close to
> failing (or probably already failed) - it seems to be a fairly
> conservative status, so if it says failed, start looking at a new disk.

>> Q: In general, can I trust the BIOS when it says that a
>> drive needs to be replaced?
>
> See above. I certainly wouldn't trust the disk by that point.

I have pulled the drive, labeled it, and put it on a shelf.

> As far as I understand, libata support has just been brought in for the
> 2.6.15 kernel (now in release candidate stage). If you want to try it
> out, build a custom kernel, grab and build the CVS version of
> smartmontools, use the '-d ata' option (note: different from the docs),
> and let us know how it goes.

I have recompiled kernels once they have been packaged into .src.rpm
updates for Fedora ... but I have never pulled one down and built it from
kernel.org

So, this might be more than I can handle ... but I will give it a shot.

>> Finally, if I could get some SMART information from this drive ...
(Continue reading)

Don O'Neil | 2 Dec 21:31 2005

RE: 3Ware Inappropriate ioctl for device

Anyone have any ideas why this fails, even after building the latest CVS?

I've tried both twe0 and twed0... My log is saying:

Dec  2 11:21:17 bigbird_new /kernel: twe0: AEN: <twe0: port 0: SMART
threshold exceeded>  

Which is why I need to check it out. Here are the commands I was using:

./smartctl -a -d 3ware,0 /dev/twed0
smartctl version 5.34 [i386-unknown-freebsd4.11] Copyright (C) 2002-5 Bruce
Allen
Home page is http://smartmontools.sourceforge.net/

Smartctl open device: /dev/twed0 failed: Inappropriate ioctl for device

./smartctl -a -d 3ware,0 /dev/twe0
smartctl version 5.34 [i386-unknown-freebsd4.11] Copyright (C) 2002-5 Bruce
Allen
Home page is http://smartmontools.sourceforge.net/

Smartctl open device: /dev/twe0 failed: Inappropriate ioctl for device    

Same result as with the older version (see below)... Any ideas?

-----Original Message-----
From: smartmontools-support-admin <at> lists.sourceforge.net
[mailto:smartmontools-support-admin <at> lists.sourceforge.net] On Behalf Of
Bruce Allen
Sent: Monday, November 28, 2005 9:15 PM
(Continue reading)

Eduard Martinescu | 3 Dec 00:07 2005
Picon

Re: 3Ware Inappropriate ioctl for device

I'm working on it. Should have something by the end of the weekend.

Don O'Neil wrote:
> Anyone have any ideas why this fails, even after building the latest CVS?
> 
> I've tried both twe0 and twed0... My log is saying:
> 
> Dec  2 11:21:17 bigbird_new /kernel: twe0: AEN: <twe0: port 0: SMART
> threshold exceeded>  
> 
> Which is why I need to check it out. Here are the commands I was using:
> 
> ./smartctl -a -d 3ware,0 /dev/twed0
> smartctl version 5.34 [i386-unknown-freebsd4.11] Copyright (C) 2002-5 Bruce
> Allen
> Home page is http://smartmontools.sourceforge.net/
> 
> Smartctl open device: /dev/twed0 failed: Inappropriate ioctl for device
> 
> ./smartctl -a -d 3ware,0 /dev/twe0
> smartctl version 5.34 [i386-unknown-freebsd4.11] Copyright (C) 2002-5 Bruce
> Allen
> Home page is http://smartmontools.sourceforge.net/
> 
> Smartctl open device: /dev/twe0 failed: Inappropriate ioctl for device    
> 
> 
> Same result as with the older version (see below)... Any ideas?
> 
> -----Original Message-----
(Continue reading)

Bruce Allen | 5 Dec 17:48 2005
Picon

Re: Ideas as to how to find Pending_Sector's ?

Looks like you have a drive that doesn't reset Current_Pending_Sector to 
zero, even when it is no longer pending.

Bruce

On Thu, 1 Dec 2005, Piete Brooks wrote:

> I've read through http://smartmontools.sourceforge.net/BadBlockHowTo.txt but
> can's spot anything of use.
> [ BTW: I have all FSs on RAID1 arrays, so it's easier for me -- I just break
> the morror, "badblocks -w" the partition, and re-sync the mirror :-) ]
>
> I have a problem with carp which reports:
>
> ID# ATTRIBUTE_NAME         FLAG   VALUE WORST THRESH TYPE     UPDATED
> WHEN_FAILED RAW_VALUE
>  1 Raw_Read_Error_Rate    0x000b 084   084   060    Pre-fail Always  -
> 14287017
>  2 Throughput_Performance 0x0005 100   100   050    Pre-fail Offline - 950
>  3 Spin_Up_Time           0x0007 129   129   024    Pre-fail Always  - 469
> (Average 539)
>  4 Start_Stop_Count       0x0012 100   100   000    Old_age  Always  - 265
>  5 Reallocated_Sector_Ct  0x0033 100   100   005    Pre-fail Always  - 0
>  7 Seek_Error_Rate        0x000b 100   100   067    Pre-fail Always  - 0
>  8 Seek_Time_Performance  0x0005 095   095   020    Pre-fail Offline - 63
>  9 Power_On_Hours         0x0012 097   097   000    Old_age  Always  - 26578
> 10 Spin_Retry_Count       0x0013 100   100   060    Pre-fail Always  - 0
> 12 Power_Cycle_Count      0x0032 100   100   000    Old_age  Always  - 28
> 196 Reallocated_Event_Count0x0032 100   100   000    Old_age  Always  - 0
> 197 Current_Pending_Sector 0x0022 100   100   000    Old_age  Always  - 34
(Continue reading)

Pete | 6 Dec 00:45 2005
Picon
Picon

Power_on values

Hi,

I have 2 disks in a software RAID1 config. running on Linux.  How can the
power_on values be different since the drives power on at the same
time?

TIA,
Peter

Smart Version 3.2
2- Disks Maxtor 6Y160P0 (160MB) RAID 1

smartctl -a /dev/hdg |grep -i "power_on"
9 Power_On_Minutes        0x0032   209   209   000    Old_age   Always       - 
219h+49m

smartctl -a /dev/hde |grep -i "power_on"
9 Power_On_Minutes        0x0032   216   216   000    Old_age   Always       - 
916h+06m

-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
Tobias Wendorff | 3 Dec 17:42 2005
Picon
Picon

smartctl version 5.34 - bug information

Hi there,

"smartctl version 5.34 [i686-cygwin-xp-sp2]" told me this:

Internal error: unable to compile regular expression 
^(Hitachi )?HDS724040KL(AT|SA)80)$parentheses not balanced
Please inform smartmontools developers at 
smartmontools-support <at> lists.sourceforge.net
=== START OF INFORMATION SECTION ===
Internal error: unable to compile regular expression 
^(Hitachi )?HDS724040KL(AT|SA)80)$parentheses not balanced
Please inform smartmontools developers at 
smartmontools-support <at> lists.sourceforge.net
Device Model:     Maxtor 7L250S0
Serial Number:    L504NTHH
Firmware Version: BANC1E00
User Capacity:    251,000,193,024 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   7
ATA Standard is:  ATA/ATAPI-7 T13 1532D revision 0

Best regards,
Tobias 

-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
(Continue reading)

Ken David | 4 Dec 17:10 2005
Picon

(no subject)

Hi Smartmontools:

I've spent the last few days (newbie) downloading and installing smartmontools on my G4PB OS X (Darwin) and then scouring the Net for anything and everything about how to set it up and then how to interpret the meaning (attributes) of the reports generated. I discovered the fantastic article written by Bruce Allen some time ago in Linux Magazine - next to the description written by SpeedFan it's the best and clearest yet (that I've come across). I have one question. When I run: /usr/local/sbin/smartctl -l error /dev/disk0

I get the following (abstracted) and indications in the report and log that this error occurs again and again (I'm motivated to do this because my performance with Tiger 10.4.2 is treacly slow and it seems to be associated with disk accesses).  In particular EVERY error points to (or has the same signature) i.e. 40 51 01 7f f8 04 e0  Error: UNC 1 sectors at LBA = 0x0004f87f = 325759. Please, any ideas about what this means? At the bottom of this note are my overall disk scores from smartmontools.

Thanks for a fantastic tool and really cool and clear explanations in your articles and on sourceforge!

Cheers (and hope to hear from you),

Ken David

Error 1102 occurred at disk power-on lifetime: 3368 hours (140 days + 8 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 01 7f f8 04 e0  Error: UNC 1 sectors at LBA = 0x0004f87f = 325759

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 01 7f f8 04 e0 00      00:14:46.300  READ DMA EXT
  25 00 01 7e f8 04 e0 00      00:14:46.200  READ DMA EXT
  25 00 01 7d f8 04 e0 00      00:14:46.100  READ DMA EXT
  25 00 01 7c f8 04 e0 00      00:14:46.100  READ DMA EXT
  25 00 01 7b f8 04 e0 00      00:14:46.000  READ DMA EXT


OVERALL SCORES

=== START OF INFORMATION SECTION ===
Device Model:     HTS548080M9AT00
Serial Number:    MRL421L4GB1NHB
Firmware Version: MG4OA53A
User Capacity:    80,026,361,856 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   6
ATA Standard is:  ATA/ATAPI-6 T13 1410D revision 3a
Local Time is:    Sun Dec  4 14:03:13 2005 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME              FLAG        VAL   WRS THR    TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate         0x000b   090   090   062    Pre-fail  Always       -       2359323
  2 Throughput_Performance      0x0005   100   100   040    Pre-fail  Offline      -       0
  3 Spin_Up_Time                    0x0007   154   154   033    Pre-fail  Always       -       2
  4 Start_Stop_Count                0x0012   100   100   000    Old_age   Always       -       1513
  5 Reallocated_Sector_Ct       0x0033   054   054   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate                 0x000b   100   100   067    Pre-fail  Always       -       0
  8 Seek_Time_Performance       0x0005   100   100   040    Pre-fail  Offline      -       0
  9 Power_On_Hours                  0x0012   093   093   000    Old_age   Always       -       3407
 10 Spin_Retry_Count            0x0013   100   100   060    Pre-fail  Always       -       0
 12 Power_Cycle_Count           0x0032   100   100   000    Old_age   Always       -       1233
191 G-Sense_Error_Rate          0x000a   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count     0x0032   100   100   000    Old_age   Always       -       165
193 Load_Cycle_Count            0x0012   090   090   000    Old_age   Always       -       108543
194 Temperature_Celsius         0x0002   152   152   000    Old_age   Always       -       36 (Lifetime Min/Max 18/55)
196 Reallocated_Event_Count     0x0032   067   067   000    Old_age   Always       -       1498
197 Current_Pending_Sector      0x0022   100   100   000    Old_age   Always       -       2
198 Offline_Uncorrectable       0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count   0x000a   200   200   000    Old_age   Always       -       0
Jeremy James | 6 Dec 01:13 2005
Picon

Re: smartctl version 5.34 - bug information

Tobias Wendorff wrote:
> "smartctl version 5.34 [i686-cygwin-xp-sp2]" told me this:
> 
> Internal error: unable to compile regular expression ^(Hitachi 
> )?HDS724040KL(AT|SA)80)$parentheses not balanced

This appears to have been fixed in CVS just recently - on the 27th of 
November. Would suggest you build a new version from the lastest copy of 
the tree.

Jeremy

-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click

Gmane