Joachim Hoessler | 1 Jan 2006 14:12
Picon

read failure at unknown block

Hi all,

I've got a pretty new Samsung drive where the smartmontools selftests 
report an error, but won't tell me which block is actually bad. Also, no 
error is logged. Could someone help me interpreting these results? 
Should I get this drive replaced?

Thanks & a Happy New Year!

Joachim

smartctl version 5.33 [i586-pc-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     SAMSUNG SP2014N
Serial Number:    S088J10Y518023
Firmware Version: VC100-33
User Capacity:    200,049,647,616 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  ATA/ATAPI-7 T13 1532D revision 4a
Local Time is:    Sun Jan  1 14:03:46 2006 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
(Continue reading)

Hayashi Naoyuki | 2 Jan 2006 22:58

Porting smartmontools to Tru64 UNIX

Hi,

This patch is tested on Tru64 5.1B
and supports only SCSI disks.

To build with Compaq C:
./autogen.sh
CC="cc -nodtk" \
        ./configure

To build with gcc(test with v3.4.4):
./autogen.sh
./configure

Bugs.
"mode sense 19h" is not right.
If you get "Transport protocol", it produce the following message.
scsiModePageOffset: response length too short, resp_len=1 offset=4 bd_len=0
Or you get the wrong protocol such as  "Fibre channel (FCP-2)".

Example output:
  Seagate ST336706LW
  MAXTOR ATLAS10K3

--

-- 
Hayashi Naoyuki <titan <at> culzean.org>
Key fingerprint = 60A0 D5D3 F58B 3633 2E52  0147 D17F 5578 3FDF F5B6
Attachment (smartmontools-5.33-osf1.patch.gz): application/gzip, 5684 bytes
(Continue reading)

jp.pozzi@izzop.net | 4 Jan 2006 20:24

Problem with smartd

Hello,

Smartd works on my system and is warning me since three days :

============================================================================
Jan  3 20:11:34 k2000 smartd[6816]: Device: /dev/sdb, SMART Failure:
LOGICAL UNIT FAILURE PREDICTION THRESHOLD EXCEEDED
Jan  3 20:41:34 k2000 smartd[6816]: Device: /dev/sdb, SMART Failure:
LOGICAL UNIT FAILURE PREDICTION THRESHOLD EXCEEDED
============================================================================

The disk is new (less than 6 months) and a smartctl -a /dev/sdb gives :

===============================================================
Device: MAXTOR   ATLAS15K2_36WLS  Version: JNZH
Serial number: E20CEE2K
Device type: disk
Transport protocol: Parallel SCSI (SPI-4)
Local Time is: Wed Jan  4 20:20:57 2006 CET
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: LOGICAL UNIT FAILURE PREDICTION THRESHOLD EXCEEDED
[asc=5d, ascq=2]

Current Drive Temperature:     37 C
Manufactured in week 00 of year
Current start stop count:      1074003968 times
Recommended maximum start stop count:  1124401151 times
Elements in grown defect list: 3

(Continue reading)

Bruce Allen | 4 Jan 2006 20:35
Picon
Favicon

Re: Problem with smartd

Could you please try the latest CVS version of smartmontools and see if 
you still get these warnings?

Bruce

On Wed, 4 Jan 2006, jp.pozzi <at> izzop.net wrote:

> Hello,
>
> Smartd works on my system and is warning me since three days :
>
> ============================================================================
> Jan  3 20:11:34 k2000 smartd[6816]: Device: /dev/sdb, SMART Failure:
> LOGICAL UNIT FAILURE PREDICTION THRESHOLD EXCEEDED
> Jan  3 20:41:34 k2000 smartd[6816]: Device: /dev/sdb, SMART Failure:
> LOGICAL UNIT FAILURE PREDICTION THRESHOLD EXCEEDED
> ============================================================================
>
> The disk is new (less than 6 months) and a smartctl -a /dev/sdb gives :
>
> ===============================================================
> Device: MAXTOR   ATLAS15K2_36WLS  Version: JNZH
> Serial number: E20CEE2K
> Device type: disk
> Transport protocol: Parallel SCSI (SPI-4)
> Local Time is: Wed Jan  4 20:20:57 2006 CET
> Device supports SMART and is Enabled
> Temperature Warning Enabled
> SMART Health Status: LOGICAL UNIT FAILURE PREDICTION THRESHOLD EXCEEDED
> [asc=5d, ascq=2]
(Continue reading)

Jeremy Friesner | 5 Jan 2006 18:53

(no subject)

Hi all,

I'm confused about "captive mode" in smartctl.   i.e. How does the command

   smartctl -C -t long /dev/sda

differ from the command

   smartctl -t long /dev/sda

?  The docs say "don't run smartctl in captive mode while partitions are mounted on the disk", but they don't
say why that is a bad idea or what the consequences of doing so would be.  Would doing that corrupt the disk? 
Skew the results?  Cause problems for other programs?  etc.

Thanks,
Jeremy

-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
Erik Inge Bolsø | 6 Jan 2006 15:20

Re: captive mode

On Thu, 5 Jan 2006, Jeremy Friesner wrote:
> Hi all,
> 
> I'm confused about "captive mode" in smartctl.   i.e. How does the command
> 
>    smartctl -C -t long /dev/sda
> 
> differ from the command
> 
>    smartctl -t long /dev/sda
> 
> ?  The docs say "don't run smartctl in captive mode while partitions are 
>mounted on the disk", but they don't say why that is a bad idea or what 
>the consequences of doing so would be.  Would doing that corrupt the 
>disk? Skew the results?  Cause problems for other programs?  etc.

It keeps the disk occupied and unresponsive to other commands until the 
test is finished. You cannot access anything on it. So for a typical 
long test, it will be totally unresponsive for, say, two hours.

--

-- 
Erik I. Bolsø

-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
Leandro Santi | 6 Jan 2006 19:49
Picon

Patch: fix typo in scsiprint.c.

Hi,

This one-liner makes smartctl detect SCSI IE errors properly on
some of my SCSI disks, using smartctl's exit(2) status bits:

Index: scsiprint.c
===================================================================
RCS file: /cvsroot/smartmontools/sm5/scsiprint.c,v
retrieving revision 1.100
diff -u -r1.100 scsiprint.c
--- scsiprint.c	7 Dec 2005 02:53:52 -0000	1.100
+++ scsiprint.c	6 Jan 2006 18:33:57 -0000
 <at>  <at>  -1079,7 +1079,7  <at>  <at> 
             else
                 pout("TapeAlert Not Supported\n");
         } else { /* disk, cd/dvd, enclosure, etc */
-            if (res == scsiGetSmartData(fd, con->smartvendorattrib)) {
+            if ((res = scsiGetSmartData(fd, con->smartvendorattrib))) {
                 if (-2 == res)
                     returnval |= FAILSTATUS;
                 else

I also backported the whole IE and request sense-related fix 
to smartmontools-5.33 (patch attached).

Leandro.
smartmontools-5.33-check_ie.patch

(Continue reading)

jp.pozzi@izzop.net | 6 Jan 2006 20:26

Re: Problem with smartd

Le mercredi 04 janvier 2006 à 13:35 -0600, Bruce Allen a écrit :
> Could you please try the latest CVS version of smartmontools and see if 
> you still get these warnings?
> 
> Bruce
> 
> On Wed, 4 Jan 2006, jp.pozzi <at> izzop.net wrote:
> 
> > Hello,
> >
> > Smartd works on my system and is warning me since three days :
> >
> > ============================================================================
> > Jan  3 20:11:34 k2000 smartd[6816]: Device: /dev/sdb, SMART Failure:
> > LOGICAL UNIT FAILURE PREDICTION THRESHOLD EXCEEDED
> > Jan  3 20:41:34 k2000 smartd[6816]: Device: /dev/sdb, SMART Failure:
> > LOGICAL UNIT FAILURE PREDICTION THRESHOLD EXCEEDED

Hello,

I get the CVS version, but the results are similar, I enclose the output
of "./smartctl -a /dev/sdx"
For /dev/sda --> "sda.txt", for /dev/sdb --> "sbd.txt".

The two disks sda et sdb are of the same kind and bought in 2005/07 and
used on the same machine.

Regards

JP Pozzi
(Continue reading)

Chad Simmons | 7 Jan 2006 00:19
Picon

SATA Drives not in DB WDC WD740GD-00FLA2 + Maxtor 6L120M0

smartctl -a --device=ata /dev/sda
smartctl version 5.33 [x86_64-pc-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD740GD-00FLA2
Serial Number:    WD-WMAKE1989657
Firmware Version: 31.08F31
User Capacity:    74,355,769,344 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   6
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Fri Jan  6 18:13:48 2006 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84) Offline data collection activity
                              

          was suspended by an interrupting command from host.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                 (1719) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        No General Purpose Logging support.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  30) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0007   131   118   021    Pre-fail  Always       -       3983
  4 Start_Stop_Count        0x0032   100   100   040    Old_age   Always       -       77
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   253   051    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   094   094   000    Old_age   Always       -       4953
 10 Spin_Retry_Count        0x0013   100   253   051    Pre-fail  Always       -       0
 11 Calibration_Retry_Count 0x0013   100   253   051    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       72
194 Temperature_Celsius     0x0022   109   099   000    Old_age   Always       -       41
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0012   200   200   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x000a   200   253   000    Old_age   Always       -       1
200 Multi_Zone_Error_Rate   0x0009   200   179   051    Pre-fail  Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%       569         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
smartctl -a --device=ata /dev/sdb
smartctl version 5.33 [x86_64-pc-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     Maxtor 6L120M0
Serial Number:    L3D0KY2H
Firmware Version: BANC1E00
User Capacity:    122,942,324,736 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   7
ATA Standard is:  ATA/ATAPI-7 T13 1532D revision 0
Local Time is:    Fri Jan  6 18:15:05 2006 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Warning! SMART Attribute Thresholds Structure error: invalid SMART checksum.
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                 (1202) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  54) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  3 Spin_Up_Time            0x0027   216   215   063    Pre-fail  Always       -       3839
  4 Start_Stop_Count        0x0032   253   253   000    Old_age   Always       -       25
  5 Reallocated_Sector_Ct   0x0033   253   253   063    Pre-fail  Always       -       0
  6 Read_Channel_Margin     0x0001   253   253   100    Pre-fail  Offline      -       0
  7 Seek_Error_Rate         0x000a   253   252   000    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0027   247   240   187    Pre-fail  Always       -       62352
  9 Power_On_Hours          0x0032   241   241   000    Old_age   Always       -       994
 10 Spin_Retry_Count        0x002b   253   252   157    Pre-fail  Always       -       0
 11 Calibration_Retry_Count 0x002b   253   252   223    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   253   253   000    Old_age   Always       -       38
192 Power-Off_Retract_Count 0x0032   253   253   000    Old_age   Always       -       0
193 Load_Cycle_Count        0x0032   253   253   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0032   041   253   000    Old_age   Always       -       39
195 Hardware_ECC_Recovered  0x000a   253   252   000    Old_age   Always       -       6524
196 Reallocated_Event_Count 0x0008   253   253   000    Old_age   Offline      -       0
197 Current_Pending_Sector  0x0008   253   253   000    Old_age   Offline      -       0
198 Offline_Uncorrectable   0x0008   253   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0008   199   199   000    Old_age   Offline      -       0
200 Multi_Zone_Error_Rate   0x000a   253   252   000    Old_age   Always       -       0
201 Soft_Read_Error_Rate    0x000a   253   252   000    Old_age   Always       -       31
202 TA_Increase_Count       0x000a   253   252   000    Old_age   Always       -       0
203 Run_Out_Cancel          0x000b   253   252   180    Pre-fail  Always       -       15
204 Shock_Count_Write_Opern 0x000a   253   252   000    Old_age   Always       -       0
205 Shock_Rate_Write_Opern  0x000a   253   252   000    Old_age   Always       -       0
207 Spin_High_Current       0x002a   253   252   000    Old_age   Always       -       0
208 Spin_Buzz               0x002a   253   252   000    Old_age   Always       -       0
209 Offline_Seek_Performnce 0x0024   253   253   000    Old_age   Offline      -       0
210 Unknown_Attribute       0x0032   253   252   000    Old_age   Always       -       0
211 Unknown_Attribute       0x0032   253   252   000    Old_age   Always       -       0
212 Unknown_Attribute       0x0032   253   252   000    Old_age   Always       -       0

Warning! SMART ATA Error Log Structure error: invalid SMART checksum.
SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      4369         -
# 2  Offline             Aborted by host               70%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Gabriele Pohl | 8 Jan 2006 03:00

captive mode - was: Re: (no subject)

Hi Jeremy,

"Jeremy Friesner" schrieb am 06.01.06 07:31:16:
> ? The docs say "don't run smartctl in captive mode while
> partitions are mounted on the disk", but they don't say
> why that is a bad idea or what the consequences of doing
> so would be.
> Would doing that corrupt the disk? Skew the results?
> Cause problems for other programs? etc.

I haven't tried it, but I suppose, you will get an error
 message, if you start the test in captive mode for a
 mounted device.

kind regards,

Gabriele

-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click

Gmane