John Theodore | 20 Aug 03:35 2014
Picon

smartd.conf advice, how often to run scans, what options?

I finally got smartd working and the puppet-smartd module works now too. I realized how useful devicescan is.

How often are you guys having short or long tests occur?

Currently my smartd.conf looks like this:

DEVICESCAN -S on -o on -a -I 194 -s (S/../.././02|L/../../6/03)

Do I need additional options to cause it to dump errors it finds to syslog? or does devicescan do that automatically? All these disks are behind a megaraid controller, will the query work properly through devicescan?

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Smartmontools-support mailing list
Smartmontools-support <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/smartmontools-support
Christian Franke | 18 Aug 21:05 2014
Picon

Re: smartctl output for Crucial 512Gb SSD

Mike wrote:
> Great thanks. I've applied the patch and ran the same command, which
> initially takes around 30 seconds to return any data at all, it then
> hung for about 1 minute after printing the GPL table.

This is because, with the patch, these large logs are completely read 
into memory. Smartctl should only read sectors needed but this requires 
some rework.

> ...
> Self-test execution status:      (  80) The previous self-test completed having
> the electrical element of the test failed.

This suggests a serious problem with the drive.

> ...
>
> SMART Extended Comprehensive Error Log Version: 1 (16383 sectors)
> No Errors Logged
>
> SMART Extended Self-test Log Version: 1 (3449 sectors)

smartctl should print one entry here but doesn't. There is either a bug 
in smartctl or the drive does not fill the self-test log as specified by 
the standard.

Please provide output of "smartctl -l gplog,0x07,0-1 /dev/sdb" as 
attachment(!).

Connecting to another SATA controller with better pass-through support 
may help to read the old Self-test Log (don't use -x, use -a or -l 
selftest).

Thanks,
Christian

------------------------------------------------------------------------------
Mike | 16 Aug 23:47 2014
Picon

Fwd: smartctl output for Crucial 512Gb SSD

Great thanks. I've applied the patch and ran the same command, which
initially takes around 30 seconds to return any data at all, it then
hung for about 1 minute after printing the GPL table.  Here is the
output:

user <at> w530:~/Downloads/smartmontools-6.3$ sudo ./smartctl -x /dev/sdb
smartctl 6.3 2014-07-26 r3976 [x86_64-linux-3.13.0-34-generic] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Crucial/Micron RealSSD m4/C400/P400
Device Model:     M4-CT512M4SSD1
Serial Number:    000000001249091FCC21
LU WWN Device Id: 5 00a075 1091fcc21
Firmware Version: 070H
User Capacity:    512,110,190,592 bytes [512 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Sat Aug 16 21:43:33 2014 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     254 (maximum performance)
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Write SCT (Get) Feature Control Command failed: scsi error medium or
hardware error (serious)
Wt Cache Reorder: Unknown (SCT Feature Control command failed)

=== START OF READ SMART DATA SECTION ===
SMART Status command failed: scsi error medium or hardware error (serious)
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

General SMART Values:
Offline data collection status:  (0x80) Offline data collection activity
was never started.
Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 2380) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: (  39) minutes.
Conveyance self-test routine
recommended polling time: (   3) minutes.
SCT capabilities:       (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   100   100   050    -    0
  5 Reallocated_Sector_Ct   PO--CK   100   100   010    -    0
  9 Power_On_Hours          -O--CK   100   100   001    -    4439
 12 Power_Cycle_Count       -O--CK   100   100   001    -    515
170 Grown_Failing_Block_Ct  PO--CK   100   100   010    -    0
171 Program_Fail_Count      -O--CK   100   100   001    -    0
172 Erase_Fail_Count        -O--CK   100   100   001    -    0
173 Wear_Leveling_Count     PO--CK   100   100   010    -    0
174 Unexpect_Power_Loss_Ct  -O--CK   100   100   001    -    398
181 Non4k_Aligned_Access    -O---K   100   100   001    -    144 45 99
183 SATA_Iface_Downshift    -O--CK   100   100   001    -    0
184 End-to-End_Error        PO--CK   100   100   050    -    0
187 Reported_Uncorrect      -O--CK   100   100   001    -    0
188 Command_Timeout         -O--CK   100   100   001    -    0
189 Factory_Bad_Block_Ct    -OSR--   100   100   001    -    215
194 Temperature_Celsius     -O---K   100   100   000    -    0
195 Hardware_ECC_Recovered  -O-RCK   100   100   001    -    0
196 Reallocated_Event_Count -O--CK   100   100   001    -    0
197 Current_Pending_Sector  -O--CK   100   100   001    -    0
198 Offline_Uncorrectable   ----CK   100   100   001    -    0
199 UDMA_CRC_Error_Count    -O--CK   100   100   001    -    0
202 Perc_Rated_Life_Used    ---RC-   100   100   001    -    0
206 Write_Error_Rate        -OSR--   100   100   001    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

Read SMART Log Directory failed: scsi error medium or hardware error (serious)

General Purpose Log Directory Version 1
Address    Access  R/W   Size  Description
0x00       GPL     R/O      1  Log Directory
0x03       GPL     R/O  16383  Ext. Comprehensive SMART error log
0x04       GPL     R/O    255  Device Statistics log
0x07       GPL     R/O   3449  Extended self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters
0x80-0x9f  GPL     R/W     16  Host vendor specific log
0xa0       GPL     VS    2000  Device vendor specific log
0xa1-0xbf  GPL     VS       1  Device vendor specific log
0xc0       GPL     VS      80  Device vendor specific log
0xc1-0xdf  GPL     VS       1  Device vendor specific log
0xe0       GPL     R/W      1  SCT Command/Status
0xe1       GPL     R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (16383 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (3449 sectors)
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

Read SMART Selective Self-test Log failed: scsi error medium or
hardware error (serious)

SCT Status Version:                  3
SCT Version (vendor specific):       1 (0x0001)
SCT Support Level:                   0
Device State:                        Active (0)
Current Temperature:                  0 Celsius
Power Cycle Max Temperature:          0 Celsius
Lifetime    Max Temperature:          0 Celsius

SCT Temperature History Version:     2
Temperature Sampling Period:         10 minutes
Temperature Logging Interval:        10 minutes
Min/Max recommended Temperature:      0/70 Celsius
Min/Max Temperature Limit:           -5/75 Celsius
Temperature History Size (Index):    478 (37)

Index    Estimated Time   Temperature Celsius
  38    2014-08-13 14:10     ?  -
 ...    ..(476 skipped).    ..  -
  37    2014-08-16 21:40     ?  -

Write SCT (Get) Error Recovery Control Command failed: scsi error
medium or hardware error (serious)
SCT (Get) Error Recovery Control command failed

Device Statistics (GP Log 0x04)
Page Offset Size         Value  Description
  1  =====  =                =  == General Statistics (rev 2) ==
  1  0x008  4              515  Lifetime Power-On Resets
  1  0x010  4             4439  Power-on Hours
  1  0x018  6        933634265  Logical Sectors Written
  1  0x020  6          2708986  Number of Write Commands
  1  0x028  6        852431906  Logical Sectors Read
  1  0x030  6          6049196  Number of Read Commands
  4  =====  =                =  == General Errors Statistics (rev 1) ==
  4  0x008  4                0  Number of Reported Uncorrectable Errors
  4  0x010  4                0  Resets Between Cmd Acceptance and Completion
  5  =====  =                =  == Temperature Statistics (rev 1) ==
  5  0x008  1                0  Current Temperature
  5  0x010  1                0  Average Short Term Temperature
  5  0x018  1                0  Average Long Term Temperature
  5  0x020  1                0  Highest Temperature
  5  0x028  1                0  Lowest Temperature
  5  0x030  1                0  Highest Average Short Term Temperature
  5  0x038  1                0  Lowest Average Short Term Temperature
  5  0x040  1                0  Highest Average Long Term Temperature
  5  0x048  1                0  Lowest Average Long Term Temperature
  5  0x050  4                -  Time in Over-Temperature
  5  0x058  1               70  Specified Maximum Operating Temperature
  5  0x060  4                -  Time in Under-Temperature
  5  0x068  1                0  Specified Minimum Operating Temperature
  6  =====  =                =  == Transport Statistics (rev 1) ==
  6  0x008  4                0  Number of Hardware Resets
  6  0x010  4                0  Number of ASR Events
  6  0x018  4                0  Number of Interface CRC Errors
  7  =====  =                =  == Solid State Device Statistics (rev 1) ==
  7  0x008  1               21~ Percentage Used Endurance Indicator
                              |_ ~ normalized value

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  4            0  Command failed due to ICRC error
0x000a  4            0  Device-to-host register FISes sent due to a COMRESET

I then ran a -t short and a --test=long, and then issued the same
command as above:

user <at> w530:~/Downloads/smartmontools-6.3$ sudo ./smartctl -x /dev/sdb
[sudo] password for user:
smartctl 6.3 2014-07-26 r3976 [x86_64-linux-3.13.0-34-generic] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Crucial/Micron RealSSD m4/C400/P400
Device Model:     M4-CT512M4SSD1
Serial Number:    000000001249091FCC21
LU WWN Device Id: 5 00a075 1091fcc21
Firmware Version: 070H
User Capacity:    512,110,190,592 bytes [512 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Sat Aug 16 22:44:54 2014 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     254 (maximum performance)
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Write SCT (Get) Feature Control Command failed: scsi error medium or
hardware error (serious)
Wt Cache Reorder: Unknown (SCT Feature Control command failed)

=== START OF READ SMART DATA SECTION ===
SMART Status command failed: scsi error medium or hardware error (serious)
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

General SMART Values:
Offline data collection status:  (0x80) Offline data collection activity
was never started.
Auto Offline Data Collection: Enabled.
Self-test execution status:      (  80) The previous self-test completed having
the electrical element of the test
failed.
Total time to complete Offline
data collection: ( 2380) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: (   2) minutes.
Extended self-test routine
recommended polling time: (  39) minutes.
Conveyance self-test routine
recommended polling time: (   3) minutes.
SCT capabilities:       (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   100   100   050    -    0
  5 Reallocated_Sector_Ct   PO--CK   100   100   010    -    0
  9 Power_On_Hours          -O--CK   100   100   001    -    4440
 12 Power_Cycle_Count       -O--CK   100   100   001    -    515
170 Grown_Failing_Block_Ct  PO--CK   100   100   010    -    0
171 Program_Fail_Count      -O--CK   100   100   001    -    0
172 Erase_Fail_Count        -O--CK   100   100   001    -    0
173 Wear_Leveling_Count     PO--CK   100   100   010    -    0
174 Unexpect_Power_Loss_Ct  -O--CK   100   100   001    -    398
181 Non4k_Aligned_Access    -O---K   100   100   001    -    144 45 99
183 SATA_Iface_Downshift    -O--CK   100   100   001    -    0
184 End-to-End_Error        PO--CK   100   100   050    -    0
187 Reported_Uncorrect      -O--CK   100   100   001    -    0
188 Command_Timeout         -O--CK   100   100   001    -    0
189 Factory_Bad_Block_Ct    -OSR--   100   100   001    -    215
194 Temperature_Celsius     -O---K   100   100   000    -    0
195 Hardware_ECC_Recovered  -O-RCK   100   100   001    -    0
196 Reallocated_Event_Count -O--CK   100   100   001    -    0
197 Current_Pending_Sector  -O--CK   100   100   001    -    0
198 Offline_Uncorrectable   ----CK   100   100   001    -    0
199 UDMA_CRC_Error_Count    -O--CK   100   100   001    -    0
202 Perc_Rated_Life_Used    ---RC-   100   100   001    -    0
206 Write_Error_Rate        -OSR--   100   100   001    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

Read SMART Log Directory failed: scsi error medium or hardware error (serious)

General Purpose Log Directory Version 1
Address    Access  R/W   Size  Description
0x00       GPL     R/O      1  Log Directory
0x03       GPL     R/O  16383  Ext. Comprehensive SMART error log
0x04       GPL     R/O    255  Device Statistics log
0x07       GPL     R/O   3449  Extended self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters
0x80-0x9f  GPL     R/W     16  Host vendor specific log
0xa0       GPL     VS    2000  Device vendor specific log
0xa1-0xbf  GPL     VS       1  Device vendor specific log
0xc0       GPL     VS      80  Device vendor specific log
0xc1-0xdf  GPL     VS       1  Device vendor specific log
0xe0       GPL     R/W      1  SCT Command/Status
0xe1       GPL     R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (16383 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (3449 sectors)

Read SMART Selective Self-test Log failed: scsi error medium or
hardware error (serious)

SCT Status Version:                  3
SCT Version (vendor specific):       1 (0x0001)
SCT Support Level:                   0
Device State:                        Stand-by (1)
Current Temperature:                  0 Celsius
Power Cycle Max Temperature:          0 Celsius
Lifetime    Max Temperature:          0 Celsius

SCT Temperature History Version:     2
Temperature Sampling Period:         10 minutes
Temperature Logging Interval:        10 minutes
Min/Max recommended Temperature:      0/70 Celsius
Min/Max Temperature Limit:           -5/75 Celsius
Temperature History Size (Index):    478 (40)

Index    Estimated Time   Temperature Celsius
  41    2014-08-13 15:10     ?  -
 ...    ..(473 skipped).    ..  -
  37    2014-08-16 22:10     ?  -
  38    2014-08-16 22:20     0  -
  39    2014-08-16 22:30     0  -
  40    2014-08-16 22:40     0  -

Write SCT (Get) Error Recovery Control Command failed: scsi error
medium or hardware error (serious)
SCT (Get) Error Recovery Control command failed

Device Statistics (GP Log 0x04)
Page Offset Size         Value  Description
  1  =====  =                =  == General Statistics (rev 2) ==
  1  0x008  4              515  Lifetime Power-On Resets
  1  0x010  4             4440  Power-on Hours
  1  0x018  6        933634265  Logical Sectors Written
  1  0x020  6          2708986  Number of Write Commands
  1  0x028  6        852431906  Logical Sectors Read
  1  0x030  6          6049196  Number of Read Commands
  4  =====  =                =  == General Errors Statistics (rev 1) ==
  4  0x008  4                0  Number of Reported Uncorrectable Errors
  4  0x010  4                0  Resets Between Cmd Acceptance and Completion
  5  =====  =                =  == Temperature Statistics (rev 1) ==
  5  0x008  1                0  Current Temperature
  5  0x010  1                0  Average Short Term Temperature
  5  0x018  1                0  Average Long Term Temperature
  5  0x020  1                0  Highest Temperature
  5  0x028  1                0  Lowest Temperature
  5  0x030  1                0  Highest Average Short Term Temperature
  5  0x038  1                0  Lowest Average Short Term Temperature
  5  0x040  1                0  Highest Average Long Term Temperature
  5  0x048  1                0  Lowest Average Long Term Temperature
  5  0x050  4                -  Time in Over-Temperature
  5  0x058  1               70  Specified Maximum Operating Temperature
  5  0x060  4                -  Time in Under-Temperature
  5  0x068  1                0  Specified Minimum Operating Temperature
  6  =====  =                =  == Transport Statistics (rev 1) ==
  6  0x008  4                0  Number of Hardware Resets
  6  0x010  4                0  Number of ASR Events
  6  0x018  4                0  Number of Interface CRC Errors
  7  =====  =                =  == Solid State Device Statistics (rev 1) ==
  7  0x008  1               21~ Percentage Used Endurance Indicator
                              |_ ~ normalized value

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  4            0  Command failed due to ICRC error
0x000a  4            0  Device-to-host register FISes sent due to a COMRESET

On 16 August 2014 20:06, Christian Franke <Christian.Franke <at> t-online.de> wrote:
> Mike wrote:
>>
>> Greetings
>>
>> Following on from previous message, where running smartctl with the
>> disk connected via SATA returns a 600PB disk (it is actually 512Gb
>> Crucial M4), I followed advice from the Crucial message boards and
>> left the disk idle over night in order to trigger automatic garbade
>> collection.  No change the next morning, below is some smartctl
>> output.  I've tried ddrescue to no avail, reads just fail completely.
>> ...
>>
>> smartctl 6.3 2014-07-26 r3976 [x86_64-linux-3.13.0-34-generic] (local
>> build)
>> Copyright (C) 2002-14, Bruce Allen, Christian Franke,
>> www.smartmontools.org
>>
>> === START OF INFORMATION SECTION ===
>> Model Family:     Crucial/Micron RealSSD m4/C400/P400
>> Device Model:     M4-CT512M4SSD1
>> ...
>>
>> Read SMART Log Directory failed: scsi error medium or hardware error
>> (serious)
>>
>
> This error message is probably due to a limitation/bug in driver/firmware of
> the used (which?) controller. The device itself should support SMART Log
> Directory.
>
>
>> General Purpose Log Directory Version 1
>> Address    Access  R/W   Size  Description
>> 0x00       GPL     R/O      1  Log Directory
>> 0x03       GPL     R/O  16383  Ext. Comprehensive SMART error log
>> 0x04       GPL     R/O    255  Device Statistics log
>> 0x07       GPL     R/O   3449  Extended self-test log
>> ..
>>
>> SMART Extended Comprehensive Error Log size 16383 not supported
>>
>> Read SMART Error Log failed: scsi error medium or hardware error (serious)
>>
>> SMART Extended Self-test Log size 3449 not supported
>>
>> Read SMART Self-test Log failed: scsi error medium or hardware error
>> (serious)
>
>
> The extended error/self-test logs may contain useful info. Unfortunately
> smartctl does not print the extended logs due to a historic size limitation.
> In the early days of these logs, sizes were <= 8.
>
> If possible, apply the attached patch and try whether the logs could be read
> then.
>
> Thanks,
> Christian
>

------------------------------------------------------------------------------
Anthony Nemmer | 16 Aug 08:51 2014
Picon

invalid SMART checksum

Greetings!

when I run smartctl -a I get an error message "invalid SMART checksum" but I googled it and I can't find anything that can explain why I am getting the error.

Help?

Thank You,
Anthony Nemmer

--
I always have coffee when I watch radar!
------------------------------------------------------------------------------
_______________________________________________
Smartmontools-support mailing list
Smartmontools-support <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/smartmontools-support
George R Goffe | 16 Aug 07:59 2014
Picon

2TB Seagate gets msg "SMART Status command failed"

Hi,

I just installed a brand new Seagate 2TB hdd, 

 Model Family:     Seagate Barracuda 7200.14
(AF)                                                                                                                      
Device Model:    
ST2000DM001-1CH164                                                                                                                                  

When I execute this command (/usr/sbin/smartctl -a -d sat /dev/sdc), I get the following:

=== START OF READ SMART DATA SECTION
===                                                                                                                              
SMART Status command
failed                                                                                                                                           
Please get assistance from
http://smartmontools.sourceforge.net/                                                                                                      
Register values returned from SMART Status command
are:                                                                                                               
 ERR=0x00, SC=0x00, LL=0xb7, LM=0x2d, LH=0x04, DEV=0x40,
STS=0x50                                                                                                     
SMART overall-health self-assessment test result:
PASSED                                                                                                              
Warning: This result is based on an Attribute
check.                                                                                                                  
 

Am I doing something wrong? I don't know how to handle this situation. The hdd is behind a SIIG USB docking
station. The os is Fedora 19 x86_64. Smartmon is at this level: smartmontools-6.2-5.fc19.x86_64

Any/all hints/tips/suggestions would be appreciated.

Regards,

George...

------------------------------------------------------------------------------
Martijn | 13 Aug 21:28 2014
Picon

[bug] Wrong data from smartctl if disk is 'readonly' / in use

Ladies, gents,

I unfortunately was unable to post this to Trac with my 
Sourceforge-account as suggested on smartmontools.org, so that is why 
I'm posting it here:

I've found that smartmontools 6.3 outputs wrong data about a disk, if 
the disk is in use by another process. This includes wrong data about 
the serial number of the drive, which surprised me (after it scared me 
;-)). I've tested this on a Windows host.

Attached are two files with the output of smartctl.exe --xall of the 
very same disk, once when it is in use by a VirtualBox virtual machine, 
and once without.

There are three important things to notice:
1. "Serial Number: 20202020202020203431"
I'm not sure what causes smartmontools to think this particular string 
is the serial number, but this is not the actual serial number ;-)
It scared me at first, because I thought this serial number was causes 
by some sort of problem with the firmware.

2. "Warning: Limited functionality due to missing admin rights" and 
"Function requires admin rights"

This is not accurate: I did have admin rights for that particular shell. 
Both attached files where created in the same shell, with the same 
rights. I would highly recommend a better description of possible 
causes, or seperate error messages for seperate cases.

3. "Device Statistics (GP Log 0x04) not supported" (for example)
Similar problem as #2.

Regarding #2 en #3: I can understand it's difficult to discover the 
difference between a disk that's in use, and not having enough 
permissions, or a disk that doesn't have certain features. I would 
recommend changing some of the messages to reflect that more could be 
going on than simply "no admin rights".

#1 seems like a fun bug that's worth debugging ;-)

Hope this helps improve smartmontools.

Regards,
- Martijn
smartctl 6.3 2014-07-26 r3976 [x86_64-w64-mingw32-win7-sp1] (sf-6.3-1)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     Crucial_CT1024M550SSD1
Serial Number:    14100C07B9DF
LU WWN Device Id: 5 00a075 10c07b9df
Firmware Version: MU01
User Capacity:    1.024.209.543.168 bytes [1,02 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Wed Aug 13 21:00:44 2014 WEDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM level is:     254 (maximum performance)
Rd look-ahead is: Enabled
Write cache is:   Enabled
ATA Security is:  Disabled, frozen [SEC2]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		( 4765) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  12) minutes.
Conveyance self-test routine
recommended polling time: 	 (   3) minutes.
SCT capabilities: 	       (0x0035)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   100   100   000    -    0
  5 Reallocated_Sector_Ct   PO--CK   100   100   000    -    0
  9 Power_On_Hours          -O--CK   100   100   000    -    96
 12 Power_Cycle_Count       -O--CK   100   100   000    -    2
171 Unknown_Attribute       -O--CK   100   100   000    -    0
172 Unknown_Attribute       -O--CK   100   100   000    -    0
173 Unknown_Attribute       -O--CK   100   100   000    -    0
174 Unknown_Attribute       -O--CK   100   100   000    -    1
180 Unused_Rsvd_Blk_Cnt_Tot PO--CK   000   000   000    -    8892
183 Runtime_Bad_Block       -O--CK   100   100   000    -    0
184 End-to-End_Error        -O--CK   100   100   000    -    0
187 Reported_Uncorrect      -O--CK   100   100   000    -    0
194 Temperature_Celsius     -O---K   068   056   000    -    32 (Min/Max 24/44)
196 Reallocated_Event_Count -O--CK   100   100   000    -    16
197 Current_Pending_Sector  -O--CK   100   100   000    -    0
198 Offline_Uncorrectable   ----CK   100   100   000    -    0
199 UDMA_CRC_Error_Count    -O--CK   100   100   000    -    0
202 Unknown_SSD_Attribute   P---CK   100   100   000    -    0
206 Unknown_SSD_Attribute   -OSR--   100   100   000    -    0
210 Unknown_Attribute       -O--CK   100   100   000    -    0
246 Unknown_Attribute       -O--CK   100   100   000    -    550141350
247 Unknown_Attribute       -O--CK   100   100   000    -    17192522
248 Unknown_Attribute       -O--CK   100   100   000    -    534235
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O     51  Comprehensive SMART error log
0x03       GPL     R/O  16383  Ext. Comprehensive SMART error log
0x04       GPL,SL  R/O    255  Device Statistics log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters
0x13       GPL     R/O      1  SATA NCQ Send and Receive log
0x24       GPL     R/O    429  Current Device Internal Status Data log
0x25       GPL     R/O    145  Saved Device Internal Status Data log
0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa0       GPL     VS    2000  Device vendor specific log
0xa0       SL      VS     208  Device vendor specific log
0xa1-0xbf  GPL,SL  VS       1  Device vendor specific log
0xc0       GPL,SL  VS      80  Device vendor specific log
0xc1-0xdf  GPL,SL  VS       1  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log size 16383 not supported

SMART Error Log Version: 1
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%        90         -
# 2  Vendor (0xff)       Completed without error       00%        90         -
# 3  Short offline       Completed without error       00%        17         -
# 4  Offline             Completed without error       00%         1         -
# 5  Short offline       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       1 (0x0001)
SCT Support Level:                   0
Device State:                        Active (0)
Current Temperature:                    32 Celsius
Power Cycle Min/Max Temperature:     26/34 Celsius
Lifetime    Min/Max Temperature:     24/44 Celsius
Under/Over Temperature Limit Count:   0/0

SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        1 minute
Min/Max recommended Temperature:      0/70 Celsius
Min/Max Temperature Limit:           -5/75 Celsius
Temperature History Size (Index):    478 (47)

Index    Estimated Time   Temperature Celsius
  48    2014-08-13 13:03    30  ***********
 ...    ..( 74 skipped).    ..  ***********
 123    2014-08-13 14:18    30  ***********
 124    2014-08-13 14:19    31  ************
 125    2014-08-13 14:20    30  ***********
 126    2014-08-13 14:21    32  *************
 127    2014-08-13 14:22    33  **************
 128    2014-08-13 14:23    34  ***************
 129    2014-08-13 14:24    35  ****************
 130    2014-08-13 14:25    36  *****************
 131    2014-08-13 14:26    37  ******************
 132    2014-08-13 14:27    37  ******************
 133    2014-08-13 14:28    35  ****************
 134    2014-08-13 14:29    34  ***************
 135    2014-08-13 14:30    34  ***************
 136    2014-08-13 14:31    34  ***************
 137    2014-08-13 14:32    33  **************
 ...    ..(  2 skipped).    ..  **************
 140    2014-08-13 14:35    33  **************
 141    2014-08-13 14:36    32  *************
 ...    ..( 12 skipped).    ..  *************
 154    2014-08-13 14:49    32  *************
 155    2014-08-13 14:50    31  ************
 ...    ..( 67 skipped).    ..  ************
 223    2014-08-13 15:58    31  ************
 224    2014-08-13 15:59    30  ***********
 ...    ..(238 skipped).    ..  ***********
 463    2014-08-13 19:58    30  ***********
 464    2014-08-13 19:59     ?  -
 465    2014-08-13 20:00    26  *******
 466    2014-08-13 20:01    27  ********
 467    2014-08-13 20:02    30  ***********
 ...    ..(  2 skipped).    ..  ***********
 470    2014-08-13 20:05    30  ***********
 471    2014-08-13 20:06    32  *************
 472    2014-08-13 20:07    33  **************
 473    2014-08-13 20:08    34  ***************
 474    2014-08-13 20:09    34  ***************
 475    2014-08-13 20:10    33  **************
 476    2014-08-13 20:11    32  *************
 477    2014-08-13 20:12    32  *************
   0    2014-08-13 20:13    30  ***********
 ...    ..(  2 skipped).    ..  ***********
   3    2014-08-13 20:16    30  ***********
   4    2014-08-13 20:17    31  ************
 ...    ..( 31 skipped).    ..  ************
  36    2014-08-13 20:49    31  ************
  37    2014-08-13 20:50    30  ***********
 ...    ..(  9 skipped).    ..  ***********
  47    2014-08-13 21:00    30  ***********

SCT Error Recovery Control command not supported

Device Statistics (GP Log 0x04)
Page Offset Size         Value  Description
  1  =====  =                =  == General Statistics (rev 2) ==
  1  0x008  4                2  Lifetime Power-On Resets
  1  0x010  4               96  Power-on Hours
  1  0x018  6        550141350  Logical Sectors Written
  1  0x020  6          4577342  Number of Write Commands
  1  0x028  6          5669747  Logical Sectors Read
  1  0x030  6           607146  Number of Read Commands
  4  =====  =                =  == General Errors Statistics (rev 1) ==
  4  0x008  4                0  Number of Reported Uncorrectable Errors
  4  0x010  4                3  Resets Between Cmd Acceptance and Completion
  5  =====  =                =  == Temperature Statistics (rev 1) ==
  5  0x008  1               32  Current Temperature
  5  0x010  1               30  Average Short Term Temperature
  5  0x018  1                -  Average Long Term Temperature
  5  0x020  1               44  Highest Temperature
  5  0x028  1               24  Lowest Temperature
  5  0x030  1               34  Highest Average Short Term Temperature
  5  0x038  1               30  Lowest Average Short Term Temperature
  5  0x040  1                -  Highest Average Long Term Temperature
  5  0x048  1                -  Lowest Average Long Term Temperature
  5  0x050  4                -  Time in Over-Temperature
  5  0x058  1               70  Specified Maximum Operating Temperature
  5  0x060  4                -  Time in Under-Temperature
  5  0x068  1                0  Specified Minimum Operating Temperature
  6  =====  =                =  == Transport Statistics (rev 1) ==
  6  0x008  4                0  Number of Hardware Resets
  6  0x010  4                0  Number of ASR Events
  6  0x018  4                0  Number of Interface CRC Errors
  7  =====  =                =  == Solid State Device Statistics (rev 1) ==
  7  0x008  1                3~ Percentage Used Endurance Indicator
                              |_ ~ normalized value

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  4            0  Command failed due to ICRC error
0x000a  4            2  Device-to-host register FISes sent due to a COMRESET

smartctl 6.3 2014-07-26 r3976 [x86_64-w64-mingw32-win7-sp1] (sf-6.3-1)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

Warning: Limited functionality due to missing admin rights
=== START OF INFORMATION SECTION ===
Device Model:     Crucial_CT1024M550SSD1
Serial Number:    20202020202020203431
Firmware Version: MU01
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   [No Information Found]
Local Time is:    Wed Aug 13 21:01:42 2014 WEDT
SMART support is: Available - device has SMART capability.
                  Enabled status cached by OS, trying SMART RETURN STATUS cmd.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Unavailable
Rd look-ahead is: Unavailable
Write cache is:   Unavailable
ATA Security is:  Unavailable
Wt Cache Reorder: Unavailable

Read SMART Thresholds failed: Function not implemented

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		( 4765) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					No General Purpose Logging support.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  12) minutes.
Conveyance self-test routine
recommended polling time: 	 (   3) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   100   100   ---    -    0
  5 Reallocated_Sector_Ct   PO--CK   100   100   ---    -    0
  9 Power_On_Hours          -O--CK   100   100   ---    -    96
 12 Power_Cycle_Count       -O--CK   100   100   ---    -    2
171 Unknown_Attribute       -O--CK   100   100   ---    -    0
172 Unknown_Attribute       -O--CK   100   100   ---    -    0
173 Unknown_Attribute       -O--CK   100   100   ---    -    0
174 Unknown_Attribute       -O--CK   100   100   ---    -    1
180 Unused_Rsvd_Blk_Cnt_Tot PO--CK   000   000   ---    -    8892
183 Runtime_Bad_Block       -O--CK   100   100   ---    -    0
184 End-to-End_Error        -O--CK   100   100   ---    -    0
187 Reported_Uncorrect      -O--CK   100   100   ---    -    0
194 Temperature_Celsius     -O---K   068   056   ---    -    32 (Min/Max 24/44)
196 Reallocated_Event_Count -O--CK   100   100   ---    -    16
197 Current_Pending_Sector  -O--CK   100   100   ---    -    0
198 Offline_Uncorrectable   ----CK   100   100   ---    -    0
199 UDMA_CRC_Error_Count    -O--CK   100   100   ---    -    0
202 Data_Address_Mark_Errs  P---CK   100   100   ---    -    0
206 Flying_Height           -OSR--   100   100   ---    -    0
210 Unknown_Attribute       -O--CK   100   100   ---    -    0
246 Unknown_Attribute       -O--CK   100   100   ---    -    550141350
247 Unknown_Attribute       -O--CK   100   100   ---    -    17192522
248 Unknown_Attribute       -O--CK   100   100   ---    -    534555
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

Read SMART Log Directory failed: Function requires admin rights

ATA_READ_LOG_EXT (addr=0x00:0x00, page=0, n=1) failed: Function requires admin rights
Read GP Log Directory failed

SMART Extended Comprehensive Error Log (GP Log 0x03) not supported

Read SMART Error Log failed: Function requires admin rights

SMART Extended Self-test Log (GP Log 0x07) not supported

Read SMART Self-test Log failed: Function requires admin rights

Read SMART Selective Self-test Log failed: Function requires admin rights

SCT Commands not supported

Device Statistics (GP Log 0x04) not supported

SATA Phy Event Counters (GP Log 0x11) not supported

------------------------------------------------------------------------------
_______________________________________________
Smartmontools-support mailing list
Smartmontools-support <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/smartmontools-support
Richard Flint | 13 Aug 20:18 2014
Picon

odd selftest log on an OCZ VERTEX PLUS SSD

Hi,

I have set up smartd and have scheduled a daily short self test and weekly long self test of an OCZ VERTEX PLUS SSD. I have an issue where the "lifetime(hours)" column in the selftest report appears incorrect (see below). Interestingly, querying the power on hours attribute shows the correct figure of 5555 hours.

Any advice on why the hours value is incorrect in the self test logs would be appreciated.

Regards,
Richard

(1:500)$ sudo smartctl -l selftest -dsat,12 /dev/rdsk/c7t4d0

smartctl 6.3 2014-07-26 r3976 [i386-pc-solaris2.10] (local build)

Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org


=== START OF READ SMART DATA SECTION ===

SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Short offline       Completed without error       00%       163         -

# 2  Short offline       Completed without error       00%       139         -

# 3  Extended offline    Completed without error       00%       137         -

# 4  Short offline       Completed without error       00%       115         -

# 5  Short offline       Completed without error       00%        91         -

# 6  Short offline       Completed without error       00%        67         -

# 7  Short offline       Completed without error       00%        43         -

# 8  Short offline       Completed without error       00%        19         -

# 9  Short offline       Completed without error       00%       251         -

#10  Short offline       Completed without error       00%       248         -

#11  Short offline       Completed without error       00%       248         -

#12  Short offline       Completed without error       00%       227         -

#13  Extended offline    Completed without error       00%       225         -

#14  Short offline       Completed without error       00%       203         -

#15  Short offline       Completed without error       00%       179         -

#16  Short offline       Completed without error       00%       155         -

#17  Short offline       Completed without error       00%       131         -

#18  Short offline       Completed without error       00%       107         -

#19  Short offline       Completed without error       00%        83         -

#20  Short offline       Completed without error       00%        59         -

#21  Extended offline    Completed without error       00%        57         -

------------------------------------------------------------------------------
_______________________________________________
Smartmontools-support mailing list
Smartmontools-support <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/smartmontools-support
Mike | 12 Aug 17:36 2014
Picon

smartctl output for Crucial 512Gb SSD

Hi

My m4 512Gb CT512M4SSD1 suddenly failed to read. Below is the smartctl output that returns eventually, taking at least a couple of minutes.  I've tried 2 different USB caddys and also SATA connection to laptop running livecd.  The drive has exclusively been mounted read-only on a Linux system with an ext4 filesystem, and has had unexpected power loss routinely throughout its life.  The values in the attribute table look OK to me I think, but the serious errors at the bottom look bad.  Any ideas greatly appreciated:

 

root <at> w530:/dev# smartctl --all /dev/sdb
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.8.0-41-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen,http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     M4-CT512M4SSD1
*removed personal information*
LU WWN Device Id: 5 00a075 1091fcc21
Firmware Version: 070H
User Capacity:    512,110,190,592 bytes [512 GB]
Sector Size:      512 bytes logical/physical
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 6
Local Time is:    Tue Aug 12 12:02:50 2014 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

General SMART Values:
Offline data collection status:  (0x80)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         ( 2380) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      (  39) minutes.
Conveyance self-test routine
recommended polling time:      (   3) minutes.
SCT capabilities:            (0x003d)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   050    Pre-fail  Always       -       0
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   100   100   001    Old_age   Always       -       4439
 12 Power_Cycle_Count       0x0032   100   100   001    Old_age   Always       -       515
170 Unknown_Attribute       0x0033   100   100   010    Pre-fail  Always       -       0
171 Unknown_Attribute       0x0032   100   100   001    Old_age   Always       -       0
172 Unknown_Attribute       0x0032   100   100   001    Old_age   Always       -       0
173 Unknown_Attribute       0x0033   100   100   010    Pre-fail  Always       -       0
174 Unknown_Attribute       0x0032   100   100   001    Old_age   Always       -       398
181 Program_Fail_Cnt_Total  0x0022   100   100   001    Old_age   Always       -       618478239843
183 Runtime_Bad_Block       0x0032   100   100   001    Old_age   Always       -       0
184 End-to-End_Error        0x0033   100   100   050    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   001    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   001    Old_age   Always       -       0
189 High_Fly_Writes         0x000e   100   100   001    Old_age   Always       -       215
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       0
195 Hardware_ECC_Recovered  0x003a   100   100   001    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   100   100   001    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   100   001    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   001    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   100   100   001    Old_age   Always       -       0
202 Data_Address_Mark_Errs  0x0018   100   100   001    Old_age   Offline      -       0
206 Flying_Height           0x000e   100   100   001    Old_age   Always       -       0

Read SMART Log Directory failed.

Error SMART Error Log Read failed: scsi error medium or hardware error (serious)
Smartctl: SMART Error Log Read Failed
Error SMART Error Self-Test Log Read failed: scsi error medium or hardware error (serious)
Smartctl: SMART Self Test Log Read Failed
Error SMART Read Selective Self-Test Log failed: scsi error medium or hardware error (serious)
Smartctl: SMART Selective Self Test Log Read Failed

------------------------------------------------------------------------------
_______________________________________________
Smartmontools-support mailing list
Smartmontools-support <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/smartmontools-support
Roy Sigurd Karlsbakk | 31 Jul 16:03 2014
Picon

Montoring disks on an LSI 9207 controller

Hi all

I have a supermicro box with an LSI SAS 9207-4i4e controller and I'm trying to get good smart data from the
drives, but… 

Drive connected to local SATA port: http://paste.debian.net/112936/
Drive connected to local LSI controller via SAS expander: http://paste.debian.net/112937/

Initially tested with Debian Wheezy's smartmontools 5.41, then compiled 6.3 - same results. Output above
is from 6.3.

Any idea how I can get good data from this thing? I've had supermicros earler (2+ years back) with LSI
controllers and it worked back then.

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 98013356
roy <at> karlsbakk.net
http://blogg.karlsbakk.net/
GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ
for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller
eksisterer adekvate og relevante synonymer på norsk.

------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls. 
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Smartmontools-support mailing list
Smartmontools-support <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/smartmontools-support
Martin Husemann | 24 Jul 10:21 2014
Picon

firmware upgrade warning from smartd, but no update available

(I tried to create a Trac account and create a formal ticket, but the
verification email for new accounts does not work, maybe there is some
issue with the Trac configuration?)

I have a minor issue, apparently an entry in the database is not specific
enough. At startup smartd logs:

Jul 22 18:29:13 emmas smartd[492]: smartd 6.2 2013-07-26 r3841 [i486--netbsdelf] (local build) 
Jul 22 18:29:13 emmas smartd[492]: Copyright (C) 2002-13, Bruce Allen, Christian Franke,
www.smartmontools.org 
Jul 22 18:29:13 emmas smartd[492]: Opened configuration file /usr/pkg/etc/smartd.conf 
Jul 22 18:29:13 emmas smartd[492]: Drive: DEVICESCAN, implied '-a' Directive on line 23 of file
/usr/pkg/etc/smartd.conf 
Jul 22 18:29:13 emmas smartd[492]: Configuration file /usr/pkg/etc/smartd.conf was parsed, found
DEVICESCAN, scanning devices 
Jul 22 18:29:13 emmas smartd[492]: Device: /dev/rwd0d, opened 
Jul 22 18:29:13 emmas smartd[492]: Device: /dev/rwd0d, ST2000DM001-9YN164, S/N:W1E0K58R,
WWN:5-000c50-052516ef4, FW:CC4H, 2.00 TB 
Jul 22 18:29:13 emmas smartd[492]: Device: /dev/rwd0d, found in smartd database: Seagate Barracuda
7200.14 (AF) 
Jul 22 18:29:13 emmas smartd[492]: Device: /dev/rwd0d, is SMART capable. Adding to "monitor" list. 
Jul 22 18:29:13 emmas smartd[492]: Device: /dev/rwd1d, opened 
Jul 22 18:29:13 emmas smartd[492]: Device: /dev/rwd1d, ST2000DM001-1CH164, S/N:Z340YYLV,
WWN:5-000c50-0668b25a1, FW:CC29, 2.00 TB 
Jul 22 18:29:13 emmas smartd[492]: Device: /dev/rwd1d, found in smartd database: Seagate Barracuda
7200.14 (AF)
Jul 22 18:29:13 emmas smartd[492]: Device: /dev/rwd1d, WARNING: A firmware update for this drive may be
available, 
Jul 22 18:29:13 emmas smartd[492]: see the following Seagate web pages: 
Jul 22 18:29:13 emmas smartd[492]: http://knowledge.seagate.com/articles/en_US/FAQ/207931en 
Jul 22 18:29:13 emmas smartd[492]: http://knowledge.seagate.com/articles/en_US/FAQ/223651en 
Jul 22 18:29:13 emmas smartd[492]: Device: /dev/rwd1d, is SMART capable. Adding to "monitor" list. 
Jul 22 18:29:13 emmas smartd[492]: Monitoring 2 ATA and 0 SCSI devices 

Side note: this is very usefull warning, thanks for adding it!
The first drive listed above has been updated after I saw it (so now it is
not warned anymore).

However, the second drive, while very similar, does not have newer firmware
available. The firmware update utility rejected the drive, and the download
finder on the seagate web page does not find any newer firmware either:

Serial Number: 	Z340YYLV			Part Number: 	1CH164-306		
Model Number: 	ST2000DM001			Family: 	BARRACUDA 7200.14 FAMILY		

No Newer Firmware Available 

I notice the warning was formulated defensive ("may be available"), and I have
no idea how to tell wich model is which - but thought I'd let you know
in case it is easily fixable.

Thanks a lot for this great tools!

Martin

------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls. 
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
meino.cramer | 27 Jul 14:58 2014
Picon
Picon

Weird result while smart testing a Winchester Digital drive

Hi,

On my Gentoo - Linux I smart tested (offline) my Winchester Digital
drive (SATA)

smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.13-RT] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Green (AF)
Device Model:     WDC WD10EARS-00Y5B1
Serial Number:    WD-WMAV51276611
LU WWN Device Id: 5 0014ee 001f5fb47
Firmware Version: 80.00A80
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Sun Jul 27 14:39:58 2014 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

and got (beside other things) this report:

197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       1
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       1

#14  Extended offline    Completed: read failure       90%     14484         4288352511

(the #14 is there due to my fruitless experiments - see below)

The offline test immediately stops when hitting the bad sector. 90% of
the disk was never tested.

By the way: I started smartctl as 'smartctl -d sat' as recommended by
'smartctl -d test'.

My partitionlayout is:

Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x07ec16a2

Device     Boot      Start        End    Blocks  Id System
/dev/sda1  *          2048     104447     51200  83 Linux
/dev/sda2           104448   12687359   6291456  82 Linux swap / Solaris
/dev/sda3         12687360  222402559 104857600  83 Linux
/dev/sda4        222402560 1953525167 865561304   5 Extended
/dev/sda5        222404608  232890367   5242880  83 Linux
/dev/sda6        232892416  442607615 104857600  83 Linux
/dev/sda7        442609664  652324863 104857600  83 Linux
/dev/sda8        652326912  862042111 104857600  83 Linux
/dev/sda9        862044160 1071759359 104857600  83 Linux
/dev/sda10      1071761408 1281476607 104857600  83 Linux
/dev/sda11      1281478656 1491193855 104857600  83 Linux
/dev/sda12      1491195904 1953525167 231164632  83 Linux

With 

4288352511 / 512 = 8375688 I found that the swap (/dev/sda) 
partition has been affected by the bad sector.

I swappoffed the swap (...) and did a 

dd if=/dev/zero of=/dev/sda2 bs=512 conv=notrunc

and it fails with an I/O-error at a certain point.

Another offline test shows my, that the bad sector was still there (no
remapping).

To scan the rest of the disk a entered a selectiv selftest.

(one example of my fruitless experimemts...scanninng the very end of
the disk)
smartctl -t selective,1953525100-1953525167 /dev/sda
smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.13-RT] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Selective self-test routine immediately in off-line mode".
SPAN         STARTING_LBA           ENDING_LBA
   0           1953525100           1953525167
Drive command "Execute SMART Selective self-test routine immediately in off-line mode" successful.
Testing has begun.

BUT!: Regardless of what I entered as scan span, it ALWAYS break
with the error that the already found sector cannot be
written/accessed:

SMART Selective self-test log data structure revision number 1
 SPAN     MIN_LBA     MAX_LBA  CURRENT_TEST_STATUS
    1  1953525100  1953525167  Not_testing

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Selective offline   Completed: read failure       90%     14500         4288352511

I am completly irritated...

IF a selftest stops at the first sector, which cannot be read or
written...and the bad sector is right at the beginning of the scan
area...and any selectiv test will first check that bad sector (as it
seems) and fails than also...how can I monitor my hd in a meaningfull
manner?

Any help is very very appreciated!

Best regards,
mcc

PS: uname -a:
Linux solfire 3.14.13-RT #1 SMP PREEMPT Fri Jul 18 15:53:15 CEST 2014 x86_64 AMD Phenom(tm) II X6 1090T
Processor AuthenticAMD GNU/Linux

------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls. 
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk

Gmane