AGRAWAL, Vishnu (Vishnu | 15 Oct 19:28 2014

Output different for the same smartctl command and version

Hi,

 

On crosschecking two system one in production and another on staging, the disk health status, the same command results in two different output. Command used is “/usr/sbin/smartctl -H /dev/sda”. For ESMF server in production it shows ‘OK’ and for ESM server in staging it shows ‘PASSED’.

1.      Why is the output different for the same command for the two servers?

2.      Does it have different significance?

 

PRODUCTION:

===========

[root <at> TP-ESMF001 monitoring]# /usr/sbin/smartctl -H /dev/sda

smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen

Home page is http://smartmontools.sourceforge.net/

 

SMART Health Status: OK

 

STAGING:

========

[root <at> JW-ESM01 monitoring]# /usr/sbin/smartctl -H /dev/sda

smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen

Home page is http://smartmontools.sourceforge.net/

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

 

Kind Regards,

Vishnu Agrawal

Mobile: +65 97714970

 

------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
_______________________________________________
Smartmontools-support mailing list
Smartmontools-support <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/smartmontools-support
Martin Schröder | 12 Oct 20:39 2014
Picon

Selective self-tests not logged to selective self-test log data structure on WD Green 6TB

Hi,
I have a Western Digital WD Green 6TB, SATA 6Gb/s (WD60EZRX).

But selective self tests are not logged to the selective self-test log
data structure.

Here's the output of -a:

smartctl 6.3 2014-07-26 r3976 [x86_64-linux-3.11.10-21-desktop] (SUSE RPM)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD60EZRX-00MVLB1
Serial Number:    WD-WX41D3404227
LU WWN Device Id: 5 0014ee 2b54a4503
Firmware Version: 80.00A80
User Capacity:    6.001.175.126.016 bytes [6,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5700 rpm
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Sun Oct 12 20:25:19 2014 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%        46         -
# 2  Selective offline   Completed without error       00%        45         -
# 3  Short offline       Completed without error       00%        45         -
# 4  Short offline       Completed without error       00%        45         -
# 5  Selective offline   Completed without error       00%        44         -
# 6  Extended offline    Completed without error       00%        44         -
# 7  Selective offline   Aborted by host               10%         4         -
# 8  Short offline       Completed without error       00%         2         -
# 9  Conveyance offline  Completed without error       00%         0         -
#10  Short offline       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing

Best
   Martin

PS: I would have opened a ticket with trac, but that does not work
    with sourceforge accounts (and why would I want to use sf accounts
    on smartmontools.org?). :-(

------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://p.sf.net/sfu/Zoho
Peter van Hoof | 1 Oct 15:50 2014
Picon

amendment for "Bad block HOWTO" page

Hi,

I found that the information given on the "Bad block HOWTO" page for 
forcing a bad sector to be replaced doesn't work on a disk with a 4 kiB 
physical sector size. At least that is the case on my Seagate Barracuda 
7200.14 3 TB drive. This is what I found:

Running a short self-test on this drive revealed a bad sector:

# 1  Short offline       Completed: read failure       70%     10174 
      4130265680

I double-checked this with dd:

# dd if=/dev/sde of=/dev/null bs=512 skip=4130265680 count=1
dd: error reading ‘/dev/sde’: Input/output error
0+0 records in
0+0 records out
0 bytes (0 B) copied, 6.80211 s, 0.0 kB/s

So indeed, this sector is bad and I want to replace it. This is what I 
did, following the description on your (otherwise excellent) HOWTO page:

# dd if=/dev/zero of=/dev/sde bs=512 seek=4130265680 count=1
dd: error writing ‘/dev/sde’: Input/output error
1+0 records in
0+0 records out
0 bytes (0 B) copied, 22.6075 s, 0.0 kB/s

The operation failed! Doing another read operation as shown above 
confirmed that the bad sector was indeed not replaced. So on a hunch I 
decided to try using a 4 kiB buffer size. This means that the value for 
the skip parameter needs to be adjusted. There are eight logical 
512-byte sectors in one 4096-byte physical sector, so the new skip value 
should be 4130265680 / 8 = 516283210. Using this, the new command becomes:

# dd if=/dev/zero of=/dev/sde bs=4096 seek=516283210  count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.000359402 s, 11.4 MB/s

And now the operation succeeded! Double-checking confirmed that the bad 
sector was indeed replaced now:

# dd if=/dev/sde of=/dev/null bs=512 skip=4130265680 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.000310025 s, 1.7 MB/s

So it looks like replacing a bad sector only works if you overwrite the 
entire physical sector in one single write operation. Overwriting only 
part of the physical sector has no effect.

Peter van Hoof.

------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Smartmontools-support mailing list
Smartmontools-support <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/smartmontools-support
Alexey Kopytko | 25 Sep 03:03 2014
Picon

Secondary drive in WD My Book Studio Edition II

Hello

How do I see SMART for a second drive in a multi-drive enclosure
accessible over SAT layer?

Here's what I could get for now:

# smartctl -i /dev/sdd

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.14-2-amd64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Green
Device Model:     WDC WD10EADS-00L5B1
Serial Number:    WD-WCAU4C000000
LU WWN Device Id: 5 0014ee 1ac7ee3f6
Firmware Version: 01.01A01
User Capacity:    1 000 204 886 016 bytes [1,00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.5, 3.0 Gb/s
Local Time is:    Thu Sep 25 09:57:12 2014 JST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

# dmesg

usb 2-1.6: new high-speed USB device number 8 using ehci-pci
usb 2-1.6: New USB device found, idVendor=1058, idProduct=1105
usb 2-1.6: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 2-1.6: Product: My Book
usb 2-1.6: Manufacturer: Western Digital
usb 2-1.6: SerialNumber: 0000000000000000000
usb-storage 2-1.6:1.0: USB Mass Storage device detected
scsi27 : usb-storage 2-1.6:1.0
input: Western Digital My Book as
/devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.6/2-1.6:1.1/0003:1058:1105.0006/input/input17
hid-generic 0003:1058:1105.0006: input,hidraw3: USB HID v1.11 Device
[Western Digital My Book] on usb-0000:00:1d.0-1.6/input1
scsi 27:0:0:0: Direct-Access     WD       My Book          1017 PQ: 0 ANSI: 4
scsi 27:0:0:1: Enclosure         WD       My Book Device   1017 PQ: 0 ANSI: 4
sd 27:0:0:0: Attached scsi generic sg4 type 0
ses 27:0:0:1: Attached Enclosure device
ses 27:0:0:1: Attached scsi generic sg5 type 13
sd 27:0:0:0: [sdd] Spinning up disk...

# lsscsi -g | grep 27:0

[27:0:0:0]   disk    WD       My Book          1017  /dev/sdd   /dev/sg4
[27:0:0:1]   enclosu WD       My Book Device   1017  -          /dev/sg5

# smartctl -i /dev/sg4

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.14-2-amd64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               WD
Product:              My Book
Revision:             1017
User Capacity:        1 000 196 497 408 bytes [1,00 TB]
Logical block size:   512 bytes
Logical Unit id:      error: designator length
Serial number:        WU2Q10000000
Device type:          disk
Local Time is:        Thu Sep 25 09:59:19 2014 JST
SMART support is:     Unavailable - device lacks SMART capability.

# smartctl -i /dev/sg5

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.14-2-amd64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               WD
Product:              My Book Device
Revision:             1017
Logical Unit id:      error: designator length
Serial number:        WU2Q10000000
Device type:          enclosure
Local Time is:        Thu Sep 25 09:59:58 2014 JST
SMART support is:     Unavailable - device lacks SMART capability.

-Alexey

------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
Kiril L | 20 Sep 16:38 2014
Picon

Marvell 88SE9485/9445 controllers

After searching around i didnt find any info regarding Marvell
88SE9485 compatibility with smartmontools.

At the moment i use LSI 9211-8i flashed Dell PERC H310 and cannot get
SMART info under Windows. HDD guardian can but i look for something
else that should work in general with more SMART tools.

Because of that i think for a new motherboard - like Asus
P9A-I/C2750/SAS/4L or Asus P9A-I/C2550/SAS/4L which have Marvell
88SE9485 controllers. I was wondering how would smartmontools work
with these motherboards?

The only thing i found about Marvell and smartmontools is:
> marvell - [Linux only] interact with SATA disks behind Marvell chip-set controllers (using the Marvell
rather than libata driver).

It looks like these motherboards are not a proper solution for me.
Please correct me f i am wrong.

If by any chance there is someone using the same controller and
smartmontools please share your experience!

------------------------------------------------------------------------------
Slashdot TV.  Video for Nerds.  Stuff that Matters.
http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk
John Theodore | 18 Sep 04:55 2014
Picon

smartd.conf not working

This command below works and scans 1 disk:
  sudo smartctl -l ssd -d megaraid,1 -t short /dev/sdb

I want to setup my /etc/smartd.conf so it runs a short test every night, and a long test once a week. The one line in my /etc/smartd.conf is below: 

DEVICESCAN -S on -o on -a -I 194 -s (S/../.././02|L/../../6/03)

I don't believe this works. What is wrong with this setup of my smartd.conf? What is the right thing I want?
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Smartmontools-support mailing list
Smartmontools-support <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/smartmontools-support
srv4adm | 10 Sep 15:05 2014
Picon

SMART Self-test: Reserved (0x80)

Hello 
Could you explain me what does it mean:

SMART Self-test log structure revision number 0
Warning: ATA Specification requires self-test log structure revision number = 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Reserved (0x80)     Aborted by host               60%     42798         -
# 2  Short offline       Completed without error       00%     42787         -
# 3  Short offline       Aborted by host               00%     42763         -

I'm interested about #1 test. what does it mean: Reserved (0x80) ? 
The system was under heavy load, so i had to abort the test.
------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
Olivier Calzi | 5 Sep 09:21 2014

Return Value 4 Perc/LSI

Hello Girls/Guys,

I'm facing a situation that i'm not able to understand the mechanisme of the return value of a smartcl command.

In a munin script to graph the state of the disk i have the return value of 4.
Same thing when i launch the command line by hand and when i made a "echo $?" i have the return value of 4.

Could you tell me where i'm wrong to my lead to understanding?

Thanks

------------------------------------------------
 Olivier Calzi

 Dir: +32 2 588 31 49 | Mobile: +32 498 817 703
 email|xmpp: ocalzi <at> voxbone.com
------------------------------------------------
------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Smartmontools-support mailing list
Smartmontools-support <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/smartmontools-support
Kyle Sebion | 26 Aug 16:10 2014

drivedb.h(978): Error in regular expression: Empty '|' subexpression

A Windows 7 Professional 64-bit, Windows Server 2012, and Windows Server 2012 R2 are getting the following error with a drivedb.h update near 8/23/2014:
  drivedb.h(978): Error in regular expression: Empty '|' subexpression

The error seems to come from the change at: http://www.smartmontools.org/changeset/3988
The change was done during: http://www.smartmontools.org/ticket/329

As a workaround, I changed the string on line 979 in drivedb.h from "SanDisk SD6SB[12]M[0-9]*G(1022I|)|" to "SanDisk SD6SB[12]M[0-9]*G(1022I)?|" on the computers I mentioned above.

--
Kyle Sebion
Developer
Digital Forces Corp

Contact Information
Support 24/7/364: 630-978-2000 X 1. Quickest response. Rings multiple people.
Direct: 630-299-4971. Follows me.
Office 7 am-5 pm: 630-978-2000 x 804. One business day.
Email: help <at> digitalforces.com. Fastest email response one business day.
Texting: 630-447-0804.  Texting is not dependable.
------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Smartmontools-support mailing list
Smartmontools-support <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/smartmontools-support
Terry Kennedy | 26 Aug 04:05 2014

Smartmontools 6.3 defect list query increments error count on Pliant SSD

  Smartmontools 6.3 (installed from ports) on FreeBSD 8-STABLE amd64. The
system is a Dell PowerEdge R710 with a Dell PERC H700 RAID controller.

  The regular disks on this system can be probed by smartmontools without
incident. However, every time the Pliant LB206M SSD is polled with smartctl,
the "non-medium error count" increments by one, and the following error is
logged to the console:

mfi0: 3580 (462244799s/0x0002/info) - Unexpected sense: PD 07(e0x20/s7) Path 5001e820026e8092, CDB:
b7 0c 00 00 00 00 00 00 00 08 00 00, Sense: 1/1c/00

  B7 is "Read defect list" and 1/1C/00 is "Defect list not found". This
seems to correspond to smartctl reporting "defect list format 6 unknown"
The problem is that the reported "Non-medium error count:" increments by
one each time smartctl is run, and this is remembered by the drive.

  Is it possible to define a quirk for this family of drives to prevent
the defect list inquiry? I am attaching the output of "smartctl -a" for
reference. Note that this is a Dell-labeled drive with Dell firmware - I
don't know if it happens with the OEM firmware.

smartctl 6.3 2014-07-26 r3976 [FreeBSD 8.4-STABLE amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               Pliant
Product:              LB206M
Revision:             D323
Compliance:           SPC-4
User Capacity:        200,049,647,616 bytes [200 GB]
Logical block size:   512 bytes
LU is resource provisioned, LBPRZ=1
Rotation Rate:        Solid State Device
Form Factor:          2.5 inches
Logical Unit id:      0x5001e820026e8090
Serial number:        40796304
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Mon Aug 25 22:14:36 2014 EDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Percentage used endurance indicator: 0%
Current Drive Temperature:     39 C
Drive Trip Temperature:        60 C

Manufactured in week 36 of year 2013
Specified cycle count over device lifetime:  0
Accumulated start-stop cycles:  35
defect list format 6 unknown
Elements in grown defect list: 0

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0          0        173.373           0
write:         0        0         0         0          0         52.992           0

Non-medium error count:        8

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background short  Completed                  48       3                 - [-   -    -]
# 2  Background long   Completed                  48       2                 - [-   -    -]
# 3  Background short  Completed                  48       1                 - [-   -    -]

Long (extended) Self Test duration: 4200 seconds [70.0 minutes]

  Please CC me directly on any replies, so I'm sure to see them. Also, if
anyone knows of a method to zero the non-medium error count, I'd appreciate
hearing about it. Thanks!

        Terry Kennedy             http://www.tmk.com
        terry <at> tmk.com             New York, NY USA

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
John A. Wallace | 25 Aug 15:33 2014
Picon

Windows oddities


> John A. Wallace wrote:
> > Windows installation oddities
> >
> > I have installed the most recent Windows version6.3-1, whichIgot from
> >
> here:_http://sourceforge.net/projects/smartmontools/files/latest/downlo
> ad_.
> > All appears to be running well from what I have seen so far. Thanks
> > for that.I manually uninstalled the previous version.
> >
> > I ama bit puzzled by the versioning and such. Before I got this
> > version, I hadbeen using one labeled as?smartmontools-win-6.2-2.zip?.
> >
> We never provided such a file on the above download location.
>
> The file is possibly from independent project
> http://www.netpower.fr/smartmontools. It provides an alternative
> installer for smartmontools.
>
Yes, you're probably right about that as I see the different numbering
version available from there.

Regarding the other question I had:

So I am curious about whether it is possible whether a failing, or faulty,
controller on an external USB enclosure could result in a report indicating
that a disk was bad or failing although in fact it might just be that the
controller itself, not the disk, could have a problem? Thanks, Christian.

- John

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/

Gmane