Sebastian Neustein | 1 Jul 2011 15:01
Picon

Re: smartd error unreadable sectors - but seatools long test passes

Justin Piszcz <jpiszcz <at> lucidpixels.com> writes:

> 
> 
> On Thu, 30 Jun 2011, Sebastian Neustein wrote:
> 
> > Justin Piszcz <jpiszcz <at> lucidpixels.com> writes:
> >
> >> On Thu, 30 Jun 2011, Sebastian Neustein wrote:
> >>> On 30.06.2011 12:47, schrieb Justin Piszcz:
> >>>> On Thu, 30 Jun 2011, Sebastian Neustein wrote:
> >>>>> Hi everyone
> > [..snip..]
> >>>>> Why does smartd repeat the message? I guess it always means the same 4
> >>>>> sectors.
> >>>>> Does it save the status of the disk and just repeats the old message? 
> >>>>> Can I tell smartd that the disk is checked?
> >> Have you looked at the ignore options?
> >>
> >>         -i ID
> >>         -I ID
> >>         -r ID[!]
> >
> > Okay, I could ignore the messages - but is this save?
> >
> > As I understand the parameters above, they won't ignore my error message 
> > since they all need an Attribute ID but the error message has not such an 
> > ID.
> 
> Hi,
(Continue reading)

Fejes József | 1 Jul 2011 17:48
Favicon

HD103SJ faulty firmware?

Hi,

My disk produces problems awfully similar to the well-known SMART 
firmware bug. Using version sf-win32-5.41-1.

Identification:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Model Family:     SAMSUNG SpinPoint F3
Device Model:     SAMSUNG HD103SJ
Serial Number:    S246J9AB101937
LU WWN Device Id: 5 0024e9 2045f0ce8
Firmware Version: 1AJ10001
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

The two problematic attributes (others are fine, no reallocated 
sectors, etc.):

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
199 UDMA_CRC_Error_Count    0x0036   100   100   000    Old_age   
Always       -       45
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   
Always       -       26
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Error log count: 45. Sample:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
(Continue reading)

Benjamin Hansen | 1 Jul 2011 01:17
Picon

GPL violation

Hi,

My name is Benjamin.  

First let me thank you for your HDD monitoring tools.

Unfortunately I may have some bad news.

I am attaching a UART serial log from a network media device (SMD-515) manufactured by Wegener Communications.

It appears they may be in violation of GPL license by using you HDD smartmontools.

I have informally contacted them via email last week about the issue have have not received a response.

As a result I am contacting you, the developers of smartmontools, to bring this to your attention.  

Thank you for your time.
=~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2011.06.17 23:13:47 =~=~=~=~=~=~=~=~=~=~=~=
xosPda serial#a6032f686f0df4e3a4a881991fadce77 subid 0xc4

xenv cs2 ok

power supply: ok

dram0 ok (9)

dram1 ok (8)

zboot ok
**********************************
* FocusBoot start ...
* Build Date: Jan 18 2007
* Version: 2.0.0
* Started at 0x92800000.
* Configurations (chip revision: 4):
*    Use 8KB DRAM as stack.
*    Support XLoad format.
**********************************
Boot from flash (0x48000000) mapped to 0xac000000.
Found XENV block at 0xac000000.
CPU clock frequency: 301.50MHz.
System clock frequency: 201.00MHz.
DRAM0 dunit_cfg/delay0_ctrl (0xf34111ba/0x000a6665).
DRAM1 dunit_cfg/delay0_ctrl (0xe34111ba/0x000b4555).
Board ID.: "852-E2"
Chip Revision: 0x8634:0x83 .. Matched.
Setting up H/W from XENV block at 0xac000000.
  Setting <SYSCLK avclk_mux> to 0x00000000.
  Setting <SYSCLK hostclk_mux> to 0x00000100.
  Setting <IRQ rise edge trigger lo> to 0xff28ca00.
  Setting <IRQ fall edge trigger lo> to 0x0000c000.
  Setting <IRQ rise edge trigger hi> to 0x0000001f.
  Setting <IRQ fall edge trigger hi> to 0x00000000.
  Setting <IRQ GPIO map> to 0x0d090800.
  Setting <PB default timing> to 0x10101010.
  Setting <PB timing0> to 0x10101010.
  Setting <PB Use timing0> to 0x000003f6.
  Setting <PB timing1> to 0x00110101.
  Setting <PB Use timing1> to 0x000003f3.
  PB cs config: 0x00060060 (use 0x00060060)
  Enabled Devices: 0x000003fb
    ISA/IDE BM/IDE Ethernet IR FIP I2CM I2CS SDIO USB
  ISA IDE IRQ: -1
  WARNING: suspected IRQ edge trigger settings for ISA IDE.
  Setting up Clean Divider 2 to 96000000Hz.
  Setting up Clean Divider 4 to 33333333Hz.
  GPIO dir/data = 0x00000000/0x00000000
  UART0 GPIO mode/dir/data = 0x6e/0x00/0x00
  UART1 GPIO mode/dir/data = 0x6e/0x00/0x00
XENV block processing completed.
Found existing memcfg: DRAM0(0x08000000), DRAM1(0x04000000)
Serial Init Done
Flash Init Done


********* STAGE *1* BOOTLOADER **********
Read From: 00060000
Mac Address:00:07:8b:00:ed:64

____________STATE_DHCP_DISCOVER____________
clearing packet buffer0clearing packet bufferclearing dhcp state structuresetting bootp
datagpucData=928664b4, pucSend=928664b4, param=928664de, sizeof(gpucData)=640, end
gpucData=92866af4MEMSET DONEXid: 65ef038eXID foundentry: pucData=928664deDone:
pucMacAddr=92fffbf0setBootp DonedhcpOptions DonesetIp DonesetEth DoneethSend! Done
____________STATE_DHCP_OFFER____________
Src: 0:40:77:bb:55:10, Dst: 0:7:8b:0:ed:64, Protocol: 8
MyIp:192.168.1.120
ServerIp:192.168.1.1
NetMask:0.255.255.255
Filename: Got offer packet
____________STATE_DHCP_REQUEST____________
entry: pucData=928664deDone: pucMacAddr=92fffbf0
____________STATE_DHCP_DONE____________
Src: 0:40:77:bb:55:10, Dst: 0:7:8b:0:ed:64, Protocol: 8
Done!!!Write Common area to: 00060000, size 00000a24
..

_______________
OsImageLength:31981568
OsImageChecksum:
 dab78e69d46ab80c5c10e33ed7e8cbfb


bootAppMain(): Launch address is 0x00000000 : start SAP download
 [main.c:186]
Starting SAP Download
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
Downloading SAP records

Multicast String from Common Area:
Type: #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hWrong
Filename: expect SAP, got #Vts
q-k&._sK!<#1`hBEC
^Сޒeƃ̙?NXDҐ <at> M#1>hL>jy3;t,;)\)垈;_hFailed to
get SAP address from Dhcp
decryptImage(): Calling load_xload( image=0xac080000, addr=0xb1800000 );
 [main.c:450]
load_xload(): Calling doxrpc on c080000 [xload.c:27]
load_xload(): Doing memcpy from 0xac080000 to 0xb1800000, size 0x2fd14 [xload.c:33]
load_xload(): Setting dest_addr to load_addr [xload.c:37]
46 4e 49 42 00 00 00 00 
00 00 10 91 00 00 10 91 
8a c9 fb fb 00 00 00 02 
c0 f8 02 00 00 00 00 00 
25 08 00 00 25 10 00 00 
25 18 00 00 25 20 00 00 
25 28 00 00 25 30 00 00 
25 38 00 00 25 40 00 00 
decryptImage(): Copying 194752 bytes from 0xb3000400 to 0x91100000 [main.c:466]
decryptImage(): Verifying checksum [main.c:468]
decryptImage(): Checksum matches [main.c:473]
decryptImage(): Set Start Address to 0x91100000
 [main.c:475]
OS signature is valid
OS image is valid. Commencing hyper-jump [0x91100000]Write Common area to: 00060000, size 00000a24
..

_______________
OsImageLength:31981568
OsImageChecksum:
 dab78e69d46ab80c5c10e33ed7e8cbfb


Serial Init Done
Flash Init Done


************ STAGE 2 BOOTLOADER ************
* Build Date: Jan 18 2007
********************************************
OS image is valid. Commencing hyper-jump [0x90020000]SMP863x Enabled Devices under Linux/XENV
0x48000000 = 0x000003fb

  ISA/IDE BM/IDE Ethernet IR FIP I2CM I2CS SDIO USB

CPU revision is: 00019068

Primary instruction cache 16kB, physically tagged, 2-way, linesize 16 bytes.

Primary data cache 16kB, 2-way, linesize 16 bytes.

Linux version 2.4.30-tango2-2.7.147.0 (by_x <at> host_x) (gcc version 3.4.2) #4 Wed Mar 16 16:16:16 CET 2005

SMP863x Chip (Configured: REVISION=0x6, Detected: 0x8634/0x83)

Determined physical RAM map:

 memory: 10020000  <at>  00000000 (reserved)

 memory: 03fe0000  <at>  10020000 (usable)

On node 0 totalpages: 81920

zone(0): 81920 pages.

zone(1): 0 pages.

zone(2): 0 pages.

Kernel command line: root=/dev/mtdblock/1 ro

Using 150.750 MHz high precision timer.

spurious interrupt detected: 7

Calibrating delay loop... 297.98 BogoMIPS

Memory: 59148k/65408k available (2312k kernel code, 6260k reserved, 108k data, 80k init, 0k highmem)

Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)

Inode cache hash table entries: 32768 (order: 6, 262144 bytes)

Mount cache hash table entries: 512 (order: 0, 4096 bytes)

Buffer cache hash table entries: 32768 (order: 5, 131072 bytes)

Page-cache hash table entries: 131072 (order: 7, 524288 bytes)

Checking for 'wait' instruction...  available.

POSIX conformance testing by UNIFIX

Linux NET4.0 for Linux 2.4

Based upon Swansea University Computer Society NET3.039

Initializing RT netlink socket

Starting kswapd

Journalled Block Device driver loaded

devfs: v1.12c (20020818) Richard Gooch (rgooch <at> atnf.csiro.au)

devfs: boot_options: 0x1

JFFS2 version 2.1. (C) 2001 Red Hat, Inc., designed by Axis Communications AB.

udf: registering filesystem

pty: 256 Unix98 ptys configured

Serial driver version 5.05c (2001-07-08) with no serial options enabled

ttyS00 at 0x0006c100 (irq = 9) is a ST16650

ttyS01 at 0x0006c200 (irq = 10) is a ST16650

em86xx_enet: ethernet driver for SMP863x internal mac

em86xx_enet: detected phy at address 0x01

em86xx_enet: mac address 00:07:8b:00:ed:64

Uniform Multi-Platform E-IDE driver Revision: 7.00beta4-2.4

ide: Assuming 50MHz system bus speed for PIO modes; override with idebus=xx

SMP863x Bus Mastering IDE activated as ide0.

physmap flash device: 1000000 at 44000000 (c007d000)

physmap flash device: 1000000 at 48000000 (c107e000)

 Amd/Fujitsu Extended Query Table v1.3 at 0x0040

number of CFI chips: 1

Using buffer write method

cfi_cmdset_0002: Disabling fast programming due to code brokenness.

Using physmap partition definition

 Amd/Fujitsu Extended Query Table v1.3 at 0x0040

number of CFI chips: 1

Using buffer write method

cfi_cmdset_0002: Disabling fast programming due to code brokenness.

Using physmap partition definition

Concatenating MTD devices:

(0): "Flash_CS2"

(1): "Flash_CS1"

into device "CS1+CS2"

Using physmap partition definition

Creating 1 MTD partitions on "CS1+CS2":

0x00400000-0x01f00000 : "Root FileSystem"

Creating 4 MTD partitions on "Flash_CS2":

0x00400000-0x01000000 : "Flash FileSystem"

0x00000000-0x00060000 : "Bootloader"

0x00060000-0x00080000 : "Common_Area"

0x00080000-0x00400000 : "Kernel"

Creating 2 MTD partitions on "Flash_CS1":

0x00000000-0x00f00000 : "Filesystem-pt2"

0x00f00000-0x01000000 : "Flash_NVM"

Finished adding mtd devices

usb.c: registered new driver usbdevfs

usb.c: registered new driver hub

NET4: Linux TCP/IP 1.0 for NET4.0

IP Protocols: ICMP, UDP, TCP, IGMP

IP: routing cache hash table of 4096 buckets, 32Kbytes

TCP: Hash tables configured (established 32768 bind 65536)

NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.

Creating TLB mapping for 0x20000000 to 0xc207f000, size 0x04000000.

cramfs: wrong magic

VFS: Mounted root (jffs2 filesystem) readonly.

Mounted devfs on /dev

Freeing unused kernel memory: 80k freed

Algorithmics/MIPS FPU Emulator v1.5

mount: Mounting /dev/ide/host0/bus0/target0/lun0/part1 on /hdisk/media failed: No such file or directory
Initializing random number generator... done.
Starting network...
run-parts: failed to open directory /etc/network/if-pre-up.d: No such file or directory
Using /lib_2.7.147.0/modules/2.4.30-tango2-2.7.147.0/llad.o
Warning: loading llad will taint the kernel: non-GPL license - LGPL
  See http://www.tux.org/lkml/#export-tainted for information about tainted modules
Using /lib_2.7.147.0/modules/2.4.30-tango2-2.7.147.0/em8xxx.o
Warning: loading em8xxx will taint the kernel: non-GPL license - Proprietary
  See http://www.tux.org/lkml/#export-tainted for information about tainted modules
mumk_register_tasklet: (0) tasklet 0xc62bd000 status  <at> 0xc62941e4

Using /lib_2.7.147.0/modules/2.4.30-tango2-2.7.147.0/kernel/drivers/char/irkernel.o
Warning: loading irkernel will taint the kir: driver loaded (wait_period = 100, buffer_size = 6)

ernel: non-GPL license - Proprietary, Copyright (c) 2004 Sigma Designs Inc. All rights reserved.
  See http://www.tux.org/lkml/#export-tainted for information about tainted modules
Using /lib/modules/2.4.30-tango2-2.7.147.0/kernel/drivers/scsi/scsi_mod.o
SCSI subsystem driver Revision: 1.00

Using /lib/modules/2.4.30-tango2-2.7.147.0/kernel/drivers/scsi/sg.o
Using /lib/modules/2.4.30-tango2-2.7.147.0/kernel/drivers/scsi/sd_mod.o
Using /lib/modules/2.4.30-tango2-2.7.147.0/kernel/drivers/usb/storage/usb-storage.o
Initializing USB Mass Storage driver...

usb.c: registered new driver usb-storage

USB Mass Storage support registered.

Using /lib/modules/2.4.30-tango2-2.7.147.0/kernel/drivers/usb/host/tango2-usb-ohci.o
Tango2 USB initializing...

tango2-usb-ohci.c: USB OHCI polling mode, at membase 0xa0021500 Status=0.

usb.c: new USB bus registered, assigned bus number 1

Product: USB OHCI Root Hub

SerialNumber: a0021500

hub.c: USB hub found

hub.c: 2 ports detected

Using /lib/modules/2.4.30-tango2-2.7.147.0/kernel/drivers/usb/host/tango2-ehci-hcd.o
Tango2 USB was initialized.

tango2-ehci-hcd: irq 48, mem base a0021400

usb.c: new USB bus registered, assigned bus number 2

tango2-ehci-hcd: park 0

TANGO2 EHCI driver enabled  

Manufacturer: Linux 2.4.30-tango2-2.7.147.0 tango2-ehci-hcd

Product: Tango2 Integrated USB 2.0

SerialNumber: tango2-ehci-bus

hub.c: USB hub found

hub.c: 2 ports detected

Using /lib/modules/2.4.30-tango2-2.7.147.0/kernel/fs/fat/fat.o
Using /lib/modules/2.4.30-tango2-2.7.147.0/kernel/fs/vfat/vfat.o
GPIO 17 is 1, setting to 0
NVM mounted with valid file system
Serial number already configured: 1052374
Stopping network with /sbin/ifconfig eth0 down
Starting network with /sbin/ifconfig eth0 up 172.16.123.74 netmask 255.255.0.0
eth0: link up, 100Mbps, full-duplex, lpa 0x45E1

smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/


Smartctl open device: /dev/ide/host0/bus0/target0/lun0/disc failed: No such file or directory
[117] Dec 31 19:00:13 Running in background
networkType = DHCP len= 4
InitializeFpLEDs. 

To setup STB, press UP and DOWN keys in 20 seconds.!
.
Sigma Designs SMP8634 development environment (based on the buildroot project)

uclibc login: ...................Ret=0
shell: RETVAL=0
(0) will goto HomePage:  

To config network by  /etc/kasenna.config
udhcpc (v0.9.9-pre) started
Sending discover...
Sending select for 192.168.1.120...
Lease of 192.168.1.120 obtained, lease time 86400
deleting routers
route: SIOC[ADD|DEL]RT: No such process
adding dns 192.168.1.1

restrict default noquery notrust nomodify
restrict 127.0.0.1
restrict 192.168.1.0 mask 255.255.255.0
restrict 10.10.8.132
server 10.10.8.132
driftfile /etc/ntp.drift
logfile /var/log/ntp.log
EST+5
Time Server = 10.10.8.132
rdate: timeout connecting to time server
killall: ntpd: no process killed
./xgalio --url http://10.10.8.132/lrp201d-ga-0307
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
browser = ANT Galio 19.06
build time = 2007-Feb-12  10:48
SIGMA toolchain = SMP8634_2.7.147.0
compile-time settings:
  * conditional access=none
  * memory=dual
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
[HDMI] ======================= creating pDH =======================
[HDMI] Found part, Vendor ID is 0x01005392
[HDMI]    ***   HotPlug changed, is now ON
[HDMI]    ***   Rx changed, is now ON
[HDMI]    ***   Clock changed, is now STABLE
[HDMI] DHUpdateVideoPixelClock(74175824)
[HDMI] DHSetHDMIMode(FALSE)
outports_options.c: Application did not specify HDCP SRM, providing empty SRM!
[HDMI] DHCancelHDCP()
[HDMI] DHUpdateVideoPixelClock(74175824)
[HDMI] DHSetHDMIMode(FALSE)
[HDMI] DHCancelHDCP()
[HDMI] DHUpdateVideoPixelClock(74175824)
[HDMI] DHSetHDMIMode(FALSE)
[HDMI] DHCancelHDCP()
TV-Format : dom_audio_pid_array_init
recording_iterator_register
dom_recording_iterator_init
recording_iterator_document_ext_register
[HDMI] ======================= creating pDH =======================
[HDMI] Found part, Vendor ID is 0x01005392
[HDMI]    ***   HotPlug changed, is now ON
[HDMI]    ***   Rx changed, is now ON
[HDMI]    ***   Clock changed, is now STABLE
[HDMI] DHSetHDMIMode(FALSE)
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Smartmontools-support mailing list
Smartmontools-support <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/smartmontools-support
Gabriele Pohl | 9 Jul 2011 15:33

Re: GPL violation

Hi list,

On 07/01/2011 01:17 AM, Benjamin Hansen wrote:
> It appears they may be in violation of GPL license
> by using you HDD smartmontools.

we (Bruce Allen, Christian Franke and me)
got this mail from Benjamin a few days ago
and have already taken some actions.

Bruce wrote to the president of the company
to inform him about the GPL-Violation and
to give him the chance to solve the issue.
And we are in contact with FSFE[1]

I was not aware of the fact that Benjamin also
wrote to the public mailinglist, as the mail
arrived but now due to list moderators action
as I suppose.

I advice to follow the recommendations of this document
of the Free Software Foundation in these cases:
http://fsfe.org/projects/ftf/reporting-fixing-violations

Quote: "Be careful when reporting a violation. Accusations and
suspicions voiced on public mailing lists create uncertainty and do
little to solve violations."

I therefore will delete the mail now from the public mail archive
and ask all of you not to escalate the situation.
We will inform you, when the problem is solved
or when we undertake legal actions.

If you have info about further devices, that use
smartmontools and do not inform the customers
about this and their rights using *Free Software*
licensed under GPL, please contact us in private mail.

Cheers,

Gabriele

[1] http://fsfe.org/projects/ftf/ftf.en.html

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
Skye Sweeney | 10 Jul 2011 02:48

Problem clearing SMART errors on WD 1T drive

Synopsis:

I am at wits end and could use some pointers as to what I am doing wrong or if I just need to buy a new drive. I realize this may not the perfect forum for this question, and would be happy with just a pointer to the right place.

I have been getting SMART errors on a backup drive on my Fedora 12 file server. I have tried the instructions in "Bad block HOWTO for smartmontools" without avail. I have visited many websites and have not found anything more illuminating to my problem.

The drive is a backup drive of my data drive using rdiff-backup once a night. It is a Western Digital SATA 1T full size drive. It is my /dev/sdc drive and has only one partition /dev/sdc1

The file system used to be ext4, but since the instructions for fixing blocks only called out ext2/3 I formatted the drive to ext3 and used the following procedures to no avail.

Details:

I get the following in my email each day:

 --------------------- Smartd Begin ------------------------


 Currently unreadable (pending) sectors detected:
       /dev/sdc [SAT] - 48 Time(s)
       44 unreadable sectors detected

 Offline uncorrectable sectors detected:
       /dev/sdc [SAT] - 48 Time(s)
       30 offline uncorrectable sectors detected

 ---------------------- Smartd End -------------------------

This ends up in /var/log/messages each day:

Jul  9 19:53:58 tux smartd[1658]: Device: /dev/sdc [SAT], 44 Currently unreadable (pending) sectors
Jul  9 19:53:58 tux smartd[1658]: Device: /dev/sdc [SAT], 30 Offline uncorrectable sectors (changed -165)

The steps I took to try to fix these problems:


1) Get SMART info

[root <at> tux ~]# smartctl -d ata -a /dev/sdc
smartctl 5.39.1 2010-01-28 r3054 [i386-redhat-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD10EARS-00Y5B1
Serial Number:    WD-WMAV51375649
Firmware Version: 80.00A80
User Capacity:    1,000,204,886,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Tue Jun 28 20:13:46 2011 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

smartctl 5.39.1 2010-01-28 r3054 [i386-redhat-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD10EARS-00Y5B1
Serial Number:    WD-WMAV51375649
Firmware Version: 80.00A80
User Capacity:    1,000,204,886,016 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Tue Jun 28 20:14:25 2011 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)    Offline data collection activity
                    was suspended by an interrupting command from host.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:          (21300) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      ( 245) minutes.
Conveyance self-test routine
recommended polling time:      (   5) minutes.
SCT capabilities:            (0x3031)    SCT Status supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   130   126   021    Pre-fail  Always       -       6475
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       619
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   088   088   000    Old_age   Always       -       9119
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       143
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       71
193 Load_Cycle_Count        0x0032   197   197   000    Old_age   Always       -       9117
194 Temperature_Celsius     0x0022   111   108   000    Old_age   Always       -       36
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   199   199   000    Old_age   Always       -       253
198 Offline_Uncorrectable   0x0030   199   199   000    Old_age   Offline      -       195
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   199   199   000    Old_age   Offline      -       291

SMART Error Log Version: 1
ATA Error Count: 805 (device log contains only the most recent five errors)
    CR = Command Register [HEX]
    FR = Features Register [HEX]
    SC = Sector Count Register [HEX]
    SN = Sector Number Register [HEX]
    CL = Cylinder Low Register [HEX]
    CH = Cylinder High Register [HEX]
    DH = Device/Head Register [HEX]
    DC = Device Command Register [HEX]
    ER = Error register [HEX]
    ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 805 occurred at disk power-on lifetime: 9119 hours (379 days + 23 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 67 6f 24 e1  Error: UNC 8 sectors at LBA = 0x01246f67 = 19165031

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 67 6f 24 e1 08      00:39:36.070  READ DMA
  ec 00 00 00 00 00 a0 08      00:39:36.061  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 08      00:39:36.058  SET FEATURES [Set transfer mode]

Error 804 occurred at disk power-on lifetime: 9119 hours (379 days + 23 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 67 6f 24 e1  Error: UNC 8 sectors at LBA = 0x01246f67 = 19165031

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 67 6f 24 e1 08      00:39:33.506  READ DMA
  b0 d5 01 09 4f c2 00 08      00:39:33.494  SMART READ LOG
  b0 d5 01 06 4f c2 00 08      00:39:33.490  SMART READ LOG
  b0 d5 01 01 4f c2 00 08      00:39:33.485  SMART READ LOG
  b0 d1 01 01 4f c2 00 08      00:39:33.477  SMART READ ATTRIBUTE THRESHOLDS [OBS-4]

Error 803 occurred at disk power-on lifetime: 9119 hours (379 days + 23 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 67 6f 24 e1  Error: UNC 8 sectors at LBA = 0x01246f67 = 19165031

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 67 6f 24 e1 08      00:39:30.754  READ DMA
  ec 00 00 00 00 00 a0 08      00:39:30.746  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 08      00:39:30.746  SET FEATURES [Set transfer mode]

Error 802 occurred at disk power-on lifetime: 9119 hours (379 days + 23 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 67 6f 24 e1  Error: UNC 8 sectors at LBA = 0x01246f67 = 19165031

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 67 6f 24 e1 08      00:39:28.178  READ DMA
  ec 00 00 00 00 00 a0 08      00:39:28.169  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 08      00:39:28.166  SET FEATURES [Set transfer mode]

Error 801 occurred at disk power-on lifetime: 9119 hours (379 days + 23 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 67 6f 24 e1  Error: UNC 8 sectors at LBA = 0x01246f67 = 19165031

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 67 6f 24 e1 08      00:39:25.615  READ DMA
  ec 00 00 00 00 00 a0 08      00:39:25.607  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 08      00:39:25.607  SET FEATURES [Set transfer mode]

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%      6491         2777760
# 2  Short offline       Completed: read failure       40%      6312         2773712

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

[root <at> tux]# smartclt -l selftest /dev/sd
smartctl 5.39.1 2010-01-28 r3054 [i386-redhat-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%      6491         2777760
# 2  Short offline       Completed: read failure       40%      6312         2773712

[root <at> tux]# smartctl -l selftest /dev/sdc
smartctl 5.39.1 2010-01-28 r3054 [i386-redhat-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%      6491         2777760
# 2  Short offline       Completed: read failure       40%      6312         2773712




2) Get the bloack size
[root <at> tux]# dumpe2fs /dev/sdc | grep "Block size"
dumpe2fs 1.41.9 (22-Aug-2009)
Block size:               4096


3) LBA of bad chunk is 2773712

4) LBA of start of partition is (63)

[root <at> tux]# # LBA of start of dev/sdc is:
[root <at> tux]# fdisk -lu /dev/sdc

Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x0e30349b

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1              63  1953520064   976760001   83  Linux


5) Compute offset

(2773712-63)*512/4096 = 346706.125


6) Use DD to nuke single block at 3460706

[root <at> tux]# dd if=/dev/zero of=/dev/sdc bs=4096 count=1 seek=346706
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 5.0141e-05 s, 81.7 MB/s

7) Nuke the block at the other error location (347212)

[root <at> tux]# dd if=/dev/zero of=/dev/sdc bs=4096 count=1 seek=347212
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 4.8141e-05 s, 85.1 MB/s

8) At this point I rebooted the system and I still get the errors on boot up and once a day.



--
-Skye Sweeney

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Smartmontools-support mailing list
Smartmontools-support <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/smartmontools-support
Alex Samorukov | 10 Jul 2011 11:35
Picon

Re: Problem clearing SMART errors on WD 1T drive

My recommendation is to put this drive in trashcan/RMA ASAP. It does 
make a sense to repair the drive if you have 1-2 pending sectors, but in 
your case i think drive will die soon. And you don`t need to dumpe2fs to 
find a bad block, you already have it in the short/long test report.

On 07/10/2011 02:48 AM, Skye Sweeney wrote:
>
> Jul  9 19:53:58 tux smartd[1658]: Device: /dev/sdc [SAT], 44 Currently 
> unreadable (pending) sectors
> Jul  9 19:53:58 tux smartd[1658]: Device: /dev/sdc [SAT], 30 Offline 
> uncorrectable sectors (changed -165)

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
Tim Small | 10 Jul 2011 15:30

Re: Problem clearing SMART errors on WD 1T drive

On 10/07/11 01:48, Skye Sweeney wrote:
> Synopsis:
>
> I am at wits end and could use some pointers as to what I am doing
> wrong or if I just need to buy a new drive. I realize this may not the
> perfect forum for this question, and would be happy with just a
> pointer to the right place.
>
> I have been getting SMART errors on a backup drive on my Fedora 12
> file server. I have tried the instructions in "Bad block HOWTO for
> smartmontools" without avail. I have visited many websites and have
> not found anything more illuminating to my problem.
>
> The drive is a backup drive of my data drive using rdiff-backup once a
> night. It is a Western Digital SATA 1T full size drive. It is my
> /dev/sdc drive and has only one partition /dev/sdc1
>
> The file system used to be ext4, but since the instructions for fixing
> blocks only called out ext2/3 I formatted the drive to ext3 and used
> the following procedures to no avail.
>
> Details:
>
> I get the following in my email each day:
>
>  --------------------- Smartd Begin ------------------------
>
>
>  Currently unreadable (pending) sectors detected:
>        /dev/sdc [SAT] - 48 Time(s)
>        44 unreadable sectors detected
>
>  Offline uncorrectable sectors detected:
>        /dev/sdc [SAT] - 48 Time(s)
>        30 offline uncorrectable sectors detected
>
>  ---------------------- Smartd End -------------------------
>
> This ends up in /var/log/messages each day:
>
> Jul  9 19:53:58 tux smartd[1658]: Device: /dev/sdc [SAT], 44 Currently
> unreadable (pending) sectors
> Jul  9 19:53:58 tux smartd[1658]: Device: /dev/sdc [SAT], 30 Offline
> uncorrectable sectors (changed -165)
>
> The steps I took to try to fix these problems:
>
>
> 1) Get SMART info
>
> [root <at> tux ~]# smartctl -d ata -a /dev/sdc
> smartctl 5.39.1 2010-01-28 r3054 [i386-redhat-linux-gnu] (local build)
> Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
>
> === START OF INFORMATION SECTION ===
> Device Model:     WDC WD10EARS-00Y5B1
> Serial Number:    WD-WMAV51375649
> Firmware Version: 80.00A80
> User Capacity:    1,000,204,886,016 bytes
> Device is:        Not in smartctl database [for details use: -P showall]
> ATA Version is:   8
> ATA Standard is:  Exact ATA specification draft version not indicated
> Local Time is:    Tue Jun 28 20:13:46 2011 EDT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
>
> smartctl 5.39.1 2010-01-28 r3054 [i386-redhat-linux-gnu] (local build)
> Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
>
> === START OF INFORMATION SECTION ===
> Device Model:     WDC WD10EARS-00Y5B1
> Serial Number:    WD-WMAV51375649
> Firmware Version: 80.00A80
> User Capacity:    1,000,204,886,016 bytes
> Device is:        Not in smartctl database [for details use: -P showall]
> ATA Version is:   8
> ATA Standard is:  Exact ATA specification draft version not indicated
> Local Time is:    Tue Jun 28 20:14:25 2011 EDT
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
>
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
>
> General SMART Values:
> Offline data collection status:  (0x84)    Offline data collection
> activity
>                     was suspended by an interrupting command from host.
>                     Auto Offline Data Collection: Enabled.
> Self-test execution status:      (   0)    The previous self-test
> routine completed
>                     without error or no self-test has ever
>                     been run.
> Total time to complete Offline
> data collection:          (21300) seconds.
> Offline data collection
> capabilities:              (0x7b) SMART execute Offline immediate.
>                     Auto Offline data collection on/off support.
>                     Suspend Offline collection upon new
>                     command.
>                     Offline surface scan supported.
>                     Self-test supported.
>                     Conveyance Self-test supported.
>                     Selective Self-test supported.
> SMART capabilities:            (0x0003)    Saves SMART data before
> entering
>                     power-saving mode.
>                     Supports SMART auto save timer.
> Error logging capability:        (0x01)    Error logging supported.
>                     General Purpose Logging supported.
> Short self-test routine
> recommended polling time:      (   2) minutes.
> Extended self-test routine
> recommended polling time:      ( 245) minutes.
> Conveyance self-test routine
> recommended polling time:      (   5) minutes.
> SCT capabilities:            (0x3031)    SCT Status supported.
>                     SCT Feature Control supported.
>                     SCT Data Table supported.
>
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE     
> UPDATED  WHEN_FAILED RAW_VALUE
>   1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail 
> Always       -       0
>   3 Spin_Up_Time            0x0027   130   126   021    Pre-fail 
> Always       -       6475
>   4 Start_Stop_Count        0x0032   100   100   000    Old_age  
> Always       -       619
>   5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail 
> Always       -       0
>   7 Seek_Error_Rate         0x002e   100   253   000    Old_age  
> Always       -       0
>   9 Power_On_Hours          0x0032   088   088   000    Old_age  
> Always       -       9119
>  10 Spin_Retry_Count        0x0032   100   100   000    Old_age  
> Always       -       0
>  11 Calibration_Retry_Count 0x0032   100   100   000    Old_age  
> Always       -       0
>  12 Power_Cycle_Count       0x0032   100   100   000    Old_age  
> Always       -       143
> 192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age  
> Always       -       71
> 193 Load_Cycle_Count        0x0032   197   197   000    Old_age  
> Always       -       9117
> 194 Temperature_Celsius     0x0022   111   108   000    Old_age  
> Always       -       36
> 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age  
> Always       -       0
> 197 Current_Pending_Sector  0x0032   199   199   000    Old_age  
> Always       -       253
> 198 Offline_Uncorrectable   0x0030   199   199   000    Old_age  
> Offline      -       195
> 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age  
> Always       -       0
> 200 Multi_Zone_Error_Rate   0x0008   199   199   000    Old_age  
> Offline      -       291
>
> SMART Error Log Version: 1
> ATA Error Count: 805 (device log contains only the most recent five
> errors)
>     CR = Command Register [HEX]
>     FR = Features Register [HEX]
>     SC = Sector Count Register [HEX]
>     SN = Sector Number Register [HEX]
>     CL = Cylinder Low Register [HEX]
>     CH = Cylinder High Register [HEX]
>     DH = Device/Head Register [HEX]
>     DC = Device Command Register [HEX]
>     ER = Error register [HEX]
>     ST = Status register [HEX]
> Powered_Up_Time is measured from power on, and printed as
> DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
> SS=sec, and sss=millisec. It "wraps" after 49.710 days.
>
> Error 805 occurred at disk power-on lifetime: 9119 hours (379 days +
> 23 hours)
>   When the command that caused the error occurred, the device was
> active or idle.
>
>   After command completion occurred, registers were:
>   ER ST SC SN CL CH DH
>   -- -- -- -- -- -- --
>   40 51 08 67 6f 24 e1  Error: UNC 8 sectors at LBA = 0x01246f67 =
> 19165031
>
>   Commands leading to the command that caused the error were:
>   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
>   -- -- -- -- -- -- -- --  ----------------  --------------------
>   c8 00 08 67 6f 24 e1 08      00:39:36.070  READ DMA
>   ec 00 00 00 00 00 a0 08      00:39:36.061  IDENTIFY DEVICE
>   ef 03 46 00 00 00 a0 08      00:39:36.058  SET FEATURES [Set
> transfer mode]
>
> Error 804 occurred at disk power-on lifetime: 9119 hours (379 days +
> 23 hours)
>   When the command that caused the error occurred, the device was
> active or idle.
>
>   After command completion occurred, registers were:
>   ER ST SC SN CL CH DH
>   -- -- -- -- -- -- --
>   40 51 08 67 6f 24 e1  Error: UNC 8 sectors at LBA = 0x01246f67 =
> 19165031
>
>   Commands leading to the command that caused the error were:
>   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
>   -- -- -- -- -- -- -- --  ----------------  --------------------
>   c8 00 08 67 6f 24 e1 08      00:39:33.506  READ DMA
>   b0 d5 01 09 4f c2 00 08      00:39:33.494  SMART READ LOG
>   b0 d5 01 06 4f c2 00 08      00:39:33.490  SMART READ LOG
>   b0 d5 01 01 4f c2 00 08      00:39:33.485  SMART READ LOG
>   b0 d1 01 01 4f c2 00 08      00:39:33.477  SMART READ ATTRIBUTE
> THRESHOLDS [OBS-4]
>
> Error 803 occurred at disk power-on lifetime: 9119 hours (379 days +
> 23 hours)
>   When the command that caused the error occurred, the device was
> active or idle.
>
>   After command completion occurred, registers were:
>   ER ST SC SN CL CH DH
>   -- -- -- -- -- -- --
>   40 51 08 67 6f 24 e1  Error: UNC 8 sectors at LBA = 0x01246f67 =
> 19165031
>
>   Commands leading to the command that caused the error were:
>   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
>   -- -- -- -- -- -- -- --  ----------------  --------------------
>   c8 00 08 67 6f 24 e1 08      00:39:30.754  READ DMA
>   ec 00 00 00 00 00 a0 08      00:39:30.746  IDENTIFY DEVICE
>   ef 03 46 00 00 00 a0 08      00:39:30.746  SET FEATURES [Set
> transfer mode]
>
> Error 802 occurred at disk power-on lifetime: 9119 hours (379 days +
> 23 hours)
>   When the command that caused the error occurred, the device was
> active or idle.
>
>   After command completion occurred, registers were:
>   ER ST SC SN CL CH DH
>   -- -- -- -- -- -- --
>   40 51 08 67 6f 24 e1  Error: UNC 8 sectors at LBA = 0x01246f67 =
> 19165031
>
>   Commands leading to the command that caused the error were:
>   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
>   -- -- -- -- -- -- -- --  ----------------  --------------------
>   c8 00 08 67 6f 24 e1 08      00:39:28.178  READ DMA
>   ec 00 00 00 00 00 a0 08      00:39:28.169  IDENTIFY DEVICE
>   ef 03 46 00 00 00 a0 08      00:39:28.166  SET FEATURES [Set
> transfer mode]
>
> Error 801 occurred at disk power-on lifetime: 9119 hours (379 days +
> 23 hours)
>   When the command that caused the error occurred, the device was
> active or idle.
>
>   After command completion occurred, registers were:
>   ER ST SC SN CL CH DH
>   -- -- -- -- -- -- --
>   40 51 08 67 6f 24 e1  Error: UNC 8 sectors at LBA = 0x01246f67 =
> 19165031
>
>   Commands leading to the command that caused the error were:
>   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
>   -- -- -- -- -- -- -- --  ----------------  --------------------
>   c8 00 08 67 6f 24 e1 08      00:39:25.615  READ DMA
>   ec 00 00 00 00 00 a0 08      00:39:25.607  IDENTIFY DEVICE
>   ef 03 46 00 00 00 a0 08      00:39:25.607  SET FEATURES [Set
> transfer mode]
>
> SMART Self-test log structure revision number 1
> Num  Test_Description    Status                  Remaining 
> LifeTime(hours)  LBA_of_first_error
> # 1  Extended offline    Completed: read failure       90%     
> 6491         2777760
> # 2  Short offline       Completed: read failure       40%     
> 6312         2773712
>
> SMART Selective self-test log data structure revision number 1
>  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
>     1        0        0  Not_testing
>     2        0        0  Not_testing
>     3        0        0  Not_testing
>     4        0        0  Not_testing
>     5        0        0  Not_testing
> Selective self-test flags (0x0):
>   After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute
> delay.
>
> [root <at> tux]# smartclt -l selftest /dev/sd
> smartctl 5.39.1 2010-01-28 r3054 [i386-redhat-linux-gnu] (local build)
> Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
>
> === START OF READ SMART DATA SECTION ===
> SMART Self-test log structure revision number 1
> Num  Test_Description    Status                  Remaining 
> LifeTime(hours)  LBA_of_first_error
> # 1  Extended offline    Completed: read failure       90%     
> 6491         2777760
> # 2  Short offline       Completed: read failure       40%     
> 6312         2773712
>
> [root <at> tux]# smartctl -l selftest /dev/sdc
> smartctl 5.39.1 2010-01-28 r3054 [i386-redhat-linux-gnu] (local build)
> Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
>
> === START OF READ SMART DATA SECTION ===
> SMART Self-test log structure revision number 1
> Num  Test_Description    Status                  Remaining 
> LifeTime(hours)  LBA_of_first_error
> # 1  Extended offline    Completed: read failure       90%     
> 6491         2777760
> # 2  Short offline       Completed: read failure       40%     
> 6312         2773712
>
>
>
>
> 2) Get the bloack size
> [root <at> tux]# dumpe2fs /dev/sdc | grep "Block size"
> dumpe2fs 1.41.9 (22-Aug-2009)
> Block size:               4096

As has already been said, you don't really need to touch the e2fs tools
in this case, because you have the lba direct from the drive....

>
>
> 3) LBA of bad chunk is 2773712
>
> 4) LBA of start of partition is (63)
>
> [root <at> tux]# # LBA of start of dev/sdc is:
> [root <at> tux]# fdisk -lu /dev/sdc
>
> Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes
> 255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Disk identifier: 0x0e30349b
>
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sdc1              63  1953520064   976760001   83  Linux
>
>
> 5) Compute offset
>
> (2773712-63)*512/4096 = 346706.125

This would be an offset into the first partition tho (because you
subtracted the start lba of the first partition)?  Whereas you are
writing to the whole-disk device with your dd command, not to the
partition device so you're going to miss it by writing to the point 63
sectors too early.

dd if=/dev/zero of=/dev/sdc bs=512 count=1 seek=2773712 would get that
first block, but if you're going to, or have already reformatted the
drive anyway, perhaps you'd be better off just writing zeros to the
whole disk?

If you haven't actually nuked the entire disk contents already, then
perhaps you'd better do a read check to verify that the sector in
question is in-fact bad already before writing over it:

dd if=/dev/sdc of=/dev/null bs=512 count=1 skip=2773712

If that fails, then the sector is still bad / unreadable.

Tim.

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
Tim Small | 10 Jul 2011 15:34

Re: Problem clearing SMART errors on WD 1T drive

On 10/07/11 10:35, Alex Samorukov wrote:
> My recommendation is to put this drive in trashcan/RMA ASAP. It does 
> make a sense to repair the drive if you have 1-2 pending sectors, but in 
> your case i think drive will die soon.

Probably but not definitely.  I've had a drive get a bad run of hundreds
of sectors in one location on the drive (maybe a bit of contamination
scratched a track or something), but then has gone on for years later
with no further problems.

I've also had chassis vibration cause bad writes (which are then UNC
sectors), but there was no physical problem at all, and rewriting the
drive caused them to be reused without being reallocated.

That having been said, the look of the SMART output shows bad sectors in
at least two different locations.

Tim.

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
Geoff Keating | 10 Jul 2011 22:54

Re: Problem clearing SMART errors on WD 1T drive


On 09/07/2011, at 5:48 PM, Skye Sweeney wrote:

> Synopsis:
> 
> I am at wits end and could use some pointers as to what I am doing wrong or if I just need to buy a new drive. I
realize this may not the perfect forum for this question, and would be happy with just a pointer to the right place.
> 
> I have been getting SMART errors on a backup drive on my Fedora 12 file server. I have tried the instructions
in "Bad block HOWTO for smartmontools" without avail. I have visited many websites and have not found
anything more illuminating to my problem. 

One thing that may not be obvious is that your drive doesn't have one or two bad blocks, it had 

198 Offline_Uncorrectable   0x0030   199   199   000    Old_age   Offline      -       195

nearly two hundred and this has been reduced to

Jul  9 19:53:58 tux smartd[1658]: Device: /dev/sdc [SAT], 44 Currently unreadable (pending) sectors
Jul  9 19:53:58 tux smartd[1658]: Device: /dev/sdc [SAT], 30 Offline uncorrectable sectors (changed -165)

74 now. (This may not be as bad as it sounds, it might be that there's just a run of bad blocks due to a scratch or a
defect on the disk.)  The commands you're using could have fixed at most 16.

I would suggest, if there's nothing valuable on the disk, to just write zeros to the whole disk, with

dd if=/dev/zero of=/dev/sdc bs=1m

or similar.  Obviously THIS WILL ERASE EVERYTHING ON THE DISK and so you might want to double-check there's
really nothing on it you want.

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
Skye Sweeney (FLL-Freak | 12 Jul 2011 03:34

Re: Problem clearing SMART errors on WD 1T drive

Geoff, and Tim,

I have spent the time since my posting reviewing your suggestings and 
implementing them. Since I have a robot that burns all new data to DVD disc 
every night, I was able to nuke this live backup disc without significant 
danger.

I took the suggestiong to use DD to write all zeros to the whole (/dev/sdc) 
disk. I then re partitioned and formated the drive. I then copied the prime 
drive to this backup disc. I finally rebooted and now have fewer errors. I 
am left with an email from the machine:

Device: /dev/sdc [SAT], 30 Offline uncorrectable sectors

Is having 30 Offline uncorrectable sectors a "Bad Thing"? Should I be buying 
a replacement disc? Or does it mean that "30 sectors are bad and will not be 
used anymore, so relax!"?

Thanks for the help. I like the fact that you were able to point out errors 
that I had made. Nice to learn something.

-Skye

> Synopsis:
>
> I am at wits end and could use some pointers as to what I am doing wrong 
> or if I just need to buy a new drive. I realize this may not the perfect 
> forum for this question, and would be happy with just a pointer to the 
> right place.
>
> I have been getting SMART errors on a backup drive on my Fedora 12 file 
> server. I have tried the instructions in "Bad block HOWTO for 
> smartmontools" without avail. I have visited many websites and have not 
> found anything more illuminating to my problem.

[Trim]

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2

Gmane