Carson Gaspar | 1 Oct 2009 01:40

Re: Help importing pool with "offline" disk

Carson Gaspar wrote:
> Victor Latushkin wrote:
>> Carson Gaspar wrote:

>> is zdb happy with your pool?
>>
>> Try e.g.
>>
>> zdb -eud <poolname>
> 
> I'm booted back into snv118 (booting with the damaged pool disks 
> disconnected so the host would come up without throwing up). After hot 
> plugging the disks, I get:
> 
> bash-3.2# /usr/sbin/zdb -eud media
> zdb: can't open media: File exists
> 
> "zpool status media" is hanging, and top shows that I'm spending ~50% of 
> CPU time in the kernel - I'll see what it says when it finally returns. 
> Let me know if there's anything else I can do to help you help me, 
> including giving you a login in the server.

OK, things are now different (possibly better?):

bash-3.2# /usr/sbin/zpool status media
   pool: media
  state: FAULTED
status: One or more devices could not be opened.  There are insufficient
         replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
(Continue reading)

Carson Gaspar | 1 Oct 2009 01:44

Re: Help importing pool with "offline" disk

Carson Gaspar wrote:
> Carson Gaspar wrote:
>> Victor Latushkin wrote:
>>> Carson Gaspar wrote:
> 
>>> is zdb happy with your pool?
>>>
>>> Try e.g.
>>>
>>> zdb -eud <poolname>
>>
>> I'm booted back into snv118 (booting with the damaged pool disks 
>> disconnected so the host would come up without throwing up). After hot 
>> plugging the disks, I get:
>>
>> bash-3.2# /usr/sbin/zdb -eud media
>> zdb: can't open media: File exists
>>
>> "zpool status media" is hanging, and top shows that I'm spending ~50% 
>> of CPU time in the kernel - I'll see what it says when it finally 
>> returns. Let me know if there's anything else I can do to help you 
>> help me, including giving you a login in the server.
> 
> OK, things are now different (possibly better?):
> 
> bash-3.2# /usr/sbin/zpool status media
>   pool: media
>  state: FAULTED
> status: One or more devices could not be opened.  There are insufficient
>         replicas for the pool to continue functioning.
(Continue reading)

Brandon High | 1 Oct 2009 02:03

Re: Comments on home OpenSolaris/ZFS server

On Mon, Sep 28, 2009 at 1:12 PM, Ware Adams <rwalists <at> washdcmail.com> wrote:
> SuperMicro 7046A-3 Workstation
> http://supermicro.com/products/system/4U/7046/SYS-7046A-3.cfm

I'm using a SuperChassis 743TQ-865B-SQ for my home NAS, which is what
that workstation uses. It's very LARGE and very quiet. Did I mention
it's HUGE? I bought two more 2800 rpm fans for it. The case is
designed for four but only comes with two for noise, I didn't notice
an increase in sound. You can find the fans (part # FAN-0104L4)
online.

I think the dual socket board you chose is a bit overkill for just a
NAS box. I used an ASUS motherboard because I wanted to use AMD, and
went with a 4850e and 8GB ECC memory. It got me a board that supports
ECC and PCI-X slots (so I could use the AOC-SAT-MV8 board). I also
host some (mostly idle) VMs on the machine and they run fine.

Supermicro has a 3 x 5.25" bay rack that holds 5 x 3.5" drives. This
doesn't leave space for a optical drive, but I used a USB drive to
install the OS and don't need it anymore.

-B

--

-- 
Brandon High : bhigh <at> freaks.com
If it wasn't for pacifists, we could achieve peace.
Brandon High | 1 Oct 2009 02:06

"Hot Space" vs. hot spares

I might have this mentioned already on the list and can't find it now,
or I might have misread something and come up with this ...

Right now, using hot spares is a typical method to increase storage
pool resiliency, since it minimizes the time that an array is
degraded. The downside is that drives assigned as hot spares are
essentially wasted. They take up space & power but don't provide
usable storage.

Depending on the number of spares you've assigned, you could have 7%
of your purchased capacity idle, assuming 1 spare per 14-disk shelf.
This is on top of the RAID6 / raidz[1-3] overhead.

What about using the free space in the pool to cover for the failed drive?

With bp rewrite, would it be possible to rebuild the vdev from parity
and simultaneously rewrite those blocks to a healthy device? In other
words, when there is free space, remove the failed device from the
zpool, resizing (shrinking) it on the fly and restoring full parity
protection for your data. If online shrinking doesn't work, create a
phantom file that accounts for all the space lost by the removal of
the device until an export / import.

It's not something I'd want to do with less than raidz2 protection,
and I imagine that replacing the failed device and expanding the
stripe width back to the original would have some negative performance
implications that would not occur otherwise. I also imagine it would
take a lot longer to rebuild / resilver at both device failure and
device replacement. You wouldn't be able to share a spare among many
vdevs either, but you wouldn't always need to if you leave some space
(Continue reading)

Ian Collins | 1 Oct 2009 02:07

Re: receive restarting a resilver

> I have a raidz2 pool on an x4500 running Solaris 10 update 7.
> 
> One of the drives has been replaced with a spare (too many errors), but 
> the resilver restarts every time data is replicated
> to the pool with zfs receive.
> 
> I thought this problem was fixed long ago?

The bug was reported as 6705765 which was closed as a duplicate of 6655927.  Unfortunately this bug only
mentions and provides a work around for zpool status.

Is the problem with zfs receive down to the same root cause?  If so, is there a work around other than
suspending replication to this pool?

I'd rather not do this as this system is a fall-back backup sever.

Thanks,

-- 
Ian.
--

-- 
This message posted from opensolaris.org
Tim Cook | 1 Oct 2009 02:53

Re: "Hot Space" vs. hot spares



On Wed, Sep 30, 2009 at 7:06 PM, Brandon High <bhigh <at> freaks.com> wrote:
I might have this mentioned already on the list and can't find it now,
or I might have misread something and come up with this ...

Right now, using hot spares is a typical method to increase storage
pool resiliency, since it minimizes the time that an array is
degraded. The downside is that drives assigned as hot spares are
essentially wasted. They take up space & power but don't provide
usable storage.

Depending on the number of spares you've assigned, you could have 7%
of your purchased capacity idle, assuming 1 spare per 14-disk shelf.
This is on top of the RAID6 / raidz[1-3] overhead.

What about using the free space in the pool to cover for the failed drive?

With bp rewrite, would it be possible to rebuild the vdev from parity
and simultaneously rewrite those blocks to a healthy device? In other
words, when there is free space, remove the failed device from the
zpool, resizing (shrinking) it on the fly and restoring full parity
protection for your data. If online shrinking doesn't work, create a
phantom file that accounts for all the space lost by the removal of
the device until an export / import.

It's not something I'd want to do with less than raidz2 protection,
and I imagine that replacing the failed device and expanding the
stripe width back to the original would have some negative performance
implications that would not occur otherwise. I also imagine it would
take a lot longer to rebuild / resilver at both device failure and
device replacement. You wouldn't be able to share a spare among many
vdevs either, but you wouldn't always need to if you leave some space
free on the zpool.

Provided that bp rewrite is committed, and vdev & zpool shrinks are
functional, could this work? It seems like a feature most applicable
to SOHO users, but I'm sure some enterprise users could find an
application for nearline storage where available space trumps
performance.

-B

--
Brandon High : bhigh <at> freaks.com
Always try to do things in chronological order; it's less confusing that way.


What are you hoping to accomplish?  You're still going to need a drives worth of free space, and if you're so performance strapped that one drive makes the difference, you've got some bigger problems on your hands.

To me it sounds like complexity for complexity's sake, and leaving yourself with a far less flexible option in the face of a drive failure.

BTW, you shouldn't need one disk per tray of 14 disks.  Unless you've got some known bad disks/environmental issues, every 2-3 should be fine.  Quite frankly, if you're doing raid-z3, I'd feel comfortable with one per thumper.

--Tim
_______________________________________________
zfs-discuss mailing list
zfs-discuss <at> opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Erik Trimble | 1 Oct 2009 02:56
Picon

Re: "Hot Space" vs. hot spares

Brandon High wrote:
> I might have this mentioned already on the list and can't find it now,
> or I might have misread something and come up with this ...
>
> Right now, using hot spares is a typical method to increase storage
> pool resiliency, since it minimizes the time that an array is
> degraded. The downside is that drives assigned as hot spares are
> essentially wasted. They take up space & power but don't provide
> usable storage.
>
> Depending on the number of spares you've assigned, you could have 7%
> of your purchased capacity idle, assuming 1 spare per 14-disk shelf.
> This is on top of the RAID6 / raidz[1-3] overhead.
>
> What about using the free space in the pool to cover for the failed drive?
>
> With bp rewrite, would it be possible to rebuild the vdev from parity
> and simultaneously rewrite those blocks to a healthy device? In other
> words, when there is free space, remove the failed device from the
> zpool, resizing (shrinking) it on the fly and restoring full parity
> protection for your data. If online shrinking doesn't work, create a
> phantom file that accounts for all the space lost by the removal of
> the device until an export / import.
>
> It's not something I'd want to do with less than raidz2 protection,
> and I imagine that replacing the failed device and expanding the
> stripe width back to the original would have some negative performance
> implications that would not occur otherwise. I also imagine it would
> take a lot longer to rebuild / resilver at both device failure and
> device replacement. You wouldn't be able to share a spare among many
> vdevs either, but you wouldn't always need to if you leave some space
> free on the zpool.
>
> Provided that bp rewrite is committed, and vdev & zpool shrinks are
> functional, could this work? It seems like a feature most applicable
> to SOHO users, but I'm sure some enterprise users could find an
> application for nearline storage where available space trumps
> performance.
>
> -B
>
>   
What you describe makes no sense for single-parity vdevs, since it 
actually increases the likelihood for data loss. In multi-parity vdevs, 
even with the loss of one drive, you still have full parity protection, 
so why would you go for all that extra effort, since it gains you what?

 From a global perspective, multi-disk parity (e.g. raidz2 or raidz3) is 
the way to go instead of hot spares. 

Hot spares are useful for adding protection to a number of vdevs, not a 
single vdev.

--

-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
Matthew Ahrens | 1 Oct 2009 03:01
Picon

Re: "Hot Space" vs. hot spares

Brandon,

Yes, this is something that should be possible once we have bp rewrite (the 
ability to move blocks around).  One minor downside to "hot space" would be 
that it couldn't be shared among multiple pools the way that hot spares can.

Also depending on the pool configuration, hot space may be impractical.  For 
example if you are using wide RAIDZ[-N] stripes.  If you have say 4 top-level 
RAIDZ-2 vdevs each with 10 disks in it, you would have to keep your pool at 
most 3/4 full to be able to take advantage of hot space.  And if you wanted 
to tolerate any 2 disks failing, the pool could be at most 1/2 full. 
(Although one could imagine eventually recombining some of the remaining 18 
good disks to make another RAIDZ group.)

So I imagine that with this implementation at least (remove faulted top-level 
vdev), Hot Space would only be practical when using mirroring.  That said, 
once we have (top-level) device removal implemented, you could implement a 
poor-man's hot space with some simple scripts -- just remove the degraded 
top-level vdev from the pool.

FYI, I am currently working on bprewrite for device removal.

--matt

Brandon High wrote:
> I might have this mentioned already on the list and can't find it now,
> or I might have misread something and come up with this ...
> 
> Right now, using hot spares is a typical method to increase storage
> pool resiliency, since it minimizes the time that an array is
> degraded. The downside is that drives assigned as hot spares are
> essentially wasted. They take up space & power but don't provide
> usable storage.
> 
> Depending on the number of spares you've assigned, you could have 7%
> of your purchased capacity idle, assuming 1 spare per 14-disk shelf.
> This is on top of the RAID6 / raidz[1-3] overhead.
> 
> What about using the free space in the pool to cover for the failed drive?
> 
> With bp rewrite, would it be possible to rebuild the vdev from parity
> and simultaneously rewrite those blocks to a healthy device? In other
> words, when there is free space, remove the failed device from the
> zpool, resizing (shrinking) it on the fly and restoring full parity
> protection for your data. If online shrinking doesn't work, create a
> phantom file that accounts for all the space lost by the removal of
> the device until an export / import.
> 
> It's not something I'd want to do with less than raidz2 protection,
> and I imagine that replacing the failed device and expanding the
> stripe width back to the original would have some negative performance
> implications that would not occur otherwise. I also imagine it would
> take a lot longer to rebuild / resilver at both device failure and
> device replacement. You wouldn't be able to share a spare among many
> vdevs either, but you wouldn't always need to if you leave some space
> free on the zpool.
> 
> Provided that bp rewrite is committed, and vdev & zpool shrinks are
> functional, could this work? It seems like a feature most applicable
> to SOHO users, but I'm sure some enterprise users could find an
> application for nearline storage where available space trumps
> performance.
> 
> -B
> 
Matthew Ahrens | 1 Oct 2009 03:03
Picon

Re: "Hot Space" vs. hot spares

Erik Trimble wrote:
>  From a global perspective, multi-disk parity (e.g. raidz2 or raidz3) is 
> the way to go instead of hot spares.
> Hot spares are useful for adding protection to a number of vdevs, not a 
> single vdev.

Even when using raidz2 or 3, it is useful to have hot spares so that 
reconstruction can begin immediately.  Otherwise it would have to wait for 
the operator to physically remove the failed disk and insert a new one.

--matt
Jorgen Lundman | 1 Oct 2009 03:22
Picon
Favicon

Re: Comments on home OpenSolaris/ZFS server


I too went with a 5in3 case for HDDs, in a nice portable Mini-ITX case, with 
Intel Atom. More of a SOHO NAS for home use, rather than a beast. Still, I can 
get about 10TB in it.

http://lundman.net/wiki/index.php/ZFS_RAID

I can also recommend the embeddedSolaris project for making a small bootable 
Solaris. Very flexible and can put on the Admin GUIs, and so on.

https://sourceforge.net/projects/embeddedsolaris/

Lund

--

-- 
Jorgen Lundman       | <lundman <at> lundman.net>
Unix Administrator   | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo    | +81 (0)90-5578-8500          (cell)
Japan                | +81 (0)3 -3375-1767          (home)

Gmane