Re: "Hot Space" vs. hot spares
Matthew Ahrens <Matthew.Ahrens <at> sun.com>
2009-10-01 01:01:15 GMT
Yes, this is something that should be possible once we have bp rewrite (the
ability to move blocks around). One minor downside to "hot space" would be
that it couldn't be shared among multiple pools the way that hot spares can.
Also depending on the pool configuration, hot space may be impractical. For
example if you are using wide RAIDZ[-N] stripes. If you have say 4 top-level
RAIDZ-2 vdevs each with 10 disks in it, you would have to keep your pool at
most 3/4 full to be able to take advantage of hot space. And if you wanted
to tolerate any 2 disks failing, the pool could be at most 1/2 full.
(Although one could imagine eventually recombining some of the remaining 18
good disks to make another RAIDZ group.)
So I imagine that with this implementation at least (remove faulted top-level
vdev), Hot Space would only be practical when using mirroring. That said,
once we have (top-level) device removal implemented, you could implement a
poor-man's hot space with some simple scripts -- just remove the degraded
top-level vdev from the pool.
FYI, I am currently working on bprewrite for device removal.
Brandon High wrote:
> I might have this mentioned already on the list and can't find it now,
> or I might have misread something and come up with this ...
> Right now, using hot spares is a typical method to increase storage
> pool resiliency, since it minimizes the time that an array is
> degraded. The downside is that drives assigned as hot spares are
> essentially wasted. They take up space & power but don't provide
> usable storage.
> Depending on the number of spares you've assigned, you could have 7%
> of your purchased capacity idle, assuming 1 spare per 14-disk shelf.
> This is on top of the RAID6 / raidz[1-3] overhead.
> What about using the free space in the pool to cover for the failed drive?
> With bp rewrite, would it be possible to rebuild the vdev from parity
> and simultaneously rewrite those blocks to a healthy device? In other
> words, when there is free space, remove the failed device from the
> zpool, resizing (shrinking) it on the fly and restoring full parity
> protection for your data. If online shrinking doesn't work, create a
> phantom file that accounts for all the space lost by the removal of
> the device until an export / import.
> It's not something I'd want to do with less than raidz2 protection,
> and I imagine that replacing the failed device and expanding the
> stripe width back to the original would have some negative performance
> implications that would not occur otherwise. I also imagine it would
> take a lot longer to rebuild / resilver at both device failure and
> device replacement. You wouldn't be able to share a spare among many
> vdevs either, but you wouldn't always need to if you leave some space
> free on the zpool.
> Provided that bp rewrite is committed, and vdev & zpool shrinks are
> functional, could this work? It seems like a feature most applicable
> to SOHO users, but I'm sure some enterprise users could find an
> application for nearline storage where available space trumps