Peter Holm | 2 May 21:39 2006

Stress testing the UFS2 filesystem

I had a chance to look some more at how the UFS2 filesystem code
handles a corrupt filesystem. I have made a web page describing the
tests and my findings:

http://people.freebsd.org/~pho/baddir.html

My daytime job will probably prevent me from looking further at
this any time soon, so if anyone finds this of interest I can make
the corrupted filesystems available.
--

-- 
Peter Holm
_______________________________________________
freebsd-fs <at> freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe <at> freebsd.org"

Pavel Merdine | 2 May 22:32 2006

Re: Stress testing the UFS2 filesystem

Hello ,

Thank you for raising this problem again. I already tried to do that
in that list, but received an answer that kernel is intended to do
that. For example, you have a faulty disk. And you have a faulty
sector which happened to occur on the directory place. So each time
kernel reads this sector it panics. So it's initially hard to even
understand what happens. And also it leads to corruption and lost
files on other file system (each time). Imagine if you have 15 disks.
In this case you have many files lost just because of a small (and not
significant) fault. It's just a nonsense.
Personally, I just replaced bad_dir with error return.
By the way, there was some bug in fs in kernel that could lead to
panic even on clean filesystem (bad_dir as far as I remember). It is
very rare and it was fixed on DragonFly. As far as I remember a fix
for this was also commited to current recently.

I think that Linux is usually much smarter on this. By default it
remounts a file system as read-only in case it detects a filesystem
corruption. I would be very happy if FreeBSD could do the same,
because fs panics really hurt when you have many systems with disks.

Of course I think we could do patches to overcome corrupting panics,
but the core FreeBSD team would not accept this, as they are happy
with panics and corruptions they make to other filesystems.

Tuesday, May 2, 2006, 11:39:00 PM, you wrote:

> I had a chance to look some more at how the UFS2 filesystem code
> handles a corrupt filesystem. I have made a web page describing the
(Continue reading)

Kris Kennaway | 3 May 00:13 2006

Re: Stress testing the UFS2 filesystem

On Wed, May 03, 2006 at 12:32:29AM +0400, Pavel Merdine wrote:
> Hello ,
> 
> Thank you for raising this problem again. I already tried to do that
> in that list, but received an answer that kernel is intended to do
> that. For example, you have a faulty disk. And you have a faulty
> sector which happened to occur on the directory place. So each time
> kernel reads this sector it panics. So it's initially hard to even
> understand what happens. And also it leads to corruption and lost
> files on other file system (each time). Imagine if you have 15 disks.
> In this case you have many files lost just because of a small (and not
> significant) fault. It's just a nonsense.
> Personally, I just replaced bad_dir with error return.
> By the way, there was some bug in fs in kernel that could lead to
> panic even on clean filesystem (bad_dir as far as I remember). It is
> very rare and it was fixed on DragonFly. As far as I remember a fix
> for this was also commited to current recently.
> 
> I think that Linux is usually much smarter on this. By default it
> remounts a file system as read-only in case it detects a filesystem
> corruption. I would be very happy if FreeBSD could do the same,
> because fs panics really hurt when you have many systems with disks.
> 
> Of course I think we could do patches to overcome corrupting panics,
> but the core FreeBSD team would not accept this, as they are happy
> with panics and corruptions they make to other filesystems.

Of course not, don't make silly accusations :-)

The problem is much more difficult to solve than "making the panic an
(Continue reading)

Scott Long | 3 May 02:21 2006

Re: Stress testing the UFS2 filesystem

Pavel Merdine wrote:
> Hello ,
> 
> Thank you for raising this problem again. I already tried to do that
> in that list, but received an answer that kernel is intended to do
> that. For example, you have a faulty disk. And you have a faulty
> sector which happened to occur on the directory place. So each time
> kernel reads this sector it panics. So it's initially hard to even
> understand what happens. And also it leads to corruption and lost
> files on other file system (each time). Imagine if you have 15 disks.
> In this case you have many files lost just because of a small (and not
> significant) fault. It's just a nonsense.
> Personally, I just replaced bad_dir with error return.
> By the way, there was some bug in fs in kernel that could lead to
> panic even on clean filesystem (bad_dir as far as I remember). It is
> very rare and it was fixed on DragonFly. As far as I remember a fix
> for this was also commited to current recently.
> 
> I think that Linux is usually much smarter on this. By default it
> remounts a file system as read-only in case it detects a filesystem
> corruption. I would be very happy if FreeBSD could do the same,
> because fs panics really hurt when you have many systems with disks.
> 
> Of course I think we could do patches to overcome corrupting panics,
> but the core FreeBSD team would not accept this, as they are happy
> with panics and corruptions they make to other filesystems.

You were so close to having an interesting email, and then you decided
to expose yourself for the troll that you are.  Thanks for the input,
it will be ignored appropriately in the future.
(Continue reading)

Björn König | 3 May 07:48 2006
Picon

Re: Stress testing the UFS2 filesystem

Kris Kennaway schrieb:
> On Wed, May 03, 2006 at 12:32:29AM +0400, Pavel Merdine wrote:
>>Of course I think we could do patches to overcome corrupting panics,
>>but the core FreeBSD team would not accept this, as they are happy
>>with panics and corruptions they make to other filesystems.
> 
> Of course not, don't make silly accusations :-)
> 
> The problem is much more difficult to solve than "making the panic an
> error return".

I'm interested in more information about this issue. Do you have a 
reference to an old discussion about this topic or do you like to 
explain it a little bit further for me (and probably others)?

Thanks in advance
Björn :-)
_______________________________________________
freebsd-fs <at> freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe <at> freebsd.org"

Kris Kennaway | 3 May 09:20 2006

Re: Stress testing the UFS2 filesystem

On Wed, May 03, 2006 at 07:48:17AM +0200, Bj?rn K?nig wrote:
> Kris Kennaway schrieb:
> >On Wed, May 03, 2006 at 12:32:29AM +0400, Pavel Merdine wrote:
> >>Of course I think we could do patches to overcome corrupting panics,
> >>but the core FreeBSD team would not accept this, as they are happy
> >>with panics and corruptions they make to other filesystems.
> >
> >Of course not, don't make silly accusations :-)
> >
> >The problem is much more difficult to solve than "making the panic an
> >error return".
> 
> I'm interested in more information about this issue. Do you have a 
> reference to an old discussion about this topic or do you like to 
> explain it a little bit further for me (and probably others)?

See the URL that Peter provided in his original post.

The issue that he is testing is how well the filesystem behaves when
you arbitrarily damage it and then run fsck (ideally, fsck should
detect all of the damage and repair it).  He seems to have found cases
where fsck does not detect and repair the damage, leading to panics at
runtime.

You can ignore Pavel's reply since he didn't have anything to add to
the discussion :-)

Kris
Peter Holm | 3 May 09:54 2006

Re: Stress testing the UFS2 filesystem


> On Wed, May 03, 2006 at 07:48:17AM +0200, Bj?rn K?nig wrote:
>> Kris Kennaway schrieb:
>> >On Wed, May 03, 2006 at 12:32:29AM +0400, Pavel Merdine wrote:
>> >>Of course I think we could do patches to overcome corrupting panics,
>> >>but the core FreeBSD team would not accept this, as they are happy
>> >>with panics and corruptions they make to other filesystems.
>> >
>> >Of course not, don't make silly accusations :-)
>> >
>> >The problem is much more difficult to solve than "making the panic an
>> >error return".
>>
>> I'm interested in more information about this issue. Do you have a
>> reference to an old discussion about this topic or do you like to
>> explain it a little bit further for me (and probably others)?
>
> See the URL that Peter provided in his original post.
>
> The issue that he is testing is how well the filesystem behaves when
> you arbitrarily damage it and then run fsck (ideally, fsck should
> detect all of the damage and repair it).  He seems to have found cases
> where fsck does not detect and repair the damage, leading to panics at
> runtime.
>

Actually the filesystems mounts without any problems if fsck is run first.

The objective of this exercise was to show that background fsck may lead
to panics. This was a problem I saw a lot a year ago when I did some
(Continue reading)

Pavel Merdine | 3 May 12:03 2006

Re[2]: Stress testing the UFS2 filesystem

Hello ,

Wednesday, May 3, 2006, 2:13:07 AM, you wrote:

> On Wed, May 03, 2006 at 12:32:29AM +0400, Pavel Merdine wrote:
>> Hello ,
>> 
>> Thank you for raising this problem again. I already tried to do that
>> in that list, but received an answer that kernel is intended to do
>> that. For example, you have a faulty disk. And you have a faulty
>> sector which happened to occur on the directory place. So each time
>> kernel reads this sector it panics. So it's initially hard to even
>> understand what happens. And also it leads to corruption and lost
>> files on other file system (each time). Imagine if you have 15 disks.
>> In this case you have many files lost just because of a small (and not
>> significant) fault. It's just a nonsense.
>> Personally, I just replaced bad_dir with error return.
>> By the way, there was some bug in fs in kernel that could lead to
>> panic even on clean filesystem (bad_dir as far as I remember). It is
>> very rare and it was fixed on DragonFly. As far as I remember a fix
>> for this was also commited to current recently.
>> 
>> I think that Linux is usually much smarter on this. By default it
>> remounts a file system as read-only in case it detects a filesystem
>> corruption. I would be very happy if FreeBSD could do the same,
>> because fs panics really hurt when you have many systems with disks.
>> 
>> Of course I think we could do patches to overcome corrupting panics,
>> but the core FreeBSD team would not accept this, as they are happy
>> with panics and corruptions they make to other filesystems.
(Continue reading)

Pavel Merdin | 3 May 12:19 2006

Re[2]: Stress testing the UFS2 filesystem

Hello ,

> You were so close to having an interesting email, and then you decided
> to expose yourself for the troll that you are.  Thanks for the input,
> it will be ignored appropriately in the future.

So nothing has changed.

--

-- 
/ Pavel Merdine

_______________________________________________
freebsd-fs <at> freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe <at> freebsd.org"

Pavel Merdin | 3 May 12:57 2006

Re[2]: Stress testing the UFS2 filesystem

Hello ,

Wednesday, May 3, 2006, 11:54:50 AM, you wrote:

>> On Wed, May 03, 2006 at 07:48:17AM +0200, Bj?rn K?nig wrote:
>>> Kris Kennaway schrieb:
>>> >On Wed, May 03, 2006 at 12:32:29AM +0400, Pavel Merdine wrote:
>>> >>Of course I think we could do patches to overcome corrupting panics,
>>> >>but the core FreeBSD team would not accept this, as they are happy
>>> >>with panics and corruptions they make to other filesystems.
>>> >
>>> >Of course not, don't make silly accusations :-)
>>> >
>>> >The problem is much more difficult to solve than "making the panic an
>>> >error return".
>>>
>>> I'm interested in more information about this issue. Do you have a
>>> reference to an old discussion about this topic or do you like to
>>> explain it a little bit further for me (and probably others)?
>>
>> See the URL that Peter provided in his original post.
>>
>> The issue that he is testing is how well the filesystem behaves when
>> you arbitrarily damage it and then run fsck (ideally, fsck should
>> detect all of the damage and repair it).  He seems to have found cases
>> where fsck does not detect and repair the damage, leading to panics at
>> runtime.
>>

> Actually the filesystems mounts without any problems if fsck is run first
(Continue reading)


Gmane