Christian Kujau | 3 Sep 2006 15:20
Picon

[OT] Re: Partitioning for ext3fs

[please reply on-list, so that other ppl can help too]

On Thu, 31 Aug 2006, david  cooke wrote:
> By hang, I mean the boot process will not go any further if I
> turn on the USB during boot. Whatever boot happens to be

Hm, too bad :(
But I'd suggest to discuss this issue on some FC forum, usb-list or even 
linux-kernel.

>> I don't know if I understand you correctly: you've upgraded to FC5 and
>> the external (USB? SATA?) drive still "does not work"?

> Typing fdisk /dev/sdc gives us
> Unable to open /dev/sdc

so, your OS (linux, FC) does not seem to be aware of your usb-disk 
(sdc) or the driver crashed. try to check dmesg/messages for related 
information and pass it on to one of the above mentioned lists.

> There is a light on in the USB so I think it's on...Yes, it is on.

That's good ;)

> Typing fdisk /dev/sda results in
> The number of cylinders for this disk is set to 19457.
[...]

OK, so sda (sata disk?) is doing well. this is good ;)

(Continue reading)

Christian | 3 Sep 2006 20:25
Picon

Re: Stress testing for ext3?

On Thu, 31 Aug 2006, Kieft, Brian wrote:
> Does anyone know of a good method for exercising
> an ext3 file system?

I'm not aware of such a "torture" tool, but any long run of your 
real-world-application of choice, some benchmarks or heavy operation on 
a big source tree or so should do no harm to any in-kernel 
rw-filesystem.

> Perhaps something that involves power removal in between commits
> or in the midst of a write,

start any of the things mentioned above and pull the plug ;)
maybe "reboot -f" could simulate this:

   -f     Force halt or reboot, don't call shutdown(8).

but I've never tried that and don't know if it will KILL running 
processes before rebooting.

> and then checks for corrupt data. Do any utilities exist for this?

fsck.ext[23] will do that for the fs structure. you could use diff(1) 
against a known-to-be-good filesystem to verify that all data is in
place.

Christian.
--

-- 
BOFH excuse #325:

(Continue reading)

Christian | 3 Sep 2006 20:29
Picon

Re: Ext3 emergency recovery

On Tue, 29 Aug 2006, Adam Atlas wrote:
> I have a damaged Ext3 filesystem which fsck has not been able to recover.

maybe the information *how* the fs went corrupt could help. posting a 
fsck log is also nice...

> Up to group 95. Some say "SEVERE DATA LOSS POSSIBLE."

are you using the latest e2fsprogs? latest kernel? i386 or something 
more exotic?

> filesystem and tried answering yes to all of them; it ended up just erasing 
> the whole thing.

is there nothing in lost+found?

--

-- 
BOFH excuse #325:

Your processor does not develop enough heat.
..:::BeOS Mr. X:::.. | 3 Sep 2006 22:39
Picon
Favicon

Re: Stress testing for ext3?

I know of a method to continously execute a command, maybe doing a full 
listing of the drive's contents will heat the drives up, but I am not 
sure about the error checking part. Here is what I would do:
while [ 1 -eq 1 ]; do ls  -shw9 -R; done

Hope this helps!

Mr. X

Kieft, Brian wrote:
> Does anyone know of a good method for exercising an ext3 file system? 
> Perhaps something that involves power removal in between commits or in 
> the midst of a write, and then checks for corrupt data. Do any utilities 
> exist for this?
> 
>  
> 
> Thanks!
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Ext3-users mailing list
> Ext3-users <at> redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users
Neil Brown | 4 Sep 2006 12:17
X-Face
Picon
Picon
Favicon

Re: debian unstable & ext3


(posting again from my subscribed address as it is a members' only
list - grumble)
On Thursday August 31, lm <at> bitmover.com wrote:
> I'm running 
> 
> Linux travis 2.6.15-1-686 #2 Mon Mar 6 15:27:08 UTC 2006 i686 GNU/Linux
> 
> on a laptop with ext3 on /
> 
> Some time ago things started getting weird in the following way: I do a
> fairly normal hack, ^Z, make, test loop when developing and it seems
> that vim is calling fsync or sync and that is then flushing everything
> to disk.  My tests create maybe 10 dozen files in ~30MB and for some
> reason this is taking 4 seconds to flush.
> 
> I'm not sure if ext3, the kernel, or vim is the problem.  I already
> googled and set
> 
> set swapsync=sync
> set nofsync
> 
> in my .exrc but that hasn't helped.
> 
> Has anyone else seen this and do they have a work around?  I'm about to
> switch to reiserfs and that's a lot of fuss for what should be a simple
> problem (I hope).

I've noticed this sort of problem, but it hasn't yet been enough to
make me explore very far....
(Continue reading)

Neil Brown | 4 Sep 2006 12:09
X-Face
Picon
Gravatar

Re: debian unstable & ext3

On Thursday August 31, lm <at> bitmover.com wrote:
> I'm running 
> 
> Linux travis 2.6.15-1-686 #2 Mon Mar 6 15:27:08 UTC 2006 i686 GNU/Linux
> 
> on a laptop with ext3 on /
> 
> Some time ago things started getting weird in the following way: I do a
> fairly normal hack, ^Z, make, test loop when developing and it seems
> that vim is calling fsync or sync and that is then flushing everything
> to disk.  My tests create maybe 10 dozen files in ~30MB and for some
> reason this is taking 4 seconds to flush.
> 
> I'm not sure if ext3, the kernel, or vim is the problem.  I already
> googled and set
> 
> set swapsync=sync
> set nofsync
> 
> in my .exrc but that hasn't helped.
> 
> Has anyone else seen this and do they have a work around?  I'm about to
> switch to reiserfs and that's a lot of fuss for what should be a simple
> problem (I hope).

I've noticed this sort of problem, but it hasn't yet been enough to
make me explore very far....

One thing worth a try is to mount with data=writeback.

(Continue reading)

Christian | 5 Sep 2006 11:48
Picon

Re: debian unstable & ext3

[resent to ext3-users <at> redhat.com]

On Thu, 31 Aug 2006, Larry McVoy wrote:
> Some time ago things started getting weird in the following way: I do a
> fairly normal hack, ^Z, make, test loop when developing and it seems
----------------------^ this would STOP your editor (vi), but do you :w 
before you do this?

> that vim is calling fsync or sync

you can start vim via strace(1) to find out which one is called.

> and that is then flushing everything to disk. My tests create maybe 10
> dozen files in ~30MB and for some reason this is taking 4 seconds to
> flush.

How full is the fs, maybe fragmentation is bad or the 4 sec are even 
I/O-bound? What mount-options are used?

It'd be intresting to reproduce this behaviour on a fresh filesystem.

> I'm about to switch to reiserfs and that's a lot of fuss for what should

Let us know if this solved the problem ;)

Christian.
--

-- 
BOFH excuse #277:

Your Flux Capacitor has gone bad.
(Continue reading)

Christian | 5 Sep 2006 12:50
Picon

Re: Stress testing for ext3?

On Sun, 3 Sep 2006, ..:::BeOS Mr. X:::.. wrote:
> I know of a method to continously execute a command, maybe doing a full 
> listing of the drive's contents will heat the drives up, but I am not sure 
> about the error checking part. Here is what I would do:
> while [ 1 -eq 1 ]; do ls  -shw9 -R; done

The directory liting will be cached, after the first run the disk 
should not be touched any more (try it out...). Also, when you're not 
redirecting the output to somewhere else (e.g. /dev/null), the terminal 
displaying the output will be the bottleneck and not the fs or the 
disk...

--

-- 
BOFH excuse #34:

(l)user error
Herta Van den Eynde | 5 Sep 2006 15:09
Picon
Favicon

Re: Stress testing for ext3?

Christian wrote:
> On Sun, 3 Sep 2006, ..:::BeOS Mr. X:::.. wrote:
> 
>> I know of a method to continously execute a command, maybe doing a 
>> full listing of the drive's contents will heat the drives up, but I am 
>> not sure about the error checking part. Here is what I would do:
>> while [ 1 -eq 1 ]; do ls  -shw9 -R; done
> 
> 
> The directory liting will be cached, after the first run the disk should 
> not be touched any more (try it out...). Also, when you're not 
> redirecting the output to somewhere else (e.g. /dev/null), the terminal 
> displaying the output will be the bottleneck and not the fs or the disk...
> 

A colleague of mine reported he got ext3 to bail out while repeatedly 
recompiling the kernel.  He enabled all kernel modules, and then ran:

# while true; do make clean; make -j18; done

The filesystem ended up being mounted ro.  The fsck at reboot moved some 
files to lost+found, after which the filesystem could be used again.

Kind regards,

Herta

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
tweeks | 5 Sep 2006 22:53

IO lockups and ext3 readonly filecorruption on RHEL4 (pre and post U4)

Has anyone been seeing IO lockup problems on EL4?  

I've tried multiple IO scheduler options (elevator=) in the boot... I'm seeing 
the same behavior regardless.  Independent of hardware.  Whitebox ATA, HA 
enclosure with dedicated SCSI, megaraid RAID hardware, Dell 2850s... same 
behavior:

A semi-busy system will suddenly go into some kind of IO la-la land where 
nothing can be written to disk for >1hour.  Of course when this happens, the 
ext3 kernel module freaks out and remounts all the filesystems as readonly.  
Then when the system is rebooted, if the system is allowed to fsck, the 
journal is hosed and the filesystem eats itself.  Moving them off the RH 
kernel all together seems to fix the problem, but I have not found a way to 
reproduce the problem yet (burning and stress testing doesn't seem to make it 
appear), so real re-testing is difficult at best.

It's become so big of a problem that we're moving some customers that require 
rock solid systems either over to RHEL3, or off RH and over to SLES or other 
distro with a non-RH kernel.  

Just the ext3 problem (minus the IO lockup part) can be seen in other BZ 
tickets:
	https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=175877
(when the filesystem fills up)

Has anyone seen these type of IO lockups + ext3 corruption on RHEL4?  
Can you reproduce it?

Tweeks
(Continue reading)


Gmane