Andriy Gapon | 1 Feb 11:00 2008
Picon

Re: kern/84983: [udf] [patch] udf filesystem: stat-ting files could randomly fail

on 22/12/2006 20:24 Pav Lucistnik said the following:
> Synopsis: [udf] [patch] udf filesystem: stat-ting files could randomly fail
> 
> State-Changed-From-To: open->closed
> State-Changed-By: pav
> State-Changed-When: Fri Dec 22 18:24:14 UTC 2006
> State-Changed-Why: 
> Fixed in 6.1 and up
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=84983

I found a bug in the patch. I got a panic in real situation when testing
a UDF fs with a directory with a huge number of files (~10^4), but this
can be easily shown in the code too:

static int
udf_readatoffset(struct udf_node *node, int *size, off_t offset,
    struct buf **bp, uint8_t **data)
{
...
*size = min(*size, MAXBSIZE);

if ((error = udf_readlblks(udfmp, sector, *size + (offset &
udfmp->bmask), bp))) {

If it so happens that *size gets MAXBSIZ value and (offset &
udfmp->bmask) is not zero, then a value > MAXBSIZ would be passed to
udf_readlblks->bread->breadn->getblk and the latter will panic because
it has an explicit assert for size <= MAXBSIZ.

(Continue reading)

Andriy Gapon | 1 Feb 11:05 2008
Picon

Re: kern/84983: [udf] [patch] udf filesystem: stat-ting files could randomly fail


Sorry - I sent this a reply to a wrong email.
Correct PR is kern/77234.
http://www.freebsd.org/cgi/query-pr.cgi?pr=77234

on 01/02/2008 12:00 Andriy Gapon said the following:
> on 22/12/2006 20:24 Pav Lucistnik said the following:
>> Synopsis: [udf] [patch] udf filesystem: stat-ting files could randomly fail
>>
>> State-Changed-From-To: open->closed
>> State-Changed-By: pav
>> State-Changed-When: Fri Dec 22 18:24:14 UTC 2006
>> State-Changed-Why: 
>> Fixed in 6.1 and up
>>
>> http://www.freebsd.org/cgi/query-pr.cgi?pr=84983
> 
> I found a bug in the patch. I got a panic in real situation when testing
> a UDF fs with a directory with a huge number of files (~10^4), but this
> can be easily shown in the code too:
> 
> static int
> udf_readatoffset(struct udf_node *node, int *size, off_t offset,
>     struct buf **bp, uint8_t **data)
> {
> ...
> *size = min(*size, MAXBSIZE);
> 
> if ((error = udf_readlblks(udfmp, sector, *size + (offset &
> udfmp->bmask), bp))) {
(Continue reading)

Andriy Gapon | 1 Feb 11:27 2008
Picon

Re: kern/84983: [udf] [patch] udf filesystem: stat-ting files could randomly fail

on 01/02/2008 12:00 Andriy Gapon said the following:
> ---- a different, under-debugged problem -----
> BTW, on some smaller directories (but still large ones) I get some very
> strange problems with reading a directory too. It seems like some bad
> interaction between udf and buffer cache system. I added a lot of
> debugging prints and the problems looks like the following:
> 
> read starting at physical sector (2048-byte one) N, size is ~20K, N%4=0
> 	bread(4 * N, some_big_size)
> 		correct data is read
> repeat the above couple dozen times
> read starting at physical sector (2048-byte one) N+1, size is ~20K
> 	bread(4 * (N+1), some_big_size)
> 		data is read from physical sector N+4 (instead of N+1)
> 
> I remember that Bruce Evance warned me that something like this could
> happen but I couldn't understand him, because I don't understand
> VM/buffer subsystem. I'll try to dig up the email.
> 

Sorry for the flood - additional info:
if I limit max read size in udf_readatoffset() to 2048 (instead of
MAXBSIZE), then large directories can be read OK. Seems like something
with overlapping buffers, maybe?

BTW, here's how I created test environment for the described issues (in
tcsh):

mkdir /tmp/bigdir

(Continue reading)

Eric Anderson | 1 Feb 15:31 2008
Picon

Re: Looking for help to reconstruct a corrupted UFS2 filesystem

Matt Emmerton wrote:
>> On Sun, 2008-01-20 at 13:47 -0500, Matt Emmerton wrote:
>>> What are my options at this point?  Since all the superblocks are 
>>> identical,
>>> fsck always behaves the same.  I suspect that one of the key blocks 
>>> that the
>>> superblock points to is corrupted.  Is any of this data replicated on 
>>> disk?
>>> Can I troll the disk looking for intermediate blocks and easily chain
>>> together portions of directory trees?
>>
>> This kind of thing is why I put ports/sysutils/ffs2recov together.  You
>> won't be able to recover everything but you should be able to get a lot
>> of it back.
> 
> Thanks Frank.  I'm playing around with this tool now.  Something must be 
> really hosed since I'm getting a lot of segfaults.
> 
> For example, ffs2recov -s /dev/ad1s1 segfaults after finding 3 
> superblocks, and these superblocks aren't close to anything that newfs 
> -N dumps out (except the one at offset 160).  It also attempts to read 
> blk 18445195961337644512, which is clearly wrong.  (I'm 99% sure that I 
> used the newfs defaults when I created this filesystem, so why would 
> ffs2recov be looking for superblocks in different locations?)
> 
> ffs2rrecov -p also segfaults after dumping part of cg 3, and ffs2recov 
> -d segfaults after hitting inode 8331.
> 
> ffs2recov -a and ffs2recov -r <name> do a lot of complaining regarding 
> failure to allocate large amount (or negative) memory.
(Continue reading)

Martin Cracauer | 1 Feb 18:22 2008

fsck and mount disagree on whether superblocks are usable

This is not an emergency but I find it odd.  Mount and fsck agree on
whether superblocks are usable.  Mount can mount readonly, but fsck
can use neither the primary superblock nor the alternatives.

Here the long story:

Just for kicks I replicated a harddrive, while dd'ing off /dev/ad0
with live (read/write mounts) filesystems and all partition tables and
disklabels.

Restored to a different drive I have everything, all partitions are
there and I can mount readonly.  The filesystem is of course dirty,
hence the readonly mount.

However, fsck refuses to run.

BAD SUPER BLOCK: VALUES IN SUPER BLOCK DISAGREE WITH THOSE IN FIRST ALTERNATE
[who did that all-uppercase, anyway?]

Allowing it to look for alternate superblocks I get:

BAD SUPER BLOCK: VALUES IN SUPER BLOCK DISAGREE WITH THOSE IN FIRST ALTERNATE

LOOK FOR ALTERNATE SUPERBLOCKS? [yn] y

32 is not a file system superblock
28756320 is not a file system superblock
57512608 is not a file system superblock
86268896 is not a file system superblock
115025184 is not a file system superblock
(Continue reading)

Niki Denev | 2 Feb 10:59 2008

Re: ZFS panics

On Feb 2, 2008 11:52 AM, Niki Denev <nike_d <at> cytexbg.com> wrote:
[snip]
> I've tried to use the "list" command in kdb as shown in the developers handbook
> but it keeps saying "No source file for address XXX"
[snip]

sorry, i forgot to remove those lines after i managed to get
"add-debug-symbols" load zfs.ko.symbols
_______________________________________________
freebsd-fs <at> freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe <at> freebsd.org"

Niki Denev | 2 Feb 10:52 2008

ZFS panics

Hi,

I'm doing some stress testing on one server using ZFS and i have
experienced two kernel panics in the last days.

The machine runs AMD64 7.0-PRERELEASE on dual quad-core (8 cores
total) Intel Xeon 2.0Ghz, with 8Gigs of Ram.
The disk subsystem consists of eight hitachi SATA drives on a Areca
1231ML with 1G of cache memory and a battery backup.
I'm using GUID partitions only. One 10G for the system on UFS2 with
geom_journal, 10G swap/dump partition, and the rest
2.7TB is a ZFS pool.
I also have this in loader.conf :

vm.kmem_size="1G"
vm.kmem_size_max="1G"

I was running multiple bonnie++ instances in parallel writing and
reading from the ZFS pool.
The first time i ran 80 bonnie++ instances and the machine rebooted
after about 3 hours.
The second time i ran 16 bonnie++ instances and the machine survived
good 11 hours.

I've tried to use the "list" command in kdb as shown in the developers handbook
but it keeps saying "No source file for address XXX"

Here it is the first panic that i experienced. The second one looks identical :

(i'm not entirely sure that i load the zfs symbols properly?)
(Continue reading)

Julian H. Stacey | 2 Feb 20:16 2008

Re: fsck and mount disagree on whether superblocks are usable

Martin Cracauer wrote:
> This is not an emergency but I find it odd.  Mount and fsck agree on
> whether superblocks are usable.  Mount can mount readonly, but fsck
> can use neither the primary superblock nor the alternatives.
> 
> 32 is not a file system superblock

Just in case, You know secondary block on newer FSs moved from 32 ?
Ref man fsck_ufs
   -b      Use the block specified immediately after the flag as the super
             block for the file system.  An alternate super block is usually
             located at block 32 for UFS1, and block 160 for UFS2.

--

-- 
Julian Stacey.  BSD Unix Linux Net Consultant, Munich.  http://berklix.com
_______________________________________________
freebsd-fs <at> freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe <at> freebsd.org"

Artis Caune | 3 Feb 21:01 2008
Picon

Re: ZFS panics

Niki Denev wrote:
> I also have this in loader.conf :
> 
> vm.kmem_size="1G"
> vm.kmem_size_max="1G"

Try with:
vm.kmem_size="1500M"
vfs.zfs.arc_max="512M"
vfs.zfs.prefetch_disable="1"

_______________________________________________
freebsd-fs <at> freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe <at> freebsd.org"

Alexander Leidinger | 4 Feb 09:21 2008
Picon

ZFS: invalid label -- what is expected?

Hi,

I'm in the uncomfortable situation that I have a ZFS which emits the  
message "label missing or invalid" for one of the 3 disks. As this is  
neither a mirror nor a raidz, the whole pool is not accessible. This  
pool just served as a backup for some data which had to be transferred  
from a graid3 to a raidz. The graid3 does not exist anymore, and the  
raidz is not populated yet...

As I want to have the data back from the interrim-backup pool (several  
jails, the FreeBSD CVS, ... so more or less nothing important, but  
time consuming to get back to the previous state by  
reinstalling/downloading/...), I would like to know where this label  
zfs is complaining about is supposed to reside, and how is it suppsoed  
to look like? I would like to give a repair a try (giving zfs the  
label it expects and hoping it is able to recover at least most of the  
data (on this pool I had the property copies=2)).

So anybody out there who can give me some information I need to do  
this, or at least knows some places in the zfs source I should have a  
look at? Please CC me, as I'm not subscribed to freebsd-fs <at> .

Bye,
Alexander.

--

-- 
Automobile, n.:
	A four-wheeled vehicle that runs up hills and down
	pedestrians.

(Continue reading)


Gmane