Christoph Hellwig | 1 Apr 2012 01:28
Favicon

[PATCH] 030: fix for new xfs_repair versions

Given that we now drop invalid unlinked inode lists there is no message
to capture.  Also add a sed expression to avoid failures on old repair
versions.

Signed-off-by: Christoph Hellwig <hch <at> lst.de>

Index: xfstests-dev/030
===================================================================
--- xfstests-dev.orig/030	2012-03-31 23:15:11.000000000 +0000
+++ xfstests-dev/030	2012-03-31 23:18:31.000000000 +0000
 <at>  <at>  -55,7 +55,8  <at>  <at>  _check_ag()
 	for structure in 'sb 0' 'agf 0' 'agi 0' 'agfl 0'
 	do
 		echo "Corrupting $structure - setting bits to $1"
-		_check_repair $1 "$structure"
+		_check_repair $1 "$structure" |
+			sed -e 's/error following ag 0 unlinked list//'
 	done
 }

 <at>  <at>  -97,8 +98,7  <at>  <at>  src/devzero -v -1 -n "$clear" $SCRATCH_D

 # now kick off the real repair test...
 #
-_scratch_mkfs_xfs $DSIZE | _filter_mkfs | \
-    sed -e 's/error following ag 0 unlinked list//' 2>$tmp.mkfs
+_scratch_mkfs_xfs $DSIZE | _filter_mkfs 2>$tmp.mkfs
 . $tmp.mkfs
 _check_ag 0
 _check_ag -1
(Continue reading)

Jeff Liu | 1 Apr 2012 06:55
Picon
Favicon

[PATCH] xfs: don't fill statvfs with project quota for a directory if it was not enabled.

Hello,

I can trigger a BUG() at fs/xfs/xfs_dquot.c on vanilla kernel 3.3.0 by the following steps:

1. mount a XFS partition without 'pquota' option.
   /dev/sda7 on /xfs type xfs (rw)

2. setup project1 on it.
   $ cat /etc/projects 
   1:/xfs
   $ cat /etc/projid
   project1:1
   $ sudo xfs_quota -x -c 'project -s project1' /xfs

3. du -sh /xfs

[  170.024496] XFS: Assertion failed: XFS_IS_QUOTA_RUNNING(mp), file: fs/xfs/xfs_dquot.c, line: 680
[  170.024534] ------------[ cut here ]------------
[  170.024630] kernel BUG at fs/xfs/xfs_message.c:101!
[  170.024718] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
[  170.024836] Modules linked in: xfs cryptd aes_i586 aes_generic ....
[  170.026787] Pid: 2082, comm: du Not tainted 3.3.0-dirty #48 LENOVO 7661D43/7661D43
[  170.026950] EIP: 0060:[<f94892c4>] EFLAGS: 00010246 CPU: 1
[  170.027126] EIP is at assfail+0x47/0x57 [xfs]
[  170.027207] EAX: 0000006a EBX: e2031f58 ECX: 00000000 EDX: 00000007
[  170.027319] ESI: 00000000 EDI: e364f000 EBP: e2031e88 ESP: e2031e74
[  170.027432]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[  170.027530] Process du (pid: 2082, ti=e2030000 task=e3096a80 task.ti=e2030000)
[  170.027659] Stack:
[  170.027702]  00000000 f9543fbb f95523d2 f955210c 000002a8 e2031ec0 f9521d2c 3432302e
(Continue reading)

Dave B | 2 Apr 2012 09:45
Favicon

Corrupt xfs on USB HDD : sub-optimal xfs_repair


Hi,

The two files in the root directory of a 500GB external USB HDD  became corrupt,
probably due to a power failure.

dave <at> K-Matrix $ ls -l /media/Galaxy/
ls: cannot access /media/Galaxy/ChnSchld_pre_4-14.tgz: No such file or directory
ls: cannot access /media/Galaxy/dhr820xu.ext: No such file or directory
total 24
??????????   ? ?    ?        ?                ? ChnSchld_pre_4-14.tgz
??????????   ? ?    ?        ?                ? dhr820xu.ext
drwxr-xr-x   7 dave dave  4096 2012-02-13 12:45 DHR recordings
drwxr-xr-x 212 dave dave 12288 2008-11-30 00:21 Miles Davis
drwxr-xr-x   5 dave dave  4096 2012-02-16 06:06 PartImage


xfs_repair didn't help much; it just removed the two filenames.
At minimum, I expected two entries in L+F but the L+F directory was not created.

dave <at> K-Matrix $ sudo xfs_repair /dev/sdc1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
entry "ChnSchld_pre_4-14.tgz" in shortform directory 128 references free inode 48145461
junking entry "ChnSchld_pre_4-14.tgz" in directory inode 128
entry "dhr820xu.ext" in shortform directory 128 references free inode 48145451
junking entry "dhr820xu.ext" in directory inode 128
        - agno = 1
        - agno = 2
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done


See longer console session and before/after metadumps in ~7MB d/l at:
http://daxxi.net/xfs/Galaxy_500GB_xfs.tar.gz
user: xfs  ,  p/w: xfs
(please only d/l if 2x240MB metadumps will be meaningful to you)


Environment:
Linux K-Matrix 3.0.0-16-generic #29-Ubuntu SMP Tue Feb 14 12:49:42 UTC 2012 i686 athlon i386 GNU/Linux


Dave

_______________________________________________
xfs mailing list
xfs <at> oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
CHEN Baozi | 2 Apr 2012 16:27
Picon

[RFC] Add XFS support in SYSLINUX

Hi all,

I'm a student applying for this year's GSoC and planning to add XFS support to the syslinux bootloader. I did
some research on the syslinux's implementation, and have a rough idea. 

XFS starts its superblock from the first bytes of partition, where syslinux would install its boot
codes(in the first 512 bytes). Currently, we have two approach to handle that. One is to follow the GRUB
that put its codes in the first few bytes immediately following the MBR and before the first partition's
first sector. However, this approach breaks some designing rules of syslinux, so it won't be accepted.
The other one that we are now considering is to use the remaining 2048 bytes left in XFS's first block. But
I'm not sure that this area is reserved for some new features in the future. 

Any comment?

Regards,
Chen Baozi
_______________________________________________
xfs mailing list
xfs <at> oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

Steven Whitehouse | 2 Apr 2012 16:29
Picon
Favicon

Re: [PATCH 3/7] gfs2: Use generic handlers of O_SYNC AIO DIO

Hi,

On Thu, 2012-03-29 at 18:05 -0400, Jeff Moyer wrote:
> From: Jan Kara <jack <at> suse.cz>
> 
> Use generic handlers to queue fsync() when AIO DIO is completed for O_SYNC
> file.
> 
> Signed-off-by: Jan Kara <jack <at> suse.cz>
> Signed-off-by: Jeff Moyer <jmoyer <at> redhat.com>
Acked-by: Steven Whitehouse <swhiteho <at> redhat.com>

Steve.

> ---
>  fs/gfs2/aops.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c
> index 38b7a74..2589781 100644
> --- a/fs/gfs2/aops.c
> +++ b/fs/gfs2/aops.c
>  <at>  <at>  -1034,7 +1034,7  <at>  <at>  static ssize_t gfs2_direct_IO(int rw, struct kiocb *iocb,
>  
>  	rv = __blockdev_direct_IO(rw, iocb, inode, inode->i_sb->s_bdev, iov,
>  				  offset, nr_segs, gfs2_get_block_direct,
> -				  NULL, NULL, 0);
> +				  NULL, NULL, DIO_SYNC_WRITES);
>  out:
>  	gfs2_glock_dq_m(1, &gh);
>  	gfs2_holder_uninit(&gh);

_______________________________________________
xfs mailing list
xfs <at> oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

Mark Tinguely | 2 Apr 2012 16:35
Picon
Favicon

Re: [PATCH] 030: fix for new xfs_repair versions

On 03/31/12 18:28, Christoph Hellwig wrote:
> Given that we now drop invalid unlinked inode lists there is no message
> to capture.  Also add a sed expression to avoid failures on old repair
> versions.
>
> Signed-off-by: Christoph Hellwig<hch <at> lst.de>
>
> Index: xfstests-dev/030
> ===================================================================
> --- xfstests-dev.orig/030	2012-03-31 23:15:11.000000000 +0000
> +++ xfstests-dev/030	2012-03-31 23:18:31.000000000 +0000
>  <at>  <at>  -55,7 +55,8  <at>  <at>  _check_ag()
>   	for structure in 'sb 0' 'agf 0' 'agi 0' 'agfl 0'
>   	do
>   		echo "Corrupting $structure - setting bits to $1"
> -		_check_repair $1 "$structure"
> +		_check_repair $1 "$structure" |
> +			sed -e 's/error following ag 0 unlinked list//'
>   	done
>   }
>
>  <at>  <at>  -97,8 +98,7  <at>  <at>  src/devzero -v -1 -n "$clear" $SCRATCH_D
>
>   # now kick off the real repair test...
>   #
> -_scratch_mkfs_xfs $DSIZE | _filter_mkfs | \
> -    sed -e 's/error following ag 0 unlinked list//' 2>$tmp.mkfs
> +_scratch_mkfs_xfs $DSIZE | _filter_mkfs 2>$tmp.mkfs
>   . $tmp.mkfs
>   _check_ag 0
>   _check_ag -1
> Index: xfstests-dev/030.out.linux
> ===================================================================
> --- xfstests-dev.orig/030.out.linux	2012-03-31 23:19:09.000000000 +0000
> +++ xfstests-dev/030.out.linux	2012-03-31 23:19:17.000000000 +0000
>  <at>  <at>  -85,7 +85,6  <at>  <at>  bad agbno AGBNO for inobt root, agno 0
>   root inode chunk not found
>   Phase 3 - for each AG...
>           - scan and clear agi unlinked lists...
> -error following ag 0 unlinked list
>           - process known inodes and perform inode discovery...
>           - process newly discovered inodes...
>   Phase 4 - check for duplicate blocks...
>
> _______________________________________________
> xfs mailing list
> xfs <at> oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

The sed left a blank line where where the "error following ag 0 unlinked 
list" would have been in an 3.0.x kernel run of the test.

Maybe add another "sed -e '/^$/d'" to the filter?

--Mark Tinguely <tinguely <at> sgi.com>

_______________________________________________
xfs mailing list
xfs <at> oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

Mark Rechler | 2 Apr 2012 17:09
Favicon

Re: XFS Kernel Panics in CentOS

Hi Eric,

Thank you for the reply. We are running CentOS 5.8, with the 2.6.18-164.10.1.el5.centos.plus kernel as it was mentioned in a bug report that has similar behavior, but ultimately a different kernel panic (http://bugs.centos.org/view.php?id=4089). We have tried running xfs_repair in the past and it has not proved useful. The odd part is that these are fresh systems (just installed). If it helps, we are also running glusterfs on these boxes though load does not always correlate to a kernel panic.

Thanks,
Mark

On Fri, Mar 30, 2012 at 6:44 PM, Eric Sandeen <sandeen <at> sandeen.net> wrote:
On 3/30/12 5:02 PM, Mark Rechler wrote:
> Hi Everyone,
>
> We've been getting a lot of errors (across several kernels) and eventually a kernel panic. Any insight into these errors would be much appreciated.
>
> Errors:
> Filesystem "dm-3": XFS internal error xfs_da_do_buf(2) at line 2112 of file fs/xfs/xfs_da_btree.c.  Caller 0xffffffff883c1826

Saying which CentOS it is would help ;)  And, standard disclaimers about how CentOS doesn't come with upstream _or_ distro support, etc etc...

But xfs_da_do_buf(2) indicates on-disk corruption, having encountered a bad magic number when reading from the disk.  Have you tried xfs_repair?

-Eric

> Call Trace:
>  [<ffffffff883c1725>] :xfs:xfs_da_do_buf+0x503/0x5b1
>  [<ffffffff883c1826>] :xfs:xfs_da_read_buf+0x16/0x1b
>  [<ffffffff883c1826>] :xfs:xfs_da_read_buf+0x16/0x1b
>  [<ffffffff883aeb71>] :xfs:xfs_attr_leaf_get+0x2e/0x99
>  [<ffffffff883aeb71>] :xfs:xfs_attr_leaf_get+0x2e/0x99
>  [<ffffffff883aec7f>] :xfs:xfs_attr_fetch+0xa3/0xd5
>  [<ffffffff883a7aa8>] :xfs:xfs_acl_iaccess+0x64/0xd4
>  [<ffffffff883f264a>] :xfs:xfs_check_acl+0x1b/0x2b
>  [<ffffffff8000f550>] generic_permission+0x40/0xca
>  [<ffffffff8000d902>] permission+0x81/0xc8
>  [<ffffffff8000999d>] __link_path_walk+0x173/0xf42
>  [<ffffffff8000e9cc>] link_path_walk+0x42/0xb2
>  [<ffffffff8000cc9c>] do_path_lookup+0x275/0x2f1
>  [<ffffffff8001278e>] getname+0x15b/0x1c2
>  [<ffffffff800236f6>] __user_walk_fd+0x37/0x4c
>  [<ffffffff8003f1f6>] vfs_lstat_fd+0x18/0x47
>  [<ffffffff8008c46e>] default_wake_function+0x0/0xe
>  [<ffffffff800efddf>] sys_lgetxattr+0x4e/0x5f
>  [<ffffffff8002a996>] sys_newlstat+0x19/0x31
>  [<ffffffff8005d229>] tracesys+0x71/0xe0
>  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
>
> Code: 0f b6 40 02 89 44 24 04 e9 95 00 00 00 44 0f b6 Z3 44 3b 65
> RIP [<ffffffffff8841bfaf>] :xfs:xfs_attr_shortform_getvalue+0x24/0xe2
>   RSP <ffff81020752dbc8>
> CR2: 00000000000002
>   <0>Kernel panic - not syncing: Fatal exception
>
> Thanks,
> Mark
>
>
> _______________________________________________
> xfs mailing list
> xfs <at> oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs


_______________________________________________
xfs mailing list
xfs <at> oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
Christoph Hellwig | 2 Apr 2012 18:39
Favicon

Re: [PATCH] xfs: don't fill statvfs with project quota for a directory if it was not enabled.

On Sun, Apr 01, 2012 at 12:55:55PM +0800, Jeff Liu wrote:
> Hello,
> 
> I can trigger a BUG() at fs/xfs/xfs_dquot.c on vanilla kernel 3.3.0 by the following steps:
> 
> 1. mount a XFS partition without 'pquota' option.
>    /dev/sda7 on /xfs type xfs (rw)
> 
> 2. setup project1 on it.
>    $ cat /etc/projects 
>    1:/xfs
>    $ cat /etc/projid
>    project1:1
>    $ sudo xfs_quota -x -c 'project -s project1' /xfs
> 
> 3. du -sh /xfs

Can you wire this up as a test case for xfstests?

> +		    ((mp->m_qflags & (XFS_PQUOTA_ACCT|XFS_OQUOTA_ENFD))) ==
> +		     (XFS_PQUOTA_ACCT|XFS_OQUOTA_ENFD)) {

This check is supposed to return false.  I guess Chandras separate
project quota inode preparations somehow broke it.

Chandra, can you look into this issue?

_______________________________________________
xfs mailing list
xfs <at> oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

Eric Sandeen | 2 Apr 2012 20:03
Favicon
Gravatar

Re: XFS Kernel Panics in CentOS

On 4/2/12 8:09 AM, Mark Rechler wrote:
> Hi Eric,
> 
> Thank you for the reply. We are running CentOS 5.8, with the
> 2.6.18-164.10.1.el5.centos.plus kernel as it was mentioned in a bug
> report that has similar behavior, but ultimately a different kernel
> panic (http://bugs.centos.org/view.php?id=4089). We have tried
> running xfs_repair in the past and it has not proved useful. The odd
> part is that these are fresh systems (just installed). If it helps,
> we are also running glusterfs on these boxes though load does not
> always correlate to a kernel panic.

I can't say for sure what's in that respun "extra" centos kernel,
but I can say this:  the error you hit indicates that xfs read a
buffer, and wound up with a metadata buffer which had unrecognized 
magic - i.e. it did not look like metadata as expected.  Seeing what
looks like corruption, it shut down.

This reminds me a little of 
https://bugzilla.redhat.com/show_bug.cgi?id=512552
which I fixed for RHEL customers a while back, where cancelled
readahead in MD was resulting in xfs thinking a buffer was
uptodate, but in fact it was uninitialized, hence it found
garbage and shut down in this way.

Something similar seems to be happening in your case, if xfs_repair
comes up clean; somehow xfs is getting hold of a buffer which 
apparently doesn't match what xfs_repair found to be a consistent
filesystem.

So I might suspect something in the storage stack?

Also please be sure you don't have kmod-xfs or xfs-kmod installed
on your centos box, which is a truly ancient and completely unsupported
backport of xfs from long, long ago.

-Eric

> Thanks,
> Mark
> 
> On Fri, Mar 30, 2012 at 6:44 PM, Eric Sandeen <sandeen <at> sandeen.net <mailto:sandeen <at> sandeen.net>> wrote:
> 
>     On 3/30/12 5:02 PM, Mark Rechler wrote:
>     > Hi Everyone,
>     >
>     > We've been getting a lot of errors (across several kernels) and eventually a kernel panic. Any insight
into these errors would be much appreciated.
>     >
>     > Errors:
>     > Filesystem "dm-3": XFS internal error xfs_da_do_buf(2) at line 2112 of file fs/xfs/xfs_da_btree.c. 
Caller 0xffffffff883c1826
> 
>     Saying which CentOS it is would help ;)  And, standard disclaimers about how CentOS doesn't come with
upstream _or_ distro support, etc etc...
> 
>     But xfs_da_do_buf(2) indicates on-disk corruption, having encountered a bad magic number when reading
from the disk.  Have you tried xfs_repair?
> 
>     -Eric
> 
>     > Call Trace:
>     >  [<ffffffff883c1725>] :xfs:xfs_da_do_buf+0x503/0x5b1
>     >  [<ffffffff883c1826>] :xfs:xfs_da_read_buf+0x16/0x1b
>     >  [<ffffffff883c1826>] :xfs:xfs_da_read_buf+0x16/0x1b
>     >  [<ffffffff883aeb71>] :xfs:xfs_attr_leaf_get+0x2e/0x99
>     >  [<ffffffff883aeb71>] :xfs:xfs_attr_leaf_get+0x2e/0x99
>     >  [<ffffffff883aec7f>] :xfs:xfs_attr_fetch+0xa3/0xd5
>     >  [<ffffffff883a7aa8>] :xfs:xfs_acl_iaccess+0x64/0xd4
>     >  [<ffffffff883f264a>] :xfs:xfs_check_acl+0x1b/0x2b
>     >  [<ffffffff8000f550>] generic_permission+0x40/0xca
>     >  [<ffffffff8000d902>] permission+0x81/0xc8
>     >  [<ffffffff8000999d>] __link_path_walk+0x173/0xf42
>     >  [<ffffffff8000e9cc>] link_path_walk+0x42/0xb2
>     >  [<ffffffff8000cc9c>] do_path_lookup+0x275/0x2f1
>     >  [<ffffffff8001278e>] getname+0x15b/0x1c2
>     >  [<ffffffff800236f6>] __user_walk_fd+0x37/0x4c
>     >  [<ffffffff8003f1f6>] vfs_lstat_fd+0x18/0x47
>     >  [<ffffffff8008c46e>] default_wake_function+0x0/0xe
>     >  [<ffffffff800efddf>] sys_lgetxattr+0x4e/0x5f
>     >  [<ffffffff8002a996>] sys_newlstat+0x19/0x31
>     >  [<ffffffff8005d229>] tracesys+0x71/0xe0
>     >  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
>     >
>     > Code: 0f b6 40 02 89 44 24 04 e9 95 00 00 00 44 0f b6 Z3 44 3b 65
>     > RIP [<ffffffffff8841bfaf>] :xfs:xfs_attr_shortform_getvalue+0x24/0xe2
>     >   RSP <ffff81020752dbc8>
>     > CR2: 00000000000002
>     >   <0>Kernel panic - not syncing: Fatal exception
>     >
>     > Thanks,
>     > Mark
>     >
>     >
>     > _______________________________________________
>     > xfs mailing list
>     > xfs <at> oss.sgi.com <mailto:xfs <at> oss.sgi.com>
>     > http://oss.sgi.com/mailman/listinfo/xfs
> 
> 

_______________________________________________
xfs mailing list
xfs <at> oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

Christoph Hellwig | 2 Apr 2012 20:48
Favicon

Re: [RFC] Add XFS support in SYSLINUX

Hi Chen,

thanks a lot for your interested in tackling this project.

Writing to the end of the first block would work for 4k or larger block
sizes filesystems, but it's not a very clean solution.

Id had a quick brainstorm with Dave and we came up with the following
idea:

 - create an extended attribute on the root filesystem that is larger
   than than the filesystem block size (typically 4k), and store the
   syslink payload in it.
 - for the first prototype get its block number using the GETBMAPX
   ioctl, and use it.

once that prototype works we can talk about a good interface for you.
We could precreate the attribute at mkfs time at a fixed block number
so that syslinux can hardcode it, or we could discuss any other kind
of interface that helps you.

_______________________________________________
xfs mailing list
xfs <at> oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs


Gmane