Barry Naujok | 1 Dec 2006 01:26
Picon

TAKE 958517 - mkfs.xfs can create a corrupt filesystem

Fix up mkfs.xfs which can create a corrupt filesystem with large block sizes.

Date:  Fri Dec  1 11:26:25 AEDT 2006
Workarea:  snort.melbourne.sgi.com:/home/bnaujok/isms/repair
Inspected by:  dgc <at> sgi.com

The following file(s) were checked into:
  longdrop.melbourne.sgi.com:/isms/xfs-cmds/master-melb

Modid:  master-melb:xfs-cmds:27594a
xfsprogs/doc/CHANGES - 1.227 - changed
http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/doc/CHANGES.diff?r1=text&tr1=1.227&r2=text&tr2=1.226&f=h
xfsprogs/mkfs/xfs_mkfs.c - 1.79 - changed
http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/mkfs/xfs_mkfs.c.diff?r1=text&tr1=1.79&r2=text&tr2=1.78&f=h
	- Fix up determination of realtime extent size so mkfs can't create
	  a corrupt filesystem

xfsprogs/include/xfs_rtalloc.h - 1.13 - changed
http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/include/xfs_rtalloc.h.diff?r1=text&tr1=1.13&r2=text&tr2=1.12&f=h
	- Remove default realtime extent size definition.

David Chinner | 1 Dec 2006 01:41
Picon
Favicon

Re: Review: Reduce in-core superblock lock contention near ENOSPC

On Fri, Dec 01, 2006 at 09:38:11AM +1100, David Chinner wrote:
> On Thu, Nov 30, 2006 at 06:03:40PM +0000, Lachlan McIlroy wrote:
> 
> > These changes wouldn't apply cleanly to tot (3 hunks failed in
> > xfs_mount.c) but I couldn't see why.
> 
> Whitespace issue? Try setting:
> 
> $ export QUILT_PATCH_OPTS="--ignore-whitespace"
> 
> I'll apply the patch to a separate tree and see if I hit the same
> problem....

I see the problem - the next patch I am going to send out for
review which is earlier in my series....

The growfs fix changes the delta parameter to xfs_icsb_modify_counters()
from int to int64_t, and that is why the hunks don't apply.

The attached patch should apply (with a 6 line offset to most hunks).

Cheers,

Dave.
--

-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

---
(Continue reading)

Tim Shimmin | 1 Dec 2006 04:33
Picon
Favicon

TAKE 958736 - cleanup old 5.3/6.1 log items - from Eric

Get rid of old 5.3/6.1 v1 log items.
Cleanup patch sent in by Eric Sandeen.
Signed-off-by: Eric Sandeen <sandeen <at> sandeen.net>

Date:  Fri Dec  1 14:31:05 AEDT 2006
Workarea:  chook.melbourne.sgi.com:/build/tes/2.6.x-xfs
Inspected by:  tes <at> sgi.com,sandeen <at> sandeen.net

The following file(s) were checked into:
  longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb

Modid:  xfs-linux-melb:xfs-kern:27596a
fs/xfs/xfs_buf_item.h - 1.44 - changed
http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_buf_item.h.diff?r1=text&tr1=1.44&r2=text&tr2=1.43&f=h
	- Get rid of old 5.3/6.1 v1 log items.
	  Cleanup patch sent in by Eric Sandeen.
	  Signed-off-by: Eric Sandeen <sandeen <at> sandeen.net>

fs/xfs/xfs_log_recover.c - 1.314 - changed
http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log_recover.c.diff?r1=text&tr1=1.314&r2=text&tr2=1.313&f=h
	- Get rid of old 5.3/6.1 v1 log items.
	  Cleanup patch sent in by Eric Sandeen.
	  Signed-off-by: Eric Sandeen <sandeen <at> sandeen.net>

fs/xfs/xfs_trans.h - 1.142 - changed
http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_trans.h.diff?r1=text&tr1=1.142&r2=text&tr2=1.141&f=h
	- Get rid of old 5.3/6.1 v1 log items.
	  Cleanup patch sent in by Eric Sandeen.
	  Signed-off-by: Eric Sandeen <sandeen <at> sandeen.net>

(Continue reading)

Christian Kujau | 1 Dec 2006 05:23
Picon

Re: mkfs.xfs questions

On Wed, 29 Nov 2006, Jasmin Buchert wrote:
> Is there any real advantage of making the log size 32-64 MB and

From 'man mkfs.xfs':

    If the  log  is  contained within the data section and size isn't
    specified, mkfs.xfs will try to select a suitable log
    size depending on the size of the filesystem.  The actual
    logsize depends on the filesystem block size and the directory
    block size.

    Otherwise, the size suboption is only needed if the log
    section of the filesystem should occupy less space than the size
    of the special file.

So, if you're not limited by very special space restrictions, you won't 
need the "size" option.

> what is the difference between log version 1 and 2 regarding to
> efficency/performance?

The "version" option should have no effect on performance, from 'man 
mkfs.xfs' again:

      Using the version suboption to specify a version 2 log enables the
      sunit  suboption,  and  allows  the  logbsize  to  be increased
      beyond  32K.

The "sunit" options can be tweaked to provide better performace in raid5 
environments, same for the "agcount" option: for special needs only but 
(Continue reading)

Tim Shimmin | 1 Dec 2006 06:16
Picon
Favicon

TAKE 958736 - fix up xfsidbg.c for removal of old items

Oops, I lost my kdb in my .config and so didn't build xfsidbg.c
So now need to fix up the corresponding log item changes for xfsidbg.c.

Date:  Fri Dec  1 16:15:37 AEDT 2006
Workarea:  chook.melbourne.sgi.com:/build/tes/2.6.x-xfs
Inspected by:  vapo <at> sgi.com

The following file(s) were checked into:
  longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb

Modid:  xfs-linux-melb:xfs-kern:27602a
fs/xfs/xfsidbg.c - 1.309 - changed
http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfsidbg.c.diff?r1=text&tr1=1.309&r2=text&tr2=1.308&f=h
	- Fix up old uses of log items for pv#958736.
	  Also abstract out the item string lookup table.
	  And add in a missing item string (quotaoff) while we are there.

alejanhd | 1 Dec 2006 15:37
Picon

One problems with mount my partition xfs


Hello, my name is Alejandro, I live in Tenerife, Canary Islands in Spain

Sorry for my english is very bad,

My question is:

In my server Supermicro, with debian sarge2 installed, I have any problems mounting my disc xfs, is posible you send me information of mount partition in xfs???


Thanks


Roger Heflin | 1 Dec 2006 16:08
Favicon

Issues with XFS on Sles9 sp2.

Hello,

I have a customer that has machines whose XFS filesystem quits
responding when certain applications are running.  The only filesystem
that uses XFS is /tmp all other filesystems still respond, anything
going to tmp hangs forever.   There are multiple machines with
a couple of different types of motherboards that have this issue,
converting the machines to ext3 eliminates the issues.  Under load
they were seeing 1-2 events per 24 hours on 100 machines.   After
the ext3 conversion they have had 0 events on 400 machines in
2 weeks, so it is fairly conclusive that XFS has something to do
with it.  It is not a hardware problem of the 2 different motherboard
with the issue, one uses Opteron+AMDchipset+IDE and the other
one uses Opteron+Nvidia+SATA, and the problems are not repeating
on any 1 node, the appear to just randomly hit 1 or 2 nodes out
of the test set, and the next day it will be a different one.

They are using Sles9SP2, currently we cannot go to SP3 as there
are some other bad driver issues unrelated to XFS (the issue
preventing us from upgrading also appears to be in 2.6.16.x
kernel.org kernels so that is a more than just a SLES issue).

I have already had long discussions with Suse with less
than useful results.

Are there any patches that are likely to either produce
more debugging or to get rid of this issue?

There are no messages in the messages file when the event
happens.

Below is a sysrq generated stack trace from one of the
machines.   The issues do not seem to require heavy IO
loads (we have verified that the application is not IO
intensive), it may be something related to running short
on memory, but we don't have any OOM type messages
anywhere.  The first type of machine to have the issue
and where the issue is alot more common has only 4GB
of ram, the second type of machine that has recently
starting also having the error has 32GB of ram.

                           Roger

<Oct/27 07:40 am>xfssyncd D 00000000000493e0 0 2760 1 3876 2755 (L-TLB)

<Oct/27 07:40 am>Call Trace:<ffffffffa0141832>{:xfs:kmem_zone_zalloc+50} 
<ffffffffa012a9c4>{:xfs:_xfs_trans_alloc+36}

<Oct/27 07:40 am> <ffffffff80231b35>{__down_write+117} 
<ffffffffa0116ead>{:xfs:xfs_ilock+93}

<Oct/27 07:40 am> <ffffffffa012eda3>{:xfs:xfs_syncsub+2787} 
<ffffffff80146970>{del_timer_sync+80}

<Oct/27 07:40 am> <ffffffff80146a55>{del_singleshot_timer_sync+21} 
<ffffffff80146d2e>{schedule_timeout+254}

<Oct/27 07:40 am> <ffffffffa013e468>{:xfs:vfs_sync+40} 
<ffffffffa013da79>{:xfs:vfs_sync_worker+25}

<Oct/27 07:40 am> <ffffffffa013dc1a>{:xfs:xfssyncd+378} 
<ffffffffa013d780>{:xfs:linvfs_fill_super+0}

<Oct/27 07:40 am> <ffffffff801112b7>{child_rip+8} 
<ffffffffa013d780>{:xfs:linvfs_fill_super+0}

<Oct/27 07:40 am> <ffffffffa013daa0>{:xfs:xfssyncd+0} 
<ffffffff801112af>{child_rip+0}

<Oct/27 07:40 am>

<Oct/27 07:40 am>res D 000000000000000a 0 16149 1 26319 16151 5825 (NOTLB)

<Oct/27 07:40 am>Call Trace:<ffffffff80231bcd>{__down_read+125} 
<ffffffffa01333dc>{:xfs:xfs_access+44}

<Oct/27 07:40 am> <ffffffffa013af44>{:xfs:linvfs_permission+20} 
<ffffffff8019c767>{permission+55}

<Oct/27 07:40 am> <ffffffff8019df1c>{link_path_walk+348} 
<ffffffff801a0706>{__user_walk_it+70}

<Oct/27 07:40 am> <ffffffff801974b0>{vfs_lstat+128} 
<ffffffff80122868>{do_page_fault+536}

<Oct/27 07:40 am> <ffffffff801975bf>{sys_newlstat+31} 
<ffffffff80111101>{error_exit+0}

<Oct/27 07:40 am> <ffffffff80110794>{system_call+124}

<Oct/27 07:40 am>sbatchd D 00000000000493e0 0 16151 1 12686 16149 (NOTLB)

<Oct/27 07:40 am>Call Trace:<ffffffff801a7b51>{dput+33} 
<ffffffff8019c2cd>{follow_mount+93}

<Oct/27 07:40 am> <ffffffff801a7b51>{dput+33} 
<ffffffff80231bcd>{__down_read+125}

<Oct/27 07:40 am> <ffffffffa01333dc>{:xfs:xfs_access+44} 
<ffffffffa013af44>{:xfs:linvfs_permission+20}

<Oct/27 07:40 am> <ffffffff8019c767>{permission+55} 
<ffffffff8018aeca>{sys_chdir+138}

<Oct/27 07:40 am> <ffffffff801a394c>{sys_select+1244} 
<ffffffff80110794>{system_call+124}

<Oct/27 07:40 am>

<Oct/27 07:40 am>gm_mapper D 000000000000000a 0 12686 1 16834 16151 (L-TLB)

<Oct/27 07:40 am>Call 
Trace:<ffffffffa012b37b>{:xfs:xfs_trans_log_buf+107} 
<ffffffff8010f9c8>{__down+152}

<Oct/27 07:40 am> <ffffffff80135c50>{default_wake_function+0} 
<ffffffff80234447>{__down_failed+53}

<Oct/27 07:40 am> <ffffffffa0141642>{:xfs:.text.lock.xfs_buf+15} 
<ffffffffa0126618>{:xfs:xfs_getsb+40}

<Oct/27 07:40 am> <ffffffffa012b8aa>{:xfs:xfs_trans_getsb+106} 
<ffffffffa012a10c>{:xfs:xfs_trans_commit+332}

<Oct/27 07:40 am> <ffffffffa00e7a9c>{:xfs:xfs_free_extent+204} 
<ffffffffa0111634>{:xfs:xfs_efd_init+68}

<Oct/27 07:40 am> <ffffffffa014179b>{:xfs:kmem_zone_alloc+75} 
<ffffffffa0141832>{:xfs:kmem_zone_zalloc+50}

<Oct/27 07:40 am> <ffffffffa011a9cd>{:xfs:xfs_itruncate_finish+557} 
<ffffffffa012aae9>{:xfs:xfs_trans_alloc+217}

<Oct/27 07:40 am> <ffffffff8011081d>{sysret_signal+28} 
<ffffffffa01300af>{:xfs:xfs_inactive+591}

<Oct/27 07:40 am> <ffffffff8011081d>{sysret_signal+28} 
<ffffffff80169f50>{__pagevec_free+32}

<Oct/27 07:40 am> <ffffffff8011081d>{sysret_signal+28} 
<ffffffffa013ebc8>{:xfs:vn_rele+72}

<Oct/27 07:40 am> <ffffffffa013d392>{:xfs:linvfs_clear_inode+18} 
<ffffffff801a9d3b>{clear_inode+155}

<Oct/27 07:40 am> <ffffffff801aa3f5>{generic_delete_inode+245} 
<ffffffff801a95ee>{iput+158}

<Oct/27 07:40 am> <ffffffff801a7cb5>{dput+389} 
<ffffffff8018d9de>{__fput+270}

<Oct/27 07:40 am> <ffffffff8018965e>{filp_close+126} 
<ffffffff8013f073>{put_files_struct+115}

<Oct/27 07:40 am> <ffffffff80140522>{do_exit+1010} 
<ffffffff801484b5>{__dequeue_signal+501}

<Oct/27 07:40 am> <ffffffff8011081d>{sysret_signal+28} 
<ffffffff80140fa8>{do_group_exit+232}

<Oct/27 07:40 am> <ffffffff8014ab37>{get_signal_to_deliver+1175} 
<ffffffff8011004b>{do_signal+1179}

<Oct/27 07:40 am> <ffffffff8010fc45>{do_signal+149} 
<ffffffffa02dbea0>{:gm:gm_linux_ioctl+0}

<Oct/27 07:40 am> <ffffffffa02dbf0a>{:gm:gm_linux_ioctl+106} 
<ffffffff801a2094>{sys_ioctl+1092}

<Oct/27 07:40 am> <ffffffff8011052d>{sys_rt_sigreturn+653} 
<ffffffff8011081d>{sysret_signal+28}

<Oct/27 07:40 am> <ffffffff80110adf>{ptregscall_common+103}

<Oct/27 07:40 am>lim D 000000000000000a 0 16834 1 16835 17594 12686 (NOTLB)

<Oct/27 07:40 am>Call Trace:<ffffffff80231bcd>{__down_read+125} 
<ffffffffa01333dc>{:xfs:xfs_access+44}

<Oct/27 07:40 am> <ffffffffa013af44>{:xfs:linvfs_permission+20} 
<ffffffff8019c767>{permission+55}

<Oct/27 07:40 am> <ffffffff8019df1c>{link_path_walk+348} 
<ffffffff801a0706>{__user_walk_it+70}

<Oct/27 07:40 am> <ffffffff801974b0>{vfs_lstat+128} 
<ffffffff80117ec4>{save_i387+148}

<Oct/27 07:40 am> <ffffffff8011018d>{do_signal+1501} 
<ffffffff801975bf>{sys_newlstat+31}

<Oct/27 07:40 am> <ffffffff80147d04>{sys_rt_sigaction+148} 
<ffffffff80110794>{system_call+124}

<Oct/27 07:40 am>

<Oct/27 07:40 am>pim D 00000000000493e0 0 16835 16834 16870 (NOTLB)

<Oct/27 07:40 am>Call 
Trace:<ffffffffa01412ad>{:xfs:xfs_buf_get_flags+877} 
<ffffffffa014179b>{:xfs:kmem_zone_alloc+75}

<Oct/27 07:40 am> <ffffffff8010f9c8>{__down+152} 
<ffffffff80135c50>{default_wake_function+0}

<Oct/27 07:40 am> <ffffffffa012b37b>{:xfs:xfs_trans_log_buf+107} 
<ffffffff80234447>{__down_failed+53}

<Oct/27 07:40 am> <ffffffffa0141642>{:xfs:.text.lock.xfs_buf+15} 
<ffffffffa0126618>{:xfs:xfs_getsb+40}

<Oct/27 07:40 am> <ffffffffa012b8aa>{:xfs:xfs_trans_getsb+106} 
<ffffffffa012a10c>{:xfs:xfs_trans_commit+332}

<Oct/27 07:40 am> <ffffffffa0104d26>{:xfs:xfs_dir2_createname+278} 
<ffffffffa0117d3d>{:xfs:xfs_ichgtime+301}

<Oct/27 07:40 am> <ffffffffa013194f>{:xfs:xfs_create+1359} 
<ffffffffa013b429>{:xfs:linvfs_mknod+521}

<Oct/27 07:40 am> <ffffffffa0116d16>{:xfs:xfs_iunlock+102} 
<ffffffffa0133387>{:xfs:xfs_lookup+119}

<Oct/27 07:40 am> <ffffffffa013b704>{:xfs:linvfs_lookup+84} 
<ffffffff8019c49b>{real_lookup+123}

<Oct/27 07:40 am> <ffffffff8019cedb>{vfs_create+251} 
<ffffffff8019f3a0>{open_namei+464}

<Oct/27 07:40 am> <ffffffff80189cc7>{filp_open+87} 
<ffffffff80189d8f>{sys_open+159}

<Oct/27 07:40 am> <ffffffff80189765>{sys_close+229} 
<ffffffff80110794>{system_call+124}

<Oct/27 07:40 am>

<Oct/27 07:40 am>elim.uptime D 00000000000493e0 0 16873 1 14418 18756 
(NOTLB)

<Oct/27 07:40 am>Call Trace:<ffffffff80231bcd>{__down_read+125} 
<ffffffffa01333dc>{:xfs:xfs_access+44}

<Oct/27 07:40 am> <ffffffffa013af44>{:xfs:linvfs_permission+20} 
<ffffffff8019c767>{permission+55}

<Oct/27 07:40 am> <ffffffff8019df1c>{link_path_walk+348} 
<ffffffff8019f2a1>{open_namei+209}

<Oct/27 07:40 am> <ffffffff80189cc7>{filp_open+87} 
<ffffffff80189d8f>{sys_open+159}

<Oct/27 07:40 am> <ffffffff80111101>{error_exit+0} 
<ffffffff80110794>{system_call+124}

<Oct/27 07:40 am>

<Oct/27 07:40 am>res D 00000000000493e0 0 14323 16149 26319 (NOTLB)

<Oct/27 07:40 am>Call Trace:<ffffffff80301c83>{inet_recvmsg+51} 
<ffffffff802b520a>{sock_aio_read+346}

<Oct/27 07:40 am> <ffffffff80231bcd>{__down_read+125} 
<ffffffffa01333dc>{:xfs:xfs_access+44}

<Oct/27 07:40 am> <ffffffffa013af44>{:xfs:linvfs_permission+20} 
<ffffffff8019c767>{permission+55}

<Oct/27 07:40 am> <ffffffff8019df1c>{link_path_walk+348} 
<ffffffff8019f27f>{open_namei+175}

<Oct/27 07:40 am> <ffffffff80189cc7>{filp_open+87} 
<ffffffff80189d8f>{sys_open+159}

<Oct/27 07:40 am> <ffffffff802b58a8>{sys_socket+104} 
<ffffffff80110794>{system_call+124}

<Oct/27 07:40 am>

<Oct/27 07:40 am>acuSolve-gmpi D 00000000000493e0 0 14418 1 14419 16873 
(NOTLB)

<Oct/27 07:40 am>Call 
Trace:<ffffffff80165ad4>{wait_on_page_writeback_range_wq+324}

<Oct/27 07:40 am> <ffffffff8010f9c8>{__down+152} 
<ffffffff80135c50>{default_wake_function+0}

<Oct/27 07:40 am> <ffffffff80234447>{__down_failed+53} 
<ffffffff801949dc>{.text.lock.super+169}

<Oct/27 07:40 am> <ffffffff8018fcea>{do_sync+42} 
<ffffffff8018fd5e>{sys_sync+62}

<Oct/27 07:40 am> <ffffffff80110794>{system_call+124}

<Oct/27 07:40 am>acuSolve-gmpi D 00000000000493e0 0 14419 1 18864 14418 
(NOTLB)

<Oct/27 07:40 am>Call Trace:<ffffffff8010f9c8>{__down+152} 
<ffffffff80135c50>{default_wake_function+0}

<Oct/27 07:40 am> <ffffffff80234447>{__down_failed+53} 
<ffffffffa0141642>{:xfs:.text.lock.xfs_buf+15}

<Oct/27 07:40 am> <ffffffffa0126618>{:xfs:xfs_getsb+40} 
<ffffffffa012ecea>{:xfs:xfs_syncsub+2602}

<Oct/27 07:40 am> <ffffffffa013e468>{:xfs:vfs_sync+40} 
<ffffffffa013d434>{:xfs:linvfs_sync_super+68}

<Oct/27 07:40 am> <ffffffff80193cff>{sync_filesystems+223} 
<ffffffff8018fcf1>{do_sync+49}

<Oct/27 07:40 am> <ffffffff8018fd5e>{sys_sync+62} 
<ffffffff80110794>{system_call+124}

<Oct/27 07:40 am>

<Oct/27 07:40 am>mktemp D 00000000000493e0 0 17594 1 17656 16834 (NOTLB)

<Oct/27 07:40 am>Call Trace:<ffffffff8025bbe9>{SHATransform+25} 
<ffffffff8019c2cd>{follow_mount+93}

<Oct/27 07:40 am> <ffffffff801a7b51>{dput+33} 
<ffffffff80231bcd>{__down_read+125}

<Oct/27 07:40 am> <ffffffffa01333dc>{:xfs:xfs_access+44} 
<ffffffffa013af44>{:xfs:linvfs_permission+20}

<Oct/27 07:40 am> <ffffffff8019c767>{permission+55} 
<ffffffff8019df1c>{link_path_walk+348}

<Oct/27 07:40 am> <ffffffff801a02ac>{sys_mkdir+220} 
<ffffffff80110794>{system_call+124}

<Oct/27 07:40 am>

<Oct/27 07:40 am>check_EWNstag D 00000000000493e0 0 17620 1 17751 17656 
(NOTLB)

<Oct/27 07:40 am>Call Trace:<ffffffff8018cbbd>{do_sync_write+173} 
<ffffffff80231bcd>{__down_read+125}

<Oct/27 07:40 am> <ffffffffa01333dc>{:xfs:xfs_access+44} 
<ffffffffa013af44>{:xfs:linvfs_permission+20}

<Oct/27 07:40 am> <ffffffff8019c767>{permission+55} 
<ffffffff8019df1c>{link_path_walk+348}

<Oct/27 07:40 am> <ffffffff8019f2a1>{open_namei+209} 
<ffffffff80189cc7>{filp_open+87}

<Oct/27 07:40 am> <ffffffff80189d8f>{sys_open+159} 
<ffffffff80111101>{error_exit+0}

<Oct/27 07:40 am> <ffffffff80110794>{system_call+124} <Oct/27 07:40 am>

Oct/27 07:40 am>

<Oct/27 07:40 am>sh D 00000000000493e0 0 17959 1 17793 17858 (NOTLB)

<Oct/27 07:40 am>Call Trace:<ffffffff80231bcd>{__down_read+125} 
<ffffffffa01333dc>{:xfs:xfs_access+44}

<Oct/27 07:40 am> <ffffffffa013af44>{:xfs:linvfs_permission+20} 
<ffffffff8019c767>{permission+55}

<Oct/27 07:40 am> <ffffffff8019df1c>{link_path_walk+348} 
<ffffffff8019f2a1>{open_namei+209}

<Oct/27 07:40 am> <ffffffff80189cc7>{filp_open+87} 
<ffffffff80189d8f>{sys_open+159}

<Oct/27 07:40 am> <ffffffff80111101>{error_exit+0} 
<ffffffff80110794>{system_call+124}

<Oct/27 07:40 am>

Iustin Pop | 1 Dec 2006 19:30

Re: mkfs.xfs questions

On Fri, Dec 01, 2006 at 04:23:41AM +0000, Christian Kujau wrote:
> On Wed, 29 Nov 2006, Jasmin Buchert wrote:
> >Is there any real advantage of making the log size 32-64 MB and
> 
> From 'man mkfs.xfs':
> 
>    If the  log  is  contained within the data section and size isn't
>    specified, mkfs.xfs will try to select a suitable log
>    size depending on the size of the filesystem.  The actual
>    logsize depends on the filesystem block size and the directory
>    block size.
> 
>    Otherwise, the size suboption is only needed if the log
>    section of the filesystem should occupy less space than the size
>    of the special file.
> 
> So, if you're not limited by very special space restrictions, you won't 
> need the "size" option.

I don't understand how you took that conclusion. The explanations refer
to the default log size. I believe the original poster asked about the
performance advantage of *raising* the log size above the default values
for internal logs, and my impression is that metadata-intensive
workloads benefit from increasing the log size (however no hard numbers
are available).

A while back when mkfs.xfs had more conservative default value, bigger log
sizes indeed helped for big filesystems.

Regards,
Iustin

Lachlan McIlroy | 1 Dec 2006 20:22
Picon
Favicon

Re: Review: Reduce in-core superblock lock contention near ENOSPC

David Chinner wrote:
> On Thu, Nov 30, 2006 at 06:03:40PM +0000, Lachlan McIlroy wrote:
> 
>>Dave,
>>
>>Could you have changed the SB_LOCK from a spinlock to a blocking
>>mutex and have achieved a similar effect?
> 
> 
> Sort of - it would still be inefficient and wouldn't help solve the
> underlying causes of contention.  Also, everything else that uses
> the SB_LOCK would now have a sleep point where there wasn't one
> previously. If we are nesting the SB_LOCK somewhere else inside a
> another spinlock (not sure if we are) then we can't sleep. I'd
> prefer not to change the semantics of such a lock if I can avoid it.

That's fair enough and I can't disagree with you.  I think the SB_LOCK
was/is being abused anyway and was used too genericly (if there's such
a word).  Using separate locks for specific purposes like you've done
with the new mutex is a great start to cleaning this code up.

> 
> I think the slow path code is somewhat clearer with a separate
> mutex - it clearly documents the serialisation barrier that
> the slow path uses and allows us to do slow path checks on the
> per-cpu counters without needing the SB_LOCK.

It's certainly an improvement over the original code.

> 
> It also means that in future, we can slowly remove the need for
> holding the SB_LOCK across the entire rebalance operation and only
> use it when referencing the global superblock fields during
> the rebalance.

Sounds good.

> 
> If the need arises, it also means we can move to a mutex per counter
> so we can independently rebalance different types of counters at the
> same time (which we can't do right now).

That seems so obvious - I'm surprised we can't do it now.

> 
> 
>>Has this change had much testing on a large machine?
> 
> 
> 8p is the largest I've run it on (junkbond) and it's been ENOSPC
> tested on a 2.7GB/s filesystem (junkbond once again) as well
> as one single, slow disks.
> 
> I've tried and tried to get the ppl that reported the problem to
> test this fix but no luck so far (this bug has been open for months
> and most of that time has been me waiting for someone to run a
> test). I've basically got sick of waiting and I just want to
> move this on. It's already too late for sles10sp1 because of
> the lack of response.

If it's important to them they'll test it.  If the change doesn't fix
their problem then I'm sure we'll hear from them again.

> 
> 
>>These changes wouldn't apply cleanly to tot (3 hunks failed in
>>xfs_mount.c) but I couldn't see why.
> 
> 
> Whitespace issue? Try setting:
> 
> $ export QUILT_PATCH_OPTS="--ignore-whitespace"
> 
> I'll apply the patch to a separate tree and see if I hit the same
> problem....
> 
> 
>>The changes look fine to me, couple of comments below.
>>
>>Lachlan
>>
>>
>> <at>  <at>  -1479,9 +1479,11  <at>  <at>  xfs_mod_incore_sb_batch(xfs_mount_t *mp,
>> 		case XFS_SBS_IFREE:
>> 		case XFS_SBS_FDBLOCKS:
>> 			if (!(mp->m_flags & XFS_MOUNT_NO_PERCPU_SB)) {
>>-				status = xfs_icsb_modify_counters_locked(mp,
>>+				XFS_SB_UNLOCK(mp, s);
>>+				status = xfs_icsb_modify_counters(mp,
>> 							msbp->msb_field,
>> 							msbp->msb_delta, 
>> 							rsvd);
>>+				s = XFS_SB_LOCK(mp);
>> 				break;
>> 			}
>> 			/* FALLTHROUGH */
>>
>>Is it safe to be releasing the SB_LOCK?
> 
> 
> Yes.
> 
> 
>>Is it assumed that the
>>superblock wont change while we process the list of xfs_mod_sb
>>structures?
> 
> 
> No. We are applying deltas - it doesn't matter if other deltas are
> applied at the same time by other callers because in the end all
> the deltas get applied and it adds up to the same thing.

Okay.

> 
> 
>> <at>  <at>  -1515,11 +1517,12  <at>  <at>  xfs_mod_incore_sb_batch(xfs_mount_t *mp,
>> 			case XFS_SBS_IFREE:
>> 			case XFS_SBS_FDBLOCKS:
>> 				if (!(mp->m_flags & XFS_MOUNT_NO_PERCPU_SB)) 
>> 				{
>>-					status =
>>-					    
>>xfs_icsb_modify_counters_locked(mp,
>>+					XFS_SB_UNLOCK(mp, s);
>>+					status = xfs_icsb_modify_counters(mp,
>> 							msbp->msb_field,
>> 							-(msbp->msb_delta),
>> 							rsvd);
>>+					s = XFS_SB_LOCK(mp);
>> 					break;
>> 				}
>> 				/* FALLTHROUGH */
>>
>>Same as above.
> 
> 
> Ditto ;)
> 
> Thanks for looking at this, Lachlan.
> 
> Cheers,
> 
> Dave.

Lachlan McIlroy | 1 Dec 2006 21:12
Picon
Favicon

Re: Review: Reduce in-core superblock lock contention near ENOSPC

David Chinner wrote:
> On Fri, Dec 01, 2006 at 09:38:11AM +1100, David Chinner wrote:
> 
>>On Thu, Nov 30, 2006 at 06:03:40PM +0000, Lachlan McIlroy wrote:
>>
>>
>>>These changes wouldn't apply cleanly to tot (3 hunks failed in
>>>xfs_mount.c) but I couldn't see why.
>>
>>Whitespace issue? Try setting:
>>
>>$ export QUILT_PATCH_OPTS="--ignore-whitespace"
>>
>>I'll apply the patch to a separate tree and see if I hit the same
>>problem....
> 
> 
> I see the problem - the next patch I am going to send out for
> review which is earlier in my series....
> 
> The growfs fix changes the delta parameter to xfs_icsb_modify_counters()
> from int to int64_t, and that is why the hunks don't apply.
> 
> The attached patch should apply (with a 6 line offset to most hunks).
> 

That's even worse - now it loses track of which file it's patching.

[lachlan <at> linux (2.6.x-xfs)2.6.x-xfs]$ patch -p1 -l -i ENOSPC.patch.eml
patching file fs/xfs/xfs_mount.c
Hunk #2 succeeded at 543 (offset 5 lines).
Hunk #3 succeeded at 1485 (offset 6 lines).
Hunk #4 succeeded at 1523 (offset 6 lines).
Hunk #5 succeeded at 1736 (offset 6 lines).
Hunk #6 succeeded at 1758 (offset 6 lines).
Hunk #7 succeeded at 1794 (offset 6 lines).
Hunk #8 succeeded at 1901 (offset 6 lines).
Hunk #9 succeeded at 2021 (offset 6 lines).
Hunk #10 succeeded at 2060 (offset 6 lines).
Hunk #11 FAILED at 2087.
1 out of 11 hunks FAILED -- saving rejects to file fs/xfs/xfs_mount.c.rej
missing header for unified diff at line 256 of patch
can't find file to patch at input line 256
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|               break;
|
--------------------------
File to patch:
Skip this patch? [y]
Skipping patch.
3 out of 3 hunks ignored
patching file fs/xfs/xfs_mount.h

It seems to have a problem with line 256 of the patch:

 <at>  <at>  -2081,7 +2121,7  <at>  <at>  again:          <---- line 256
  		lcounter = icsbp->icsb_ifree;
  		lcounter += delta;
  		if (unlikely(lcounter < 0))
-			goto slow_path;
+			goto balance_counter;
  		icsbp->icsb_ifree = lcounter;
  		break;

I can't see what's wrong.  Don't sweat over it - it's not that important
now that the review is done.  I'll leave it to you to merge it with tot.

Lachlan


Gmane