Christoph Hellwig | 1 May 2010 14:58
Favicon

Re: [PATCH 4/5] [PATCH] xfs: simplify buffer to transaction matching

On Tue, Apr 20, 2010 at 04:41:55PM +1000, Dave Chinner wrote:
> Good start, but I think that it should use xfs_trans_first_item()
> and xfs_trans_next_item() rather than walking the descriptor
> table directly.

I tried implementing it, but it doesn't work.  We can call the buffer
matching routines on transactions that don't have any item linked to
it, which will cause xfs_trans_first_item to panic.  Compare this code
in xfs_trans_buf_item_match:

	for (licp = &tp->t_items; licp != NULL; licp = licp->lic_next) {
		if (xfs_lic_are_all_free(licp)) {
			ASSERT(licp == &tp->t_items);
			ASSERT(licp->lic_next == NULL);
			return NULL;
		}

		...
	}

to this in xfs_trans_first_item:

	licp = &tp->t_items;
        /*
	 * If it's not in the first chunk, skip to the second.
	 */
	if (xfs_lic_are_all_free(licp)) {
		licp = licp->lic_next;
	}

(Continue reading)

Linda Walsh | 1 May 2010 15:38

Re: Building XFSDump but missing uuid development package


Peter Shuere wrote:
> Hi,
> 
> I am trying to build the XFS utility, but during the build-process, I ran into a problem saying that I need
the UUID development package to make the build complete. I am using OpenSUSE 11.2, where should I get the
copy of the uuid-devel source (or binary, whichever one that works).
(On SuSE 11.2:)
> zypper se uuid 
Loading repository data...
Reading installed packages...

S | Name                | Summary                                   | Type      
--+---------------------+-------------------------------------------+-----------
i | libuuid-devel       | Development files for libuuid1            | package   
  | libuuid-devel-32bit | Development files for libuuid1            | package   
i | libuuid1            | Library to generate UUIDs                 | package   
i | libuuid1-32bit      | Library to generate UUIDs                 | package   
  | perl-Data-UUID      | Perl extension for generating Globally/-> | package   
  | perl-Data-UUID      | Perl extension for generating Globally/-> | srcpackage
  | uuidd               | Utilities for the Second Extended File -> | package   
---
Does libuuid-devel not work for you?

_______________________________________________
xfs mailing list
xfs <at> oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

(Continue reading)

tytso | 1 May 2010 21:47
Picon
Picon
Favicon
Gravatar

Re: [PATCH 3/4] writeback: pay attention to wbc->nr_to_write in write_cache_pages

On Fri, Apr 30, 2010 at 12:43:29PM -0700, Andrew Morton wrote:
> 
> Maybe that fs shouldn't be calling write_cache_pages() at all.  After
> all, write_cache_pages() is a wrapper which emits a sequence of calls
> to ->writepage(), and ->writepage() writes a page.

On my todo list is to fix ext4 to not call write_cache_pages() at all.
We are seriously abusing that function ATM, since we're not actually
writing the pages when we call write_cache_pages().  I won't go into
what we're doing, because it's too embarassing, but suffice it to say
that we end up calling pagevec_lookup() or pagevec_lookup_tag()
*four*, count them *four* times while trying to do writeback.

I have a simple patch that gives ext4 our own copy of
write_cache_pages(), and then simplifies it a lot, and fixes a bunch
of problems, but then I discarded it in favor of fundamentally redoing
how we do writeback at all, but it's going to take a while to get
things completely right.  But I am working to try to fix this.

If it would help, I can ressurect the "fork write_cache_pages() and
simplify" patch, so ext4 isn't dependent on the mm/page-writeback.c's
write_cache_pages(), if there is an immediate, short-term need to fix
that function.

						- Ted

_______________________________________________
xfs mailing list
xfs <at> oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
(Continue reading)

Dave Chinner | 3 May 2010 06:23

Re: [PATCH 4/5] [PATCH] xfs: simplify buffer to transaction matching

On Sat, May 01, 2010 at 08:58:39AM -0400, Christoph Hellwig wrote:
> On Tue, Apr 20, 2010 at 04:41:55PM +1000, Dave Chinner wrote:
> > Good start, but I think that it should use xfs_trans_first_item()
> > and xfs_trans_next_item() rather than walking the descriptor
> > table directly.
> 
> I tried implementing it, but it doesn't work.  We can call the buffer
> matching routines on transactions that don't have any item linked to
> it, which will cause xfs_trans_first_item to panic.  Compare this code
> in xfs_trans_buf_item_match:
> 
> 	for (licp = &tp->t_items; licp != NULL; licp = licp->lic_next) {
> 		if (xfs_lic_are_all_free(licp)) {
> 			ASSERT(licp == &tp->t_items);
> 			ASSERT(licp->lic_next == NULL);
> 			return NULL;
> 		}
> 
> 		...
> 	}
> 
> to this in xfs_trans_first_item:
> 
> 	licp = &tp->t_items;
>         /*
> 	 * If it's not in the first chunk, skip to the second.
> 	 */
> 	if (xfs_lic_are_all_free(licp)) {
> 		licp = licp->lic_next;
> 	}
(Continue reading)

Michael Monnerie | 3 May 2010 08:49
Picon

Re: xfs_fsr question for improvement

On Samstag, 17. April 2010 Dave Chinner wrote:
> They have thousands of extents in them and they are all between
> 8-10GB in size, and IO from my VMs are stiall capable of saturating
> the disks backing these files. While I'd normally consider these
> files fragmented and candidates for running fsr on tme, the number
> of extents is not actually a performance limiting factor and so
> there's no point in defragmenting them. Especially as that requires
> shutting down the VMs...

I personally care less about file fragmentation than about 
metadata/inode/directory fragmentation. This server gets accesses from 
numerous people,

# time find /mountpoint/ -inum 107901420
/mountpoint/some/dir/ectory/path/x.iso

real    7m50.732s
user    0m0.152s
sys     0m2.376s

It took nearly 8 minutes to search through that mount point, which is 
6TB big on a RAID-5 striped over 7 2TB disks, so search speed should be 
high. Especially as there are only 765.000 files on that disk:
Filesystem            Inodes   IUsed   IFree IUse%
/mountpoint           1258291200  765659 1257525541    1%

Wouldn't you say an 8 minutes search over just 765.000 files is slow, 
even when only using 7x 2TB 7200rpm disks in RAID-5?

> > Would it be possible xfs_fsr defrags the meta data in a way that
(Continue reading)

Michael Monnerie | 3 May 2010 09:41
Picon

Re: xfs_fsr question for improvement

On Montag, 3. Mai 2010 Michael Monnerie wrote:
> When I created that XFS, I took two 2TB partitions, did pvcreate,
> vgcreate and lvcreate. Could it be that lvcreate automatically
>  thought it should do a RAID-0? Because all reads are equally split
>  between the two volumes. After a while, I added the 3rd 2TB volume,
>  and I can't see that behaviour there. So maybe this is the source of
>  all evil.

I found that lvcreate really is too smart:
       -i, --stripes Stripes
              Gives the number of stripes.  This is equal to the number 
of physical volumes to scatter the logical volume.

So it seems lvcreate did know that the VG was split among 2 "disks", and 
therefore used -i2 while I wanted -i1.

> reldiratime

Should be nodiratime, of course.

--

-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services
http://proteger.at [gesprochen: Prot-e-schee]
Tel: 0660 / 415 65 31

// Wir haben im Moment zwei Häuser zu verkaufen:
// http://zmi.at/langegg/
(Continue reading)

Michael Monnerie | 3 May 2010 10:54
Picon

read slower than write on "mv"?

This is not XFS specific, I see it on every filesys. I do a 
"mv . /newlocation" and see this with iostat:

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-
sz avgqu-sz   await  svctm  %util
xvdb              608,60    2,60 25967,20   102,40    85,31     5,37    
8,77   1,58  96,72
xvdg              0,00  608,60     0,00 26532,10    87,19     7,55   
12,41   0,12   7,44

Reading takes 97% I/O time, writing 7%. This is on the same raidset, but 
I see the same when copying between two different single disks also. Or 
is it just an effect of write caching that writes look faster than 
reads?

--

-- 
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services
http://proteger.at [gesprochen: Prot-e-schee]
Tel: 0660 / 415 65 31

// Wir haben im Moment zwei Häuser zu verkaufen:
// http://zmi.at/langegg/
// http://zmi.at/haus2009/
_______________________________________________
xfs mailing list
(Continue reading)

Amit K. Arora | 3 May 2010 10:31
Picon

[PATCH] New testcase to check if fallocate respects RLIMIT_FSIZE or not

On Sat, May 01, 2010 at 06:18:46AM -0400, Christoph Hellwig wrote:
> On Sat, May 01, 2010 at 12:34:26PM +0530, Amit K. Arora wrote:
> > Agreed. How about doing this check in the filesystem specific fallocate
> > inode routines instead ? For example, in ext4 we could do :
> 
> That looks okay - in fact XFS should already have this check because
> it re-uses the setattr implementation to set the size.
> 
> Can you submit an xfstests testcase to verify this behaviour on all
> filesystems?

Here is the new testcase.

I have run this test on a x86_64 box on XFS and ext4 on 2.6.34-rc6. It
passes on XFS, but fails on ext4. Below is the snapshot of results
followed by the testcase itself.

--
Regards,
Amit Arora

Test results:
------------
# ./check 228
FSTYP         -- xfs (non-debug)
PLATFORM      -- Linux/x86_64 elm9m93 2.6.34-rc6

228 0s ...
Ran: 228
Passed all 1 tests
(Continue reading)

Peter Palfrader | 3 May 2010 13:54

[regression,bisected] 2.6.32.12: find(1) on xfs causes OOM

Hi,

I have an xfs filesystem in a KVM domain with 512megs of memory and 2 gigs of
swap.

The filesystem is 750g in size, of which some 500g are in use in about 6
million files.  (This XFS filesystem is exported via nfs4.  I haven't tested if
this makes any difference.)

Starting in 2.6.32.12 running something like "find | wc -l" on this
filesystem's mountpoint causes the OOM killer to kill off most of the
system.  (See kern.log[1])

With 2.6.32.11 the system does not behave like this.

Bisecting turned up the following commit.  Reverting it in 2.6.32.12
also results in a system that works.

| 9e1e9675fb29c0e94a7c87146138aa2135feba2f is first bad commit
| commit 9e1e9675fb29c0e94a7c87146138aa2135feba2f
| Author: Dave Chinner <david <at> fromorbit.com>
| Date:   Fri Mar 12 09:42:10 2010 +1100
| 
|     xfs: reclaim all inodes by background tree walks
|     
|     commit 57817c68229984818fea9e614d6f95249c3fb098 upstream
|     
|     We cannot do direct inode reclaim without taking the flush lock to
|     ensure that we do not reclaim an inode under IO. We check the inode
|     is clean before doing direct reclaim, but this is not good enough
(Continue reading)

Dave Chinner | 3 May 2010 13:52

Re: read slower than write on "mv"?

On Mon, May 03, 2010 at 10:54:28AM +0200, Michael Monnerie wrote:
> This is not XFS specific, I see it on every filesys. I do a 
> "mv . /newlocation" and see this with iostat:
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-
> sz avgqu-sz   await  svctm  %util
> xvdb              608,60    2,60 25967,20   102,40    85,31     5,37    
> 8,77   1,58  96,72
> xvdg              0,00  608,60     0,00 26532,10    87,19     7,55   
> 12,41   0,12   7,44
> 
> Reading takes 97% I/O time, writing 7%. This is on the same raidset, but 
> I see the same when copying between two different single disks also. Or 
> is it just an effect of write caching that writes look faster than 
> reads?

Yes, just an effect of write caching hiding IO latency.

Cheers,

Dave.
--

-- 
Dave Chinner
david <at> fromorbit.com

_______________________________________________
xfs mailing list
xfs <at> oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

(Continue reading)


Gmane