Sinha_Himanshu | 6 Jul 2006 20:02

RE: Limited write bandwidth from ext3


We tried the extents+mballoc+delalloc patches suggested by Andreas and found
that it made a significant improvement in our benchmark - write bandwidth
increased from 144 MBps to 214 MBps. We are at about 85% of the bandwidth
that one can get writing to an  ext2 file which in turn is about 82% of the
bandwidth one can get writing to the block device. We are analyzing our
traces to determine the cause of these differences. So far, we see that
during writes to the ext3 file lun writes periodically wait for 5 reads
while in the case of writes to ext2 file lun writes periodically wait for
only one read.

Workload: Single threaded 512 KB writes to a new file.
				RedHat 4 U1			2.6.16.8
kernel
			(2.6.9 based kernel)		
Block Device		308 MBps			306 MBps
Ext2 file			267				255
Ext3				138				144
Ext3 with patches		N/A				216 
Ext3 with patches, journal on separate LUN	215

Himanshu

-----Original Message-----
From: Andreas Dilger [mailto:adilger <at> clusterfs.com] 
Sent: Wednesday, June 21, 2006 4:54 PM
To: Sinha, Himanshu
Cc: ext3-users <at> redhat.com
Subject: Re: Limited write bandwidth from ext3

(Continue reading)

Herta Van den Eynde | 10 Jul 2006 12:29
Picon
Favicon

chattr +T not implemented?

We run a third party application that creates an inordinate amount of 
subdirectories in a single directory.  To speed up I/O, I wanted to set 
the T attribute on the directory that will hold the subdirectories.  The 
"chattr +T /usr/local/lepus-bb/a-0607" command returns status 0, but 
when I verify the setting, the attribute isn't there:

   # lsattr -d /usr/local/lepus-bb/a-0607
   ------------- /usr/local/lepus-bb/a-0607

Is this attribute implemented?  The manual pages entry for chattr 
suggests it is, but when I check the chattr usage, "T" isn't listed:

   #chattr -v
   usage: chattr [-RV] [-+=AacDdijsSu] [-v version] files...

FWIIW
   # cat /proc/version
   Linux version 2.4.21-40.ELsmp (bhcompile <at> hs20-bc1-7.build.redhat.com)
   (gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-54)) #1 SMP Thu Feb 2
   22:22:39 EST 2006

Kind regards,

Herta

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
Theodore Tso | 10 Jul 2006 20:08
Picon
Picon
Favicon
Gravatar

Re: chattr +T not implemented?

On Mon, Jul 10, 2006 at 12:29:50PM +0200, Herta Van den Eynde wrote:
> Is this attribute implemented?  The manual pages entry for chattr 
> suggests it is, but when I check the chattr usage, "T" isn't listed:
> 
>   #chattr -v
>   usage: chattr [-RV] [-+=AacDdijsSu] [-v version] files...
> 
> FWIIW
>   # cat /proc/version
>   Linux version 2.4.21-40.ELsmp (bhcompile <at> hs20-bc1-7.build.redhat.com)
>   (gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-54)) #1 SMP Thu Feb 2
>   22:22:39 EST 2006

To quote from the man page:

       A  directory  with    attribute will be deemed to be the top of
       directory hierarchies for the purposes of  the  Orlov  block  allocator
       (which is used in on systems with Linux 2.5.46 or later).

You're using Linux version 2.4.21....

							- Ted
Andreas Dilger | 10 Jul 2006 20:37

Re: chattr +T not implemented?

On Jul 10, 2006  12:29 +0200, Herta Van den Eynde wrote:
> We run a third party application that creates an inordinate amount of 
> subdirectories in a single directory.  To speed up I/O, I wanted to set 
> the T attribute on the directory that will hold the subdirectories.  The 
> "chattr +T /usr/local/lepus-bb/a-0607" command returns status 0, but 
> when I verify the setting, the attribute isn't there:
> 
>   # lsattr -d /usr/local/lepus-bb/a-0607
>   ------------- /usr/local/lepus-bb/a-0607
> 
> Is this attribute implemented?  The manual pages entry for chattr 
> suggests it is, but when I check the chattr usage, "T" isn't listed:
> 
>   #chattr -v
>   usage: chattr [-RV] [-+=AacDdijsSu] [-v version] files...

man chattr(1) reports:
	A  directory  with  the  ’T’ attribute will be deemed to be the top of
	directory hierarchies for the purposes of the  Orlov  block allocator
	(which is used in on systems with Linux 2.5.46 or later).`

You can also check with "debugfs -c -R 'stat lepus-bb/a-0607' /dev/XXXX"
(assuming /usr/local/ is the mountpoint).  It may be that the kernel is
not allowing the T attribute in the EXT3_FL_USER_VISIBLE mask, though it
does show correctly in my kernel.

#define EXT3_TOPDIR_FL                  0x00020000 /* Top of directory hierarchies*/
#define EXT3_FL_USER_VISIBLE            0x0003DFFF /* User visible flags */

Cheers, Andreas
(Continue reading)

Herta Van den Eynde | 11 Jul 2006 00:11
Picon
Favicon

Re: chattr +T not implemented?

Theodore Tso wrote:
> On Mon, Jul 10, 2006 at 12:29:50PM +0200, Herta Van den Eynde wrote:
> 
>>Is this attribute implemented?  The manual pages entry for chattr 
>>suggests it is, but when I check the chattr usage, "T" isn't listed:
>>
>>  #chattr -v
>>  usage: chattr [-RV] [-+=AacDdijsSu] [-v version] files...
>>
>>FWIIW
>>  # cat /proc/version
>>  Linux version 2.4.21-40.ELsmp (bhcompile <at> hs20-bc1-7.build.redhat.com)
>>  (gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-54)) #1 SMP Thu Feb 2
>>  22:22:39 EST 2006
> 
> 
> To quote from the man page:
> 
>        A  directory  with    attribute will be deemed to be the top of
>        directory hierarchies for the purposes of  the  Orlov  block  allocator
>        (which is used in on systems with Linux 2.5.46 or later).
> 
> You're using Linux version 2.4.21....
> 
Ouch.  Missed that.  Thanks for pointing it out, Theodore.

Kind regards,

Herta

(Continue reading)

Zeremski Boris | 13 Jul 2006 09:15
Picon

detail explain of file creation process

Hi,

 

Could someone point me to documentation or explain

in detail, process of creating file.(space reservation, inode....)

 

What is happen at low lavel?

 

Thanks

_______________________________________________
Ext3-users mailing list
Ext3-users <at> redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
christian | 13 Jul 2006 19:24
Picon

Re: detail explain of file creation process

On Thu, 13 Jul 2006, Zeremski Boris wrote:
> Could someone point me to documentation or explain
> in detail, process of creating file.(space reservation, inode....)
> What is happen at low lavel?

does this: http://e2fsprogs.sourceforge.net/ext2intro.html suffice?

--

-- 
BOFH excuse #332:

suboptimal routing experience
Zeremski Boris | 14 Jul 2006 07:09
Picon

RE: detail explain of file creation process


Hi, this link is great, explain basic concept of ext2/3 file system (inode,
directory, soft/hard links...).

What I am interested in, is more detail process of creating file. What is
going on when, for example, make 'touch test.file' till that file really
start existing on file system. 

Where can I find his kind of information?

> 
> > Could someone point me to documentation or explain
> > in detail, process of creating file.(space reservation, inode....)
> > What is happen at low lavel?
> 
> does this: http://e2fsprogs.sourceforge.net/ext2intro.html suffice?
> 
Andreas Dilger | 14 Jul 2006 07:20

Re: detail explain of file creation process

On Jul 14, 2006  07:09 +0200, Zeremski Boris wrote:
> Hi, this link is great, explain basic concept of ext2/3 file system (inode,
> directory, soft/hard links...).
> 
> What I am interested in, is more detail process of creating file. What is
> going on when, for example, make 'touch test.file' till that file really
> start existing on file system. 
> 
> Where can I find his kind of information?

If you run UML with GDB, you can set a breakpoint at "sys_open" and follow
it around from there.  Also of interest are ext3_lookup, ext3_create. 

If you don't find any documentation, you might consider writing a wiki
page for this as you figure it out.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.
Ling C. Ho | 13 Jul 2006 22:13

Ext3 overhead vs Raw

Hi,

I am trying to find way to speed up read access on ext3 filesystem.
I did some tests using dd, with different block sizes, directio and 
none, etc. The test file is about 1Gig in size, and spread across 25 
fragments (found using filefrag). Block size is 4k. I have also tried 
setting readahead buffer using blockdev , from 256 to 32767.

time /root/dd conv=nocreat  ibs=4096 obs=4096 if=/sam/cache/test/test3 
of=/dev/null
The best real elapsed time I get is about 23.5s.

If I dd the same amount of data from the disk device itself, I get about 
18.5s, which matches what hdparm -tT gives me.

Comparing strace outputs, I can see the read system calls reading from 
ext3 takes 30-35% longer to complete compare to raw device. Is this 
something  expected or  can I expect better performance?

I am running kernel.org kernel 2.6.12 .

Thanks,

...
ling

Gmane