Tao Ma | 1 Dec 2011 08:39

Re: [PATCH] Set the initial TRIM information as TRIMMED

Hi Kyungmin,
On 12/01/2011 03:00 PM, Kyungmin Park wrote:
> From: Kyungmin Park <kyungmin.park <at> samsung.com>
> 
> Now trim information doesn't stored at disk so every boot time. it's cleared.
> and do the trim all disk groups.
> But assume that it's already trimmed at previous time so don't need to trim it again. So set the intial state
as trimmed.
sorry, I don't get your meaning here.
Why can we assume that the group is already trimmed since it isn't
stored in the disk?

Thanks
Tao
> 
> Signed-off-by: Kyungmin Park <kyungmin.park <at> samsung.com>
> ---
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index e2d8be8..97ef342 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
>  <at>  <at>  -1098,6 +1098,12  <at>  <at>  int ext4_mb_init_group(struct super_block *sb, ext4_group_t group)
>  		goto err;
>  	}
>  	mark_page_accessed(page);
> +
> +	/*
> +	 * TRIM information is not stored at disk so set the initial
> +	 * state as trimmed. Since previous time it's already trimmed all
> +	 */
(Continue reading)

Kyungmin Park | 1 Dec 2011 09:19
Favicon

Re: [PATCH] Set the initial TRIM information as TRIMMED

On 12/1/11, Tao Ma <tm <at> tao.ma> wrote:
> Hi Kyungmin,
> On 12/01/2011 03:00 PM, Kyungmin Park wrote:
>> From: Kyungmin Park <kyungmin.park <at> samsung.com>
>>
>> Now trim information doesn't stored at disk so every boot time. it's
>> cleared.
>> and do the trim all disk groups.
>> But assume that it's already trimmed at previous time so don't need to
>> trim it again. So set the intial state as trimmed.
> sorry, I don't get your meaning here.
> Why can we assume that the group is already trimmed since it isn't
> stored in the disk?
To avoid the first time trim operation.
Every boot time. run the fitrim then it trims all block groups again.
but it's already done at previous time. so don't need to trim it
again.
Doesn't make sense? I think it's not designed behavior.

In your patch at http://patchwork.ozlabs.org/patch/102918/

with the patch:
[root <at> boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a
real	0m5.625s
user	0m0.000s
sys	0m1.269s
[root <at> boyu-tm linux-2.6]# time ./ftrim /mnt/ext4/a
real	0m0.002s
user	0m0.000s
sys	0m0.001s
(Continue reading)

Tao Ma | 1 Dec 2011 09:30

Re: [PATCH] Set the initial TRIM information as TRIMMED

On 12/01/2011 04:19 PM, Kyungmin Park wrote:
> On 12/1/11, Tao Ma <tm <at> tao.ma> wrote:
>> Hi Kyungmin,
>> On 12/01/2011 03:00 PM, Kyungmin Park wrote:
>>> From: Kyungmin Park <kyungmin.park <at> samsung.com>
>>>
>>> Now trim information doesn't stored at disk so every boot time. it's
>>> cleared.
>>> and do the trim all disk groups.
>>> But assume that it's already trimmed at previous time so don't need to
>>> trim it again. So set the intial state as trimmed.
>> sorry, I don't get your meaning here.
>> Why can we assume that the group is already trimmed since it isn't
>> stored in the disk?
> To avoid the first time trim operation.
> Every boot time. run the fitrim then it trims all block groups again.
> but it's already done at previous time. so don't need to trim it
> again.
> Doesn't make sense? I think it's not designed behavior.
You make the assumption that we run the fitrim every time at boot time.
But what if the user don't run it at all? I guess "run fitrim during
boot" is your firmware's behaviour and it isn't an assumption for all
the other users.

Having said that, this flag will be cleared whenever some
blocks/clusters are freed. So maybe it doesn't matter for setting this
flag during the group initialization.

Thanks
Tao
(Continue reading)

Allison Henderson | 1 Dec 2011 21:13
Picon

Re: [PATCH 1/2] ext4: let mpage_submit_io works well when blocksize < pagesize

On 11/23/2011 02:15 AM, Yongqiang Yang wrote:
> If there is a unwritten but clean buffer in a page and there is a dirty buffer
> after the buffer, then mpage_submit_io does not write the dirty buffer out.
> As a result, da_writepages loops forever.
>
> This patch fixes the problem by checking dirty flag.
>
> Signed-off-by: Yongqiang Yang<xiaoqiangnk <at> gmail.com>
> ---
>   fs/ext4/inode.c |    7 +++++--
>   1 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 755f6c7..20a1d17 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
>  <at>  <at>  -1339,8 +1339,11  <at>  <at>  static int mpage_da_submit_io(struct mpage_da_data *mpd,
>   					clear_buffer_unwritten(bh);
>   				}
>
> -				/* skip page if block allocation undone */
> -				if (buffer_delay(bh) || buffer_unwritten(bh))
> +				/*
> +				 * skip page if block allocation undone and
> +				 * block is dirty
> +				 */
> +				if (ext4_bh_delay_or_unwritten(NULL, bh))
>   					skip_page = 1;
>   				bh = bh->b_this_page;
>   				block_start += bh->b_size;
(Continue reading)

Darrick J. Wong | 1 Dec 2011 21:14
Picon
Favicon

[PATCH 11/14] crc32: Bolt on crc32c

Reuse the existing crc32 code to stamp out a crc32c implementation.

Signed-off-by: Darrick J. Wong <djwong <at> us.ibm.com>
---
 include/linux/crc32.h |    2 ++
 lib/Kconfig           |    8 +++---
 lib/crc32.c           |   62 +++++++++++++++++++++++++++++++------------------
 lib/crc32defs.h       |    7 ++++++
 lib/gen_crc32table.c  |   35 ++++++++++++++++++++++------
 5 files changed, 80 insertions(+), 34 deletions(-)

diff --git a/include/linux/crc32.h b/include/linux/crc32.h
index 391a259..68267b6 100644
--- a/include/linux/crc32.h
+++ b/include/linux/crc32.h
 <at>  <at>  -11,6 +11,8  <at>  <at> 
 extern u32  crc32_le(u32 crc, unsigned char const *p, size_t len);
 extern u32  crc32_be(u32 crc, unsigned char const *p, size_t len);

+extern u32  __crc32c_le(u32 crc, unsigned char const *p, size_t len);
+
 #define crc32(seed, data, length)  crc32_le(seed, (unsigned char const *)(data), length)

 /*
diff --git a/lib/Kconfig b/lib/Kconfig
index 2bc5834..cfddafc 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
 <at>  <at>  -51,14 +51,14  <at>  <at>  config CRC_ITU_T
 	  functions require M here.
(Continue reading)

Darrick J. Wong | 1 Dec 2011 21:14
Picon
Favicon

[PATCH 05/14] crc32: Miscellaneous cleanups

Misc cleanup of lib/crc32.c and related files
	- removed unnecessary header files.
	- straightened out some convoluted ifdef's
	- rewrote some references to 2 dimensional arrays as 1 dimensional
	  arrays to make them correct. I.e. replaced tab[i] with tab[0][i].
	- a few trivial whitespace changes
	- fixed a warning in gen_crc32tables.c caused by a mismatch in the
	  type of the pointer passed to output table. Since the table is
	  only used at kernel compile time, it is simpler to make the table
	  big enough to hold the largest column size used. One cannot make the
	  column size smaller in output_table because it has to be used by
	  both the le and be tables and they can have different column sizes.

From: Bob Pearson <rpearson <at> systemfabricworks.com>
Signed-off-by: Bob Pearson <rpearson <at> systemfabricworks.com>
[djwong <at> us.ibm.com: Minor changelog tweaks]
Signed-off-by: Darrick J. Wong <djwong <at> us.ibm.com>
---
 lib/crc32.c          |  104 +++++++++++++++++---------------------------------
 lib/gen_crc32table.c |    6 +--
 2 files changed, 39 insertions(+), 71 deletions(-)

diff --git a/lib/crc32.c b/lib/crc32.c
index c93c9ae..2a87ea2 100644
--- a/lib/crc32.c
+++ b/lib/crc32.c
 <at>  <at>  -23,13 +23,10  <at>  <at> 
 /* see: Documentation/crc32.txt for a description of algorithms */

 #include <linux/crc32.h>
(Continue reading)

Darrick J. Wong | 1 Dec 2011 21:13
Picon
Favicon

[PATCH v5.2 00/14] crc32c: Add faster algorithm and self-test code

Hi all,

This patchset (re)uses Bob Pearson's crc32 slice-by-8 code to stamp out a
software crc32c implementation.  It removes the crc32c implementation in
crypto/ in favor of using the stamped-out one in lib/.  There is also a change
to Kconfig so that the kernel builder can pick an implementation best suited
for the hardware.

The motivation for this patchset is that I am working on adding full metadata
checksumming to ext4.  As far as performance impact of adding checksumming
goes, I see nearly no change with a standard mail server ffsb simulation.  On a
test that involves only file creation and deletion and extent tree writes, I
see a drop of about 50 pcercent with the current kernel crc32c implementation;
this improves to a drop of about 20 percent with the enclosed crc32c code.

When metadata is usually a small fraction of total IO, this new implementation
doesn't help much because metadata is usually a small fraction of total IO.
However, when we are doing IO that is almost all metadata (such as rm -rf'ing a
tree), then this patch speeds up the operation substantially.

Incidentally, given that iscsi, sctp, and btrfs also use crc32c, this patchset
should improve their speed as well.  I have not yet quantified that, however.
This latest submission combines Bob's patches from late August 2011 with mine
so that they can be one coherent patch set.  Please excuse my inability to
combine some of the patches; I've been advised to leave Bob's patches alone and
build atop them instead. :/

Since the last posting, I've also collected some crc32c test results on a bunch
of different x86/powerpc/sparc platforms.  The results can be viewed here:
http://goo.gl/sgt3i ; the "crc32-kern-le" and "crc32c" columns describe the
(Continue reading)

Allison Henderson | 1 Dec 2011 21:55
Picon

Re: [PATCH 4/6] ext4: remove code related to punching hole from ext4_ext_insert_extent

On 11/16/2011 07:03 PM, Yongqiang Yang wrote:
> Punch hole should never call ext4_ext_insert_extent, so this patch
> removes code related to it from ext4_ext_insert_extent.
>
> Signed-off-by: Yongqiang Yang<xiaoqiangnk <at> gmail.com>
> ---
>   fs/ext4/extents.c |    2 --
>   1 files changed, 0 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 6888d1a..720070d 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
>  <at>  <at>  -1737,8 +1737,6  <at>  <at>  int ext4_ext_insert_extent(handle_t *handle, struct inode *inode,
>   	 * There is no free space in the found leaf.
>   	 * We're gonna add a new leaf in the tree.
>   	 */
> -	if (flag&  EXT4_GET_BLOCKS_PUNCH_OUT_EXT)
> -		flags = EXT4_MB_USE_ROOT_BLOCKS;
>   	err = ext4_ext_create_new_leaf(handle, inode, flags, path, newext);
>   	if (err)
>   		goto cleanup;

Hi Yongqiang,

Actually I believe it does end up inserting an extent if an extent gets 
split. For example, we punch a hole in the middle of an extent, so we 
first split the extent into three pieces, and remove the middle piece.

Because inserting the extra extents can require extra blocks, the 
(Continue reading)

Eric Sandeen | 1 Dec 2011 23:06
Picon
Favicon
Gravatar

Re: [PATCH] Set the initial TRIM information as TRIMMED

On 12/1/11 1:00 AM, Kyungmin Park wrote:
> From: Kyungmin Park <kyungmin.park <at> samsung.com>
> 
> Now trim information doesn't stored at disk so every boot time. it's cleared.
> and do the trim all disk groups.
> But assume that it's already trimmed at previous time so don't need to trim it again. So set the intial state
as trimmed.
> 
> Signed-off-by: Kyungmin Park <kyungmin.park <at> samsung.com>
> ---
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index e2d8be8..97ef342 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
>  <at>  <at>  -1098,6 +1098,12  <at>  <at>  int ext4_mb_init_group(struct super_block *sb, ext4_group_t group)
>  		goto err;
>  	}
>  	mark_page_accessed(page);
> +
> +	/*
> +	 * TRIM information is not stored at disk so set the initial
> +	 * state as trimmed. Since previous time it's already trimmed all
> +	 */
> +	EXT4_MB_GRP_SET_TRIMMED(this_grp);

Hm, so if there were freed but un-trimmed blocks at this point, we will
never trim them until we free _another_ block in the group, right?  That
might be a reasonable tradeoff, but it is somewhat surprising behavior.

i.e. say we do:
(Continue reading)

Eric Sandeen | 1 Dec 2011 23:33
Picon
Favicon
Gravatar

Re: WARNING: at fs/inode.c:884 unlock_new_inode+0x34/0x59()

On 11/28/11 1:08 PM, Alex wrote:
> On 11/28/2011 09:28 PM, Theodore Tso wrote:
>> On Nov 28, 2011, at 1:12 PM, Alex wrote:
>>
>>> Hi Ted,
>>>
>>> This is full dmesg with suspend to disk. This time no fs errors happened. And unfortunately I don't
remember what kind of errors were there last time.
>>>
>> OK, so you have a very large number of warnings, spanning over several seconds.   That sounds to me like a
memory corruption problem, as opposed to some kind of race that only happens for a symlink creation
happening right during the suspend/resume.
>>
>> I can fix the warning and arrange to print more debugging information so that if this happens again, we get
more information, but if you can't reproduce it, I'm not sure there's much more I can do for now.
>>
>> Regards,
>>
>> -- Ted
>>
> Yes I can reproduce this, it happens after every s2disk.

And FYI a few other users are hitting it, see https://bugzilla.redhat.com/show_bug.cgi?id=744275

Several of the bugs dup'd to that bug are from distinct users.

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo <at> vger.kernel.org
(Continue reading)


Gmane