Theodore Ts'o | 1 May 2009 06:37
Picon
Picon
Favicon
Gravatar

[PATCH 2/3] ext4: Fix and simplify s_dirt handling

The s_dirt flag wasn't completely handled correctly, but it didn't
really matter when journalling was enabled.  It turns out that when
ext4 runs without a journal, we don't clear s_dirt in places where we
should have, with the result that the high-level write_super()
function was writing the superblock when it wasn't necessary.

So we fix this by making ext4_commit_super() clear the s_dirt flag,
and removing many of the other places where s_dirt is manipulated.
When journalling is enabled, the s_dirt flag might be left set more
often, but s_dirt really doesn't matter when journalling is enabled.

Signed-off-by: "Theodore Ts'o" <tytso <at> mit.edu>
---
 fs/ext4/super.c |   14 +++-----------
 1 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index ad4c9be..7c7a08a 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
 <at>  <at>  -3128,7 +3128,6  <at>  <at>  static int ext4_load_journal(struct super_block *sb,
 	if (journal_devnum &&
 	    journal_devnum != le32_to_cpu(es->s_journal_dev)) {
 		es->s_journal_dev = cpu_to_le32(journal_devnum);
-		sb->s_dirt = 1;

 		/* Make sure we flush the recovery flag to disk. */
 		ext4_commit_super(sb, 1);
 <at>  <at>  -3168,7 +3167,7  <at>  <at>  static int ext4_commit_super(struct super_block *sb, int sync)
 					&EXT4_SB(sb)->s_freeblocks_counter));
(Continue reading)

Theodore Ts'o | 1 May 2009 06:37
Picon
Picon
Favicon
Gravatar

[PATCH 3/3] ext4: Use separate super_operations structure for no_journal filesystems

By using a separate super_operations structure for filesystems that
have and don't have journals, we can simply ext4_write_super() ---
which is only needed when no journal is present --- and ext4_freeze(),
ext4_unfreeze(), and ext4_sync_fs(), which are only needed when the
journal is present.

Signed-off-by: "Theodore Ts'o" <tytso <at> mit.edu>
---
 fs/ext4/super.c |   98 ++++++++++++++++++++++++++-----------------------------
 1 files changed, 46 insertions(+), 52 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 7c7a08a..241a81e 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
 <at>  <at>  -65,7 +65,6  <at>  <at>  static const char *ext4_decode_error(struct super_block *sb, int errno,
 static int ext4_remount(struct super_block *sb, int *flags, char *data);
 static int ext4_statfs(struct dentry *dentry, struct kstatfs *buf);
 static int ext4_unfreeze(struct super_block *sb);
-static void ext4_write_super(struct super_block *sb);
 static int ext4_freeze(struct super_block *sb);

 
 <at>  <at>  -995,7 +994,6  <at>  <at>  static const struct super_operations ext4_sops = {
 	.dirty_inode	= ext4_dirty_inode,
 	.delete_inode	= ext4_delete_inode,
 	.put_super	= ext4_put_super,
-	.write_super	= ext4_write_super,
 	.sync_fs	= ext4_sync_fs,
 	.freeze_fs	= ext4_freeze,
(Continue reading)

Theodore Ts'o | 1 May 2009 06:37
Picon
Picon
Favicon
Gravatar

[PATCH 1/3] ext4: Simplify ext4_commit_super()'s function signature

The ext4_commit_super() function took both a struct super_block * and
a struct ext4_super_block *, but the struct ext4_super_block can be
derived from the struct super_block.

Signed-off-by: "Theodore Ts'o" <tytso <at> mit.edu>
---
 fs/ext4/super.c |   33 ++++++++++++++++-----------------
 1 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 3e509bc..ad4c9be 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
 <at>  <at>  -54,8 +54,7  <at>  <at>  static struct kset *ext4_kset;

 static int ext4_load_journal(struct super_block *, struct ext4_super_block *,
 			     unsigned long journal_devnum);
-static int ext4_commit_super(struct super_block *sb,
-			      struct ext4_super_block *es, int sync);
+static int ext4_commit_super(struct super_block *sb, int sync);
 static void ext4_mark_recovery_complete(struct super_block *sb,
 					struct ext4_super_block *es);
 static void ext4_clear_journal_err(struct super_block *sb,
 <at>  <at>  -306,7 +305,7  <at>  <at>  static void ext4_handle_error(struct super_block *sb)
 		printk(KERN_CRIT "Remounting filesystem read-only\n");
 		sb->s_flags |= MS_RDONLY;
 	}
-	ext4_commit_super(sb, es, 1);
+	ext4_commit_super(sb, 1);
 	if (test_opt(sb, ERRORS_PANIC))
(Continue reading)

Theodore Tso | 1 May 2009 06:39
Picon
Picon
Favicon
Gravatar

Re: Question on block group allocation

On Wed, Apr 29, 2009 at 03:29:55PM -0700, Curt Wohlgemuth wrote:
> Yes, we have this patch.  I'm not sure if we have the "1 or 2" bug
> fixes you refer to above; do you have commits for these?

b5451f7b  ext4: Fix potential inode allocation soft lockup in Orlov allocator
6b82f3cb  ext4: really print the find_group_flex fallback warning only once
7d39db14  ext4: Use struct flex_groups to calculate get_orlov_stats()
9f24e420  ext4: Use atomic_t's in struct flex_groups

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Christoph Hellwig | 1 May 2009 09:02
Favicon

Re: [PATCH 2/3] ext4: Fix and simplify s_dirt handling

On Fri, May 01, 2009 at 12:37:16AM -0400, Theodore Ts'o wrote:
> The s_dirt flag wasn't completely handled correctly, but it didn't
> really matter when journalling was enabled.  It turns out that when
> ext4 runs without a journal, we don't clear s_dirt in places where we
> should have, with the result that the high-level write_super()
> function was writing the superblock when it wasn't necessary.
> 
> So we fix this by making ext4_commit_super() clear the s_dirt flag,
> and removing many of the other places where s_dirt is manipulated.
> When journalling is enabled, the s_dirt flag might be left set more
> often, but s_dirt really doesn't matter when journalling is enabled.

Btw, you might want to move all s_dirt setting into a wrapper, so that
it never gets set for journal mode.  That way we can avoid superflous
calls into the filesystem in that default mode.

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Christoph Hellwig | 1 May 2009 09:04
Favicon

Re: [PATCH 3/3] ext4: Use separate super_operations structure for no_journal filesystems

On Fri, May 01, 2009 at 12:37:17AM -0400, Theodore Ts'o wrote:
> By using a separate super_operations structure for filesystems that
> have and don't have journals, we can simply ext4_write_super() ---
> which is only needed when no journal is present --- and ext4_freeze(),
> ext4_unfreeze(), and ext4_sync_fs(), which are only needed when the
> journal is present.

FYI: I will make ->sync_fs mandatory pretty soon.  At that point
->write_super will only be left for d_dirt-induced periodic writeback.
(Still have to figure out what to do about file_fsync, but that won't
affect ext4)

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Nick Dokos | 1 May 2009 10:46
Picon
Favicon

[PATCH 0/6][64-bit] Overview

With this set of patches, I can go through a mkfs/fsck cycle with a
32TiB filesystem in four different configurations:

   o flex_bg off, no raid parameters
   o flex_bg off, raid parameters
   o flex_bg on, no raid parameters
   o flex_bg on, raid parameters

There are no errors and the layouts seem reasonable - in the first two
cases, I've checked the block and inode bitmaps of the four groups that
are not marked BG_BLOCK_UNINIT and they look correct.  I'm spot checking
some bitmaps in the last two cases but that's a longer process.

The fs is built on an LVM volume that consists of 16 physical volumes,
with a stripe size of 128 KiB. Each physical volume is a striped LUN
(also with a 128KiB stripe size) exported by an MSA1000 RAID
controller. There are 4 controllers, each with 28 300GiB, 15Krpm SCSI
disks. Each controller exports 4 LUNs. Each LUN is 2TiB (that's
a limitation of the hardware). So each controller exports 8TiB and
four of them provide the 32TiB for the filesystem.

The machine is a DL585g5: 4 slots, each with a quad core AMD cpu
(/proc/cpuinfo says:

vendor_id	: AuthenticAMD
cpu family	: 16
model		: 2
model name	: Quad-Core AMD Opteron(tm) Processor 8356
stepping	: 3
cpu MHz		: 2310.961
(Continue reading)

Nick Dokos | 1 May 2009 10:46
Picon
Favicon

[PATCH 1/6][64-bit] Eliminate erroneous blk_t casts in ext2fs_get_free_blocks2().

Running e2fsck -n -f <device> on a brand-new, never-mounted 32Tib fs
was producing checksum errors on half the groups (the lower half)
before it got to pass1 and block or inode bitmap conflicts on some
groups in the upper half. The checksum errors were actually caused by
the conflicts: e2fsck does a sanity check on block allocations, finds
a conflict, and uses the backup superblock at 32768. But when it does
that, it assumes the worst: it clears the UNINIT FLAGS and zeroes the
free inode count for each group, assuming that it will fix them up
during the run. That triggers the checksum errors (although it's not
clear to me why only the lower half of the groups get checksum
errors.)

dumpe2fs showed that the conflicts were not artifacts of e2fsck
processing: they existed on disk. In fact, the block bitmap of a group
conflicted with the inode bitmap of the group fifteen steps further
along. That smelled like flex_bg, so I remade the fs with flex_bg
off and then *every* group in the upper half (more precisely, with
blocks above the 32-bit boundary) had a conflict with itself: the
block bitmap and inode bitmap were allocated on top of each other.
That led to ext2fs_get_free_blocks2() which was truncating 64-bit block
numbers by casting them to blk_t.

This patch eliminates the casts. The resulting file system is free of
the conflicts (with or without flex_bg.) However, it does not fsck
cleanly yet: there are block bitmap differences  in the very last group.

Signed-off-by: Nick Dokos <nicholas.dokos <at> hp.com>
---
 lib/ext2fs/alloc.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)
(Continue reading)

Nick Dokos | 1 May 2009 10:46
Picon
Favicon

[PATCH 2/6] [64-bit] write_bitmaps(): blk_t -> blk64_t

write_bitmaps() was truncating blocks numbers to 32 bits and therefore
writing at the wrong place on the disk.

Signed-off-by: Nick Dokos <nicholas.dokos <at> hp.com>
---
 lib/ext2fs/rw_bitmaps.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/ext2fs/rw_bitmaps.c b/lib/ext2fs/rw_bitmaps.c
index 5b504bf..da472a8 100644
--- a/lib/ext2fs/rw_bitmaps.c
+++ b/lib/ext2fs/rw_bitmaps.c
 <at>  <at>  -36,8 +36,8  <at>  <at>  static errcode_t write_bitmaps(ext2_filsys fs, int do_inode, int do_block)
 	errcode_t	retval;
 	char 		*block_buf, *inode_buf;
 	int		csum_flag = 0;
-	blk_t		blk;
-	blk_t		blk_itr = fs->super->s_first_data_block;
+	blk64_t		blk;
+	blk64_t		blk_itr = fs->super->s_first_data_block;
 	ext2_ino_t	ino_itr = 1;

 	EXT2_CHECK_MAGIC(fs, EXT2_ET_MAGIC_EXT2FS_FILSYS);
--

-- 
1.6.0.6

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
(Continue reading)

Nick Dokos | 1 May 2009 10:46
Picon
Favicon

[PATCH 3/6] [64-bit] mke2fs 64-bit miscellaneous fixes.

Fix the code that was supposed to zero the last 16 blocks of the
volume: in the case of volumes larger than 2^32 blocks, various
conversions were conspiring to produce a range other than the intended
one.

Fix show_stats() for block numbers >= 2^32: blk_t -> blk64_t and printf
format.

Change int_log{2,10}() to take 64-bit arguments. Also change the int_log10()
implementation in progress.c in the same way.

Signed-off-by: Nick Dokos <nicholas.dokos <at> hp.com>
---
 lib/ext2fs/progress.c |    2 +-
 misc/mke2fs.c         |   14 +++++++-------
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/lib/ext2fs/progress.c b/lib/ext2fs/progress.c
index ff298d6..c8ee70c 100644
--- a/lib/ext2fs/progress.c
+++ b/lib/ext2fs/progress.c
 <at>  <at>  -13,7 +13,7  <at>  <at> 
 #include "ext2fs.h"
 #include "ext2fsP.h"

-static int int_log10(unsigned int arg)
+static int int_log10(unsigned long long int arg)
 {
 	int	l;

(Continue reading)


Gmane