Mitch Harder | 9 Feb 22:38

[PATCH/RFC] Btrfs: Add conditional ENOSPC debugging.

This patch isn't intended for inclusion in the kernel, but is provided
to facilitate ENOSPC debugging in a framework that will have no
impact on Btrfs unless compiled conditionally.

Debugging printk statements are wrapped in #ifdef macros
to allow btrfs to be built in its original form unless debugging
is explicitly requested when the kernel is built.

The debugging can be enabled as follows:
KCFLAGS="-DBTRFS_DEBUG_ENOSPC" \
KCPPFLAGS="-DBTRFS_DEBUG_ENOSPC" \
make

The patch was constructed by searching the Btrfs code for "ENOSPC",
and inserting printk statements as appropriate if the occurance
was a generation point for an ENOSPC.

I initially developed this patch to track down where ENOSPC errors were being
generated when using zlib compression.

This patch will occasionally generate some false positives, since ENOSPC
conditions are sometimes corrected before returning an error to the caller.

It is also interesting for highlighting all the places in Btrfs where an
ENOSPC can be generated.

Signed-off-by: Mitch Harder <mitch.harder <at> sabayonlinux.org>
---
 fs/btrfs/delayed-inode.c    |   10 ++++
 fs/btrfs/extent-tree.c      |   84 +++++++++++++++++++++++++++++++++
(Continue reading)

Norbert Scheibner | 9 Feb 18:42
Picon

Freeing space over reboot question

Glück Auf!
I use now kernel 3.2. The filesystem was originally created under 2.6.39 on 1 whole hdd, mounted with
"noatime,nodiratime,inode_cache". I use it for backups: rsync the whole system to a subvolume,
snapshot it and then delete some tempfiles in the snapshot, which are 90% of the full-backup, all once a
day. In figures: on this 1 TB hdd is the full-backup with around 600 GiB and 10 to 20 snapshots with around 30
GiB each, all together using around 700 GiB on disc.

What I did:
- I deleted (by accident) the big subvolume with the full-backup with "btrfs subvolume delete" and
recreated it with the same name with a snapshot of the latest snapshot.
- During the deletion of this big subvolume in background I changed the kernel from 3.1 to 3.2 and did a reboot.
- After that, the fs was operational, but the space was still used and the next system-backup onto this fs
failed with no space left errors. "btrfs filesystem df" showed me that the fs used the whole hdd and that
there were only some kB free, which fits to the errors from rsync during backup.

So the used space of subvolume I deleted, was not freed.

How to get the space back which should have been freed?
A balance did not help. What worked was the deletion of that half-filled subvolume, which I use for the full
backup. After that the space got freed and the next balance run shrinked the fs again, so that it uses only a
part of the hdd.

What I wonder is: Couldn't this be a little bit more user-friendly?

If there is a background process running like this here, freeing some space, should the umount take as long
as the background process or should the background process stop immediately and restart after the next
mount (if possible, especially with a kernel change in between or the possibility that the fs gets mounted read-only)?

... Or is this all nonsense and it happened here because I rebooted and after that used another kernel.

(Continue reading)

Arne Jansen | 9 Feb 15:09
Picon

[PATCH] btrfs: don't check DUP chunks twice

Because scrub enumerates the dev extent tree to find the chunks to scrub,
it currently finds each DUP chunk twice and also scrubs it twice. This
patch makes sure that scrub_chunk only checks that part of the chunk the
dev extent has been found for. This only changes the behaviour for DUP
chunks.

Reported-and-tested-by: Stefan Behrens <sbehrens <at> giantdisaster.de>
Signed-off-by: Arne Jansen <sensille <at> gmx.net>
---
 fs/btrfs/scrub.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 9770cc5..abc0fbf 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1367,7 +1367,8 @@ out:
 }

 static noinline_for_stack int scrub_chunk(struct scrub_dev *sdev,
-	u64 chunk_tree, u64 chunk_objectid, u64 chunk_offset, u64 length)
+	u64 chunk_tree, u64 chunk_objectid, u64 chunk_offset, u64 length,
+	u64 dev_offset)
 {
 	struct btrfs_mapping_tree *map_tree =
 		&sdev->dev->dev_root->fs_info->mapping_tree;
@@ -1391,7 +1392,8 @@ static noinline_for_stack int scrub_chunk(struct scrub_dev *sdev,
 		goto out;

 	for (i = 0; i < map->num_stripes; ++i) {
(Continue reading)

Liu Bo | 9 Feb 11:17
Favicon

[PATCH] Btrfs: fix trim 0 bytes after a device delete

A user reported a bug of btrfs's trim, that is we will trim 0 bytes
after a device delete.

The reproducer:

$ mkfs.btrfs disk1
$ mkfs.btrfs disk2
$ mount disk1 /mnt
$ fstrim -v /mnt
$ btrfs device add disk2 /mnt
$ btrfs device del disk1 /mnt
$ fstrim -v /mnt

This is because after we delete the device, the block group may start from
a non-zero place, which will confuse trim to discard nothing.

Reported-by: Lutz Euler <lutz.euler <at> freenet.de>
Signed-off-by: Liu Bo <liubo2009 <at> cn.fujitsu.com>
---
 fs/btrfs/extent-tree.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 77ea23c..b6e2c92 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -7653,9 +7653,16 @@ int btrfs_trim_fs(struct btrfs_root *root, struct fstrim_range *range)
 	u64 start;
 	u64 end;
 	u64 trimmed = 0;
(Continue reading)

Jeff Liu | 9 Feb 07:25
Picon
Favicon

[PATCH v2] Btrfs: return the internal error unchanged if btrfs_get_extent_fiemap() call failed for SEEK_DATA/SEEK_HOLE inquiry

Given that ENXIO only means "offset beyond EOF" for either SEEK_DATA or SEEK_HOLE inquiry
in a desired file range, so we should return the internal error unchanged if btrfs_get_extent_fiemap()
call failed, rather than ENXIO.

Cc: Dave Chinner <david <at> fromorbit.com>
Signed-off-by: Jie Liu <jeff.liu <at> oracle.com>

---
 fs/btrfs/file.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 97fbe93..6d9e796 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1761,7 +1761,7 @@ static int find_desired_extent(struct inode *inode, loff_t *offset, int origin)
 						     start - root->sectorsize,
 						     root->sectorsize, 0);
 		if (IS_ERR(em)) {
-			ret = -ENXIO;
+			ret = PTR_ERR(em);
 			goto out;
 		}
 		last_end = em->start + em->len;
@@ -1773,7 +1773,7 @@ static int find_desired_extent(struct inode *inode, loff_t *offset, int origin)
 	while (1) {
 		em = btrfs_get_extent_fiemap(inode, NULL, 0, start, len, 0);
 		if (IS_ERR(em)) {
-			ret = -ENXIO;
+			ret = PTR_ERR(em);
(Continue reading)

Liu Bo | 9 Feb 06:40
Favicon

[PATCH 3/3 v2] xfstests: add btrfs online defragments QA test

As the title shows, we port btrfs online defragments QA test into xfstests.

v1->v2:
- place the real tests inside testcases.

Signed-off-by: Liu Bo <liubo2009 <at> cn.fujitsu.com>
---
 278      |  247 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 278.args |   18 +++++
 278.out  |   75 +++++++++++++++++++
 group    |    1 +
 4 files changed, 341 insertions(+), 0 deletions(-)
 create mode 100755 278
 create mode 100644 278.args
 create mode 100644 278.out

diff --git a/278 b/278
new file mode 100755
index 0000000..71f12e0
--- /dev/null
+++ b/278
@@ -0,0 +1,247 @@
+#! /bin/bash
+# FS QA Test No. 278
+#
+# Btrfs Online defragmentation tests
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2012 Fujitsu Liu Bo.  All Rights Reserved.
+#
(Continue reading)

Jeff Liu | 9 Feb 04:46
Picon
Favicon

[PATCH] Btrfs: return EUCLEAN rather than ENXIO once internal error has occurred for SEEK_DATA/SEEK_HOLE inquiry

By referring to http://linux.die.net/man/2/lseek, return ENXIO only
when "offset beyond EOF" for either SEEK_DATA or SEEK_HOLE inquiry.
But we return it in case of internal issue too if btrfs_get_extent_fiemap() failed
due to other issues.  This will confuse the user applications to be expecting ENXIO when
trying to find a specific data or hole location once it has occurred.

Thanks Dave for pointing that out in XFS thread.

This patch fix it to return EUCLEAN, or maybe another particular errno is more reasonable in Btrfs to
indicate this fatal error?

Thanks,
-Jeff

Cc: david <at> fromorbit.com
Signed-off-by: Jie Liu <jeff.liu <at> oracle.com>

---
 fs/btrfs/file.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 97fbe93..6693040 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1761,7 +1761,7 @@ static int find_desired_extent(struct inode *inode, loff_t *offset, int origin)
 						     start - root->sectorsize,
 						     root->sectorsize, 0);
 		if (IS_ERR(em)) {
-			ret = -ENXIO;
(Continue reading)

Shoichi Ogawa | 8 Feb 06:18
Picon
Favicon

BUSINESS PROPOSAL

Hello Friend 

I am sorry to encroach on your privacy. I have a business proposal foryou. Please contact me if you are
interested to know more.

Best Regards 

Shoichi Ogawa
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Daniel Kuhn | 8 Feb 23:19
Picon
Gravatar

BTRFS crash during mount

After a forced power turn-off the filesystem of my primary boot 
partition cannot be mounted anymore,
btrfs crashes during the mount process. I'm using OpenSuse 12.1 but I've 
also tried mounting with a newer kernel 3.2.2 (systemrescue cd) and with 
a usb-converter connected to another PC without success.

The kernel log seems pretty specific about the crash location, see below.

Best regards,
Daniel Kuhn

[   66.476674] ------------[ cut here ]------------
[   66.476684] kernel BUG at fs/btrfs/free-space-cache.c:1515!
[   66.476691] invalid opcode: 0000 [#1] SMP
[   66.476699] Modules linked in: tpm_tis tpm tpm_bios i2c_nforce2 
serio_raw pcspkr floppy k10temp asus_atk0110 raid10 raid456 
async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx 
raid1 raid0 multipath linear ata_generic nouveau ttm drm_kms_helper drm 
i2c_algo_bit firewire_ohci i2c_core pata_acpi mxm_wmi forcedeth 
pata_marvell firewire_core pata_amd video wmi
[   66.476752]
[   66.476759] Pid: 1844, comm: mount Not tainted 3.2.2-alt250-i586 #2 
System manufacturer System Product Name/M3N-HT DELUXE
[   66.476772] EIP: 0060:[<c06f7b6f>] EFLAGS: 00010206 CPU: 2
[   66.476785] EIP is at remove_from_bitmap+0xa8/0x285
[   66.476792] EAX: 6a92c000 EBX: 00000000 ECX: 0005c000 EDX: 00000002
[   66.476799] ESI: f2f5baa8 EDI: f2f5ba8c EBP: f2f5ba48 ESP: f2f5b9ec
[   66.476805]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[   66.476813] Process mount (pid: 1844, ti=f2f5a000 task=f2ff7080 
task.ti=f2f5a000)
(Continue reading)

Martin | 8 Feb 20:24
Picon
Favicon
Gravatar

btrfs support for efficient SSD operation (data blocks alignment)

My understanding is that for x86 architecture systems, btrfs only allows
a sector size of 4kB for a HDD/SSD. That is fine for the present HDDs
assuming the partitions are aligned to a 4kB boundary for that device.

However for SSDs...

I'm using for example a 60GByte SSD that has:

    8kB page size;
    16kB logical to physical mapping chunk size;
    2MB erase block size;
    64MB cache.

And the sector size reported to Linux 3.0 is the default 512 bytes!

My first thought is to try formatting with a sector size of 16kB to
align with the SSD logical mapping chunk size. This is to avoid SSD
write amplification. Also, the data transfer performance for that device
is near maximum for writes with a blocksize of 16kB and above. Yet,
btrfs supports a 4kByte page/sector size only at present...

Is there any control possible over the btrfs filesystem structure to map
metadata and data structures to the underlying device boundaries?

For example to maximise performance, can the data chunks and the data
chunk size be aligned to be sympathetic to the SSD logical mapping chunk
size and the erase block size?

What features other than the trim function does btrfs employ to optimise
for SSD operation?
(Continue reading)

Dan Carpenter | 8 Feb 10:18
Picon
Favicon
Gravatar

passing positive numbers to ERR_PTR()

Hi Jan,

Smatch complains when you pass positive numbers to ERR_PTR().  There
is a warning triggered in iref_to_path().

fs/btrfs/backref.c +920 iref_to_path()
   918  
   919          if (ret)
   920                  return ERR_PTR(ret);
                                       ^^^
"ret" can be either a negative error code, zero, or one here.

   921  

I looked at the code, but couldn't tell if it was intentional or not.
It really is pretty unusual to do that, so maybe there should be a
comment or something.

regards,
dan carpenter
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Gmane