Geert Uytterhoeven | 19 Oct 21:34 2014

btrfs extent_state.private compiler warning (Re: Btrfs: cleanup the read failure record after write or when the inode is freeing)

On Sat, Oct 11, 2014 at 2:08 PM, Linux Kernel Mailing List
<linux-kernel <at> vger.kernel.org> wrote:
> Gitweb:     http://git.kernel.org/linus/;a=commit;h=f612496bca664bff6a09a99a9a7506410b6e876e
> Commit:     f612496bca664bff6a09a99a9a7506410b6e876e

>     Btrfs: cleanup the read failure record after write or when the inode is freeing
>
>     After the data is written successfully, we should cleanup the read failure record
>     in that range because
>     - If we set data COW for the file, the range that the failure record pointed to is
>       mapped to a new place, so it is invalid.
>     - If we set no data COW for the file, and if there is no error during writting,
>       the corrupted data is corrected, so the failure record can be removed. And if
>       some errors happen on the mirrors, we also needn't worry about it because the
>       failure record will be recreated if we read the same place again.
>
>     Sometimes, we may fail to correct the data, so the failure records will be left
>     in the tree, we need free them when we free the inode or the memory leak happens.

> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c

> +void btrfs_free_io_failure_record(struct inode *inode, u64 start, u64 end)
> +{
> +       struct extent_io_tree *failure_tree = &BTRFS_I(inode)->io_failure_tree;
> +       struct io_failure_record *failrec;
> +       struct extent_state *state, *next;

> +               failrec = (struct io_failure_record *)state->private;

(Continue reading)

Zygo Blaxell | 19 Oct 21:25 2014

3.16.3..3.17.1 hang in renameat2()

I've seen a hang in renameat2() from time to time on the last few
stable kernels.  I can reproduce it easily but only on one specific
multi-terabyte filesystem with millions of files.  I've tried to make
a simpler repro setup but so far without success.

Here is what I know so far.  First, the stack trace:

Oct 19 13:59:44 tester7 kernel: [ 4411.832218] INFO: task faster-dupemerg:22368 blocked for more than
240 seconds.
Oct 19 13:59:44 tester7 kernel: [ 4411.832227]       Not tainted 3.17.1-zb64+ #1
Oct 19 13:59:44 tester7 kernel: [ 4411.832229] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
Oct 19 13:59:44 tester7 kernel: [ 4411.832231] faster-dupemerg D ffff8803fcc5db20     0 22368  22367 0x00000000
Oct 19 13:59:44 tester7 kernel: [ 4411.832235]  ffff8802570cbb68 0000000000000086 ffff8802fb08e000 0000000000020cc0
Oct 19 13:59:44 tester7 kernel: [ 4411.832238]  ffff8802570cbfd8 0000000000020cc0 ffff8802ff328000 ffff8802fb08e000
Oct 19 13:59:44 tester7 kernel: [ 4411.832242]  ffff8802570cbab8 ffffffffc0343844 ffff8802570cbab8 00000000ffffffef
Oct 19 13:59:44 tester7 kernel: [ 4411.832245] Call Trace:
Oct 19 13:59:44 tester7 kernel: [ 4411.832283]  [<ffffffffc0343844>] ?
free_extent_state.part.29+0x34/0xb0 [btrfs]
Oct 19 13:59:44 tester7 kernel: [ 4411.832299]  [<ffffffffc0343d45>] ? free_extent_state+0x25/0x30 [btrfs]
Oct 19 13:59:44 tester7 kernel: [ 4411.832314]  [<ffffffffc034449a>] ? __set_extent_bit+0x3aa/0x4f0 [btrfs]
Oct 19 13:59:44 tester7 kernel: [ 4411.832319]  [<ffffffff817a78d2>] ? _raw_spin_unlock_irqrestore+0x32/0x70
Oct 19 13:59:44 tester7 kernel: [ 4411.832323]  [<ffffffff8109ead1>] ? get_parent_ip+0x11/0x50
Oct 19 13:59:44 tester7 kernel: [ 4411.832326]  [<ffffffff817a3da9>] schedule+0x29/0x70
Oct 19 13:59:44 tester7 kernel: [ 4411.832343]  [<ffffffffc03453f0>] lock_extent_bits+0x1b0/0x200 [btrfs]
Oct 19 13:59:44 tester7 kernel: [ 4411.832346]  [<ffffffff810b4c50>] ? add_wait_queue+0x60/0x60
Oct 19 13:59:44 tester7 kernel: [ 4411.832361]  [<ffffffffc03334b9>] btrfs_evict_inode+0x139/0x550 [btrfs]
Oct 19 13:59:44 tester7 kernel: [ 4411.832368]  [<ffffffff8120d9a8>] evict+0xb8/0x190
Oct 19 13:59:44 tester7 kernel: [ 4411.832370]  [<ffffffff8120e165>] iput+0x105/0x1a0
Oct 19 13:59:44 tester7 kernel: [ 4411.832373]  [<ffffffff81209d48>] __dentry_kill+0x1b8/0x210
(Continue reading)

john terragon | 19 Oct 17:00 2014
Picon

btrfs send on top level subvolumes that contain other subvolumes

Hi.

Let's say I have a top-level subvolume /sub and that inside /sub I
have another subvolume say /sub/X/Y/subsub.

If I make a snapshot (both ro and rw give the same results) of /sub,
say /sub-snap, right now what I get is this

1) the /sub-snap/X/Y/subsub is present (and empty, and that's OK as
snapshot are not recursive) but it doesn't seem to be neither
   a) an empty subvolume (because btrfs sub list doesn't list it)
   b) a directory because, for example lsattr -d subsub gives me this
result "lsattr: Inappropriate ioctl for device While reading flags on
subsub"

2) if /sub-snap is ro and I send it somewhere, then in the destination
sub-snap subsub is not present at all (which wouldn't be illogical,
given the non-recursive nature of snapshots).

So I'm wondering it all of this is the intended outcome when
snapshotting and sending a subvolume that has internally defined
subvolumes or if perhaps it's a bug.

I'm using kernel 3.17.1 patched for the recent ro snapshot corruption
bug and btrfs-progs from the 3.17.x branch in git.

Thanks
John
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
(Continue reading)

Liu Bo | 19 Oct 14:43 2014
Picon

[PATCH] fstests: fix memory corruption in aio-last-ref-held-by-io

From: Liu Bo <liub.liubo <at> gmail.com>

This's been detected by testing generic/323 on btrfs, it keeps producing chaos
of checksum errors.

It is because aio-last-ref-held-by-io uses a static buffer that is been used
repeatedly for every io_submit() call, but we'll issue NUM_IOS(=16) io_sumbit()
in a 'for' loop at a time, and when those data read by aio has not finish its
endio(), its memory is likely to be used in the next io_submit, which ends up
data corruption and numerous checksum errors.

This allocates memory for each io_submit() and generic/323 runs fine after this.

Signed-off-by: Liu Bo <bo.li.liu <at> oracle.com>
---
 src/aio-dio-regress/aio-last-ref-held-by-io.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/aio-dio-regress/aio-last-ref-held-by-io.c b/src/aio-dio-regress/aio-last-ref-held-by-io.c
index a73dc3b..7633831 100644
--- a/src/aio-dio-regress/aio-last-ref-held-by-io.c
+++ b/src/aio-dio-regress/aio-last-ref-held-by-io.c
 <at>  <at>  -109,7 +109,7  <at>  <at>  aio_test_thread(void *data)
 	ioctx_initted = 0;
 	ios_submitted = 0;

-	ret = posix_memalign((void **)&buffer, getpagesize(), IOSIZE);
+	ret = posix_memalign((void **)&buffer, getpagesize(), IOSIZE * NUM_IOS);
 	if (ret != 0) {
 		printf("%lu: Failed to allocate buffer for IO: %d\n",
(Continue reading)

Russell Coker | 19 Oct 01:41 2014
Picon

Re: strange 3.16.3 problem

On Sun, 19 Oct 2014, Robert White <rwhite <at> pobox.com> wrote:
> On 10/17/2014 08:54 PM, Russell Coker wrote:
> > # find . -name "*546"
> > ./1412233213.M638209P10546
> > # ls -l ./1412233213.M638209P10546
> > ls: cannot access ./1412233213.M638209P10546: No such file or directory
> > 
> > Any suggestions?
> 
> Does "ls -l *546" show the file to exist? e.g. what happens if you use
> the exact same wildcard in the ls command as you used in the find?

# ls -l *546 
ls: cannot access 1412233213.M638209P10546: No such file or directory

That gives the same result as find, the shell matches the file name but then 
ls can't view it.

lstat64("1412233213.M638209P10546", 0x9fab0c8) = -1 ENOENT (No such file or 
directory)

From strace, the lstat64 system call fails.

> It is possible (and back in the day it was quite common) for files to be
> created with non-renderable nonsense in the name. for instance if the
> first four characters of the name were "13^H4" (where ^H is the single
> backspace character) the file wold look like it was named 14* but it
> would be listed by ls using "13*". If the file name is "damaged", which
> is usually a failing in the program that created the file, then it can
> be "hidden in plain sight".
(Continue reading)

Jérôme Carretero | 19 Oct 00:59 2014
Picon

3.17.1 kernel BUG at /usr/src/linux/fs/btrfs/relocation.c:2736!

Hi Yan,

I was attempting to migrate a JBOD drive set to RAID1,
so I did a balance filter, which failed because free space was missing,
then I attempted to remove a device from the set (because I also wanted
to do that more urgently than the RAID1 migration), and got:

[ 6973.725608] kernel BUG at /usr/src/linux/fs/btrfs/relocation.c:2736!
[ 6973.725645] invalid opcode: 0000 [#1] PREEMPT SMP 
[ 6973.725685] Modules linked in: af_packet cfg80211 rfkill coretemp hwmon kvm_intel kvm mousedev
hid_generic ipmi_devintf iTCO_wdt iTCO_vendor_support snd_pcm snd_timer usbhid dcdbas microcode
8250 psmouse snd soundcore pcspkr hid ipmi_si tpm_tis tpm ipmi_msghandler serial_core rtc_cmos hed
evdev processor lpc_ich mfd_core ipv6 autofs4 unix
[ 6973.725955] CPU: 24 PID: 62454 Comm: btrfs Not tainted 3.17.1-ForYou #6
[ 6973.725992] Hardware name: Dell Inc. PowerEdge R910/0P658H, BIOS 1.2.0 06/22/2010
[ 6973.726050] task: ffff8803d4bdc5c0 ti: ffff8805270b0000 task.ti: ffff8805270b0000
[ 6973.726107] RIP: 0010:[<ffffffff8121d1a2>]  [<ffffffff8121d1a2>] do_relocation+0x3b7/0x3cb
[ 6973.726178] RSP: 0000:ffff8805270b3a40  EFLAGS: 00010246
[ 6973.726214] RAX: 0000000000000000 RBX: ffff8801c8165f00 RCX: 0000007d40000000
[ 6973.726254] RDX: 0000000040000000 RSI: 0000000000000000 RDI: 00000000ffffffff
[ 6973.726293] RBP: ffff8801c8165100 R08: 0000000000004000 R09: 0000000000200000
[ 6973.726332] R10: 0000000000001000 R11: 0000160000000000 R12: ffff8802673dd830
[ 6973.726370] R13: ffff8801066a4000 R14: ffff8808754b3800 R15: 00000000ffffffe4
[ 6973.726409] FS:  00007fa38c4bb880(0000) GS:ffff880277cc0000(0000) knlGS:0000000000000000
[ 6973.726472] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 6973.726508] CR2: 0000000003228bd0 CR3: 000000049d5c8000 CR4: 00000000000007e0
[ 6973.726546] Stack:
[ 6973.726572]  ffff8801a6dc9c20 ffff8806763671c0 ffff880676367ca0 0000000176367ca0
[ 6973.726653]  ffff8801c8165140 0000007d6e998000 000000bf00000000 ffff880019ccbc20
[ 6973.726724]  ffff880676367940 ffff8801c8666580 ffff880827fc4c38 ffffffff814cad3e
(Continue reading)

Wang Shilong | 18 Oct 17:52 2014
Picon

Re: Poll: time to switch skinny-metadata on by default?

Hello Josef,

With Skinny metadta and i running your btrfs-next repo for-suse branch
(which has extent ref patch), i hit following problem:

[  250.679705] BTRFS info (device sdb): relocating block group 35597058048 flags 36                                                                                                                        
[  250.728815] BTRFS info (device sdb): relocating block group 35462840320 flags 36
[  253.562133] Dropping a ref for a root that doesn't have a ref on the block
[  253.562475] Dumping block entry [34793177088 8192], num_refs 3, metadata 0
[  253.562795]   Ref root 0, parent 35532013568, owner 23988, offset 0, num_refs 18446744073709551615
[  253.563126]   Ref root 0, parent 35560964096, owner 23988, offset 0, num_refs 1
[  253.563505]   Ref root 0, parent 35654615040, owner 23988, offset 0, num_refs 1
[  253.563837]   Ref root 0, parent 35678650368, owner 23988, offset 0, num_refs 1
[  253.564162]   Root entry 5, num_refs 1
[  253.564520]   Root entry 18446744073709551608, num_refs 18446744073709551615
[  253.564860]   Ref action 4, root 5, ref_root 5, parent 0, owner 23988, offset 0, num_refs 1
[  253.565205]    [<ffffffffa049d2f1>] process_leaf.isra.6+0x281/0x3e0 [btrfs]
[  253.565225]    [<ffffffffa049de83>] build_ref_tree_for_root+0x433/0x460 [btrfs]
[  253.565234]    [<ffffffffa049e1af>] btrfs_build_ref_tree+0x18f/0x1c0 [btrfs]
[  253.565241]    [<ffffffffa0419ce8>] open_ctree+0x18b8/0x21a0 [btrfs]
[  253.565247]    [<ffffffffa03ecb0e>] btrfs_mount+0x62e/0x8b0 [btrfs]
[  253.565251]    [<ffffffff812324e9>] mount_fs+0x39/0x1b0
[  253.565255]    [<ffffffff8125285b>] vfs_kern_mount+0x6b/0x150
[  253.565257]    [<ffffffff8125565b>] do_mount+0x27b/0xc30
[  253.565259]    [<ffffffff81256356>] SyS_mount+0x96/0xf0
[  253.565260]    [<ffffffff81795429>] system_call_fastpath+0x16/0x1b
[  253.565263]    [<ffffffffffffffff>] 0xffffffffffffffff
[  253.565272]   Ref action 1, root 18446744073709551608, ref_root 0, parent 35654615040, owner 23988,
offset 0, num_refs 1
[  253.565681]    [<ffffffffa049d564>] btrfs_ref_tree_mod+0x114/0x570 [btrfs]
(Continue reading)

Paul Jones | 18 Oct 14:00 2014
Picon

3.17.1 blocked task

Hi All,

Just found this stack trace in dmesg while running a scrub on one of my file systems. I haven’t seen this
reported yet so I thought I should report it ☺
All filesystems are raid1.

vm-server ~ # btrfs fi sh
Label: 'Root'  uuid: 58d27dbd-7c1e-4ef7-8d43-e93df1537b08
        Total devices 2 FS bytes used 21.29GiB
        devid    3 size 40.00GiB used 40.00GiB path /dev/sde3
        devid    4 size 40.00GiB used 40.00GiB path /dev/sdd3

Label: 'storage'  uuid: df3d4a9c-ed6c-4867-8991-a018276f6f3c
        Total devices 5 FS bytes used 2.24TiB
        devid    6 size 2.69TiB used 2.10TiB path /dev/sda1
        devid    7 size 901.92GiB used 422.00GiB path /dev/sdf1
        devid    8 size 892.25GiB used 410.00GiB path /dev/sdg1
        devid    9 size 892.25GiB used 412.00GiB path /dev/sdh1
        devid   10 size 2.73TiB used 2.08TiB path /dev/sdb2

Label: 'Fast'  uuid: 9baf63f7-a9d6-456c-8fdd-1a8fdb21958f
        Total devices 2 FS bytes used 352.54GiB
        devid    2 size 407.12GiB used 407.12GiB path /dev/sde1
        devid    3 size 407.12GiB used 407.12GiB path /dev/sdd1

Linux vm-server 3.17.1-gentoo-r1 #1 SMP PREEMPT Sat Oct 18 16:53:06 AEDT 2014 x86_64 Intel(R) Core(TM)
i7-2600K CPU  <at>  3.40GHz GenuineIntel GNU/Linux

[ 4372.838272] BTRFS: checksum error at logical 14375409094656 on dev /dev/sdb2, sector 1565551496,
root 5, inode 3082523, offset 6035542016, length 4096, links 1 (path: shared/backup/Normal/Paul-PC_2014_07_06_13_00_27_323D24.TIB)
(Continue reading)

Russell Coker | 18 Oct 12:29 2014
Picon

Re: strange 3.16.3 problem

On Sat, 18 Oct 2014, "Michael Johnson - MJ" <mj <at> revmj.com> wrote:
> The NFS client is part of the kernel iirc, so it should be 64 bit.  This
> would allow the creation of files larger than 4gb and create possible
> issues with a 32 bit user space utility.

A correctly written 32bit application will handle files >4G in size.

While some applications may have problems, I'm fairly sure that ls will be ok.

# dd if=/dev/zero of=/tmp/test bs=1024k count=1 seek=5000
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00383089 s, 274 MB/s
# /bin/ls -lh /tmp/test
-rw-r--r--. 1 root root 4.9G Oct 18 20:47 /tmp/test
# file /bin/ls
/bin/ls: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically 
linked (uses shared libs), for GNU/Linux 2.6.26, 
BuildID[sha1]=0xd3280633faaabf56a14a26693d2f810a32222e51, stripped

A quick test shows that a 32bit ls can handle this.

> I would mount from a client with 64 bit user space and see if the problem
> occurs there.  If so, it is probably not a btrfs issue (if I am
> understanding your environment correctly).

I'll try that later.

--

-- 
My Main Blog         http://etbe.coker.com.au/
(Continue reading)

Russell Coker | 18 Oct 05:54 2014
Picon

strange 3.16.3 problem

I have a system running the Debian 3.16.3-2 AMD64 kernel for the Xen Dom0 and 
the DomUs.

The Dom0 has a pair of 500G SATA disks in a BTRFS RAID-1 array.  The RAID-1 
array has some subvols exported by NFS as well as a subvol for the disk images 
for the DomUs - I am not using NoCOW as performance is fine without it and I 
like having checksums on everything.

I have started having some problems with a mail server that is running in a 
DomU.  The mail server has 32bit user-space because it was copied from a 32bit 
system and I had no reason to upgrade it to 64bit, but it's running a 64bit 
kernel so I don't think that 32bit user-space is related to my problem.

# find . -name "*546"
./1412233213.M638209P10546
# ls -l ./1412233213.M638209P10546
ls: cannot access ./1412233213.M638209P10546: No such file or directory

Above is the problem, find says that the file in question exists but ls 
doesn't think so, the file in question is part of a Maildir spool that's NFS 
mounted.  This problem persisted across a reboot of the DomU, so it's a 
problem with the Dom0 (the NFS server).

The dmesg output on the Dom0 doesn't appear to have anything relevant, and a 
find command doesn't find the file.  I don't know if this is a NFS problem or 
a BTRFS problem.  I haven't rebooted the Dom0 yet because a remote reboot of a 
server running a kernel from Debian/Unstable is something I try to avoid.

Any suggestions?

(Continue reading)

Vincent. | 18 Oct 03:47 2014
Picon

raid10 drive replacement

Hi !

I have a faulty drive in my raid10 and want it to be replaced.
Working drive are xvd[bef] and replacement drive is xvdc.

When I mount my drive in RW:
#mount -odegraded /dev/xvdb /tank
#dmesg -c
[ 6207.294513] btrfs: device fsid 728ef4d8-928c-435c-b707-f71c459e1520
devid 1 transid 551398 /dev/xvdb
[ 6207.327357] btrfs: allowing degraded mounts
[ 6207.477041] btrfs: bdev (null) errs: wr 15211054, rd 3038899, flush
0, corrupt 0, gen 0
[ 6219.703606] Btrfs: too many missing devices, writeable mount is not allowed
[ 6219.785929] btrfs: open_ctree failed

When I mount my drive in RO:
#mount -odegraded,ro /dev/xvdb /tank
#btrfs filesystem show
Label: none  uuid: 728ef4d8-928c-435c-b707-f71c459e1520
Total devices 4 FS bytes used 4.70TiB
devid    1 size 2.73TiB used 2.73TiB path /dev/xvdb
devid    2 size 2.73TiB used 2.73TiB path
devid    3 size 2.73TiB used 2.73TiB path /dev/xvde
devid    4 size 2.73TiB used 2.73TiB path /dev/xvdf

Btrfs v3.12

Of course, because my mount is in RO, i can't add device and do a balance:
#btrfs device add /dev/xvdc /tank
(Continue reading)


Gmane