Theodore Ts'o | 19 Apr 17:48 2014
Picon
Picon

Re: Many orphaned inodes after resize2fs

On Sat, Apr 19, 2014 at 05:42:12PM +0200, Patrik Horník wrote:
> 
> Please confirm that this is fully correct solution (for my purpose, not
> elegant clean way for official fix) and it has no negative consequences. It
> seems that way but I did not analyze all code paths the fixed code is in.

Yes, that's a fine solution.  What I'll probably do is disable the
check if s_inodes_count is greater than s_mkfs_time minus some fudge
value, or if the broken system clock boolean is set.

> BTW were there any other negative consequences of this bug in e2fsck except
> changing i_dtime of inodes to current time?

Nope, that would be the only consequence --- if you don't the system
administrator's anxiety that was induced by the false positive!

Thanks for pointing out this problem.  I'll make sure it gets fixed in
the next maintenance release of e2fsprogs.

					- Ted
Patrik Horník | 18 Apr 18:56 2014
Picon

Many orphaned inodes after resize2fs

Hello,

yesterday I experienced following problem with my ext3 filesystem:

- I had ext3 filesystem of the size of a few TB with journal. I correctly unmounted it and it was marked clean.

- I then ran fsck.etx3 -f on it and it did not find any problem.

- After increasing size of its LVM volume by 1.5 TB I resized the filesystem by resize2fs lvm_volume and it finished without problem.

- But fsck.ext3 -f immediately after that showed "Inodes that were part of a corrupted orphan linked list found." and many thousands of "Inode XXX was part of the orphaned inode list." I did not accepted fix. According to debugfs all the inodes I check from these reported orphaned inodes (I checked only some from beginning of list of errors) have size 0.

- When I mount the fs read only the data I was able to check seem OK. (But I am unable to check everything.)

- I created LVM snapshot and repaired the fs on it with fsck.ext3. After that there we no files in lost+found. Does it mean that all that orphaned inodes have size 0? Or when the fsck does not create files in lost+found?

- I am checking the data against various backups but I will not be able to check everything and some less important data dont have backup. So I would like to know in what state the fs is and what are best next steps.

- Right now I am planning to use current LVM snapshot as test run and discard it after data check. Original fs is in the state just after resize2fs, fsck was run on it after that but I did not accepted any fix and cancelled the check. I then plan to create backup snapshot, fsck original fs / LVM volume, check once again against backups and go with it. But this will not tell me status of all my data and the fs and if it is secure to use it. Another problem is all operations take long hours.

- I have also some technical specific questions. Orphan inode is valid inode not found in any directory, right? What exactly is CORRUPTED orphan linked list? What can cause such problem? Is it known problem? How can orphaned inodes and corrupted orphan linked list can be created by resize2fs or why was it not detected by fsck.ext3 before that? Can it be serious and can it be symptom of some data loss? Can fixing it by fsck.ext3 corrupt other data which are OK now, when I mount the fs read-only?

- The platform used was latest stable Debian with kernel linux-image-3.2.0-4-amd64 version 3.2.46-1+deb7u1 and e2fsprogs 1.42.5-1.1. After the incident I started using linux-image-3.13-1-amd64 version 3.13.7-1 (from the point of snapshot's creation and running fsck for real on snapshot) and thinking about going to e2fsprogs 1.42.9 from sources.

Thank you very much.

Patrik
_______________________________________________
Ext3-users mailing list
Ext3-users <at> redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
Martin T | 6 Mar 21:46 2014
Picon

questions regarding file-system optimization for sortware-RAID array

Hi,

I created a RAID1 array of two physical HDD's with chunk size of 64KiB under Debian "wheezy" using mdadm. As a next step, I would like to create an ext3(or ext4) file-system to this RAID1 array using mke2fs utility. According to RAID-related tutorials, I should create the file-system like this:

# mkfs.ext3 -v -L myarray -m 0.5 -b 4096 -E stride=16,stripe-width=32 /dev/md0


Questions:

1) According to manual of mke2fs, value of the "stride" has to be the RAID chunk size in clusters. As I use chunk size of 64KiB, then I have to use "stride" value of 16(16*4096=65536). Why is it important for file-system to know the size of chunk used in RAID array? I know it improves the I/O performance, but why is this so?

2) If the "stride" size in my case is 16, then the "stripe_width=" is 32 because there are two drives in the array which contain the actual data. Manual page of the mke2fs explain this option as "This allows the block allocator to prevent read-modify-write of the parity in a RAID stripe if possible when the data is written.". How to understand this? What is this "read-modify-write" behavior? Could somebody explain this with an example?


regards,
Martin
_______________________________________________
Ext3-users mailing list
Ext3-users <at> redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
VYSHAKH KRISHNAN CH | 29 Jan 05:41 2014
Picon

Invitation to connect on LinkedIn

 
 
 
 
 
From VYSHAKH KRISHNAN CH
 
Software Engineer at Ericsson
Bengaluru Area, India
 
 
 
 
 
 
 

I'd like to add you to my professional network on LinkedIn.

- VYSHAKH

 
 
 
 
 
 
 
You are receiving Invitation to Connect emails. Unsubscribe
© 2014, LinkedIn Corporation. 2029 Stierlin Ct. Mountain View, CA 94043, USA
 
_______________________________________________
Ext3-users mailing list
Ext3-users <at> redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
Lakshmipathi.G | 18 Jan 13:13 2014
Picon

File System corruption tool

Hi - 

I'm searching for file system corruption tool, say it inject disk-errors like
multiply owned blocks etc. Later an integrity scan process (like e2fsck) will
verify on-disk layout and fix these errors. 

I'd like to read/understand such tools before writing one for an proprietary
on-disk file system. 

Do we have such tools for ext{2,3,4}fs ? Thanks for any help or pointers!

-- 
----
Cheers,
Lakshmipathi.G
FOSS Programmer.
www.giis.co.in
_______________________________________________
Ext3-users mailing list
Ext3-users <at> redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
Ken Bass | 17 Jan 17:32 2014
Picon

Very long delay for first write to big filesystem

I asked about this a while back. It seems that this problem is getting much worse.

The problem/issue: there is a very long delay when my system does a write to the filesystem. The delay now is over 5 minutes (yes: minutes). This only happens on the first write after booting up the system, and only for large files - 1GB or more. This can be a serious problem since all access to any hard disk is blocked and will hang until the first write begins again.

The prevailing thought at the time was this was associated with loading into memory the directory information looking for free space, which I would believe now.

The filesystem in question is 7.5TB, with about 4TB used. There are over 250,000 files. I also have another system with 1TB total and 400GB used, with 65,000 files. This system, the smaller one, is beginning to show delays as well, although only a few seconds.

This problem seems to involve several factors: the total size of the system; the current "fragmentation" of that system; and finally the amount of physical memory available.

As to the last factor, the 7.5TB system has only 2GB of memory (I didn't think that it would need a lot since it is mostly being used as a file server). The "fragmentation" factor (I am only guessing here) occurs with having many files written and deleted over time.

So my questions are: is there a solution or work around for this; and is this a bug, or perhaps an undesirable feature. If the latter, should this be reported (somewhere)?

Any suggestions, tips, etc. greatly appreciated.

TIA

ken

_______________________________________________
Ext3-users mailing list
Ext3-users <at> redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
Lars Noschinski | 16 Dec 09:18 2013
Picon

Recover files from a broken ext3 partition


[Please CC me on answers, I'm not subscribed]

Hi everyone,

I have got a hard disk which was damaged by a fall and would like to
recover a few files from that. (There is a backup for most of the
data, but a handful of recent files are missing. These are important
enough to spend some time on them, but not for paying a professional
data recovery service).

Using GNU ddrescue I was able to read 99.8% of the partition, so
there's hope the data is still there. Unfortunately, some key parts of
the file system seem to be damaged, so e2fsck fails:

------------------------------
% ddrescuelog -l- -b4096 sdd5.ddrescue.log > badblocks.sdd5.4096
% e2fsck -b 20480000 -v -f -L badblocks.sdd5.4096 sdd5
[...]
Pass 1: Checking inodes, blocks, and sizes
Block 1 in the primary group descriptors is on the bad block list

If the block is really bad, the filesystem can not be fixed.
You can remove this block from the bad block list and hope
that the block is really OK.  But there are no guarantees.
------------------------------
[at 20480000 there seems to be an intact superblock; got the number
(and the block size) from 'mke2fs -n']

The files I am interested in are located under /home/$USER/Desktop and
/home/$USER/Dokumente; so I tried accessing them with debugfs.
Unfortunately, /home/$USER seems to be corrupted:

------------------------------
% LESS=FSRX debugfs -s 20480000 -b 4096 sdd5
debugfs 1.42.8 (20-Jun-2013)
debugfs:  cd /home/$USER
debugfs:  ls

EXT2 directory corrupted
------------------------------

So, any hints for me how to proceed? Is there a way to access the
Desktop and Dokumente subdirectories (provided they are themselves
undamaged)?

Also, I interrupted ddrescue for my access attempt (because it takes
really long to get all the still-good sectors from a 200GB partition).
Can debugfs show me where on the disk /home/$USER is located? This
would allow me to instruct ddrescue to concentrate on these parts.

  Best regards, Lars
Lakshmipathi.G | 27 Dec 20:02 2013
Picon

expirer - a tool to delete files when they expire

Hi - 
Though I'm not sure whether anyone will be interested in such tool (other than a possible SO user [1]). Here is a small tool to delete files at specific time in future. 




--
----
Cheers,
Lakshmipathi.G
FOSS Programmer.
www.giis.co.in
_______________________________________________
Ext3-users mailing list
Ext3-users <at> redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
fsluck | 25 Dec 02:32 2013

how to know ext cache hit rate?

how to know ext cache hit rate?
 
thanks


_______________________________________________
Ext3-users mailing list
Ext3-users <at> redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
Nicolas Michel | 16 Sep 12:16 2013
Picon

Numbers behind "df" and "tune2fs"

Hello guys,

I have some difficulties to understand what really are the numbers
behing "df" and tune2fs. You'll find the output of tune2fs and df
below, on which my maths are based.

Here are my maths:

A tune2fs on an ext3 FS tell me the FS size is 3284992 block large. It
also tell me that the size of one block is 4096 (bytes if I'm not
wrong?). So my maths tell me that the disk is 3284992 * 4096 =
13455327232 bytes or 13455327232 / 1024 /1024 /1024 = 12.53 GB.

A df --block-size=1 on the same FS tell me the disk is 13243846656
which is 211480576 bytes smaller than what tune2fs tell me.

In gigabytes, it means:
* for df, the disk is 12.33 GB
* for tune2fs, the disk is 12.53 GB

I thought that maybe df is only taking into account the real blocks
available for users. So I tried to remove the reserved blocks and the
GDT blocks:
(3284992 - 164249 - 801) * 4096 = 12779282432
or in GB : 12779282432 / 1024 / 1024 / 1024 = 11.90 Gb ...

My last thought was that "Reserved block" in tune2fs was not only the
reserved blocks for root (which is 5% per default on my system) but
take into account all other reserved blocks fo the fs internal usage.
So:
(3284992 - 164249) * 4096 = 12782563328
In GB : 11.90 Gb (the difference is not significative with a precision of 2.

So I'm lost ...

Is someone have an explanation? I would really really be grateful.
Nicolas

------------------------------
---------

Here is the output of df and tune2fs :

$ tune2fs -l /dev/mapper/datavg-datalogslv
tune2fs 1.41.9 (22-Aug-2009)
Filesystem volume name:   <none>
Last mounted on:          <not available>
Filesystem UUID:          4e5bea3e-3e61-4fc8-9676-e5177522911c
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index
filetype needs_recovery sparse_super large_file
Filesystem flags:         unsigned_directory_hash
Default mount options:    (none)
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              822544
Block count:              3284992
Reserved block count:     164249
Free blocks:              3109325
Free inodes:              822348
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      801
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8144
Inode blocks per group:   509
Filesystem created:       Wed Aug 28 08:30:10 2013
Last mount time:          Wed Sep 11 17:16:56 2013
Last write time:          Thu Sep 12 09:38:02 2013
Mount count:              18
Maximum mount count:      27
Last checked:             Wed Aug 28 08:30:10 2013
Check interval:           15552000 (6 months)
Next check after:         Mon Feb 24 07:30:10 2014
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:              256
Required extra isize:     28
Desired extra isize:      28
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      ad2251a9-ac33-4e5e-b933-af49cb4f2bb3
Journal backup:           inode blocks

$ df --block-size=1 /dev/mapper/datavg-datalogslv
Filesystem                      1B-blocks      Used   Available Use% Mounted on
/dev/mapper/datavg-datalogslv 13243846656 563843072 12007239680   5% /logs

--

-- 
Nicolas MICHEL
Richards, Paul Franklin | 30 Aug 03:48 2013

Strange fsck.ext3 behavior - infinite loop

Greetings! Need your help fellow penguins!

Strange behavior with fsck.ext3: how to remove a long orphaned inode list?

After copying data over from one old RAID to another new RAID with rsync, the dump command would not complete because of filesystem errors on the new RAID. So I ran fsck.ext3 with the -y option and it would just run in an infinite loop restarting itself and then trying to correct the same inodes over and over again. Some of the errors were lots of orphaned inodes. So I ran a tar tape backup and reformatted it with mkfs.ext3. After restoring the tar backup, I got the same errors when I ran fsck.ext3 -f. Again fsck.ext3 -y would run in an infinite loop trying to correct the problem. So I formatted it again and ran fsck immediately afterwards and it's still detecting corrupted orphaned inode lists. This is on what should be a pristine filesystem with basically no files. Is this a hardware problem with the RAID or a bug somewhere or just normal behavior?
 
[root <at> myhost /]# mkfs.ext3 /dev/sda1
mke2fs 1.35 (28-Feb-2004)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
268435456 inodes, 536868352 blocks
26843417 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
16384 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
        102400000, 214990848, 512000000

Writing inode tables: done                           
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 24 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
[root <at> myhost /]# fsck
fsck         fsck.cramfs  fsck.ext2    fsck.ext3    fsck.msdos   fsck.vfat
[root <at> myhost /]# man fsck.ext3
[root <at> myhost /]# fsck.ext3 -C /root/completion /dev/sda1
e2fsck 1.35 (28-Feb-2004)
/dev/sda1: clean, 11/268435456 files, 8450084/536868352 blocks
[root <at> myhost /]# fsck.ext3 -f -C /root/completion /dev/sda1
e2fsck 1.35 (28-Feb-2004)
Pass 1: Checking inodes, blocks, and sizes
Inodes that were part of a corrupted orphan linked list found.  Fix<y>? yes   

Inode 26732609 was part of the orphaned inode list.  FIXED.
Inode 26732609 has imagic flag set.  Clear<y>? yes

Inode 26732610 is in use, but has dtime set.  Fix<y>? yes

Inode 26732611 is in use, but has dtime set.  Fix<y>? yes

Inode 26732611 has imagic flag set.  Clear<y>? yes

Inode 26732612 is in use, but has dtime set.  Fix<y>? yes

Inode 26732613 is in use, but has dtime set.  Fix<y>? yes

Inode 26732613 has imagic flag set.  Clear<y>?

/dev/sda1: e2fsck canceled.

/dev/sda1: ***** FILE SYSTEM WAS MODIFIED *****
[root <at> myhost /]# fsck.ext3 -f /dev/sda1
e2fsck 1.35 (28-Feb-2004)
Pass 1: Checking inodes, blocks, and sizes
Inode 26732609 is in use, but has dtime set.  Fix<y>?

_______________________________________________
Ext3-users mailing list
Ext3-users <at> redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users

Gmane