Re: Calculating overall data file fragmentation...
2014-04-17 09:37:01 GMT
The Data Fragmentation has almost always been a concern for 4D Developers. I perfectly remember the
experiences I've shown during trainings, but it was with 4D V6.5 and 2004. I made DataAnalyzer in 1996,
just to verify the impact of data fragmentation on response time. Things have changed a lot since.
Now, there are many kind of blocks inside a data file:
- Objects that you use and load: Records, LSO (Large Size Objects: Blobs, pictures, large texts, JSON)
- Objects that 4D uses (Address tables, block-allocation tables, etc.)
There are also many kind of fragmentation. From all these kinds, the only important ones are the ones that
will oblige the disk head to move frequently. In fact, a disk reads a buffer whose size depends on the
manufacturer. When 4D loads one record, the disk loads a buffer, from the record beginning, and as big as
the buffer size. If records are following each other in the proper order (i.e. order in the address table,
for instance 1 2 3 4 5...), many subsequent records will be loaded. But if the records are in the reverse
order (5 4 3 1 2) it will oblige the system to ask the disk to realod the full buffer everytimes. This is why a
negative fragmentation is very important, but a small positive fragmentation is not (for instance some
empty blocks between consecutive records)
Also, fragmentation matters only in some cases always involving sequential access (Sequential query,
Selection to array, etc.), and only if this access is done in the default order (i.e. order of records
inside the address table). Otherwise, the access is always random access where the fragmentation effect
if much less important.
I think that there are the questions you must ask yourself before being concerned by fragmentation:
- Do I access data sequentially ? And, if yes, is it in the default order ? If not, you don't care about fragmentation.