Armistead, Jason | 1 Aug 2006 02:11

Ensuring data is written to disk

I've been following the thread about disk data consistency with some
interest.  Given that many IDE disk drives may choose to hold data in their
write buffers before actually writing it to disk, and given that the
ordering of the writes may not be the same as the OS or application expects,
the only obvious way I can see to overcome this, and ensure the data is
truly written to the physical platters without disabling write caching is to
overwhelm the disk drive with more data than can fit in its internal write
buffer.

So, if you have an IDE disk with an 8Mb cache, guess what, send it an 8Mb
chunk of random data to write out when you do an fsync().  Better still,
locate this 8Mb as close to the middle of the travel of its heads, so that
performance is not affected any more than necessary.  If the drive firmware
uses a LILO or LRU policy to determine when to do its disk writes,
overwhelming its buffers should ensure that the actual data you sent to it
gets written out 

Of course, guessing the disk drive write buffer size and trying not to kill
system I/O performance with all these writes is another question entirely
... sigh !!!

Jason
Anthony Liguori | 1 Aug 2006 04:58

Re: AltGr keystrokes

On Mon, 31 Jul 2006 12:13:14 +0200, Eric Hameleers wrote:

> Are other people with international keyboards having these issues as well?
> What is the difference with the old RFB patch that the currently built-in
> VNC server handles differently?

The old VNC patch uses LibVNCServer which is a derivative of the old Xvnc
codebase.  Details of why libvncserver didn't meet my goals have been
addressed in previous threads but suffice to say, I felt that starting
from scratch was necessary.

The VNC protocol does a rather poor job in defining what keycodes
correspond to what key events.  It roughly says to follow the X keysyms
but warns that many clients do weird things and does not actually document
all of those things.

Additionally, there's also translation to QEMU's keycodes.  Fortunately,
these things are easy to fix once we know what codes are being sent by the
client.  Unfortunately, I don't have an nl keyboard lying around so I need
some help figuring that out :-)

Regards,

Anthony Liguori

> Eric
Gaetano Sferra | 1 Aug 2006 08:14
Picon
Favicon

RE: Re: AltGr keystrokes

Wait!
This doesn't reply to my question, I never talked about VNC servers or 
clients.
If you want replies about a similar but quite different matter, post a new 
topic, don't "takeover" the mine.

Thank you,
--
Gaetano Sferra

_________________________________________________________________
Personalizza MSN Messenger con sfondi e fotografie! 
http://spaces.msn.com/morespaces.aspx
Jamie Lokier | 1 Aug 2006 12:17

Re: Ensuring data is written to disk

Armistead, Jason wrote:
> I've been following the thread about disk data consistency with some
> interest.  Given that many IDE disk drives may choose to hold data in their
> write buffers before actually writing it to disk, and given that the
> ordering of the writes may not be the same as the OS or application expects,
> the only obvious way I can see to overcome this, and ensure the data is
> truly written to the physical platters without disabling write caching is to
> overwhelm the disk drive with more data than can fit in its internal write
> buffer.
> 
> So, if you have an IDE disk with an 8Mb cache, guess what, send it an 8Mb
> chunk of random data to write out when you do an fsync().  Better still,
> locate this 8Mb as close to the middle of the travel of its heads, so that
> performance is not affected any more than necessary.  If the drive firmware
> uses a LILO or LRU policy to determine when to do its disk writes,
> overwhelming its buffers should ensure that the actual data you sent to it
> gets written out 

It doesn't work.

I thought that too, for a while, as a way to avoid sending CACHEFLUSH
commands for fs journal ordering when there is a lot of data being
written anyway.

But there is no guarantee that the drive uses a LILO or LRU policy,
and if the firmware is optimised for cache performance over a range of
benchmarks, it won't use those - there are better strategies.

You could write 8MB to the drive, but it could easily pass through the
cache without evicting some of the other data you want written.
(Continue reading)

Jens Axboe | 1 Aug 2006 12:45
Picon

Re: Ensuring data is written to disk

On Tue, Aug 01 2006, Jamie Lokier wrote:
> > Of course, guessing the disk drive write buffer size and trying not to kill
> > system I/O performance with all these writes is another question entirely
> > ... sigh !!!
> 
> If you just want to evict all data from the drive's cache, and don't
> actually have other data to write, there is a CACHEFLUSH command you
> can send to the drive which will be more dependable than writing as
> much data as the cache size.

Exactly, and this is what the OS fsync() should do once the drive has
acknowledged that the data has been written (to cache). At least
reiserfs w/barriers on Linux does this.

Random write tricks are worthless, as you cannot make any assumptions
about what the drive firmware will do.

--

-- 
Jens Axboe
Brad Campbell | 1 Aug 2006 13:17
Picon

Re: Run Real Time Guest OS?

Steve Ellenoff wrote:
> Is it possible to run a real time OS under qemu? What changes would need 
> to be made?
> Can it even be done?
> 
> The guest OS I'm trying to run sets the RTC System Timer 0 to a 0.25ms 
> interval (~4000Hz)!! The program I'm trying to run on it, expects this 
> time to be accurate, and as such, visually the program seems to be 4-5x 
> too slow in qemu, which makes sense given that it's delivering only a 
> 1024Hz timer irq.
> 
> I've noticed in the source code that qemu sets this max value of 1024Hz 
> (1ms) for the timer, which from what I understand is a limit of the 
> Linux kernel itself, ie, that's the most the kernel can support.
> 
Not at all.. for a single qemu instance on linux it tries to use the PIT in the rtc, and I've seen 
this run upto 8192hz. Why not crank it up in the qemu source t0 4096 and see what happens. It's not 
going to hurt anything in any case.
You would most certainly want a HZ value of 1000 to try this.

Brad
--

-- 
"Human beings, who are almost unique in having the ability
to learn from the experience of others, are also remarkable
for their apparent disinclination to do so." -- Douglas Adams
Alessandro Corradi | 1 Aug 2006 13:46
Picon

More about i386 mmu

Hi all,
Can I  have some additional info regarding mmu emulation in i386? In particular, in tech doc is written that qemu uses mmap system call to emulate cpu mmu, can you help me to understand this point? For instance, how it translate virtual address in io address and access to emulated devices?

Thanks

Ale

_______________________________________________
Qemu-devel mailing list
Qemu-devel <at> nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel
Jamie Lokier | 1 Aug 2006 16:17

Re: Ensuring data is written to disk

Jens Axboe wrote:
> On Tue, Aug 01 2006, Jamie Lokier wrote:
> > > Of course, guessing the disk drive write buffer size and trying not to kill
> > > system I/O performance with all these writes is another question entirely
> > > ... sigh !!!
> > 
> > If you just want to evict all data from the drive's cache, and don't
> > actually have other data to write, there is a CACHEFLUSH command you
> > can send to the drive which will be more dependable than writing as
> > much data as the cache size.
> 
> Exactly, and this is what the OS fsync() should do once the drive has
> acknowledged that the data has been written (to cache). At least
> reiserfs w/barriers on Linux does this.

1. Are you sure this happens, w/ reiserfs on Linux, even if the disk
   is an SATA or SCSI type that supports ordered tagged commands?  My
   understanding is that barriers force an ordering between write
   commands, and that CACHEFLUSH is used only with disks that don't have
   more sophisticated write ordering commands.  Is the data still
   committed to the disk platter before fsync() returns on those?

2. Do you know if ext3 (in ordered mode) w/barriers on Linux does it too,
   for in-place writes which don't modify the inode and therefore don't
   have a journal entry?

On Darwin, fsync() does not issue CACHEFLUSH to the drive.  Instead,
it has an fcntl F_FULLSYNC which does that, which is documented in
Darwin's fsync() page as working with all Darwin's filesystems,
provided the hardware honours CACHEFLUSH or the equivalent.

>From what little documentation I've found, on Linux it appears to be
much less predictable.  It seems that some filesystems, with some
kernel versions, and some mount options, on some types of disk, with
some drive settings, will commit data to a platter before fsync()
returns, and others won't.  And an application calling fsync() has no
easy way to find out.  Have I got this wrong?

ps. (An aside question): do you happen to know of a good patch which
implements IDE barriers w/ ext3 on 2.4 kernels?  I found a patch by
googling, but it seemed that the ext3 parts might not be finished, so
I don't trust it.  I've found turning off the IDE write cache makes
writes safe, but with a huge performance cost.

Thanks,
-- Jamie
Fabrice Bellard | 1 Aug 2006 17:50

qemu osdep.c

CVSROOT:	/sources/qemu
Module name:	qemu
Changes by:	Fabrice Bellard <bellard>	06/08/01 15:50:07

Modified files:
	.              : osdep.c 

Log message:
	removed unused code

CVSWeb URLs:
http://cvs.savannah.gnu.org/viewcvs/qemu/osdep.c?cvsroot=qemu&r1=1.11&r2=1.12
Fabrice Bellard | 1 Aug 2006 17:50

qemu osdep.h

CVSROOT:	/sources/qemu
Module name:	qemu
Changes by:	Fabrice Bellard <bellard>	06/08/01 15:50:14

Modified files:
	.              : osdep.h 

Log message:
	removed unused code

CVSWeb URLs:
http://cvs.savannah.gnu.org/viewcvs/qemu/osdep.h?cvsroot=qemu&r1=1.6&r2=1.7

Gmane