Hi Jeff,
thanks for the background on the O_DIRECT problem. As I wrote, there
is some active discussion on the xfs-list about how to deal with
O_DIRECT/ext3/xfs/rpm, and your explanation will be very valuable to
the XFS community, especially since it comes from the rpm master
himself :)
For your background: The hanging of rpm with recent XFS patches due to
enabled O_DIRECT support generated some confusion about where this bug
came from.
Current XFS policy is to enable O_DIRECT only for XFS partitions.
Thanks!
--
Axel.Thimm@xxxxxxxxxxxxxxxxxxx
--- Begin Message ---
James Olin Oden wrote:
On Fri, 29 Aug 2003, Jeff Johnson wrote:
Well, O_DIRECT is being ripped out by force in the specfiles (BTW when
you decided to rip it out, you forgot to change the release tag, so
there are some versions of rpm-4.2-1 floating around with enabled
O_DIRECT support). BTW2 there is some discussion on the xfs lists on
the background of db4/O_DIRECT/sparse files bug. If you like I can add
you to the discussion.
Um, I'm less than happy about O_DIRECT: stupid interface, arcane problem.
The killer is/was that O_DIRECT was unleashed in the kernel with hardly
any warning,
certainly not enough to do something rational in distributing rpm.
Sure there are are versions around. I have the (ahem) joy of developing
without
a release plan or schedule, with whatever I just built released through
Raw Hide
and betas.
Ready or not, here's rpm!
So is O_DIRECT something you have to use, or is it something that is good
to use if you can for rpm? I just googled for this, because I was not
sure what it was all about, and in effect it causes a file opened with
this flag to avoid the kernels cache and go directly to the disk? So,
I guess I am wondering, at a high level, why does rpm use this (i.e. I
am looking for clue)?
O_DIRECT is similar to a "charcater device" in unix, i.e. DMA to/from
user space.
Berkeley DB uses (used) O_DIRECT to "harden" the behavior of an
intrinsic file creation race
opening __db.001 while establishing a dbenv. The implementation is sane,
and O_DIRECT
is reasonable hardening.
O_DIRECT -- as instantiated in linux -- is goofy and under development.
The behavior before was
just to ignore O_DIRECT, it did not matter whether it was used or not.
Life was good.
O_DIRECT imposes size and alignment constraints on the I/O request, just
like character
devices. The painful constraint is page aligned address, necessitating
valloc, not just any buffer
address.
The other pain with O_DIRECT is that the EINVAL returned with alignment
failure comes from
read/write, not from the open. This is different than most syscalls,
which either fail immediately,
or the cause of failure is obvious for other reasons.
What is/was really, really broken is that O_DIRECT came out of no place,
was released without adequate
warning, and even kernel folks disagree on the implementation, and the
implementation is not
complete. Very not good. Works on some file systems, not on others, etc
etc.
Sure I can avoid O_DIRECT use in rpm, that's the only sane use of
O_DIRECT afaict.
But I cannot dictate what kernel users choose to run.
73 de Jeff
_______________________________________________
Rpm-list mailing list
Rpm-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/rpm-list
--- End Message ---
pgpAiEcZcwO6p.pgp
Description: PGP signature
|