[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

O_DIRECT & rpm (for xfs)



Hi Jeff,

thanks for the background on the O_DIRECT problem. As I wrote, there
is some active discussion on the xfs-list about how to deal with
O_DIRECT/ext3/xfs/rpm, and your explanation will be very valuable to
the XFS community, especially since it comes from the rpm master
himself :)

For your background: The hanging of rpm with recent XFS patches due to
enabled O_DIRECT support generated some confusion about where this bug
came from.

Current XFS policy is to enable O_DIRECT only for XFS partitions.

Thanks!
-- 
Axel.Thimm@physik.fu-berlin.de
--- Begin Message ---
James Olin Oden wrote:

>On Fri, 29 Aug 2003, Jeff Johnson wrote:
>
>  
>
>>>Well, O_DIRECT is being ripped out by force in the specfiles (BTW when
>>>you decided to rip it out, you forgot to change the release tag, so
>>>there are some versions of rpm-4.2-1 floating around with enabled
>>>O_DIRECT support). BTW2 there is some discussion on the xfs lists on
>>>the background of db4/O_DIRECT/sparse files bug. If you like I can add
>>>you to the discussion.
>>> 
>>>
>>>      
>>>
>>Um, I'm less than happy about O_DIRECT: stupid interface, arcane problem.
>>The killer is/was that O_DIRECT was unleashed in the kernel with hardly 
>>any warning,
>>certainly not enough to do something rational in distributing rpm.
>>
>>Sure there are are versions around. I have the (ahem) joy of developing 
>>without
>>a release plan or schedule, with whatever I just built released through 
>>Raw Hide
>>and betas.
>>
>>Ready or not, here's rpm!
>>
>>    
>>
>So is O_DIRECT something you have to use, or is it something that is good
>to use if you can for rpm?  I just googled for this, because I was not 
>sure what it was all about, and in effect it causes a file opened with 
>this flag to avoid the kernels cache and go directly to the disk?  So, 
>I guess I am wondering, at a high level, why does rpm use this (i.e. I
>am looking for clue)?
>  
>

O_DIRECT is similar to a "charcater device" in unix, i.e. DMA to/from 
user space.

Berkeley DB uses (used) O_DIRECT to "harden" the behavior of an 
intrinsic file creation race
opening __db.001 while establishing a dbenv. The implementation is sane, 
and O_DIRECT
is reasonable hardening.

O_DIRECT -- as instantiated in linux -- is goofy and under development. 
The behavior before was
just to ignore O_DIRECT, it did not matter whether it was used or not. 
Life was good.

O_DIRECT imposes size and alignment constraints on the I/O request, just 
like character
devices. The painful constraint is page aligned address, necessitating 
valloc, not just any buffer
address.

The other pain with O_DIRECT is that the EINVAL returned with alignment 
failure comes from
read/write, not from the open. This is different than most syscalls, 
which either fail immediately,
or the cause of failure is obvious for other reasons.

What is/was really, really broken is that O_DIRECT came out of no place, 
was released without adequate
warning, and even kernel folks disagree on the implementation, and the 
implementation is not
complete. Very not good. Works on some file systems, not on others,  etc 
etc.

Sure I can avoid O_DIRECT use in rpm, that's the only sane use of 
O_DIRECT afaict.

But I cannot dictate what kernel users choose to run.

73 de Jeff


_______________________________________________
Rpm-list mailing list
Rpm-list@redhat.com
https://www.redhat.com/mailman/listinfo/rpm-list
--- End Message ---

Attachment: pgp00020.pgp
Description: PGP signature