On 11/26/11 7:06 PM, Dave Chinner wrote:
> On Thu, Nov 24, 2011 at 05:50:42PM -0200, Carlos Maiolino wrote:
>> On Thu, Nov 24, 2011 at 05:20:51PM -0200, Carlos Maiolino wrote:
>>> xfsprogs (mainly mkfs) is using the logical sector size of a volume to
>>> the filesystem, which, even in devices using Advanced Format, it can get a
>>> bytes sector size if it is set as the logical sector size.
>>> This patch changes the ioctl to get the physical sector size, independent
>>> of the
>>> logical size.
>> Just as information, this patch proposal does not change the behaviour of
>> mkfs in case the
>> user is using libblkid, which in case, mkfs will take advantage of libblkid
>> to retrieve disk
>> topology and information.
>> I'm not sure if libblkid is the best way to retrieve the device sector's
>> size here, since
>> this does not provide a way to retrive the physical sector size, only the
>> logical size, but
>> I can be very wrong.
> If libblkid exports the PBS (physical block size) as exposed in
> /sys/block/<dev>/queue/physical_block_size, then we should be able
> to get it.
> However, the issue in my mind is not whether it is supported, but
> what is the effect of making this change? The filesystem relies on
> the fact that the minimum guaranteed unit of atomic IO is a sector,
> not the PBS of the underlying disk. What guarantees do we have when
> do a PBS sized IO is doesn't get torn into sector sized IOs by the
> drive and hence only partitially completed on failure? Indeed, if
> the filesystem is sector unaligned, it is -guaranteed- to have PBS
> sized IOs torn apart by the hardware....
> i.e. do we have any guarantee at all that a PBS sized IO will either
> wholly complete or wholly fail when PBS != sector size? And if not,
> why is this a change we should make given it appears to me to
> violate a fundamental assumption of the filesystem design?
I had the expectation that physical block size WAS the fundamental/atomic
IO size for the disk, and anything smaller required read/modify/write.
So I made this suggestion (and I think hch concurred) so that we weren't
doing log IOs which required RMW & translation.
i.e. for a 4k physical / 512 logical disk - wouldn't we want to choose
Ok, if we have mismanaged the alignment and aligned to logical, not
physical, then I guess there would be an issue... but at that point
we've already messed up (though not catastrophically I guess)...