Hi Mark and Dave,
Thanks for both of your comments.
On 06/26/2012 10:38 AM, Dave Chinner wrote:
> On Mon, Jun 25, 2012 at 08:41:31PM +0800, Jeff Liu wrote:
>> Hello,
>>
>> Using the start offset rather than map->br_startoff to calculate the
>> starting page index could
>> get more accurate data offset in page cache probe routine.
>> With this refinement, the old max_t() could be able to remove too.
> ....
>> + }
>> + /*
>> + * xfs_bmapi_read() can handle repeated hole regions,
>> + * hence it should not return two extents both are
>> + * holes. If the 2nd extent is unwritten, there must
>> + * have data buffer resides in page cache.
>> + */
>> + BUG();
>
> That's wrong. A hole can be up to 32bits in length. When the hole is
> longer than that, you'll get two extents that are holes. Try working
> with sparse files that have holes in the order of a 100TB in them...
I recalled we have verified that xfs_bmapi_read() can handle repeated
hole extents since the extent length in memory is 64bits which is
defined at:
struct xfs_bmbt_irec {
....
xfs_filblks_t br_blockcount;
};
I can reproduce that issue with Mark's test case, simply by creating a
file with xfs_io -F -f -c "truncate 200M" -c "falloc $((50 << 20)) 50m"
-c "falloc $((100 << 20) 50m" -c "pwrite $((150 << 20)) 50m"
So the file mapping is:
0-50m 50m-100m 100m-150m 150m-200m
[hole | unwritten_without_data | unwritten_without_data | data]
Current code logic will hit BUG() as the first unwritten extent has no
data buffer.
I have to do xfs_bmap_read() in a loop as before.
>
> Also, as I've said before - BUG() does not belong in filesystem code
> that can return an error. Shut the filesystem down with an in-memory
> corruption error and maybe put an ASSERT(0) there so debug kernels
> trip over it. However, no filesystem "can not happen" logic error is
> a reason to panic a production machine.
Thanks for this teaching again.
Regards,
-Jeff
|