I'm experimenting with using XFS with a network block device (DST), and
have come up against the problem that when writing data to the network,
it uses kernel_sendpage to hand the page presented at the BIO layer to
the network stack. It then completes the block IO request.
The problem arises when XFS proceeds to then reuse that page before the
NIC actually sends it. Particularly if TX checksumming or TCP
segmentation is being offloaded to the NIC, it seems that the NIC will
try to access to page after the BIO request has returned, and so operate
on stale data. I assume the same problem might happen in the case of
TCP retransmits or similar. The motivation for using sendpage rather
than sendmsg (or using sendpage on a copy of the original page) is to
try to ensure speed by a zero-copy path through the subsystem.
Is there any way at all in which XFS would be able to (theoretically)
expose an API to allow an underlying block device to retain ownership of
pages until it's done with them, so as to avoid a potentially needless
copy? Or is there another way of achieving this?
thanks in advance,
Matthew Hodgson <matthew@xxxxxxxxxxxxx>
Media & Systems Project Manager
Tel: +44 (0) 845 666 7778