On Fri, Jan 21, 2011 at 08:41:52AM -0600, Geoffrey Wehrman wrote:
> On Fri, Jan 21, 2011 at 01:51:40PM +1100, Dave Chinner wrote:
> | Realistically, for every disadvantage or advantage we can enumerate
> | for specific workloads, I think one of us will be able to come up
> | with a counter example that shows the opposite of the original
> | point. I don't think this sort of argument is particularly
> | productive. :/
>
> Sorry, I wasn't trying to be argumentative. Rather I was just documenting
> what I saw as potential issues. I'm not arguing against your proposed
> change. If you don't find my sharing of observations productive, I'm
> happy to keep my thoughts to my self in the future.
Ah, that's not what I meant, Geoffrey. Reading it back, I probably
should have said "direction of discussion" rather than "sort of
argument" to make it more obvious I trying not to get stuck with us
goign round and round trying to demonstrate the pros and cons of
different approaches on a workload-by-workload basis.
Basically all I was trying to do is move the discusion past a
potential sticking point - I definitely value the input and insight
you provide, and I'll try to write more clearly to hopefully avoid
such misunderstandings in future discussions.
> | Instead, I look at it from the point of view that a 64k IO is little
> | slower than a 4k IO so such a change would not make much difference
> | to performance. And given that terabytes of storage capacity is
> | _cheap_ these days (and getting cheaper all the time), the extra
> | space of using 64k instead of 4k for sparse blocks isn't a big deal.
> |
> | When I combine that with my experience from SGI where we always
> | recommended using filesystems block size == page size for best IO
> | performance on HPC setups, there's a fair argument that using page
> | size extents for small sparse writes isn't a problem we really need
> | to care about.
> |
> | І'd prefer to design for where we expect storage to be in the next
> | few years e.g. 10TB spindles. Minimising space usage is not a big
> | priority when we consider that in 2-3 years 100TB of storage will
> | cost less than $5000 (it's about $15-20k right now). Even on
> | desktops we're going to have more capacity that we know what to do
> | with, so trading off storage space for lower memory overhead, lower
> | metadata IO overhead and lower potential fragmentation seems like
> | the right way to move forward to me.
> |
> | Does that seem like a reasonable position to take, or are there
> | other factors that you think I should be considering?
>
> Keep in mind that storage of the future may not be on spindles, and
> fragmentation may not be an issue. Even so, with SSD 64K I/O is very
> reasonable as most flash memory implements at a minimum 64K page. I'm
> fully in favor your proposal to require page sized I/O.
With flash memory there is the potential that we don't even need to
care. The trend is towards on-device compression (e.g. Sandforce
controllers already do this) to reduce write amplification to values
lower than one. Hence a 4k write surrounded by 60k of zeros is
unlikely to be a major issue as it will compress really well.... :)
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
|