xfs
[Top] [All Lists]

Re: posix_fallocate

To: Eric Sandeen <sandeen@xxxxxxxxxxx>
Subject: Re: posix_fallocate
From: Krzysztof Błaszkowski <kb@xxxxxxxxxxxxxxx>
Date: Tue, 11 May 2010 16:20:20 +0200
Cc: xfs@xxxxxxxxxxx
In-reply-to: <4BE85436.5040402@xxxxxxxxxxx>
Organization: Systemy mikroprocesorowe
References: <201005071022.37863.kb@xxxxxxxxxxxxxxx> <201005102017.11706.kb@xxxxxxxxxxxxxxx> <4BE85436.5040402@xxxxxxxxxxx>
User-agent: KMail/1.9.5
On Monday 10 May 2010 20:45, Eric Sandeen wrote:
> Krzysztof Błaszkowski wrote:
> > On Monday 10 May 2010 16:39, Eric Sandeen wrote:
> >> Krzysztof Błaszkowski wrote:
>
> ...
>
> >>> We stick with 2.6.31.5 which seems to be good for us. We do not change
> >>> kernels easily, as soon as higher revision arrives because it doesn't
> >>> make sense from stability point of view. We have seen too many times
> >>> regression bugs so if we are confident with some revision then there is
> >>> no point to change this.
> >>
> >> It was just a testing suggestion, but I already tested upstream and the
> >> problem persists, now just need to find the time to dig into it.
> >
> > I see and I am glad you confirmed this. Do you think that fallocate
> > called many times with fixed size and increasing offset will work better
> > than one time call with huge size @ 0 offset ?
>
> I'd expect that to work; it's certainly worth a test

agreed.

> , and please send your 
> results back to the list ;)

okay, will do this tomorrow.


BUT let's think about possible results:
- test will fail. nothing to comment.

- test will pass. this is interesting case.

A passed test - does this test prove anything ?
it may but this is not obvious.
If the fault was caused by some algorithmic mistake (some table size, buffer 
size according to input size) then the test result could be a proof.

but if it fails due to missing spinlock/mutex elsewhere then we talk about 
probability of failure which depends on requested size. 

bad news is that this failure happens at various sizes depending on hw 
configuration. On some boxes the threshold point is abt 7T while another 
boxes fail after e.g. 15T.
I am not sure about any relationship between these boxes in installed memory, 
amount of logical cores and theirs frequency (and current workload)
 
In this later case the test will prove nothing. If i run it 5 times and it 
passed 5 times it would mean only that i was lucky.

as long as we don't know the exact nature of this fault we can't consider such 
test as reliable fix.


Krzysztof

PS of course i will try this just to satisfy curiosity tomorrow afternoon (PL 
time) as all high capacity storage has been shipped to customers already. 

>
> thanks,
> -Eric
>
> > Krzysztof
> >
> >> -Eric

<Prev in Thread] Current Thread [Next in Thread>