Hi,
>>>> Seems to me that something is still dirtying an inode regularly.
>>>>
>>>> Perhaps you need to look at the XFS and writeback event traces to
>>>> find out what process is dirtying the inode. trace-cmd is your
>>>> friend...
>>> Something like this?
>>>
>>> -----
>>> echo 1 > /sys/kernel/debug/tracing/events/xfs/enable
>>>
>>> echo 0 > /sys/kernel/debug/tracing/events/xfs/enable
>>>
>>> more /sys/kernel/debug/tracing/trace
>>> -----
>>>
>>>
>>> I tried recreating the situation of the last 2 days (clean boot, stopped
>>> services) and it's currently quiescing nicely. :-(
>>>
>>> I'll keep an eye on it and try to catch it in the act but every time I
>>> turn the tracing on the HDD light stays firmly off. :-(
>> There is more interesting news already.
>>
>> I had used 'hdparm -S 120' to set the spindown_timeout to 10 minutes. It
>> appears that that was sticking through a cold boot. Setting that back to
>> its previous value of 1 (5 seconds) makes the disk constantly spin up
>> and down when I suspect it is idle.
>
> Well, that's kind of important to know.
>
> It takes XFS a minimum of 90s to idle a filesystem properly after
> any modification. Setting a spindown time shorter than this will
> cause the disk to spin up and down all the time until the filesystem
> idles itself.
>
> What else have you tuned on your system?
This is a new laptop: 5 seconds was the factory default. I increased it
to 10 minutes between my first and second posts in an attempt to
investigate the problem.
Further investigations reveal that I need to switch off APM ('hdparm -B
255') on the disk as well otherwise it still racks up spinup/down cycles
long after boot; at rate of 2 or 3 a minute even if the spindown_timeout
is set to 10 minutes.
>> I've caught a trace over the course of a few spinup/downs and attached
>> it (gzipped as it's 208K unpacked).
>
> Which you've taken about 90s after boot, so while there is probably
> still dirty inodes due to the boot process. Indeed:
>
> flush-8:0-1225 [002] 91.103273: xfs_ilock: dev 8:6 ino 0x80a124
> flags ILOCK_EXCL caller xfs_iomap_write_allocate
> flush-8:0-1225 [002] 91.103287: xfs_perag_get: dev 8:6 agno 2
> refcount 28 caller xfs_bmap_btalloc_nullfb
> flush-8:0-1225 [002] 91.103290: xfs_perag_put: dev 8:6 agno 2
> refcount 27 caller xfs_bmap_btalloc_nullfb
> flush-8:0-1225 [002] 91.103292: xfs_perag_get: dev 8:6 agno 3
> refcount 32 caller xfs_bmap_btalloc_nullfb
> flush-8:0-1225 [002] 91.103293: xfs_perag_put: dev 8:6 agno 3
> refcount 31 caller xfs_bmap_btalloc_nullfb
> flush-8:0-1225 [002] 91.103295: xfs_perag_get: dev 8:6 agno 2
> refcount 28 caller xfs_alloc_vextent
>
> That's data writeback happening, so filesystem idling is still at
> least 90s away from this. So, it's no surprise your disk is
> spinning up and down here because there is IO being done every 5-10
> seconds which is in the same order of frequency as the IO the system
> is issuing....
OK. Thanks for pointing out my errors. I'll keep an eye on the situation.
Provided the spindown_timeout is >90s would you expect the disk to idle
properly? Is there something else (other than the spindown_timeout) that
could be encouraging the disk to go to sleep that would be mitigated by
switching off APM?
Many thanks for your time; especially your efforts analysing the logs.
Regards,
@ndy
--
andyjpb@xxxxxxxxxxxxxx
http://www.ashurst.eu.org/
0x7EBA75FF
|