> I have absolutely no experience or knowledge enough to help you out with
> your problem, but I was wondering: how does one monitor the temperature of
> the hard drives? I have sensors on my motherboard that I presume get the
> ambient temperature readings inside the case, and my hard drives are not
> too far away, but I was wondering if maybe hard drives have embedded
> thermal sensors? Or should the paranoid install some thermal sensor
> plastered on each drive?
Maybe you can use S.M.A.R.T. for that purpose (smartd, smartctl):
# /usr/sbin/smartctl -a /dev/hdb
Device: IBM-DTLA-307045 Supports ATA Version 5
Drive supports S.M.A.R.T. and is enabled
Check S.M.A.R.T. Passed
General Smart Values:
Off-line data collection status: (0x00) Offline data collection activity
was
never started
Self-test execution status: ( 0) The previous self-test routine
completed without error or no self-test
has ever
been run
Total time to complete off-line
data collection: (2294) Seconds
Offline data collection
Capabilities: (0x1b) SMART EXECUTE OFF-LINE IMMEDIATE
Automatic timer ON/OFF support
Suspend Offline Collection upon new
command
Offline surface scan supported
Self-test supported
Smart Capablilities: (0x0003) Saves SMART data before entering
power-saving mode
Supports SMART auto save timer
Error logging capability: (0x01) Error logging supported
Short self-test routine
recommended polling time: ( 2) Minutes
Extended self-test routine
recommended polling time: ( 28) Minutes
Vendor Specific SMART Attributes with Thresholds:
Revision Number: 16
Attribute Flag Value Worst Threshold Raw Value
( 1)Raw Read Error Rate 0x000b 100 100 060 000000000000
( 2)Throughput Performance 0x0005 132 132 050 000000000157
( 3)Spin Up Time 0x0007 090 090 024 00060143013e
( 4)Start Stop Count 0x0012 100 100 000 00000000000b
( 5)Reallocated Sector Ct 0x0033 100 100 005 000000000000
( 7)Seek Error Rate 0x000b 100 100 067 000000000000
( 8)Seek Time Preformance 0x0005 130 130 020 000000000022
( 9)Power On Hours 0x0012 100 100 000 000000000450
( 10)Spin Retry Count 0x0013 100 100 060 000000000000
( 12)Power Cycle Count 0x0032 100 100 000 00000000000b
(192)Unknown Attribute 0x0032 100 100 050 00000000000b
(193)Unknown Attribute 0x0012 100 100 050 00000000000b
(194)Unknown Attribute 0x0002 105 105 000 000000000034
(196)Reallocated Event Count 0x0032 100 100 000 000000000000
(197)Current Pending Sector 0x0022 100 100 000 000000000000
(198)Offline Uncorrectable 0x0008 100 100 000 000000000000
(199)UDMA CRC Error Count 0x000a 200 200 000 000000000000
SMART Error Log:
SMART Error Logging Version: 1
No Errors Logged
|