[ale] DriveStatusError BadCRC

Brian Pitts bpitts at learnlink.emory.edu
Thu Sep 7 22:06:05 EDT 2006


This afternoon my desktop's logs started filling with

Sep  7 19:30:17 localhost kernel: [17183013.668000] hdb: dma_intr: 
status=0x51 { DriveReady SeekComplete Error }
Sep  7 19:30:17 localhost kernel: [17183013.668000] hdb: dma_intr: 
error=0x84 { DriveStatusError BadCRC }
Sep  7 19:30:17 localhost kernel: [17183013.668000] ide: failed opcode 
was: unknown
Sep  7 19:30:17 localhost kernel: [17183013.728000] ide0: reset: success

I didn't notice this until the system slowed to a crawl, then froze when 
I tried to shutdown I restarted using SystemRescueCD and ran e2fsck 
which found a lot of errors on /home-  blocks used multiple times 
between multiple inodes, invalid counts, etc.

Here's the result of a short test with smartctl

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      
UPDATED  WHEN_ FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   200   200   051    Pre-fail  
Always       -        0
  3 Spin_Up_Time            0x0007   125   113   021    Pre-fail  
Always       -        4258
  4 Start_Stop_Count        0x0032   098   098   040    Old_age   
Always       -        2895
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  
Always       -        0
  7 Seek_Error_Rate         0x000b   200   200   051    Pre-fail  
Always       -        0
  9 Power_On_Hours          0x0032   090   090   000    Old_age   
Always       -        7333
 10 Spin_Retry_Count        0x0013   100   100   051    Pre-fail  
Always       -        0
 11 Calibration_Retry_Count 0x0013   100   100   051    Pre-fail  
Always       -        0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   
Always       -        691
194 Temperature_Celsius     0x0022   120   253   000    Old_age   
Always       -        30
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   
Always       -        0
197 Current_Pending_Sector  0x0012   200   200   000    Old_age   
Always       -        0
198 Offline_Uncorrectable   0x0012   200   200   000    Old_age   
Always       -        0
199 UDMA_CRC_Error_Count    0x000a   200   253   000    Old_age   
Always       -        388
200 Multi_Zone_Error_Rate   0x0009   200   155   051    Pre-fail  
Offline      -        0

Should I
a) run badblocks overnight and keep an eye on things in the future
b) expect drive failure in the near-term and buy a replacement immediately

I do have current backups on hda (which holds a rarely-booted copy of 
Windows) and older ones on another system.

Thanks,
Brian



More information about the Ale mailing list