[PC-BSD Support] A failed drive causes system to hang

Radio młodych bandytów radiomlodychbandytow at o2.pl
Sun Apr 14 03:10:26 PDT 2013


Cross-post from freebsd-fs:
http://docs.freebsd.org/cgi/getmsg.cgi?fetch=333977+0+archive/2013/freebsd-fs/20130414.freebsd-fs

I have a failing drive in my array. I need to RMA it, but don't have
time and it fails rarely enough to be a yet another annoyance.
The failure is simple: it fails to respond.
When it happens, the only thing I found I can do is switch consoles. Any 
command hangs, login on different consoles hangs, apps hang.
I run PC-BSD 9.1.

On the 1st console I see a series of messages like:

(ada0:ahcich0:0:0:0): CAM status: Command timeout
(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED

I've seen it happening even when running an installer from a different 
drive, while preparing installation (don't remember which step).

I have partial dmesg screenshots from an older failure (21st of December 
2012), transcript below:

Screen1:
(ada0:ahcich0:0:0:0): FLUSHCACHE40. ACB: (ea?) 00 00 00 00 (cut?)
(ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-qu (cut)
(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 05 d3(cut)
00
(ada0:ahcich0:0:0:0): CAM status: Command timeout
(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 03 7b(cut)
00
(ada0:ahcich0:0:0:0): CAM status: Unconditionally Re-qu (cut)
(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 03 d0(cut)
00
(ada0:ahcich0:0:0:0): CAM status: Command timeout
(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated

Screen 2:
ahcich0: Timeout on slot 29 port 0
ahcich0: (unreadable, lots of numbers, some text)
(aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: (cc?) 00 (cut)
(aprobe0:ahcich0:0:0:0): CAM status: Command timeout
(aprobe0:ahcich0:0:0:0): Error (5?), Retry was blocked
ahcich0: Timeout on slot 29 port 0
ahcich0: (unreadable, lots of numbers, some text)
(aprobe0:ahcich0:0:0:0): ATA_IDENTIFY. ACB: (cc?) 00 (cut)
(aprobe0:ahcich0:0:0:0): CAM status: Command timeout
(aprobe0:ahcich0:0:0:0): Error (5?), Retry was blocked
ahcich0: Timeout on slot 30 port 0
ahcich0: (unreadable, lots of numbers, some text)
(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 01 (cut)
(ada0:ahcich0:0:0:0): CAM status: Command timeout
(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 01 (cut)

Both are from the same event. In general, messages:

(ada0:ahcich0:0:0:0): CAM status: Command timeout
(ada0:ahcich0:0:0:0): Error 5, Periph was invalidated
(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED.

are the most common.

And one recent, though from a different drive (being a part of the same 
array):
fuse4bsd: version 0.3.9-pre1, FUSE ABI 7.19
(ada1:ata0:0:0:0): READ_DMA48. ACB: 25 00 82 46 b8 40 25 00 00 00 01 00
(ada1:ata0:0:0:0): CAM status: Command timeout
(ada1:ata0:0:0:0): Retrying command
vboxdrv: fAsync=0 offMin=0x53d offMax=0x52b9
linux: pid 17170 (npviewer.bin): syscall pipe2 not implemented
(ada1:ata0:0:0:0): READ_DMA48. ACB: 25 00 87 1a c7 40 1a 00 00 00 01 00
(ada1:ata0:0:0:0): CAM status: Command timeout
(ada1:ata0:0:0:0): Retrying command

A thing pointed out on freebsd-fs is that driver changed from ahcich0 to 
ata0. I haven't done any configuration here myself. Have you changed 
some defaults?
-- 
Twoje radio


More information about the Support mailing list