Valid XHTML 1.0 Strict
Valid CSS!

[ Docs | Tools | Advisories | Full-Disclosure ]

Introduction

We recently experienced a drive failure in a RAID attached to a Compaq Proliant 1600R. The RAID controller in question is a Compaq Smart Array 5304 (128MB cache). Here's what to expect:

Drive Failure

The failure is detected:

cpqary3: [ID 702911 kern.warning] WARNING:
cpqary3: [ID 103154 kern.warning] WARNING: Bus = 1 : Device = 11 : Function = 0
cpqary3: [ID 404339 kern.notice]  Event Occured on ......... 03/26/2002
cpqary3: [ID 678209 kern.notice]  Event Time................ 13:00:27
cpqary3: [ID 269178 kern.notice]  Description............... Physical drive failure: SCSI port 3 ID 2
cpqary3: [ID 715728 kern.notice]  Physical Drive Num........ 16
cpqary3: [ID 647361 kern.notice]  Failure Reason............ UNKNOWN

A hot spare is located:

cpqary3: [ID 702911 kern.warning] WARNING:
cpqary3: [ID 103154 kern.warning] WARNING: Bus = 1 : Device = 11 : Function = 0
cpqary3: [ID 404339 kern.notice]  Event Occured on ......... 03/26/2002
cpqary3: [ID 678209 kern.notice]  Event Time................ 13:00:27
cpqary3: [ID 269178 kern.notice]  Description............... State change, logical drive 0
cpqary3: [ID 677830 kern.notice]  Logical Drive Num......... 0
cpqary3: [ID 407483 kern.notice]  Prev Logical Drive State.. OK
cpqary3: [ID 732945 kern.notice]  New Logical Drive State... Regenerating
cpqary3: [ID 553769 kern.notice]  Current Spare Status......
cpqary3: [ID 166510 kern.notice]   Defined
cpqary3: [ID 509785 kern.notice]   Available

The hot spare is activated:

cpqary3: [ID 702911 kern.warning] WARNING:
cpqary3: [ID 103154 kern.warning] WARNING: Bus = 1 : Device = 11 : Function = 0
cpqary3: [ID 404339 kern.notice]  Event Occured on ......... 03/26/2002
cpqary3: [ID 678209 kern.notice]  Event Time................ 13:00:27
cpqary3: [ID 269178 kern.notice]  Description............... State change, logical drive 0
cpqary3: [ID 677830 kern.notice]  Logical Drive Num......... 0
cpqary3: [ID 407483 kern.notice]  Prev Logical Drive State.. Regenerating
cpqary3: [ID 732945 kern.notice]  New Logical Drive State... Needs Rebuild Permission
cpqary3: [ID 553769 kern.notice]  Current Spare Status......
cpqary3: [ID 166510 kern.notice]   Defined
cpqary3: [ID 974324 kern.notice]   Active

... and rebuild begins:

cpqary3: [ID 702911 kern.warning] WARNING:
cpqary3: [ID 103154 kern.warning] WARNING: Bus = 1 : Device = 11 : Function = 0
cpqary3: [ID 404339 kern.notice]  Event Occured on ......... 03/26/2002
cpqary3: [ID 678209 kern.notice]  Event Time................ 13:00:28
cpqary3: [ID 269178 kern.notice]  Description............... State change, logical drive 0
cpqary3: [ID 677830 kern.notice]  Logical Drive Num......... 0
cpqary3: [ID 407483 kern.notice]  Prev Logical Drive State.. Needs Rebuild Permission
cpqary3: [ID 732945 kern.notice]  New Logical Drive State... Rebuilding
cpqary3: [ID 553769 kern.notice]  Current Spare Status......
cpqary3: [ID 166510 kern.notice]   Defined
cpqary3: [ID 974324 kern.notice]   Active
cpqary3: [ID 388622 kern.notice]   Building

Drive Replacement

The machine was powered down, and we booted the RAID tools from the Compaq Diagnostic partition. Once this was loaded, we were able to swap the defected drive, and restart Solaris. On boot, the following messages were recorded:

cpqary3: [ID 702911 kern.warning] WARNING:
cpqary3: [ID 103154 kern.warning] WARNING: Bus = 1 : Device = 11 : Function = 0
cpqary3: [ID 404339 kern.notice]  Event Occured on ......... 03/26/2002
cpqary3: [ID 678209 kern.notice]  Event Time................ 13:00:27
cpqary3: [ID 269178 kern.notice]  Description............... Hot-plug drive removed: SCSI port 3 ID 2
cpqary3: [ID 486352 kern.notice]  Physical Drive Num ....... 16
cpqary3: [ID 479030 kern.notice]  Configured Drive ? ....... YES
cpqary3: [ID 702911 kern.warning] WARNING:
cpqary3: [ID 103154 kern.warning] WARNING: Bus = 1 : Device = 11 : Function = 0
cpqary3: [ID 404339 kern.notice]  Event Occured on ......... 03/26/2002
cpqary3: [ID 678209 kern.notice]  Event Time................ 13:00:58
cpqary3: [ID 269178 kern.notice]  Description............... Hot-plug drive inserted: SCSI port 3 ID 2
cpqary3: [ID 486352 kern.notice]  Physical Drive Num ....... 16
cpqary3: [ID 479030 kern.notice]  Configured Drive ? ....... YES

John Cartwright <johnc@grok.org.uk>