Dedicated Mirror Channel Failed
What Caused the Problem?
A dedicated cache mirroring channel has failed. The Recovery Guru Details area provides specific information you will need as you follow the Recovery Steps.
Caution: Electrostatic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touch components without using a proper ground may damage the equipment.
Caution: Possible loss of data accessibility. Do not remove a component when either (1) the Service action (removal) allowed (SAA) field in the
Details area of this recovery procedure is NO (
), or (2) the SAA LED on the
affected component is OFF (note that some products do not have SAA LEDs). Removing a component while its SAA LED is OFF may result in temporary loss of access to your data. Refer to the following Important Notes and the Recovery Steps for more detail.
Important Notes
- The purpose of the dedicated cache mirror channel is to increase
the performance of mirroring cached data between the two
controllers.
- This problem can result from a problem with a dedicated mirror
channel inside one of the following hardware components:
- Controller (A or B)
- Interconnect-battery canister
(if applicable for your model of controller enclosure)
- If the failure is determined to be in a controller, then that
controller will have a Degraded status in the Physical View of the
Subsystem Management Window and in the Storage Subsystem Profile.
- There can be one or more dedicated mirror channels per
controller. If all of the dedicated mirror channels fail, then
write caching for all logical drives will be automatically suspended. Write
caching will be reinstated after the failed component has been
replaced (see the Recovery Steps below).
- Service Action Allowed Important Information:
- The Service action (removal) allowed field in the
Details area indicates whether or not you can safely remove the component. If the SAA field is NO (
), then the affected component
must remain in place until you service another component first.
- The Service action LED on Component field in the
Details area indicates whether or not a physical SAA LED is present on the hardware component. This field does NOT indicate whether the LED is ON or OFF (that indication is provided by the Service action (removal) allowed field).
- If a component does not have an SAA LED, then it is OK to remove the component when its fault LED is lit and the
Service action (removal) allowed field = YES (
) in the
Details area.
- The Service action (removal) allowed field shown in the
Details area and the physical SAA LED on the hardware component (if supported) MUST match before you remove the affected component. In rare cases (such as multiple problems), the status of the LED and the SAA field may not match. If there is a mismatch, then you should NOT remove the component until these indications match. The Recovery Steps will help you to make this determination.
Recovery Steps
Refer to the 'Component requiring service' field in the Details area
to determine which procedure you need to complete.
If... |
Then... |
The component that requires
service is a controller |
Go to "Procedure for
Replacing a Controller." |
The component that requires
service field displays
"Unknown" |
The problem cannot be traced to a
particular component. The problem can be with either
Controller A, Controller B, or with the Interconnect-Battery
Canister. It is recommended that you contact your
technical support representative to assist you with resolving
this problem. Do not continue with the remaining Recovery
Steps until instructed to do so by your technical support
representative. Advanced users only: If you have
a spare controller or interconnect-battery canister, you can use
the spares to replace the components to determine if the problem
is resolved. Use the steps provided in "Procedure for
Replacing a Controller" to replace Controller A first, then
Controller B, and then use the steps provided in
"Procedure for Replacing an Interconnect-Battery Canister"
to replace the Interconnect-Battery canister (if applicable for
your model of controller enclosure). After replacing each
component, click the Recheck button in the Recovery Guru
to determine if the problem has been fixed. |
Procedure for Replacing a
Controller
1 |
If there are any hosts connected
to this storage subsystem that are NOT running a host-based,
multi-path failover driver, stop I/O to the storage subsystem from
each of these hosts.
|
2 |
Place the affected controller
offline.
The affected controller is listed in the 'Component requiring
service' field in the Details area.
a |
Select the controller
in the Physical View of the Subsystem Management Window. |
b |
Select Advanced >>
Recovery >> Set Controller >> Offline. |
c |
Complete the
instructions in the dialog, then select Yes. |
|
3 |
Read all of the following steps before taking any action.
The remaining Recovery Steps will no longer be accessible from the Recovery
Guru dialog after you complete step a.
a |
Click the Recheck button to rerun the Recovery Guru. |
b |
Select the "Offline Controller" problem that is being reported in the Summary area. |
c |
Complete the Recovery Steps in the "Offline Controller" recovery procedure to replace the affected controller. |
|
Procedure for Replacing an
Interconnect-Battery Canister
1 |
Obtain a replacement
Interconnect-battery canister with the same part number as the
failed one. Note: The part number of the
Interconnect-battery canister can be located in the Storage
Subsystem Profile or in the View Enclosure Components dialog. |
2 |
Prepare the Interconnect-battery
canister for removal by completing the following steps:
a |
Select the Advanced >> Troubleshooting >>
Prepare for Removal menu option. |
b |
Select the appropriate Enclosure (as listed in the
Details area) and then the Interconnect-battery
canister from the respective drop-down menus. |
c |
Click the Prepare for Removal button. |
d |
If... |
Then... |
The resulting dialog
indicates that the component is safe to remove |
Go to step 3. |
The resulting dialog
indicated that the component is NOT safe to remove |
Answer the following question:
Are there other problems being reported in the
Summary area?
- Yes - Fix these problems first and then return to this procedure after
clicking the Recheck button.
- No - Stop this procedure and contact your technical support representative
|
|
|
3 |
Remove the Interconnect-battery canister.
If... |
Then... |
This storage subsystem is configured to operate without batteries |
Make sure that the new canister does not contain batteries, otherwise the storage subsystem will have a Needs Attention status if it is inserted with batteries.
Go to step 4.
|
This storage subsystem is configured to operate with batteries |
If... |
Then... |
The new canister contains batteries |
Go to step 4. |
The new canister does NOT contain batteries |
Remove the batteries from the failed canister and insert them into the new canister. Go to step
4.
|
|
|
4 |
Insert the new replacement
Interconnect-battery canister and be sure that the canister is securely seated.
|
5 |
Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative. |