Offline Controller
What Caused the Problem?
The controller was placed Offline. This could be caused by the following:
- The controller failed a diagnostic test and was automatically placed Offline. The diagnostics are initiated internally by the controller or by the Advanced >> Troubleshooting >> Run Diagnostics >> Controller menu option.
- The controller was manually placed Offline using the Advanced >> Recovery >> Place Controller >> Offline menu option.
The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.
Caution: Possible loss of data accessibility.
If the Summary area is reporting any type of "miswire" problem, then it
is very important that you resolve any miswire problem first. Once
all miswire problems have been resolved, you can continue with this
procedure. Resolving non-miswire problems with miswire problems still
present can lead to a loss of data accessibility.
Caution: Possible loss of data accessibility. Do not remove a component when either (1) the Service action (removal) allowed (SAA) field in the
Details area of this recovery procedure is NO (
), or (2) the SAA LED on the affected component is OFF (note that some products do not have SAA LEDs). Removing a component while its SAA LED is OFF may result in temporary loss of access to your data. Refer to the following Important Notes and the Recovery Steps for more detail.
Caution: Possible loss of data accessibility. Do not remove a component when either (1) the Service action (removal) allowed (SAA) field in the
Details area of this recovery procedure is NO (
), or (2) the SAA LED on the affected component is OFF (note that some products do not have SAA LEDs). Removing a component while its SAA LED is OFF may result in temporary loss of access to your data. Refer to the following Important Notes and the Recovery Steps for more detail.
Caution: Electrostatic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.
Important Notes
- Service Action Allowed Important Information:
- The Service action (removal) allowed field in the
Details area indicates whether or not you can safely remove the component. If the SAA field is NO (
), then the affected component must remain in place until you service another component first.
- The Service action LED on Component field in the
Details area indicates whether or not a physical SAA LED is present on the hardware component. This field does NOT indicate whether the LED is ON or OFF (that indication is provided by the Service action (removal) allowed field).
- If a component does not have an SAA LED, then it is OK to remove the component when its fault LED is lit and the Service action (removal) allowed field = YES (
) in the
Details area.
- The Service action (removal) allowed field shown in the
Details area and the physical SAA LED on the hardware component (if supported) MUST match before you remove the affected component. In rare cases (such as multiple problems), the status of the LED and the SAA field may not match. If there is a mismatch, then you should NOT remove the component until these indications match. The Recovery Steps will help you to make this determination.
- Not all models of storage subsystems contain batteries. If your storage subsystem does not contain batteries, you can ignore any references to them in the following steps. To determine if your storage subsystem contains batteries, use the Storage Subsystem >> View >> Profile option and select the Enclosures tab. Storage arrays without batteries will not have those components listed.
- If write caching on any logical drive is enabled, it might be automatically suspended until this issue is corrected and all batteries are fully charged (if batteries are present). To see if caching is disabled, use the Logical Drive >> Change >> Cache Settings menu option. Any option with the red dot icon is currently suspended.
- Logical Drives assigned to the Offline controller have been moved to the storage subsystem's Online controller.
- If the controllers for this storage subsystem are located in an enclosure containing both controllers and drives, you may have to insert the battery from the old controller canister into the new replacement controller canister. Consult the following steps and your hardware documentation for details.
Recovery Steps
1 |
If... |
Then... |
You were instructed to manually place the controller Offline in another procedure |
Go to step 2. |
The controller was automatically placed Offline |
Go to step 2. |
You know the controller was manually placed Offline but there is nothing wrong with the controller |
Place the controller back Online by highlighting the controller in the Physical View of the Subsystem Management Window and selecting Advanced >> Recovery >> Place Controller >> Online.
Go to step 4.
|
|
2 |
Check the Service Action Allowed status in the
Details area.
If... |
Then... |
Service action (removal) allowed = YES ( ) |
Check the Service action LED on Component field in the
Details area and then answer the following question:
Does the field indicate that this component has a physical SAA LED?
- Yes - Check the physical component to ensure the LED is ON.
- If the LED is ON, then go to step 3.
- If the LED is OFF, then click the Recheck
button to see if another problem appears in the
Summary area.
- If there is another problem, fix it first and then return to this procedure.
- If there is no other problem, then stop this procedure and contact your technical support representative.
- No - Go to step 3.
|
Service action (removal) allowed = NO ( ) |
Answer the following question:
Are there other problems being reported in the
Summary area?
- Yes - Fix these problems first and then return to this procedure after
clicking the Recheck button.
- No - Stop this procedure and contact your technical support representative.
|
|
3 |
a |
Remove the defective controller. The defective controller (A or B) is listed in the Recovery Guru Details area.
Note: Before you insert a new controller canister into an Out-of-Band Managed Storage Subsystem (refer to the Network Management Type column in the Enterprise Management Window), you must update the DHCP/BOOTP server so that it will associate the new controller's hardware Ethernet address with the DNS/network name and IP address previously assigned to the removed controller.
To update the DHCP/BOOTP server, find the entry associated with the removed controller and replace its Ethernet address with the new controller's Ethernet address. The controller's Ethernet address is located on an Ethernet ID label on the controller canister in the form xx.xx.xx.xx.xx.xx.
|
b |
If... |
Then... |
The controllers for this storage subsystem are located in an enclosure containing both controllers and drives |
Check to see if the new controller canister contains a battery.
- If your model of storage subsystem does not contain batteries, go to step c.
- If your model of storage subsystem is supposed to contain batteries and...
- There is not a battery installed in the new controller canister, then install the battery from the old canister and go to step c.
- There is a battery installed in the new controller canister, then go to step c.
|
The controllers for this storage subsystem are located in an enclosure containing only controllers |
Go to step c. |
|
|
c |
Make sure at least 1 minute has elapsed. Then, insert the new controller canister firmly into place.
Note the controller slot (A or B) of the affected controller listed in the Recovery Guru Details area. Highlight this controller slot in the Physical View of the Subsystem Management Window.
If... |
Then... |
The controller indicates that it is Online |
Go to step d. |
The controller indicates that it is Offline |
Select Advanced >> Recovery >> Place Controller >> Online and then go to step d. |
|
d |
If... |
Then... |
The controllers for this storage subsystem are located in an enclosure containing both controllers and drives |
Determine whether you need to reset the battery age.
- If your model of storage subsystem does not contain batteries, go to step 4.
- If your model of storage subsystem is supposed to contain batteries and...
- You installed the battery from the old controller canister, then you don't need to reset the battery age. Go to step 4.
- There was already a battery in the new replacement controller canister, then you must reset the battery age using the following procedure:
Select the Components button
on the enclosure containing the controllers in the Physical View of the Subsystem Management Window. Highlight the Batteries option and select the Reset button associated with the new controller canister (A or B). Then, go to step 4.
|
The controllers for this storage subsystem are located in an enclosure containing only controllers |
Go to step 4. |
|
|
4 |
If you have logical drives mapped to hosts that have Automatic Logical Drive Transfer (ADT) disabled, it may be necessary to redistribute the logical drives to their preferred controller. Use the following steps to determine the ADT status of the hosts connected to your storage subsystem:
a |
Open the Storage Subsystem Profile by selecting the Storage Subsystem >> View >> Profile menu option from the Subsystem Management Window. Then, select the profile's Mappings tab. |
b |
Scroll to the NVSRAM Host Type Internal Definitions section. |
c |
If... |
Then... |
There are hosts mapped to logical drives on this storage subsystem that have an ADT status of disabled
OR
There are hosts mapped to logical drives on this storage subsystem that are not running a host-based, multi-path failover driver
|
It may be necessary to redistribute the logical drives to their preferred controller. If the Subsystem Management Window's Advanced >> Recovery >> Redistribute Logical Drives menu option is available, select the option.
Note: If you have a mix of hosts with ADT enabled and ADT disabled, all logical drives will be immediately assigned back to their preferred path. However, until the host-based, multi-path failover driver detects the valid preferred path (may take several minutes), the logical drives mapped to the ADT-enabled hosts may get temporarily returned back to the non-preferred path.
If the menu option is not available (grayed out), the logical drives are already associated with their preferred controllers and no action is needed.
Go to step 5.
|
ALL hosts mapped to logical drives on this storage subsystem have an ADT status of enabled AND
All hosts mapped to logical drives on this storage subsystem are running a host-based, multi-path failover driver
|
No action is required.
If logical drives need to be redistributed to their preferred controller, the host-based, multi-path failover driver will automatically initiate the transfer.
Note that detection of a restored preferred path by the multi-path failover driver can take several minutes.
Go to step 5.
|
|
|
|
5 |
Click the Recheck button to rerun the Recovery Guru.
The failure should no longer appear in the Summary area. If
the failure appears again, contact your technical support
representative. |