Recovering from an Unresponsive Storage Subsystem Condition

A can have an for several reasons. Use the procedure in this topic to determine a possible cause and solution.


Important:

The storage management software can take up to five minutes to detect that a storage subsystem has become unresponsive or becomes responsive again. Before completing this procedure, make sure that you wait some time before you decide that the storage subsystem is still unresponsive.

  1. Check the Tree View to see if all storage subsystems are unresponsive.
  2. Are any storage subsystems unresponsive?
  3. Make sure that the controllers are installed and that there is power to the storage subsystem.
  4. Is there a problem with the storage subsystem?
  5. Perform one of these actions, depending on how your storage subsystem is managed:
  6. For an out-of-band managed storage subsystem, make sure that the controllers are network accessible by using the ping command to make sure that the controller can be reached. Type one of these commands, and press Enter.
  7. Is the verification successful?
  8. Remove the storage subsystem with the Unresponsive status from the Enterprise Management Window (EMW), and select Add Storage Subsystem to add the storage subsystem again.
  9. Does the storage subsystem return to ?
  10. Check the Ethernet cables to make sure that there is no visible damage and that they are securely connected.
  11. Make sure the appropriate network configuration tasks have been performed. For example, make sure that IP addresses have been assigned to each controller.
  12. Is there is a cable or network accessibility problem?
  13. For an in-band managed storage subsystem, make sure that the is network accessible by using the ping command to verify that the host can be reached. Type one of these commands, and press Enter.
  14. Is the verification successful?
  15. Remove the host with the Unresponsive status from the EMW, and select Add Storage Subsystem to add the host again.
  16. Does the host return to Optimal status?
  17. Make sure that the host is turned on and operational and that the host bus adapters have been installed.
  18. Check all external cables and switches or hubs to make sure that no visible damage exists and that they are securely connected.
  19. Make sure the host-agent software is installed and running. If you started the host system before you were connected to the controller in the storage subsystem, the host-agent software will not be able to detect the controllers. If this is the case, make sure that the connections are secure, and restart the host-agent software.
  20. If you have recently replaced or added the controller, restart the host-agent software so that the new controller is recognized.
  21. Does a problem still exist?
  22. Check with other administrators to see if a firmware upgrade was performed on the controller from another storage management station. If a firmware upgrade was performed, the EMW on your management station might not be able to locate the new Subsystem Management Window (SMW) software needed to manage the storage subsystem with the new version of the firmware.
  23. Does a problem still exist?
  24. Determine if there is an excessive amount of network traffic to one or more controllers. This problem is self-correcting because the EMW software periodically retries to establish communication with the controllers in the storage subsystem. If the storage subsystem was unresponsive and a subsequent try to connect to the storage subsystem succeeds, the storage subsystem becomes responsive.

    For an out-of-band managed storage subsystem, determine if management operations are taking place on the storage subsystem from other storage management stations. A controller-determined limit exists to the number of Transmission Control Protocol/Internet Protocol (TCP/IP) connections that can be made to the controller before it stops responding to subsequent connection attempts. The type of management operations being performed and the number of management sessions taking place together determine the number of TCP/IP connections made to a controller. This problem is self-correcting because, after some TCP/IP connections terminate, the controller then becomes responsive to other connection attempts.

  25. Is the storage subsystem is still unresponsive?

Related Links:

  • Learn About Unresponsive Device Conditions