____________________________________________________ DECevent Release Notes for OpenVMS Order Number: AA-Q73LA-TE The DECevent Release Notes for OpenVMS provide general release information and restrictions for DECevent. Revision/Update Information: October 27, 1995 DECevent Version: V2.1 OpenVMS Alpha Version: 7.0 Digital Equipment Corporation Maynard, Massachusetts ________________________________________________________________ First Edition: October 1993 Revised: October 1995 The information in this document is subject to change without notice and should not be construed as a commitment by Digital Equipment Corporation. Digital Equipment Corporation assumes no responsibility for any errors that may appear in this document. The software described in this document is furnished under a license and may be used or copied only in accordance with the terms of such license. No responsibility is assumed for the use or reliability of software on equipment that is not supplied by Digital Equipment Corporation or its affiliated companies. Restricted Rights: Use, duplication, or disclosure by the U.S. Government is subject to restrictions as set forth in subparagraph (c) (1) (ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.227-7013. Copyright © Digital Equipment Corporation 1995 Printed in U.S.A. All Rights Reserved. Alpha, CI, DEC, DEC 10000, DECevent, DECsystem, HSC, KDA, KDM, KRQ50, MicroVAX, RA, RQDX, RRD42, RV20, RX, RZ, TA, TK, TQK, TU, UDA, VAXserver, and the DIGITAL logo are trademarks of Digital Equipment Corporation. This document was prepared using VAX DOCUMENT, Version 2.1. _________________________________________________________________ Contents Preface................................................... v 1 General Release Information 1.1 IPMT Entries for DECevent........................ 1-1 1.2 Starting Prerequisite Processes and Images....... 1-1 1.3 Starting the DECevent Graphical User Interface... 1-1 1.4 Re-Installing DECevent........................... 1-2 1.5 Supported Devices................................ 1-2 2 Comparing DECevent and Previous Digital Fault Management Tools 2.1 Selection Criteria............................... 2-1 2.2 Thresholding..................................... 2-2 2.2.1 Event Classes ................................. 2-2 2.2.2 Event Severity ................................ 2-2 2.2.3 Event Counting ................................ 2-3 2.2.4 Comparison of Event Counts with Thresholds .... 2-4 2.3 Autocopy......................................... 2-4 2.4 Cluster-Wide Support............................. 2-4 2.5 Mixed Architecture Cluster Support............... 2-4 2.6 Volume Labels.................................... 2-4 2.7 DECevent Translation of Multiple Input Files..... 2-5 iii 3 Known Restrictions 3.1 File Specification Restriction................... 3-1 3.2 Wildcard Restriction............................. 3-1 3.3 DIRECTORY CANONICAL Command Restriction.......... 3-1 3.4 SHOW THRESHOLD Command Restriction............... 3-1 3.5 Manual Analysis.................................. 3-1 3.6 Local Settings File.............................. 3-2 3.7 Page File Quota Restriction...................... 3-2 3.8 JTquota Restriction.............................. 3-2 3.9 /LOG Qualifier Restriction....................... 3-2 3.10 /FSTERR Report Type Restriction.................. 3-2 3.11 SET THRESHOLD/NAME Restriction................... 3-2 3.12 Evidence Support................................. 3-3 3.13 /BEFORE and /SINCE Parameter Restrictions........ 3-3 3.14 SHOW THRESHOLD Restriction....................... 3-3 3.15 RF74 Algorithm Incorrect......................... 3-3 3.16 Distribution List Name Restrictions.............. 3-3 3.17 TA9x Translation Problem......................... 3-3 3.18 RF31T Support.................................... 3-4 3.19 HSD10 Support.................................... 3-4 3.20 Automatic Analysis Restriction................... 3-4 3.21 Command Restrictions with Automatic Analysis..... 3-4 3.22 Set SICL Restriction............................. 3-5 3.23 Missing Information from the Output of the SHOW SUMMARY Command.................................. 3-5 4 Known Restrictions for the Graphical User Interface 4.1 Graphical User Interface User Guide for OpenVMS.. 4-1 4.2 GUI Startup Command.............................. 4-1 4.3 Canceling GUI Displays........................... 4-1 4.4 Text Entries..................................... 4-1 4.5 Accessing KNL Directories........................ 4-2 4.6 Brief Report Type Restriction.................... 4-2 4.7 Multiple Input File Restriction.................. 4-2 4.8 sys info Text Length Restriction................. 4-2 iv A Supported Devices for Bit-To-Text Translation A.1 Fully Supported Devices for Bit-To-Text Translation...................................... A-1 A.2 Partial Device Support........................... A-4 B Supported Devices for Analysis B.1 Fully Supported Devices for Analysis............. B-1 B.2 Devices with Partial Analysis Support by DECevent......................................... B-2 v _________________________________________________________________ Preface The DECevent Release Notes for OpenVMS provide general information about DECevent as well as differences between DECevent and past fault management tools. Also provided are restrictions for the current release of DECevent and known problems with the current release of DECevent. Intended Audience The DECevent Release Notes for OpenVMS are intended for use by system managers and service personnel who use DECevent software. Documentation Conventions The following conventions are used in this manual: boldface Boldface type indicates the first instance of type terms being defined in text. italic Italic type indicates emphasis and complete type manual titles. v 1 _________________________________________________________________ General Release Information This section provides information pertaining to the general use of the DECevent event management tool. 1.1 IPMT Entries for DECevent If MCS Service Engineers need to submit IPMT problems and solutions against DECevent, they should use the corporate IPMT server and use the product name DECEVENT when entering DECevent IPMT issues and problems. 1.2 Starting Prerequisite Processes and Images You must start all prerequisite processes and images, such as Motif, before executing the DECevent startup command file, DECEVENT$STARTUP.COM in SYS$STARTUP:SYSTARTUP_ VMS.COM. Failure to do so causes the decw$transport_common logical to not be defined when DECevent starts. 1.3 Starting the DECevent Graphical User Interface DECevent now allows you to translate and analyze event logs with a graphical user interface or GUI. You invoke the GUI with the following command: $DIAGNOSE/INTERFACE=DECW Refer to Chapter 4 for known restrictions for the GUI. Refer to The DECevent Graphical User Interface User's Guide, AA-QE26A-TE, for information about how to use the GUI. General Release Information 1-1 General Release Information 1.4 Re-Installing DECevent 1.4 Re-Installing DECevent When a new version of DECevent is installed, and a previous version has existed on a system, the file FMG_LOCAL_PARAM_ LIBRARY.KNL must be deleted from the login directory SYS$LOGIN: of each user who ran the previous version of DECevent. 1.5 Supported Devices Appendix A of these notes contains a list of all devices supported and recognized by DECevent's bit-to-text translation. Appendix B contains a list of all devices supported and recognized by the analysis and notification options of DECevent. 1-2 General Release Information 2 _________________________________________________________________ Comparing DECevent and Previous Digital Fault Management Tools This section describes known differences between past fault management tools and DECevent. 2.1 Selection Criteria Selection criteria for the /INCLUDE and /EXCLUDE qualifiers differs between former fault management tools and DECevent. The following lists valid selection keywords for DECevent. /INCLUDE=(keyword=[val] [,...]) /EXCLUDE=(keyword=[val] [,...]) keywords: ATTENTIONS BUGCHECKS BUSES CACHE CONFIGURATIONS CONTROL_ENTRIES CPUS DEVICE_ERRORS DEVICE_NAME DEVICE_NODE DEVICE_NUMBER DISKS ENVIRONMENTAL_ENTRIES HOSTS INFORMATIONALS IOS or IO_SUBSYSTEMS MCHKS or MACHINE_CHECKS MEMORY NODES OS or OPERATING_SYSTEMS PWR or POWER SEQUENCE_NUMBERS SWI or SOFTWARE_INFORMATIONALS SYNC_COMMUNICATIONS TAPES TIMEOUTS UNKNOWN_ENTRIES UNSOLICITED_MSCP VMS_ENTRIES VOLUME_CHANGES EXAMPLE: /INCLUDE=(DISK=RZ56,DISK=RA) /EXCLUDE=(TAPE=TA,CPU) You can also use keywords without values, as shown in the following example: /INCLUDE=(disk, tape) Comparing DECevent and Previous Digital Fault Management Tools 2-1 Comparing DECevent and Previous Digital Fault Management Tools 2.2 Thresholding 2.2 Thresholding Thresholding is the process by which DECevent automatic analysis determines whether a device has logged a sufficient number of events to warrant analysis and/or notification. DECevent performs thresholding by classifying events based on time of occurrence and relative severity, then comparing them to stored numbers called thresholds. DECevent implements thresholding in a different manner from previous fault management tools. This implementation is discussed in the following sections, which explain event classes, event severity, event counting, and the comparison process. 2.2.1 Event Classes DECevent allows the specification of thresholds within two categories called classes. These classes are DSE and CUSTOMER. The DSE class contains thresholds that are preset by Digital. When a DSE threshold is crossed, analysis is initiated for the device that logged the events. If analysis generates a theory code, notification is also performed. DSE thresholds may only be modified by Digital Services personnel. The CUSTOMER class contains thresholds that are not preset by Digital. When a CUSTOMER threshold is crossed, notification is initiated, for the device that logged the events, through the monitor mailing list and the customer external command procedures. Analysis is not initiated. CUSTOMER thresholds may be set to any value by the system manager. 2.2.2 Event Severity Within each class, thresholds are further specified according to the relative severity of the event. Severity is classified as shown in the following table: 2-2 Comparing DECevent and Previous Digital Fault Management Tools Comparing DECevent and Previous Digital Fault Management Tools 2.2 Thresholding ___________________________________________________________ Severity_Level___Description_______________________________ HARD Non-recoverable errors. SOFT Errors from which the device successfully recovered. MEDIA Events related to media such as disk media. INFO Events logged to record information such _________________as_timestamps_and_volume_mounts/dismounts. 2.2.3 Event Counting The thresholding process requires that counts be maintained for events that are logged on each device. Event counts are classified by severity. When an event is logged for a device, DECevent stores the event in the analysis state database, then recalculates the event counts for the device. If a threshold is crossed, events less than seven days old are used in automatic analysis. DECevent removes events older than seven days to prevent their use in automatic analysis. Events less than 24 hours old are counted for thresholding. Comparing DECevent and Previous Digital Fault Management Tools 2-3 Comparing DECevent and Previous Digital Fault Management Tools 2.2 Thresholding 2.2.4 Comparison of Event Counts with Thresholds Once the event counts have been recalculated, DECevent compares them with the thresholds in the following manner: o If an error of any severity level exceeds the DSE threshold for that severity level, then DECevent initiates analysis. If analysis generates a theory code, DECevent initiates notification. o If an error of any severity level exceeds the CUSTOMER threshold for that severity level, then DECevent initiates notification. 2.3 Autocopy Previous fault management tools provided an autocopy feature that detected failing disk drives and initiated shadow copying to preserve data. This feature is not implemented in the current version of DECevent. 2.4 Cluster-Wide Support Previous fault management tools supported a cluster environment. This feature is not implemented in the current version of DECevent. However, DECevent may be run separately on any or all nodes of a cluster. 2.5 Mixed Architecture Cluster Support DECevent is not supported in mixed architecture cluster environments. 2.6 Volume Labels The Error Report Formatter (ERF) maintained a list of devices and their volume labels as it encountered volume mount and dismount entries. When an entry for a tape or disk device was encountered, the matching volume labels were output in the report. This feature is not implemented in the current version of DECevent. 2-4 Comparing DECevent and Previous Digital Fault Management Tools Comparing DECevent and Previous Digital Fault Management Tools 2.7 DECevent Translation of Multiple Input Files 2.7 DECevent Translation of Multiple Input Files When requested to translate entries from multiple input files, DECevent does not delineate which entries are from which input file. Entries are numbered as if they were all translated from the same input file. Comparing DECevent and Previous Digital Fault Management Tools 2-5 3 _________________________________________________________________ Known Restrictions This section describes known restrictions with this release of DECevent. 3.1 File Specification Restriction The current version of DECevent does not support file specifications that start with a node address. For example: GARCIA::disk1:[]lesh.sys 3.2 Wildcard Restriction When a user attempts a translation using a wildcard in the file specification and the number or size of the filename exceeds the internal file buffer array, an error message indicating that the internal files array buffer has been exceeded is displayed. 3.3 DIRECTORY CANONICAL Command Restriction The DIRECTORY CANONICAL command does not function in the current release of DECevent. 3.4 SHOW THRESHOLD Command Restriction The SHOW THRESHOLD COMMAND truncates the second theory number when multiple theories are called out. 3.5 Manual Analysis Manual analysis can only be run from one account at a time on a given system. Running manual analysis from multiple accounts, or running it simultaneously from the same account, may cause incorrect results to be reported. Known Restrictions 3-1 Known Restrictions 3.6 Local Settings File 3.6 Local Settings File When a new version of DECevent is installed after a previous version has existed on a system, the file SYS$LOGIN:FMG_LOCAL_PARAM_LIBRARY.KNL must be deleted from the login directory of each user who ran the previous version of DECevent. 3.7 Page File Quota Restriction DECevent may fail with an access violation if the page file quota is exceeded. The process terminates and you are returned to the system prompt if this happens. You may then re-issue the last failing command. 3.8 JTquota Restriction The JTquota shown with the SHOW FIELD command when using AUTHORIZE must be increased to 8192 for DECevent software to function correctly. At the UAF> prompt following the SHOW FIELD command, change the JTquota by entering MODIFY FIELD/JTQUOTA=8192. 3.9 /LOG Qualifier Restriction The /LOG qualifier controls the display of informational messages telling the user how many event entries were selected and rejected. When using the /LOG qualifier, and no output is specified, the informational messages may be embedded within the report. 3.10 /FSTERR Report Type Restriction In the current DECevent release, the /FSTERR report type produces an output for RA70, RA72, RA73 and RA9x series devices only. 3.11 SET THRESHOLD/NAME Restriction The SET THRESHOLD/NAME command only works for devices that have previously logged an event while the DECevent startup command was active. 3-2 Known Restrictions Known Restrictions 3.12 Evidence Support 3.12 Evidence Support This release of DECevent does not currently provide evidence information in the analysis/notification reports. 3.13 /BEFORE and /SINCE Parameter Restrictions A selection with the starting date (/BEFORE) greater than the ending date (/SINCE) is ignored. No error message is generated. 3.14 SHOW THRESHOLD Restriction The SHOW THRESHOLD command displays the event counts, for devices that have logged errors, as 1. This occurs even when the devices have exceeded thresholds as high as 12, and notification has been performed. DECevent is handling the events properly and performing notification at the appropriate times, but is not displaying the correct counts. 3.15 RF74 Algorithm Incorrect The RF74 LBN to physical cylinder, head, and sector conversion is incorrect. The RF74 is a "banded drive" and contains a variable number of sectors per track, depending on the band. The algorithm uses this information as if it were a fixed sector disk. RF74 analysis may be affected by this. 3.16 Distribution List Name Restrictions When using the ADD USER command, the following words cannot be user names: CUSTOMER FSE MONITOR SICL 3.17 TA9x Translation Problem Extended Sense Data packets from TA9x class devices cause DECevent to halt without producing a complete report. The specific cause is not yet known. Known Restrictions 3-3 Known Restrictions 3.18 RF31T Support 3.18 RF31T Support RF31T events are translated as RF31 events even though RF31 events have a different format. RF31T is unsupported for analysis. 3.19 HSD10 Support Bit-to-text support for the HSD10 was accomplished by using the HSD10 written error specification only. Actual HSD10 event logs were not available at the time. Please contact your Digital Multivendor Customer Services engineer if any inconsistencies are observed during the translation of HSD10 event logs. 3.20 Automatic Analysis Restriction When performing automatic analysis, DECevent reads the system event mailbox, but does not check the system event log for entries that may have been made since the last system crash. 3.21 Command Restrictions with Automatic Analysis Commands that modify analysis and notification knowledge files require prior shutdown of automatic analysis. Before using any of the commands listed in this section, DECevent automatic analysis must be stopped by entering: $ DIAGNOSE SHUTDOWN node-name After the DIAGNOSE SHUTDOWN command has been entered, DECevent automatic analysis may be restarted by executing the command procedure SYS$STARTUP:DECEVENT$STARTUP.COM. The following commands require prior shutdown of automatic analysis: ADD EXTERNAL REMOVE USER ADD USER REPAIR FLUSH SET PHONE_NUMBER IGNORE SET SICL RECOGNIZE SET SYSTEM_INFO REMOVE REPAIRED SET THRESHOLD REMOVE SYSTEM_INFO 3-4 Known Restrictions Known Restrictions 3.22 Set SICL Restriction 3.22 Set SICL Restriction Use the DIA$MGR:DECEVENT$SICL_ENABLE.COM to enable SICL. 3.23 Missing Information from the Output of the SHOW SUMMARY Command The output of the DECevent SHOW SUMMARY command does not specify whether a field service or a customer threshold was crossed. Known Restrictions 3-5 4 _________________________________________________________________ Known Restrictions for the Graphical User Interface This section describes known restrictions with this release of the DECevent graphical user interface (GUI). 4.1 Graphical User Interface User Guide for OpenVMS The figures and examples shown in The DECevent Graphical User Interface User's Guide show Digital UNIX results. You must substitute OpenVMS commands and syntax where necessary when using the GUI on OpenVMS systems. 4.2 GUI Startup Command Use the following command to start the Graphical User Interface for DECevent on OpenVMS systems: $ DIAGNOSE/INTERFACE=DECWINDOWS 4.3 Canceling GUI Displays Currently, there is no cancel option implemented within the GUI to stop displays of large translations. If you wish to halt the display during long translations, from another window or terminal, issue the following command at the command line: $ STOP/ID= Where is your process id number. 4.4 Text Entries All text field entries must be terminated with a . Known Restrictions for the Graphical User Interface 4-1 Known Restrictions for the Graphical User Interface 4.5 Accessing KNL Directories 4.5 Accessing KNL Directories The GUI must be able to find and access the knowledge library (KNL) directories. If you are using the C Shell, use the following command to access and find the KNL directories: setenv DIA_LIBRARY /usr/sbin/DIA If you are using the Bourne Shell, use the following commands: >DIA_LIBRARY=/usr/sbin/DIA >export DIA_LIBRARY 4.6 Brief Report Type Restriction When the brief report type is selected within the GUI, the resulting translation shows a full report. 4.7 Multiple Input File Restriction You cannot perform translation or analysis on multiple input files. Only one input file can be translated or analysis performed upon at a time. 4.8 sys info Text Length Restriction The text length fields in the sys info icon box are limited to 40 characters. 4-2 Known Restrictions for the Graphical User Interface A _________________________________________________________________ Supported Devices for Bit-To-Text Translation This section contains lists of products and devices DECevent bit-to-text translation supports. ________________________ Note ________________________ For any device not in this list, as much translation will be performed as possible. All remaining information in the event will be dumped in hex. ______________________________________________________ A.1 Fully Supported Devices for Bit-To-Text Translation The following devices have full bit-to-text translation support. CPUs AlphaServer 1000 Family AlphaServer 2000 Family AlphaServer 2100 Family AlphaServer 8200 AlphaServer 8400 AlphaStation 200 Family AlphaStation 400 Family DEC 2000 Model 300 - Partial support DEC 3000 Model 300 - Partial support DEC 3000 Model 400 DEC 3000 Model 500 DEC 3000 Model 700 DEC 3000 Model 900 DEC 4000 Family DEC 7000 Family DEC 10000 Family Supported Devices for Bit-To-Text Translation A-1 Supported Devices for Bit-To-Text Translation A.1 Fully Supported Devices for Bit-To-Text Translation SCSI Solid State Disks The entire EZnn family is supported. SCSI RAID Controllers HSZ10 HSZ15 SCSI Port Drivers N53C710 N53C94 SCSI CDROMs The entire RRDnn family is supported. SCSI Magneto-Optical Disks The RWnn and RWZnn families are supported. SCSI Floppy Disks RX26 SCSI Hard Disks The RZnn family is supported. SCSI Tapes The TK50Z, TKZnn family, TSZnn family, TZnnn family, and TZKnn family are supported. HSC Devices The following HSC devices are fully supported by DECevent. Their names will be displayed as controlling devices for appropriate disk and tape errors, but their own out-of-band errors are not yet supported by DECevent. The out-of-band events will be dumped in hex. HSC40 HSC70 HSC50 HSC90 HSC60 HSC95 HSC65 A-2 Supported Devices for Bit-To-Text Translation Supported Devices for Bit-To-Text Translation A.1 Fully Supported Devices for Bit-To-Text Translation Other Adapters CIMNA DSYT1 KFMSA KFMSB SHAC PAxxx PNxxx HS Array Controllers HSJ30 HSJ40 HSD05 HSD10 HSD30 HSD40 HSZ40 Internal Device Controllers CIMNA CIXCD-AC DEFTA DEFZA Other Support DDR DSR Host Based RAID 0 and 5 I/O Adapters DEMFA DEMNA KDM70 KZMSA MSCP DSA Disks RA60 RA80 RA70 RA81 RA71 RA82 RA72 RA90 RA73 RA92 Supported Devices for Bit-To-Text Translation A-3 Supported Devices for Bit-To-Text Translation A.1 Fully Supported Devices for Bit-To-Text Translation MSCP DSSI Disks RF30 RF70 RF31 RF71 RF31F RF72 RF35 RF73 RF74 MSCP DSA Tapes TA78 TA79 TA81 TA90 TA90E TA91 MSCP DSSI Tapes TF30 TF70 TF85 TF857 A.2 Partial Device Support The following devices will be recognized if an error log entry is found in the event file. The device name will be displayed in the report with as much generic information as possibly translated from bit to text. Any remaining information will be displayed in hex. Generic Devices o Generic DU (for example, MSCP disks not yet known by OpenVMS) o Generic TU (for example, MSCP tapes not yet known by OpenVMS) o NI SCA events o OpenVMS Volume Shadow events (Phase II) CPUs DEC 2000 Model 200 DEC 2000 Model 300 DEC 3000 Model 300 A-4 Supported Devices for Bit-To-Text Translation Supported Devices for Bit-To-Text Translation A.2 Partial Device Support MSCP Controllers KCM44 RQDX1 KDA50-Q RQDX3 KDB50 RQDX4 KFBTA UDA50 KZMSA for DSSI UDA05-A KRQ50 RQDX1 RQDX3 RQDX4 UDA50 UDA50-A MSCP DSA Solid State Disks ESE20 ESE52 ESE56 ESE58 MSCP DSSI Disks RF36 RFH31 RFH35 RFH36 RFH72 RFH73 RFH74 MSCP DSSI Solid State Disks EF51 EF52 EF53 EF54 EF58 MSCP DSA Optical Disks RV20 RV60 Supported Devices for Bit-To-Text Translation A-5 Supported Devices for Bit-To-Text Translation A.2 Partial Device Support MSCP DSA Tapes TA85/TA857 TA86/TA867 TAD34 TAD44 TAD85 TAD86 TAD87 MSCP DSSI Tapes TF86 TF867 Other TMSCP Tapes/Controllers TBK70 TK50 TK50-DEBNT TK70 TQK50 TUK50 Other Adapters DE422-SA DEFAA DEFTA DEFZA DEFEA DWTVA A-6 Supported Devices for Bit-To-Text Translation B _________________________________________________________________ Supported Devices for Analysis This section contains a list of products and devices that the analysis and notification options of DECevent support. B.1 Fully Supported Devices for Analysis The following devices have full analysis and notification support. CPUs o DEC 4000 o DEC 7000 o AlphaServer 2000 5/250 and 5/300 o AlphaServer 2100 5/520 and 5/300 o AlphaServer 2100-RM 5/250 and 5/300 MSCP DSA Disks o RA72 o RA73 o RA90 o RA92 There are 15 different areas of analysis that can be done on RAxx devices. Analysis support for RAxx devices is implemented for the following areas: 1. RA90/92 Special Analysis 2. SDI Communication Errors Analysis 3. Non-media Drive Detected Errors Analysis 4. Media Drive Detected Errors Analysis Supported Devices for Analysis B-1 Supported Devices for Analysis B.1 Fully Supported Devices for Analysis 5. Head Matrix Analysis 6. Bad Surface Analysis 7. Servo Failures Analysis 8. Head Slap Test Analysis 9. Random Read Path Analysis 10.Bad Head Analysis 11.Radial Scratch Analysis 12.Circumferential Scratch Analysis 13.Bad Spot Analysis 14.Possible Media Errors Analysis 15.Forced Error Analysis I/O Adapters o KDM70 B.2 Devices with Partial Analysis Support by DECevent The following devices will have partial analysis support. For these devices, if a problem involves LED codes, DECevent will report "No Theory Found". MSCP DSA Disks o RA60 o RA70 o RA71 o RA80 o RA81 o RA82 B-2 Supported Devices for Analysis