检测check Exadata Image & OS versions , GI & DB patches
sundiag exacheck cellserv ==> imageinfo dbhost ==> /usr/local/bin/imagehistory Also check the version of the switch. Login to Switch and execute the following command [root@myswitch-1 sbin]# version [root@dmorlsw-ib2 sbin]# cd /usr/local/bin [root@dmorlsw-ib2 bin]# ls -lrt version -rwxr-xr-x 1 root root 20356 Apr 4 2011 version Output will look as below. [root@dmorlsw-ib2 ~]# version SUN DCS 36p version: 1.3.3-2 Build time: Apr 4 2011 11:15:19 SP board info: Manufacturing Date: 2009.05.05 Serial Number: "NCD3X0178" Hardware Revision: 0x0006 Firmware Revision: 0x0102 BIOS version: NOW1R112 BIOS date: 04/24/2009 ib8# cat /sys/class/infiniband/is4_0/fw_ver 7.2.300 ib8 # cat /sys/class/dmi/id/bios_version NOW1R112 ib8 # nm2version NM2-36p version: 1.0.1-1 Build time: Sep 14 2009 12:52:51 ComExpress info: Manufacturing Date: 2009.08.19 Serial Number: Hardware Revision: 0x0006 Firmware Revision: 0x0102 { case `uname` in Linux ) ILOM="/usr/bin/ipmitool sunoem cli" ;; SunOS ) ILOM="/opt/ipmitool/bin/ipmitool sunoem cli" ;; esac ; ImageInfo="/opt/oracle.cellos/imageinfo" ; uname -srm ; head -1 /etc/*release ; uptime | cut -d, -f1 ; $ILOM "show /SP system_description system_identifier" | grep = ; $ImageInfo -activated -node -status -ver | grep -v ^$ ; } | tee /tmp/ExaInfo.log $GRID_HOME/OPatch/opatch lsinv -all -oh $GRID_HOME | tee /tmp/OPatchInv.log $ORACLE_HOME/OPatch/opatch lsinv -all | tee -a /tmp/OPatchInv.log cat /tmp/ExaInfo.log Linux 2.6.18-128.1.16.0.1.el5 x86_64 ==> /etc/enterprise-release <== Enterprise Linux Enterprise Linux Server release 5.3 (Carthage) ==> /etc/redhat-release <== Enterprise Linux Enterprise Linux Server release 5.3 (Carthage) 20:37:56 up 458 days system_description = SUN FIRE X4170 SERVER, ILOM v3.0.6.10.b, r52264 system_identifier = Sun Oracle Database Machine Active image version: 11.2.1.2.3 Active image activated: XXXX-XX-XX 12:27:12 +0800 Active image status: success Active node type: COMPUTE Inactive image version: undefined FileName: OPatchInv.log ---------------- ... Oracle Home : /u01/app/11.2.0/grid Central Inventory : /u01/app/oraInventory from : /etc/oraInst.loc OPatch version : 11.2.0.1.2 OUI version : 11.2.0.1.0 OUI location : /u01/app/11.2.0/grid/oui ... -------------------------------------------------------------------------------- List of Oracle Homes: Name Location Ora11g_gridinfrahome1 /u01/app/11.2.0/grid OraDb11g_home1 /u01/app/oracle/product/11.2.0/dbhome_1 -------------------------------------------------------------------------------- Installed Top-level Products (1): Oracle Grid Infrastructure 11.2.0.1.0 ... Interim patches (2) : Patch 9524394 : applied on Thu Jun 03 20:46:05 CST 2010 ... {TRACKING BUG FOR 11.2.0.1 DB MACHINE BUNDLE PATCH 3} Patch 9455587 : applied on Fri Apr 02 18:27:47 CST 2010 ... {MERGE REQUEST ON TOP OF 11.2.0.1.0 FOR BUGS 8483425 8667622 8702731 8730804} Rac system comprising of multiple nodes Local node = dbserv01 Remote node = dbserv02 Remote node = dbserv03 Remote node = dbserv04 -------------------------------------------------------------------------------- OPatch succeeded. ... Oracle Home : /u01/app/oracle/product/11.2.0/dbhome_1 ... Oracle Database 11g 11.2.0.1.0 ... Interim patches (5) : Patch 8888434 : applied on Sat Jan 08 00:27:33 CST 2011 ... {AIX-ASM-CF: LMHB TERMINATE INSTANCE WHEN OFFLINE ONE FAILGROUP IN ASM DG} Patch 8730312 : applied on Thu Jun 03 21:30:03 CST 2010 ... {FWD MERGE FOR BASE BUG 8715387 FOR 12G} Patch 9502717 : applied on Thu Jun 03 21:25:54 CST 2010 ... {LMS HIT ORA-600 [KJBLDRMNEXTPKEY:SEEN] AND CRASHED THE INSTANCE} { + same 2 as GI above}
检测 cell server Cache Policy
cell08# MegaCli64 -LDInfo -Lall -aALL | grep 'Current Cache Policy' Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU cell09# MegaCli64 -LDInfo -Lall -aALL | grep 'Current Cache Policy' Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU Cache policy is in WB Would recommend proactive battery repalcement. Example : a. /opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -Cache -LALL -aALL ####( Will list the cache policy) b. /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp -WB -LALL -aALL ####( Will try to change teh policy from xx to WB) So policy Change to WB will not come into effect immediately Set Write Policy to WriteBack on Adapter 0, VD 0 (target id: 0) success Battery capacity is below the threshold value
检测cell BBU备用电池状态:
cell08# /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuStatus -a0 BBU status for Adapter: 0 BatteryType: iBBU Voltage: 4061 mV Current: 0 mA Temperature: 36 C BBU Firmware Status: Charging Status : None Voltage : OK Temperature : OK Learn Cycle Requested : No Learn Cycle Active : No Learn Cycle Status : OK Learn Cycle Timeout : No I2c Errors Detected : No Battery Pack Missing : No Battery Replacement required : No Remaining Capacity Low : Yes Periodic Learn Required : No Battery state: GasGuageStatus: Fully Discharged : No Fully Charged : Yes Discharging : Yes Initialized : Yes Remaining Time Alarm : No Remaining Capacity Alarm: No Discharge Terminated : No Over Temperature : No Charging Terminated : No Over Charged : No Relative State of Charge: 99 % Charger System State: 49168 Charger System Ctrl: 0 Charging current: 0 mA Absolute state of charge: 21 % Max Error: 2 % Exit Code: 0x00
批量检测BBU 信息:
dcli -g ~/cell_group -l root -t '{ uname -srm ; head -1 /etc/*release ; uptime | cut -d, -f1 ; imagehistory ; ipmitool sunoem cli "show /SP system_description system_identifier" | grep = ; ipmitool sunoem cli "show /SP/policy FLASH_ACCELERATOR_CARD_INSTALLED /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuStatus -a0 | egrep -i 'BBU|Battery|Charge:|Fully|Low|Learn' ; }' | tee /tmp/ExaInfo.log Target cells: ['cellserv01', 'cellserv02', 'cellserv03', 'cellserv04', 'cellserv05', 'cellserv06', 'cellserv07'] cellserv01: Linux 2.6.18-128.1.16.0.1.el5 x86_64 cellserv01: ==> /etc/enterprise-release <== cellserv01: Enterprise Linux Enterprise Linux Server release 5.3 (Carthage) cellserv01: cellserv01: ==> /etc/redhat-release <== cellserv01: Enterprise Linux Enterprise Linux Server release 5.3 (Carthage) cellserv01: 01:17:39 up 635 days cellserv01: Version : 11.2.1.2.1 cellserv01: Image activation date : 2011-03-25 11:59:34 -0800 cellserv01: Imaging mode : fresh cellserv01: Imaging status : success cellserv01: cellserv01: Version : 11.2.1.2.3 cellserv01: Image activation date : 2011-04-13 12:15:46 +0800 cellserv01: Imaging mode : patch cellserv01: Imaging status : success cellserv01: cellserv01: Version : 11.2.1.2.6 cellserv01: Image activation date : 2011-05-27 23:08:22 +0800 cellserv01: Imaging mode : patch cellserv01: Imaging status : success cellserv01: cellserv01: system_description = SUN FIRE X4275 SERVER, ILOM v3.0.6.10.b, r52264 cellserv01: system_identifier = Sun Oracle Database Machine cellserv01: Connected. Use ^D to exit. cellserv01: -> show /SP/policy FLASH_ACCELERATOR_CARD_INSTALLED cellserv01: show: No matching properties found. cellserv01: cellserv01: -> Session closed cellserv01: Disconnected cellserv01: BBU status for Adapter: 0 cellserv01: BatteryType: iBBU cellserv01: BBU Firmware Status: cellserv01: Learn Cycle Requested : No cellserv01: Learn Cycle Active : No cellserv01: Learn Cycle Status : OK cellserv01: Learn Cycle Timeout : No cellserv01: Battery Pack Missing : No cellserv01: Battery Replacement required : No cellserv01: Remaining Capacity Low : Yes cellserv01: Periodic Learn Required : No cellserv01: Battery state: cellserv01: Fully Discharged : No cellserv01: Fully Charged : Yes cellserv01: Relative State of Charge: 99 % cellserv01: Absolute state of charge: 21 % dcli -l root -g /root/all_group '/opt/MegaRAID/MegAaCli/MegaCli64 -AdpBbuCmd -a0' > BBU.out
check ipmi:
dcli -g ~/cell_group -l root -t '{ > ipmitool sunoem cli "show /SP/policy FLASH_ACCELERATOR_CARD_INSTALLED" | grep = ; MegaCli64 -LDInfo -Lall -aALL | grep 'Current Cache Policy' ; }' | tee /tmp/ExaCells.log
Comment