Exadata Upgrade Procedure


It is advised that you follow "Quarterly Full Stack Download Patch" (QFSDP) in below MOS article.
It is better if both your running version and your updated version exists in QFSDP list.

 Exadata Database Machine and Exadata Storage Server Supported Versions ( Doc ID 888828.1 )


* download "compute nodes image" + "cell nodes image" + "latest dbnodeupdate.sh script"
 ( cell nodes image also includes IB switch firmwares. )

-----------------------------------

* RUN PRECHECKS BELOW

 -- download and run latest EXACHK. Check exacheck report
 
 -- DB NODE PRECHECK (run on each db node)
	 --- ./dbnodeupdate.sh -u -v -l DB_NODE_IMAGE.zip
	 --- check all alert logs
		dcli -g dbs_group -l root "dbmcli -e list alerthistory"

		there must not be any hardware error or critical logical error
		if no critical issue found, clear all alerts

		dcli -g dbs_group -l root "dbmcli -e drop alerthistory all"

 -- CELL NODES PRECHECK
         --- check ssh equivalence

	dcli -g cell_group -l root 'hostname -i'
	dcli -l root -g cell_group "/opt/oracle.cellos/ipconf -verify"

        --- reset cell and db node ilom

	dcli -g cell_group -l root "/usr/bin/ipmitool sunoem cli 'reset -script /SP'"
	./patchmgr -cells cell_group -reset_force
	./patchmgr -cells cell_group –cleanup

	-- run upgrade precheck

	./patchmgr -cells cell_group -patch_check_prereq
	./patchmgr -cells cell_group –cleanup

	 --- check all alert logs
		dcli -g cell_group -l root "cellcli -e list alerthistory"

		there must not be any hardware error or critical logical error
		if no critical issue found, clear all alerts

		dcli -g cell_group -l root "cellcli -e drop alerthistory all"

 -- IB SWITCH PRECHECK

	./patchmgr –ibswitches ibs_group -upgrade -ibswitch_precheck   

-----------------------------------

* BACKUP IB SWITCHES

  Operate below doc
	
   	How To Back Up and Restore Switch Settings for Sun Datacenter InfiniBand Switch 36 & Gateway Switch (Doc ID 1341944.1)
 
  Also take backups of below files.	

	scp switch:/etc/sysconfig/network-scripts/ifcfg-eth0 .
	scp switch:/etc/resolv.conf .
	scp switch:/etc/ntp.conf .
	scp switch:/etc/localtime .
	scp switch:/etc/opensm/opensm.conf .
	scp switch:/etc/sysconfig/network .

-----------------------------------

NON-ROLLING UPGRADING

 We always prefer non-rolling upgrades.(we dont use online rolling upgrading)
 It is disruptive but you generally have a dataguard standby system, you switch over at 10 minutes, then you have plenty working interval.

  

 STEP-1 STOP ALL DB SERVICES

	dcli -g dbs_group -l root "/u01/app/1*/grid/bin/crsctl stop crs"
	dcli -g dbs_group -l root "/u01/app/1*/grid/bin/crsctl disable crs"

 STEP-2 STOP ALL CELL SERVICES

	dcli -g cell_group -l root "service celld stop"

 STEP-3 REBOOT ALL DB & CELL SYSTEMS SO THAT BE SURE THEY CAN REBOOT SUCCESSFULLY AND THERE EXISTS NO NEW HW ERROR.

 STEP-4 CELL SERVERS UPGRADE
	
	nohup ./patchmgr -cells cell_group -patch &
	tail -f nohup.out

	CONTROL WITH BELOW COMMAND. YOU MUST SEE CORRECT VERSION WITH STATUS=SUCCESS
		dcli -g cell_group -l root "imageinfo;imagehistory"

 STEP-5 IB SWITCH UPGRADE
	nohup ./patchmgr –ibswitches ib_list -upgrade &
	tail -f nohup.out

	CONTROL WITH LOGGING TO SWITCHES AND CHECK WITH COMMAND "version". CORRECT VERSION with STATUS=SUCCESS MUST EXIST.

 STEP-6 SINGLE DB NODE UPDATE

	First precheck again,

	./dbnodeupdate.sh -u -v -l DB_NODE_IMAGE.zip -M		-M removes only custom rpms that prevent precheck.
								-R removes all custom rpms. It may be needed if OS version changes with update.

	After successfull precheck,

	./dbnodeupdate.sh -u -l DB_NODE_IMAGE.zip

	CONTROL CAREFULLY LOGS UNDER /var/log/cellos.
	SUCCESFULL UPGRADING WILL REBOOT OS
	AFTER REBOOT YOU MUST RUN POST INSTALLATION

CHECK THIS SUCCEEDS => /opt/oracle.cellos/CheckHWnFWProfile

	./dbnodeupdate.sh -c

	AFTER POST INSTALLATION FINISHED, YOU MUST SEE CORRECT VERSION with STATUS=SUCCESS at "imageinfo" COMMAND

 STEP-7 IF SINGLE DB NODE UPDATE SUCCEEDED, YOU CAN RUN SEVERAL DB NODE UPDATES IN PARALLEL AT ALL REMAINING COMPUTE NODES.