Wednesday 10 October 2012

CRS is not coming up.. hitting a bug

RAC - Windows server 2003 - CR Service is not coming up. - On 2 node RAC one node is down.

When try start CR service on node 2 (which is down) Both nodes are down with Blue screen error(OS error)

10.2.0.1

When I check OCR logs..

2012-10-09 21:54:46.058: [ CRSMAIN][632]32Initializing OCR
2012-10-09 21:54:46.073: [ OCROSD][632]utgdv:11:could not read reg value ocrmirrorconfig_loc os error= The system could not find the environment option that was entered.

2012-10-09 21:54:46.073: [ OCROSD][632]utgdv:11:could not read reg value ocrmirrorconfig_loc os error= The system could not find the environment option that was entered.
2012-10-09 21:54:46.073: [ OCRRAW][632]proprioo: for disk 0 (E:\cdata\crs\data.ocr), id match (1), my id set (1830547516,1028247821) total id sets (1), 1st set (1830547516,1028247821), 2nd set (0,0) my votes (2), total votes (2)
2012-10-09 21:54:46.089: [ OCRMAS][2872]th_master:12: I AM THE NEW OCR MASTER at incar 1. Node Number = 1
2012-10-09 21:54:46.089: [ OCROSD][2872]utgdv:11:could not read reg value ocrmirrorconfig_loc os error= The system could not find the environment option that was entered.

I had a doubt it would not able to access OCR.loc So get into ocr file on both nodes and check whether it is accessible or not.

To find the locaton u can use ocrcheck command..

or on windows ---> run ---->regedit ----> HKEY_LOCAL_MACHINE\SOFTWARE\Oracle\OCR

Ocr.loc  In my enviroment  E:\cdata\crs\data.ocr in this format and it was accessible.

To check votedisk location You can use..     crsctl query css votedisk

Checked--- All ip's were pinging --- no network related issue.

When checking css logs, We got to know that It was 10.2.0.1 bug. We were hitting a bug.

Log for reference
StartCMMon(): clssnmNMDetach failed - 2 10.2.0.1

Bug 4714940: CLSSGMSTARTNMMON: TIMED OUT WAITING ON NESTED NM RECONFIG. SELF-SACRIFICING
-> duplicate of
Bug 4682525: STARTCMMON(): CLSSNMNMDETACH FAILED, IN CSS
-> duplicate of
Bug 4682514: CLSSGMSLAVECMSYNC: RECONFIG TIMEOUT, IN CSS
-> fixed by 10202
To fix this bug minimum required patchset is 10.2.0.2 we decided to go for 10.2.0.5 which is the latest.
Patch # 4547817 10.2.0.2 PATCH SET FOR ORACLE DATABASE SERVER
Patch # 7213942 ORACLE 10.2.0.2 PATCH 18 BUG FOR WINDOWS-64 AMD64 AND INTEL EM64T XP AND 2003 (latest minipatch over 10202)

patch # 8202632 10.2.0.5.0 PATCH SET FOR ORACLE DATABASE SERVER
patch # 14408636 ORACLE 10G 10.2.0.5 PATCH 18 BUG FOR WINDOWS (64-BIT AMD64 AND INTEL EM64) (latest minipatch over 10205)

Thanks.....