Total Pageviews

Wednesday 15 January 2014

Gird and NFS : [FATAL] [INS-41321] Invalid Oracle Cluster Registry (OCR) location.

 Grid is very picky and somewhat uninformative about its NFS support

You need to trace the installer to find out what exactly it doesn’t like about your configuration.

Running the installer normally, the error message is:
[FATAL] [INS-41321] Invalid Oracle Cluster Registry (OCR) location.
CAUSE: The installer detects that the storage type of the location (/cmsstgdb/crs/ocr/ocr1) is not supported for Oracle Cluster Registry.
ACTION: Provide a supported storage location for the Oracle Cluster Registry.

OK, so Oracle says the storage is not supported, but I know that Netapp NFS is support just fine. This means I used the wrong parameters for the NFS mounts. But when I check my fstab and /etc/mount, everything looks A-OK. Can Oracle tell me what exactly bothers it?

It can. If you run the silent install by adding the following flags to the command line:
-J-DTRACING.ENABLED=true -J-DTRACING.LEVEL=2

Then you will see the following lines that explain why Oracle does not like your storage:

[main] [ 2011-01-04 23:43:55.184 GMT+00:00 ] [TaskSharedStorageAccess.reportStorageExceptions:754] Adding exception for node [node01]:

[main] [ 2011-01-04 23:43:55.184 GMT+00:00 ] [TaskSharedStorageAccess.reportStorageExceptions:755] Exception message: Mount options did not meet the requirements [Expected = “rw,hard,rsize>=32768,wsize& gt;=32768,proto=tcp |tcp,vers=3|nfsvers=3|nfsv3|v3,timeo>=600, acregmin=0&acregmax=0&acdirmin=0& amp;acdirmax=0|actimeo=0? ; Found = “rw,vers=3,rsize=32768,wsize=32768, acregmin=0,acregmax=0, acdirmin=0,acdirmax=0,hard,proto=tcp,timeo=300,retrans=2,sec=sys, addr=fas01b”]

This way it was much easier to see that I had timeo=300 while Oracle wanted timeo>=600.

 Your NFS configuration is not what you think it is.

If /etc/fstab says “timeo=600? and running “mount” shows that the volume is mounted with “timeo=600?, why does Oracle thinks that the volume is mounted with “timeo=300??

Turns out that the right place to look if you want to know what is your real NFS configuration is in “/proc/mounts”. The man page for “mount” says:

It is possible that files /etc/mtab and /proc/mounts don’t match. The first file is based only on the mount command options, but the content of the second file also depends on the kernel and others settings (e.g. remote NFS server. In particular case the mount command may reports unreliable information about a NFS mount point and the /proc/mounts file usually contains more reliable information.)

Aha! So /proc/mounts shows that timeo=300, which causes installation to fail, and the man page says that this could be caused by remote NFS server settings. Perfect. The problem was packaged and sent to the customer’s sysadmin, and was solved by the next morning.