ORACLE 11gR2 Redundant Interconnet and ora.cluster_interconnect.haip,Oracle 用196.254 段来进行私有网络

11gR2 网格构架冗余链路和ora.cluster_interconnect.haip

======================================================

$ $GRID_HOME/bin/oifcfg getif
eth1 10.1.0.128   global  cluster_interconnect
eth3 10.1.0.0  global  public
eth6 10.11.0.128  global  cluster_interconnect
eth7 10.12.0.128  global  cluster_interconnect

$ $GRID_HOME/bin/oifcfg iflist -p -n
eth1  10.1.0.128  PRIVATE  255.255.255.128
eth1  169.254.0.0  UNKNOWN  255.255.192.0
eth1  169.254.192.0  UNKNOWN  255.255.192.0
eth3  10.1.0.0  PRIVATE  255.255.255.128
eth6  10.11.0.128  PRIVATE  255.255.255.128
eth6  169.254.64.0  UNKNOWN  255.255.192.0
eth7  10.12.0.128  PRIVATE  255.255.255.128
eth7  169.254.128.0  UNKNOWN  255.255.192.0

======================================================

适用于:
Oracle Server – Enterprise Edition – Version: 11.2.0.2 and later

该文档中的信息可以应用于任何平台

目的:
该文档意在解释什么是在11gR2中的ora.cluster_interconnect.haip的网格架构。

范围和应用:
该文档是为RAC数据库管理员和Oracle支持工程师准备的。

11gR2 网格构架冗余链路和ora.cluster_interconnect.haip

网格构架从11.2.0.2版本开始,不包括任何的第三方IP故障专业技术(bond, IPMP or similar)的冗余连接支持。在安装阶段,可以定义多个专用网络适配器或随后使用oifcfg。Oracle数据库,CSS,OCR,CRS,CTSS,并在11.2.0.2的EVM组件采用自动。

如果更多的网络适配器被指定,网格构架可以一次激活最多4个专用网络适配器。ora.cluster_interconnect.haip 将为Oracle RAC、Oracle ASM、Oracle ACFS等启用一至四个连接本地HAIP的互联通信网络适配器,注意,如果存在sun cluster,HAIT特性将在11.2.0.2中禁用。

Grid将自动选择连接本地保留地址169.254.*.*作为HAIP子网,并且不会尝试适用任何169.254.*.*地址,如果它已经被在用于其它目的使用。由于HAIP,在默认情况下,网络流量将被所有活动的网络接口负载均衡。并且如果其中一个失败或者变成不可连接状态,相应的HAIP地址将透明的转移到相对的其它网络适配器。

当Grid中启动集群中的第一个节点,HAIP地址数量是由有多少个私有网络适配器是活动状态所决定的。如果只有一个活跃的私有网络,那么Grid将创建一个,如果有两个,Grid将创建两个,如果大于两个,Grid将创建4个HAIPs.即使更多的私有网络适配器随后被激活,HAIPs的数量是不会改变的,要使得新的网络适配器变成活动状态,则要重启集群所有的节点。

当Oracle Clusterware完全启动,resource haip应该显示ONLINE状态;

$ $GRID_HOME/bin/crsctl stat res -t -init
..
ora.cluster_interconnect.haip
1 ONLINE ONLINE racnode1

案例1:单专用网络适配器

如果多个物理网络适配器在操作系统级别上被绑定在一起,作为一个单一的设备名称,例如bond0,它仍被认为是一个单一的网络适配器环境。如果只有一个专用网络适配器被指定,如下面的例子中的eth1,将又HAIP创建一个虚拟的HAIP,这里当Grid启动和运行时的需求:

$ $GRID_HOME/bin/oifcfg getif
eth1 10.1.0.128 global cluster_interconnect
eth3 10.1.0.0 global public

$ $GRID_HOME/bin/oifcfg iflist -p -n
eth1 10.1.0.128 PRIVATE 255.255.255.128
eth1 169.254.0.0 UNKNOWN 255.255.0.0
eth3 10.1.0.0 PRIVATE 255.255.255.128

注:在eth1上的168.254.0.0子网是由resource haip启动的。
ifconfig
..
eth1 Link encap:Ethernet HWaddr 00:16:3E:11:11:22
inet addr:10.1.0.168 Bcast:10.1.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1122/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6369306 errors:0 dropped:0 overruns:0 frame:0
TX packets:4270790 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3037449975 (2.8 GiB) TX bytes:2705797005 (2.5 GiB)
eth1:1 Link encap:Ethernet HWaddr 00:16:3E:11:22:22
inet addr:169.254.167.163 Bcast:169.254.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

Instance alert.log (ASM and database):
Private Interface ‘eth1:1′ configured from GPnP for use as a private interconnect.
[name=’eth1:1’, type=1, ip=169.254.167.163, mac=00-16-3e-11-11-22, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62]
Public Interface ‘eth3′ configured from GPnP for use as a public interface.
[name=’eth3’, type=1, ip=10.1.0.68, mac=00-16-3e-11-11-44, net=10.1.0.0/25, mask=255.255.255.128, use=public/1]
..
Shared memory segment for instance monitoring created
Picked latch-free SCN scheme 3
..
Cluster communication is configured to use the following interface(s) for this instance
169.254.167.163
注:网络连接使用虚拟专用IP:192.254.167.163,而不是真正的私有IP。对于11.2.0.2之前的实例,默认情况下,依旧使用真正的私有IP,去使用新的特性,每当Grid被重启的时候,init.ora参数 cluster_interconnects将可以被更新

对于11.2.0.2及以上的版本,v$cluster_interconnects将显示haip的信息。

SQL> select name,ip_address from v$cluster_interconnects;

NAME            IP_ADDRESS
————— —————-
eth1:1          169.254.167.163

案例2: 多个专用网络适配器
2.1. 默认状态:
下面是3个专用网络的eth1,eth6和eth7例如,当Grid启动并运行
$ $GRID_HOME/bin/oifcfg getif
eth1 10.1.0.128 global cluster_interconnect
eth3 10.1.0.0 global public
eth6 10.1.0.128 global cluster_interconnect
eth7 10.1.0.128 global cluster_interconnect

$ $GRID_HOME/bin/oifcfg iflist -p -n
eth1 10.1.0.128 PRIVATE 255.255.255.128
eth1 169.254.0.0 UNKNOWN 255.255.192.0
eth1 169.254.192.0 UNKNOWN 255.255.192.0
eth3 10.1.0.0 PRIVATE 255.255.255.128
eth6 10.1.0.128 PRIVATE 255.255.255.128
eth6 169.254.64.0 UNKNOWN 255.255.192.0
eth7 10.1.0.128 PRIVATE 255.255.255.128
eth7 169.254.128.0 UNKNOWN 255.255.192.0

注:resource haip启动了四个虚拟专用IP地址,在eth1上两个,和一个eth6、eth7.

ifconfig
..
eth1 Link encap:Ethernet HWaddr 00:16:3E:11:11:22
inet addr:10.1.0.168 Bcast:10.1.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1122/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:15176906 errors:0 dropped:0 overruns:0 frame:0
TX packets:10239298 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:7929246238 (7.3 GiB) TX bytes:5768511630 (5.3 GiB)

eth1:1 Link encap:Ethernet HWaddr 00:16:3E:11:11:22
inet addr:169.254.30.98 Bcast:169.254.63.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
eth1:2 Link encap:Ethernet HWaddr 00:16:3E:11:11:22
inet addr:169.254.244.103 Bcast:169.254.255.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth6 Link encap:Ethernet HWaddr 00:16:3E:11:11:77
inet addr:10.1.0.188 Bcast:10.1.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1177/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:7068185 errors:0 dropped:0 overruns:0 frame:0
TX packets:595746 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2692567483 (2.5 GiB) TX bytes:382357191 (364.6 MiB)

eth6:1 Link encap:Ethernet HWaddr 00:16:3E:11:11:77
inet addr:169.254.112.250 Bcast:169.254.127.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:10.1.0.208 Bcast:10.1.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1188/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6435829 errors:0 dropped:0 overruns:0 frame:0
TX packets:314780 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2024577502 (1.8 GiB) TX bytes:172461585 (164.4 MiB)

eth7:1 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.178.237 Bcast:169.254.191.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

Instance alert.log (ASM and database):

Private Interface ‘eth1:1′ configured from GPnP for use as a private interconnect.
[name=’eth1:1’, type=1, ip=169.254.30.98, mac=00-16-3e-11-11-22, net=169.254.0.0/18, mask=255.255.192.0, use=haip:cluster_interconnect/62]
Private Interface ‘eth6:1′ configured from GPnP for use as a private interconnect.
[name=’eth6:1’, type=1, ip=169.254.112.250, mac=00-16-3e-11-11-77, net=169.254.64.0/18, mask=255.255.192.0, use=haip:cluster_interconnect/62]
Private Interface ‘eth7:1′ configured from GPnP for use as a private interconnect.
[name=’eth7:1’, type=1, ip=169.254.178.237, mac=00-16-3e-11-11-88, net=169.254.128.0/18, mask=255.255.192.0, use=haip:cluster_interconnect/62]
Private Interface ‘eth1:2′ configured from GPnP for use as a private interconnect.
[name=’eth1:2’, type=1, ip=169.254.244.103, mac=00-16-3e-11-11-22, net=169.254.192.0/18, mask=255.255.192.0, use=haip:cluster_interconnect/62]
Public Interface ‘eth3′ configured from GPnP for use as a public interface.
[name=’eth3’, type=1, ip=10.1.0.68, mac=00-16-3e-11-11-44, net=10.1.0.0/25, mask=255.255.255.128, use=public/1]
Picked latch-free SCN scheme 3
..

在这个实例中使用以下接口去配置集群通信连接
169.254.30.98
169.254.112.250
169.254.178.237
169.254.244.103

注:互联通信将使用所有四个虚拟私有IP地址,在网络故障情况下,只要有一个专用网络适配器运作,所以四个IP地址将保持活跃。

2.2 当专用网络适配器出现故障
如果一个专用网络适配器故障,在这个例子中eth6,虚拟私有IP eth6将自动瓢到一个健康的适配器,它对实例来说事透明的(ASM或数据库)

$ $GRID_HOME/bin/oifcfg iflist -p -n
eth1 10.1.0.128 PRIVATE 255.255.255.128
eth1 169.254.0.0 UNKNOWN 255.255.192.0
eth1 169.254.128.0 UNKNOWN 255.255.192.0
eth7 10.1.0.128 PRIVATE 255.255.255.128
eth7 169.254.64.0 UNKNOWN 255.255.192.0
eth7 169.254.192.0 UNKNOWN 255.255.192.0

注:虚拟私有IP eth6子网169.254.64.0迁往eth7

ifconfig
..
eth1 Link encap:Ethernet HWaddr 00:16:3E:11:11:22
inet addr:10.1.0.168 Bcast:10.1.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1122/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:15183840 errors:0 dropped:0 overruns:0 frame:0
TX packets:10245071 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:7934311823 (7.3 GiB) TX bytes:5771878414 (5.3 GiB)

eth1:1 Link encap:Ethernet HWaddr 00:16:3E:11:11:22
inet addr:169.254.30.98 Bcast:169.254.63.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth1:3 Link encap:Ethernet HWaddr 00:16:3E:11:11:22
inet addr:169.254.178.237 Bcast:169.254.191.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:10.1.0.208 Bcast:10.1.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1188/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6438985 errors:0 dropped:0 overruns:0 frame:0
TX packets:315877 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2026266447 (1.8 GiB) TX bytes:173101641 (165.0 MiB)

eth7:2 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.112.250 Bcast:169.254.127.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7:3 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.244.103 Bcast:169.254.255.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

2.3。当另一个专用网络适配器出现故障

如果另一个专用网络适配器停止工作,在这个例子中,虚拟私有IP eth1将自动飘到其他健康的网络适配器上,对实例没有影响(ASM或数据库实例上)

$ $GRID_HOME/bin/oifcfg iflist -p -n
eth7 10.1.0.128 PRIVATE 255.255.255.128
eth7 169.254.64.0 UNKNOWN 255.255.192.0
eth7 169.254.192.0 UNKNOWN 255.255.192.0
eth7 169.254.0.0 UNKNOWN 255.255.192.0
eth7 169.254.128.0 UNKNOWN 255.255.192.0
ifconfig
..
eth7 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:10.1.0.208 Bcast:10.1.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1188/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6441559 errors:0 dropped:0 overruns:0 frame:0
TX packets:317271 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2027824788 (1.8 GiB) TX bytes:173810658 (165.7 MiB)

eth7:1 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.30.98 Bcast:169.254.63.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7:2 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.112.250 Bcast:169.254.127.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7:3 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.244.103 Bcast:169.254.255.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7:4 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.178.237 Bcast:169.254.191.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

2.4。当专用网络适配器恢复

如果专用网络适配器eth6恢复后,它会被自动激活并且虚拟专用IP地址将分配给它:

$ $GRID_HOME/bin/oifcfg iflist -p -n
..
eth6 10.1.0.128 PRIVATE 255.255.255.128
eth6 169.254.128.0 UNKNOWN 255.255.192.0
eth6 169.254.0.0 UNKNOWN 255.255.192.0
eth7 10.1.0.128 PRIVATE 255.255.255.128
eth7 169.254.64.0 UNKNOWN 255.255.192.0
eth7 169.254.192.0 UNKNOWN 255.255.192.0

ifconfig
..
eth6 Link encap:Ethernet HWaddr 00:16:3E:11:11:77
inet addr:10.1.0.188 Bcast:10.1.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1177/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:398 errors:0 dropped:0 overruns:0 frame:0
TX packets:121 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:185138 (180.7 KiB) TX bytes:56439 (55.1 KiB)

eth6:1 Link encap:Ethernet HWaddr 00:16:3E:11:11:77
inet addr:169.254.178.237 Bcast:169.254.191.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth6:2 Link encap:Ethernet HWaddr 00:16:3E:11:11:77
inet addr:169.254.30.98 Bcast:169.254.63.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:10.1.0.208 Bcast:10.1.0.255 Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1188/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6442552 errors:0 dropped:0 overruns:0 frame:0
TX packets:317983 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2028404133 (1.8 GiB) TX bytes:174103017 (166.0 MiB)

eth7:2 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.112.250 Bcast:169.254.127.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

eth7:3 Link encap:Ethernet HWaddr 00:16:3E:11:11:88
inet addr:169.254.244.103 Bcast:169.254.255.255 Mask:255.255.192.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

HAIP 日志文件

Resource haip是由ohasd.bin管理,资源日志是位于$GRID_HOME/log//ohasd/ohasd.log 和$GRID_HOME/log//agent/ohasd/orarootagent_root/orarootagent_root.log

L1、当专用网络适配器出现故障时的登录示例

在多个专用网络适配器环境,如果一个适配器故障:

* ohasd.log

2010-09-24 09:10:00.891: [GIPCHGEN][1083025728]gipchaInterfaceFail: marking interface failing 0x2aaab0269a10 { host ”, haName ‘CLSFRAME_a2b2’, local (nil), ip ‘10.1.0.188’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x4d }
2010-09-24 09:10:00.902: [GIPCHGEN][1138145600]gipchaInterfaceDisable: disabling interface 0x2aaab0269a10 { host ”, haName ‘CLSFRAME_a2b2’, local (nil), ip ‘10.1.0.188’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x1cd }
2010-09-24 09:10:00.902: [GIPCHDEM][1138145600]gipchaWorkerCleanInterface: performing cleanup of disabled interface 0x2aaab0269a10 { host ”, haName ‘CLSFRAME_a2b2’, local (nil), ip ‘10.1.0.188’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x1ed }

* orarootagent_root.log

2010-09-24 09:09:57.708: [ USRTHRD][1129138496] {0:0:2} failed to receive ARP request
2010-09-24 09:09:57.708: [ USRTHRD][1129138496] {0:0:2} Assigned IP 169.254.112.250 no longer valid on inf eth6
2010-09-24 09:09:57.708: [ USRTHRD][1129138496] {0:0:2} VipActions::startIp {
2010-09-24 09:09:57.708: [ USRTHRD][1129138496] {0:0:2} Adding 169.254.112.250 on eth6:1
2010-09-24 09:09:57.719: [ USRTHRD][1129138496] {0:0:2} VipActions::startIp }
2010-09-24 09:09:57.719: [ USRTHRD][1129138496] {0:0:2} Reassigned IP: 169.254.112.250 on interface eth6
2010-09-24 09:09:58.013: [ USRTHRD][1082325312] {0:0:2} HAIP: Updating member info HAIP1;10.1.0.128#0;10.1.0.128#1
2010-09-24 09:09:58.015: [ USRTHRD][1082325312] {0:0:2} HAIP: Moving ip ‘169.254.112.250’ from inf ‘eth6’ to inf ‘eth7’
2010-09-24 09:09:58.015: [ USRTHRD][1082325312] {0:0:2} pausing thread
2010-09-24 09:09:58.015: [ USRTHRD][1082325312] {0:0:2} posting thread
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]start {
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]start }
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} HAIP: Moving ip ‘169.254.244.103’ from inf ‘eth1’ to inf ‘eth7’
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} pausing thread
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} posting thread
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]start {
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]start }
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} HAIP: Moving ip ‘169.254.178.237’ from inf ‘eth7’ to inf ‘eth1’
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} pausing thread
2010-09-24 09:09:58.016: [ USRTHRD][1082325312] {0:0:2} posting thread
2010-09-24 09:09:58.017: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]start {
2010-09-24 09:09:58.017: [ USRTHRD][1116531008] {0:0:2} [NetHAWork] thread started
2010-09-24 09:09:58.017: [ USRTHRD][1116531008] {0:0:2} Arp::sCreateSocket {
2010-09-24 09:09:58.017: [ USRTHRD][1093232960] {0:0:2} [NetHAWork] thread started
2010-09-24 09:09:58.017: [ USRTHRD][1093232960] {0:0:2} Arp::sCreateSocket {
2010-09-24 09:09:58.017: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]start }
2010-09-24 09:09:58.018: [ USRTHRD][1143847232] {0:0:2} [NetHAWork] thread started
2010-09-24 09:09:58.018: [ USRTHRD][1143847232] {0:0:2} Arp::sCreateSocket {
2010-09-24 09:09:58.034: [ USRTHRD][1116531008] {0:0:2} Arp::sCreateSocket }
2010-09-24 09:09:58.034: [ USRTHRD][1116531008] {0:0:2} Starting Probe for ip 169.254.112.250
2010-09-24 09:09:58.034: [ USRTHRD][1116531008] {0:0:2} Transitioning to Probe State
2010-09-24 09:09:58.034: [ USRTHRD][1093232960] {0:0:2} Arp::sCreateSocket }
2010-09-24 09:09:58.035: [ USRTHRD][1093232960] {0:0:2} Starting Probe for ip 169.254.244.103
2010-09-24 09:09:58.035: [ USRTHRD][1093232960] {0:0:2} Transitioning to Probe State
2010-09-24 09:09:58.050: [ USRTHRD][1143847232] {0:0:2} Arp::sCreateSocket }
2010-09-24 09:09:58.050: [ USRTHRD][1143847232] {0:0:2} Starting Probe for ip 169.254.178.237
2010-09-24 09:09:58.050: [ USRTHRD][1143847232] {0:0:2} Transitioning to Probe State
2010-09-24 09:09:58.231: [ USRTHRD][1093232960] {0:0:2} Arp::sProbe {
2010-09-24 09:09:58.231: [ USRTHRD][1093232960] {0:0:2} Arp::sSend: sending type 1
2010-09-24 09:09:58.231: [ USRTHRD][1093232960] {0:0:2} Arp::sProbe }

2010-09-24 09:10:04.879: [ USRTHRD][1116531008] {0:0:2} Arp::sAnnounce {
2010-09-24 09:10:04.879: [ USRTHRD][1116531008] {0:0:2} Arp::sSend: sending type 1
2010-09-24 09:10:04.879: [ USRTHRD][1116531008] {0:0:2} Arp::sAnnounce }
2010-09-24 09:10:04.879: [ USRTHRD][1116531008] {0:0:2} Transitioning to Defend State
2010-09-24 09:10:04.879: [ USRTHRD][1116531008] {0:0:2} VipActions::startIp {
2010-09-24 09:10:04.879: [ USRTHRD][1116531008] {0:0:2} Adding 169.254.112.250 on eth7:2
2010-09-24 09:10:04.880: [ USRTHRD][1116531008] {0:0:2} VipActions::startIp }
2010-09-24 09:10:04.880: [ USRTHRD][1116531008] {0:0:2} Assigned IP: 169.254.112.250 on interface eth7

2010-09-24 09:10:05.150: [ USRTHRD][1143847232] {0:0:2} Arp::sAnnounce {
2010-09-24 09:10:05.150: [ USRTHRD][1143847232] {0:0:2} Arp::sSend: sending type 1
2010-09-24 09:10:05.150: [ USRTHRD][1143847232] {0:0:2} Arp::sAnnounce }
2010-09-24 09:10:05.150: [ USRTHRD][1143847232] {0:0:2} Transitioning to Defend State
2010-09-24 09:10:05.150: [ USRTHRD][1143847232] {0:0:2} VipActions::startIp {
2010-09-24 09:10:05.151: [ USRTHRD][1143847232] {0:0:2} Adding 169.254.178.237 on eth1:3
2010-09-24 09:10:05.151: [ USRTHRD][1143847232] {0:0:2} VipActions::startIp }
2010-09-24 09:10:05.151: [ USRTHRD][1143847232] {0:0:2} Assigned IP: 169.254.178.237 on interface eth1
2010-09-24 09:10:05.470: [ USRTHRD][1093232960] {0:0:2} Arp::sAnnounce {
2010-09-24 09:10:05.470: [ USRTHRD][1093232960] {0:0:2} Arp::sSend: sending type 1
2010-09-24 09:10:05.470: [ USRTHRD][1093232960] {0:0:2} Arp::sAnnounce }
2010-09-24 09:10:05.470: [ USRTHRD][1093232960] {0:0:2} Transitioning to Defend State
2010-09-24 09:10:05.470: [ USRTHRD][1093232960] {0:0:2} VipActions::startIp {
2010-09-24 09:10:05.471: [ USRTHRD][1093232960] {0:0:2} Adding 169.254.244.103 on eth7:3
2010-09-24 09:10:05.471: [ USRTHRD][1093232960] {0:0:2} VipActions::startIp }
2010-09-24 09:10:05.471: [ USRTHRD][1093232960] {0:0:2} Assigned IP: 169.254.244.103 on interface eth7
2010-09-24 09:10:06.047: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]stop {
2010-09-24 09:10:06.282: [ USRTHRD][1129138496] {0:0:2} [NetHAWork] thread stopping
2010-09-24 09:10:06.282: [ USRTHRD][1129138496] {0:0:2} Thread:[NetHAWork]isRunning is reset to false here
2010-09-24 09:10:06.282: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]stop }
2010-09-24 09:10:06.282: [ USRTHRD][1082325312] {0:0:2} VipActions::stopIp {
2010-09-24 09:10:06.282: [ USRTHRD][1082325312] {0:0:2} NetInterface::sStopIp {
2010-09-24 09:10:06.282: [ USRTHRD][1082325312] {0:0:2} Stopping ip ‘169.254.112.250’, inf ‘eth6’, mask ‘10.1.0.128’
2010-09-24 09:10:06.288: [ USRTHRD][1082325312] {0:0:2} NetInterface::sStopIp }
2010-09-24 09:10:06.288: [ USRTHRD][1082325312] {0:0:2} VipActions::stopIp }
2010-09-24 09:10:06.288: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]stop {
2010-09-24 09:10:06.298: [ USRTHRD][1131239744] {0:0:2} [NetHAWork] thread stopping
2010-09-24 09:10:06.298: [ USRTHRD][1131239744] {0:0:2} Thread:[NetHAWork]isRunning is reset to false here
2010-09-24 09:10:06.298: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]stop }
2010-09-24 09:10:06.298: [ USRTHRD][1082325312] {0:0:2} VipActions::stopIp {

2010-09-24 09:10:06.298: [ USRTHRD][1082325312] {0:0:2} NetInterface::sStopIp {
2010-09-24 09:10:06.298: [ USRTHRD][1082325312] {0:0:2} Stopping ip ‘169.254.178.237’, inf ‘eth7’, mask ‘10.1.0.128’
2010-09-24 09:10:06.299: [ USRTHRD][1082325312] {0:0:2} NetInterface::sStopIp }
2010-09-24 09:10:06.299: [ USRTHRD][1082325312] {0:0:2} VipActions::stopIp }
2010-09-24 09:10:06.299: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]stop {
2010-09-24 09:10:06.802: [ USRTHRD][1133340992] {0:0:2} [NetHAWork] thread stopping
2010-09-24 09:10:06.802: [ USRTHRD][1133340992] {0:0:2} Thread:[NetHAWork]isRunning is reset to false here
2010-09-24 09:10:06.802: [ USRTHRD][1082325312] {0:0:2} Thread:[NetHAWork]stop }
2010-09-24 09:10:06.802: [ USRTHRD][1082325312] {0:0:2} VipActions::stopIp {
2010-09-24 09:10:06.802: [ USRTHRD][1082325312] {0:0:2} NetInterface::sStopIp {
2010-09-24 09:10:06.802: [ USRTHRD][1082325312] {0:0:2} Stopping ip ‘169.254.244.103’, inf ‘eth1’, mask ‘10.1.0.128’
2010-09-24 09:10:06.802: [ USRTHRD][1082325312] {0:0:2} NetInterface::sStopIp }
2010-09-24 09:10:06.802: [ USRTHRD][1082325312] {0:0:2} VipActions::stopIp }
2010-09-24 09:10:06.803: [ USRTHRD][1082325312] {0:0:2} USING HAIP[ 0 ]: eth7 – 169.254.112.250
2010-09-24 09:10:06.803: [ USRTHRD][1082325312] {0:0:2} USING HAIP[ 1 ]: eth1 – 169.254.178.237
2010-09-24 09:10:06.803: [ USRTHRD][1082325312] {0:0:2} USING HAIP[ 2 ]: eth7 – 169.254.244.103
2010-09-24 09:10:06.803: [ USRTHRD][1082325312] {0:0:2} USING HAIP[ 3 ]: eth1 – 169.254.30.98

注:从上面看,甚至只有NIC eth6故障,仍然有可能多虚拟私有IP在仍存在的 NICS上运动。

* ocssd.log

2010-09-24 09:09:58.314: [ GIPCNET][1089964352] gipcmodNetworkProcessSend: [network] failed send attempt endp 0xe1b9150 [0000000000000399] { gipcEndpoint : localAddr ‘udp://10.1.0.188:60169’, remoteAddr ”, numPend 5, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, flags 0x2, usrFlags 0x4000 }, req 0x2aaab00117f0 [00000000004b0cae] { gipcSendRequest : addr ‘udp://10.1.0.189:41486’, data 0x2aaab0050be8, len 80, olen 0, parentEndp 0xe1b9150, ret gipcretEndpointNotAvailable (40), objFlags 0x0, reqFlags 0x2 }
2010-09-24 09:09:58.314: [ GIPCNET][1089964352] gipcmodNetworkProcessSend: slos op : sgipcnValidateSocket
2010-09-24 09:09:58.314: [ GIPCNET][1089964352] gipcmodNetworkProcessSend: slos dep : Invalid argument (22)
2010-09-24 09:09:58.314: [ GIPCNET][1089964352] gipcmodNetworkProcessSend: slos loc : address not
2010-09-24 09:09:58.314: [ GIPCNET][1089964352] gipcmodNetworkProcessSend: slos info: addr ‘10.1.0.188:60169’, len 80, buf 0x2aaab0050be8, cookie 0x2aaab00117f0
2010-09-24 09:09:58.314: [GIPCXCPT][1089964352] gipcInternalSendSync: failed sync request, ret gipcretEndpointNotAvailable (40)
2010-09-24 09:09:58.314: [GIPCXCPT][1089964352] gipcSendSyncF [gipchaLowerInternalSend : gipchaLower.c : 755]: EXCEPTION[ ret gipcretEndpointNotAvailable (40) ] failed to send on endp 0xe1b9150 [0000000000000399] { gipcEndpoint : localAddr ‘udp://10.1.0.188:60169’, remoteAddr ”, numPend 5, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, flags 0x2, usrFlags 0x4000 }, addr 0xe4e6d10 [00000000000007ed] { gipcAddress : name ‘udp://10.1.0.189:41486’, objFlags 0x0, addrFlags 0x1 }, buf 0x2aaab0050be8, len 80, flags 0x0
2010-09-24 09:09:58.314: [GIPCHGEN][1089964352] gipchaInterfaceFail: marking interface failing 0xe2bd5f0 { host ‘racnode2’, haName ‘CSS_a2b2’, local 0x2aaaac2098e0, ip ‘10.1.0.189:41486’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x6 }
2010-09-24 09:09:58.314: [GIPCHALO][1089964352] gipchaLowerInternalSend: failed to initiate send on interface 0xe2bd5f0 { host ‘racnode2’, haName ‘CSS_a2b2’, local 0x2aaaac2098e0, ip ‘10.1.0.189:41486’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x86 }, hctx 0xde81d10 [0000000000000010] { gipchaContext : host ‘racnode1’, name ‘CSS_a2b2’, luid ‘4f06f2aa-00000000’, numNode 1, numInf 3, usrFlags 0x0, flags 0x7 }
2010-09-24 09:09:58.326: [GIPCHGEN][1089964352] gipchaInterfaceDisable: disabling interface 0x2aaaac2098e0 { host ”, haName ‘CSS_a2b2’, local (nil), ip ‘10.1.0.188’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 1, flags 0x14d }
2010-09-24 09:09:58.326: [GIPCHGEN][1089964352] gipchaInterfaceDisable: disabling interface 0xe2bd5f0 { host ‘racnode2’, haName ‘CSS_a2b2’, local 0x2aaaac2098e0, ip ‘10.1.0.189:41486’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x86 }
2010-09-24 09:09:58.327: [GIPCHALO][1089964352] gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0xe2bd5f0 { host ‘racnode2’, haName ‘CSS_a2b2’, local 0x2aaaac2098e0, ip ‘10.1.0.189:41486’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0xa6 }
2010-09-24 09:09:58.327: [GIPCHGEN][1089964352] gipchaInterfaceReset: resetting interface 0xe2bd5f0 { host ‘racnode2’, haName ‘CSS_a2b2’, local 0x2aaaac2098e0, ip ‘10.1.0.189:41486’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0xa6 }
2010-09-24 09:09:58.338: [GIPCHDEM][1089964352] gipchaWorkerCleanInterface: performing cleanup of disabled interface 0x2aaaac2098e0 { host ”, haName ‘CSS_a2b2’, local (nil), ip ‘10.1.0.188’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x16d }
2010-09-24 09:09:58.338: [GIPCHTHR][1089964352] gipchaWorkerUpdateInterface: created remote interface for node ‘racnode2’, haName ‘CSS_a2b2’, inf ‘udp://10.1.0.189:41486’
2010-09-24 09:09:58.338: [GIPCHGEN][1089964352] gipchaWorkerAttachInterface: Interface attached inf 0xe2bd5f0 { host ‘racnode2’, haName ‘CSS_a2b2’, local 0x2aaaac2014f0, ip ‘10.1.0.189:41486’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x6 }
2010-09-24 09:10:00.454: [ CSSD][1108904256]clssnmSendingThread: sending status msg to all nodes

L2、专用网络适配器恢复的登录示例

在多个专用网络适配器环境中,如果故障的适配器重新恢复:

* ohasd.log

2010-09-24 09:14:30.962: [GIPCHGEN][1083025728]gipchaNodeAddInterface: adding interface information for inf 0x2aaaac1a53d0 { host ”, haName ‘CLSFRAME_a2b2’, local (nil), ip ‘10.1.0.188’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x41 }
2010-09-24 09:14:30.972: [GIPCHTHR][1138145600]gipchaWorkerUpdateInterface: created local bootstrap interface for node ‘eyrac1f’, haName ‘CLSFRAME_a2b2’, inf ‘mcast://230.0.1.0:42424/10.1.0.188’
2010-09-24 09:14:30.972: [GIPCHTHR][1138145600]gipchaWorkerUpdateInterface: created local interface for node ‘eyrac1f’, haName ‘CLSFRAME_a2b2’, inf ‘10.1.0.188:13235’

* ocssd.log

2010-09-24 09:14:30.961: [GIPCHGEN][1091541312] gipchaNodeAddInterface: adding interface information for inf 0x2aaab005af00 { host ”, haName ‘CSS_a2b2’, local (nil), ip ‘10.1.0.188’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x41 }
2010-09-24 09:14:30.972: [GIPCHTHR][1089964352] gipchaWorkerUpdateInterface: created local bootstrap interface for node ‘racnode1’, haName ‘CSS_a2b2’, inf ‘mcast://230.0.1.0:42424/10.1.0.188’
2010-09-24 09:14:30.972: [GIPCHTHR][1089964352] gipchaWorkerUpdateInterface: created local interface for node ‘racnode1’, haName ‘CSS_a2b2’, inf ‘10.1.0.188:10884’
2010-09-24 09:14:30.972: [GIPCHGEN][1089964352] gipchaNodeAddInterface: adding interface information for inf 0x2aaab0035490 { host ‘racnode2’, haName ‘CSS_a2b2’, local (nil), ip ‘10.1.0.208’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x42 }
2010-09-24 09:14:30.972: [GIPCHGEN][1089964352] gipchaNodeAddInterface: adding interface information for inf 0x2aaab00355c0 { host ‘racnode2’, haName ‘CSS_a2b2’, local (nil), ip ‘10.1.0.188’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x42 }
2010-09-24 09:14:30.972: [GIPCHTHR][1089964352] gipchaWorkerUpdateInterface: created remote interface for node ‘racnode2’, haName ‘CSS_a2b2’, inf ‘mcast://230.0.1.0:42424/10.1.0.208’
2010-09-24 09:14:30.972: [GIPCHGEN][1089964352] gipchaWorkerAttachInterface: Interface attached inf 0x2aaab0035490 { host ‘racnode2’, haName ‘CSS_a2b2’, local 0x2aaab005af00, ip ‘10.1.0.208’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x46 }
2010-09-24 09:14:30.972: [GIPCHTHR][1089964352] gipchaWorkerUpdateInterface: created remote interface for node ‘racnode2’, haName ‘CSS_a2b2’, inf ‘mcast://230.0.1.0:42424/10.1.0.188’
2010-09-24 09:14:30.972: [GIPCHGEN][1089964352] gipchaWorkerAttachInterface: Interface attached inf 0x2aaab00355c0 { host ‘racnode2’, haName ‘CSS_a2b2’, local 0x2aaab005af00, ip ‘10.1.0.188’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x46 }
2010-09-24 09:14:31.437: [GIPCHGEN][1089964352] gipchaInterfaceDisable: disabling interface 0x2aaab00355c0 { host ‘racnode2’, haName ‘CSS_a2b2’, local 0x2aaab005af00, ip ‘10.1.0.188’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x46 }
2010-09-24 09:14:31.437: [GIPCHALO][1089964352] gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0x2aaab00355c0 { host ‘racnode2’, haName ‘CSS_a2b2’, local 0x2aaab005af00, ip ‘10.1.0.188’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x66 }
2010-09-24 09:14:31.446: [GIPCHGEN][1089964352] gipchaInterfaceDisable: disabling interface 0x2aaab0035490 { host ‘racnode2’, haName ‘CSS_a2b2’, local 0x2aaab005af00, ip ‘10.1.0.208’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x46 }
2010-09-24 09:14:31.446: [GIPCHALO][1089964352] gipchaLowerCleanInterfaces: performing cleanup of disabled interface 0x2aaab0035490 { host ‘racnode2’, haName ‘CSS_a2b2’, local 0x2aaab005af00, ip ‘10.1.0.208’, subnet ‘10.1.0.128’, mask ‘255.255.255.128’, numRef 0, numFail 0, flags 0x66 }

已知问题

Bug 10370797
Issue: HAIP fails to start on AIX

Fixed in: 11.2.0.3, affects AIX only

Symptom:

* Output of root script:

CRS-2672: Attempting to start ‘ora.cluster_interconnect.haip’ on ‘racnode1’
CRS-5017: The resource action “ora.cluster_interconnect.haip start” encountered the following error:
Start action for HAIP aborted
CRS-2674: Start of ‘ora.cluster_interconnect.haip’ on ‘racnode1’ failed

* $GRID_HOME/log//agent/ohasd/orarootagent_root/orarootagent_root.log

2010-12-04 17:19:54.893: [ USRTHRD][2084] {0:3:37} failed to create arp
2010-12-04 17:19:54.893: [ USRTHRD][2084] {0:3:37} (null) category: -2, operation: ioctl, loc: bpfopen:2,os, OS error: 14, other:
2010-12-04 17:19:54.992: [ USRTHRD][2084] {0:3:37} Arp::sCreateSocket {
2010-12-04 17:19:54.992: [ USRTHRD][2084] {0:3:37} failed to create arp
Bug 10332426
Issue: HAIP fails to start while running rootupgrade.sh

Fixed in: Configuration issue

Symptom:

* Output of root script:

CRS-2672: Attempting to start ‘ora.cluster_interconnect.haip’ on ‘racnode1’
CRS-5017: The resource action “ora.cluster_interconnect.haip start”
encountered the following error:
Start action for HAIP aborted
CRS-2674: Start of ‘ora.cluster_interconnect.haip’ on ‘racnode1’ failed

* $GRID_HOME/log//gipcd/gipcd.log

2010-12-12 09:41:35.201: [ CLSINET][1088543040] Returning NETDATA: 0 interfaces
2010-12-12 09:41:40.201: [ CLSINET][1088543040] Returning NETDATA: 0 interfaces

解决:

造成该问题的原因是OCR和专用网络上的操作系统的信息不匹配,下面的输出应相互一致(网络适配器的名称,子网和子网掩码):

oifcfg iflist -p -n
oifcfg getif
ifconfig
Bug 10363902
问题:HAIP无法启动,如果群集互连的Infiniband或任何其他网络硬件类型,有长于6个字节的硬件地址(MAC)

Fixed in: 11.2.0.3 for Linux and Solaris

Symptom:

* Output of root script:

CRS-2672: Attempting to start ‘ora.cluster_interconnect.haip’ on ‘racnode1’
CRS-5017: The resource action “ora.cluster_interconnect.haip start”
encountered the following error:
Start action for HAIP aborted
CRS-2674: Start of ‘ora.cluster_interconnect.haip’ on ‘racnode1’ failed

* $GRID_HOME/log//gipcd/gipcd.log

2010-12-07 13:23:08.560: [ USRTHRD][3858] {0:0:62} Arp::sCreateSocket {
2010-12-07 13:23:08.560: [ USRTHRD][3858] {0:0:62} failed to create arp
2010-12-07 13:23:08.561: [ USRTHRD][3858] {0:0:62} (null) category: -2,
operation: ssclsi_aix_get_phys_addr, loc: aixgetpa:4,n, OS error: 2, other:

@ 10380816 – implement of 10363902 on AIX

参考文献
注:1050908.1 – 如何解决Grid构架的启动问题
注:1054902.1 – 如何为Clusterware和RAC确认网络和域名解析步骤
注:1212703.1 – 由于多播需求使11.2.0.2网格构架安装或升级引发故障,
http://download.oracle.com/docs/cd/E11882_01/install.112/e17212/prelinux…

11gR2 Grid Infrastructure Redundant Interconnect and ora.cluster_interconnect.haip (Doc ID 1210883.1)

标签