几个月前,在客户的一次演练测试中,一位乙方项目组的经理指出:在RAC环境中使用lsnrctl命令关闭监听,Oracle CRS会自动将该监听重启。客户对他的这个说法十分重视,同时向我咨询,CRS确实会定期对所有资源进行检查,并可能重新启动以外终止的资源;但手动使用lsnrctl关闭监听绝对不能算在以外终止的范畴当中。这位乙方的项目经理年纪已界中年,项目经验十分丰富,而且说起这个问题来信誓旦旦(十分反感这样的自信),不由得别人不信;当时我向客户具体介绍了CRS重启资源的原理,并阐述了我认为“不会重启”的观点,因为不能排除一些意外因素(我认识的Oracle总是带来惊喜),我的口气并不如乙方项目经理那么肯定,客户负责人也只有将信将疑,并认为可以实际测试一下。
当时的测试记录没有保留,我们来看一下RHEL 5.5上Oracle RAC 10.2.0.5版本中的表现(实际与AIX上10.2.0.4的表现一致):
[maclean@rh2 ~]$ crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora.....S13.cs application ONLINE ONLINE rh2 ora....db1.srv application ONLINE ONLINE rh2 ora.racdb.db application ONLINE ONLINE rh1 ora....b1.inst application ONLINE ONLINE rh2 ora....b2.inst application ONLINE ONLINE rh1 ora....SM2.asm application ONLINE ONLINE rh1 ora....H1.lsnr application ONLINE ONLINE rh1 ora.rh1.gsd application ONLINE ONLINE rh1 ora.rh1.ons application ONLINE ONLINE rh1 ora.rh1.vip application ONLINE ONLINE rh1 ora....SM1.asm application ONLINE ONLINE rh2 ora....H2.lsnr application ONLINE ONLINE rh2 ora.rh2.gsd application ONLINE ONLINE rh2 ora.rh2.ons application ONLINE ONLINE rh2 ora.rh2.vip application ONLINE ONLINE rh2 [maclean@rh2 ~]$ ps -ef|grep tns maclean 4098 17071 0 19:35 pts/0 00:00:00 grep tns maclean 11062 1 0 11:34 ? 00:00:00 /s01/rac10g/bin/tnslsnr LISTENER_RH2 -inherit [maclean@rh2 ~]$ lsnrctl stop LISTENER_RH2 LSNRCTL for Linux: Version 10.2.0.5.0 - Production on 27-JUN-2010 19:35:46 Copyright (c) 1991, 2010, Oracle. All rights reserved. Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=rh2-vip)(PORT=1521)(IP=FIRST))) The command completed successfully [maclean@rh2 ~]$ crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora.....S13.cs application ONLINE ONLINE rh2 ora....db1.srv application ONLINE ONLINE rh2 ora.racdb.db application ONLINE ONLINE rh1 ora....b1.inst application ONLINE ONLINE rh2 ora....b2.inst application ONLINE ONLINE rh1 ora....SM2.asm application ONLINE ONLINE rh1 ora....H1.lsnr application ONLINE ONLINE rh1 ora.rh1.gsd application ONLINE ONLINE rh1 ora.rh1.ons application ONLINE ONLINE rh1 ora.rh1.vip application ONLINE ONLINE rh1 ora....SM1.asm application ONLINE ONLINE rh2 ora....H2.lsnr application OFFLINE OFFLINE // TARGET被置为OFFLINE是不会被重启的 ora.rh2.gsd application ONLINE ONLINE rh2 ora.rh2.ons application ONLINE ONLINE rh2 ora.rh2.vip application ONLINE ONLINE rh2 [maclean@rh2 ~]$ crs_start ora.rh2.LISTENER_RH2.lsnr Attempting to start `ora.rh2.LISTENER_RH2.lsnr` on member `rh2` Start of `ora.rh2.LISTENER_RH2.lsnr` on member `rh2` succeeded. [maclean@rh2 ~]$ crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora.....S13.cs application ONLINE ONLINE rh2 ora....db1.srv application ONLINE ONLINE rh2 ora.racdb.db application ONLINE ONLINE rh1 ora....b1.inst application ONLINE ONLINE rh2 ora....b2.inst application ONLINE ONLINE rh1 ora....SM2.asm application ONLINE ONLINE rh1 ora....H1.lsnr application ONLINE ONLINE rh1 ora.rh1.gsd application ONLINE ONLINE rh1 ora.rh1.ons application ONLINE ONLINE rh1 ora.rh1.vip application ONLINE ONLINE rh1 ora....SM1.asm application ONLINE ONLINE rh2 ora....H2.lsnr application ONLINE ONLINE rh2 ora.rh2.gsd application ONLINE ONLINE rh2 ora.rh2.ons application ONLINE ONLINE rh2 ora.rh2.vip application ONLINE ONLINE rh2 [maclean@rh2 ~]$ ps -ef|grep tns maclean 4629 1 0 19:37 ? 00:00:00 /s01/rac10g/bin/tnslsnr LISTENER_RH2 -inherit maclean 5212 17071 0 19:38 pts/0 00:00:00 grep tns [maclean@rh2 ~]$ kill -9 4629 [maclean@rh2 ~]$ ps -ef|grep tns maclean 5333 17071 0 19:38 pts/0 00:00:00 grep tns [maclean@rh2 ~]$ date Sun Jun 27 19:38:59 EDT 2010 //过10分钟再来看看 [maclean@rh2 ~]$ ps -ef|grep tns maclean 8655 1 0 19:47 ? 00:00:00 /s01/rac10g/bin/tnslsnr LISTENER_RH2 -inherit maclean 9252 17071 0 19:48 pts/0 00:00:00 grep tns [maclean@rh2 ~]$ lsnrctl status LISTENER_RH2 LSNRCTL for Linux: Version 10.2.0.5.0 - Production on 27-JUN-2010 19:48:43 Copyright (c) 1991, 2010, Oracle. All rights reserved. Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=rh2-vip)(PORT=1521)(IP=FIRST))) STATUS of the LISTENER ------------------------ Alias LISTENER_RH2 Version TNSLSNR for Linux: Version 10.2.0.5.0 - Production Start Date 27-JUN-2010 19:47:07 Uptime 0 days 0 hr. 1 min. 35 sec Trace Level off Security ON: Local OS Authentication SNMP OFF Listener Parameter File /s01/rac10g/network/admin/listener.ora Listener Log File /s01/rac10g/network/log/listener_rh2.log Listening Endpoints Summary... (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.104)(PORT=1521))) (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.103)(PORT=1521))) (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC))) Services Summary... Service "+ASM" has 1 instance(s). Instance "+ASM1", status BLOCKED, has 1 handler(s) for this service... Service "+ASM_XPT" has 1 instance(s). Instance "+ASM1", status BLOCKED, has 1 handler(s) for this service... Service "S13" has 1 instance(s). Instance "racdb1", status READY, has 2 handler(s) for this service... Service "racdb" has 2 instance(s). Instance "racdb1", status READY, has 2 handler(s) for this service... Instance "racdb2", status READY, has 1 handler(s) for this service... Service "racdbXDB" has 2 instance(s). Instance "racdb1", status READY, has 1 handler(s) for this service... Instance "racdb2", status READY, has 1 handler(s) for this service... Service "racdb_XPT" has 2 instance(s). Instance "racdb1", status READY, has 2 handler(s) for this service... Instance "racdb2", status READY, has 1 handler(s) for this service... The command completed successfully
LISTENER资源的默认CHECK_INTERVAL为600秒,即10分钟内CRS会检测到LISTENER的意外终止并尝试重新启动该资源。经过上述在生产主机上的测试,乙方的项目经理觉得有些不可思议,同时客户也认同了我的观点。其实如乙方项目经理所作出的那样肯定的论调即使在其他地方也是不少见的,他们大多全通过实践来学习和认识Oracle,这点没有问题,实践可以教会我们大多数东西,但同时如果我们对事物的认识全部来自实践又往往不全面了,乙方项目经理所犯得就是这种错误,可能他在某次case当中遇到过类似的带有错误指导性情况,同时他也没有反复阅读过官方文档并没有在事后去深入了解整个事件的逻辑因果,并凭借着多年的经验果断地为该问题下了十分肯定的结论。
中国企业目前的IT基建大多由集成商完成,在整个it环节中集成商扮演了十分重要的角色;随着阅历的丰富,渐渐发觉集成商处集结着大量如这位经理般,年龄或大或小,经验或多多少,但说起技术来大多没完没了,深怕别人不知道自己会这会那的,他们在发表自己论点的口气决不允许半点质疑!
sofa!3Q!加油!
的确是这样!
这也就是为什么,你是OCM,而他不是!
职业铸就性格啊!
也认识一些所谓的高手,名声在外,有时甚至对别人的问题根本没有理解清楚,或者完全没有涉猎,也信口开合的以不容人怀疑的口吻回答你。IT的内容这么博大,更新这么迅速,不知道就说不知道呗,有没有人会笑话。这是不知道这些人的性格是如何养成的?
我觉得这是正常的处理思路。我以前是做开发的(现在转行做DBA),在处理后台进程监控时,意外终止的进程会自动将其重启,正常终止的,一般认为是有意识的人为终止,不会自动将其重启,否则会维护人员就无法终止进程做一些必要的维护工作。