RAC swap 不足, 实例down --解决

在论坛上看到一个案例,觉得不错,分享一下RAC swap 不足, 实例down --解决

AIX RAC swap 不足, 实例down --解决
os : aix 5.3 + hacmp 5.4.1
db : 10.2.0. 3 rac


alert.log
Mon Sep  1 03:49:17 2008
Process startup failed, error stack:
Mon Sep  1 03:49:17 2008
Errors in file /app/oracle/admin/racdb/bdump/racdb1_psp0_479298.trc:
ORA-27300: OS system dependent operation:fork failed with status: 12
ORA-27301: OS failure message: Not enough space
ORA-27302: failure occurred at: skgpspawn3
Mon Sep  1 03:49:18 2008
Process PZ96 died, see its trace file


trace.log
Redo thread mounted by this instance: 1
Oracle process number: 4
Unix process pid: 479298, image: oracle@racdb1 (PSP0)

*** SERVICE NAMERAC swap 不足, 实例down --解决SYS$BACKGROUND) 2008-09-01 03:46:42.179
*** SESSION IDRAC swap 不足, 实例down --解决553.1) 2008-09-01 03:46:42.179
*** 2008-09-01 03:46:42.179
Process startup failed, error stack:
ORA-27300: OS system dependent operation:fork failed with status: 12
ORA-27301: OS failure message: Not enough space
ORA-27302: failure occurred at: skgpspawn3
*** 2008-09-01 03:47:41.144
Process startup failed, error stack:
ORA-27300: OS system dependent operation:fork failed with status: 12
ORA-27301: OS failure message: Not enough space
ORA-27302: failure occurred at: skgpspawn3
*** 2008-09-01 03:49:15.684

分析:
   1. 发现alert.log 发现很多ORA-27301: OS failure message: Not enough space 以为是 磁盘空间不足, 检查磁盘发现没有满.
   2. 检查trace.log , 发现是process PSP0 启动不了,  process-spawner (PSP0): spawns Oracle processes , 为oracle主要后台进程.
   3. crs_stat -t  发现数据库实例1已经down 掉,各个rac1资源已经offline..说明PSP0已经启动不了. 由此可知 ORA-27301: OS failure message: Not enough space 可能是由于系统内存不足的原因.
   4. metalink 查询. Doc ID:  Note:560309.1
      正是由于ram/swap 不足导致, 建议设置.
      RAM                              SWAP
      1GB to 2GB                   1.5 times RAM
      > 2GB and       > 8GB                      .75 times RAM
     而当前aix 系统的ram:8G, swap: 4G 可知远小于需要设置的swap.
   5. 查到原因 ,修改就很简单了. 利用smitty chps 修改系统的swap 即可. 然后利用 lsps -a 或者topas 查看swap .
   6. 系统运行3.4个月一直很正常,修改swap 以后需要继续观察.

Doc ID:  Note:560309.1   
Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 10.2.0.3
This problem can occur on any platform.

Symptoms
The database can not start up due to the following errors:

*** SERVICE NAME  SYS$BACKGROUND) 2008-03-24 17:02:34.855
*** SESSION ID 1104.1) 2008-03-24 17:02:34.855
*** 2008-03-24 17:02:34.855
Process startup failed, error stack:
ORA-27300: OS system dependent operation:fork failed with status: 12
ORA-27301: OS failure message: Not enough space
ORA-27302: failure occurred at: skgpspawn3
*** 2008-03-24 17:02:38.158
Process startup failed, error stack:
ORA-27300: OS system dependent operation:fork failed with status: 12
ORA-27301: OS failure message: Not enough space
ORA-27302: failure occurred at: skgpspawn3

Cause
This issue is mainly caused by lack of memory / swap. Checking the memory configuration on the server, we have found the following:

Total Physical Memory 38912 MB
Swap: Max Size 17664 MiB
So, RAM is 38 GB, SWAP space is only 17 GB

Solution
-We should increase the server swap space (paging space) . The general rule of thumb is that swap space should be:
RAM                              SWAP
1GB to 2GB                   1.5 times RAM
> 2GB and > 8GB                            .75 times RAM

So in our case, the recommended swap space is @28 GB .

We can also try to increase physical memory, if possible.

We should also check the ulimits for Oracle user.:
memory - unlimited
data       - unlimited
cpu        - unlimited
stack      - at least 32768
nofile      - OS dependent

We should also check memory parameters in the pfile/spfile that add more load to the memory consumption on the server.  In our issue, we found these settings which added more pressure to the memory:
-lock_sga=true
-large db_keep_cache_size=14000m.

上一篇:【技术贴】VS2005不能新建项目不能新建网站。提示Microsoft visual studio


下一篇:通过binlog日志文件恢复单表【小技巧】