1. 问题背景
事件背景:早上突然有同事反馈OA登录不了,面对这种情况出于DBA的职业本能一般都需要先确定下数据库运行是否正常,也就本能的查看了下数据库的状态,显示属于OPEN状态。第二步查看下数据库的alert日志,查看最近的时间点是否有报错的信息,经过不懈的努力终于发现了最近的时间点有个ORA-03137的错误;
ORACLE版本:11.0.2
平台:linux 32位
2. ORA-03137错误日志
2.1 alert日志内容
Fri Sep 27 08:11:31 2013
Errors in file /data/oracle/diag/rdbms/ekpj/ekpj/trace/ekpj_ora_31911.trc (incident=40971):
ORA-03137: TTC 协议内部错误: [12333] [32] [49] [51] [] [] [] []
Incident details in: /data/oracle/diag/rdbms/ekpj/ekpj/incident/incdir_40971/ekpj_ora_31911_i40971.trc
Fri Sep 27 08:11:34 2013
Trace dumping is performing id=[cdmp_20130927081134]
Fri Sep 27 08:11:35 2013
2.2 ora_31911.trc 的dump日志
*** 2013-09-27 08:11:32.047
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=35rzrx7bb4d3t) -----
select * from ( select kmreviewma0_.fd_id as fd1_1604_, kmreviewma0_.fd_last_modified_time as fd2_1604_, kmreviewma0_.doc_subject as doc3_1604_, kmreviewma0_.fd_current_number as fd4_1604_, kmreviewma0_.fd_feedback_modify as fd5_1604_, kmreviewma0_.fd_feedback_executed as fd6_1604_, kmreviewma0_.fd_number as fd7_1604_, kmreviewma0_.doc_creator_id as doc8_1604_, kmreviewma0_.doc_create_time as doc9_1604_, kmreviewma0_.fd_department_id as fd10_1604_, kmreviewma0_.doc_publish_time as doc11_1604_, kmreviewma0_.doc_read_count as doc12_1604_, kmreviewma0_.extend_file_path as extend13_1604_, kmreviewma0_.fd_use_form. as fd16_1604_, kmreviewma0_.doc_status as doc17_1604_, kmreviewma0_.auth_att_nodownload as auth18_1604_, kmreviewma0_.auth_att_nocopy as auth19_1604_, kmreviewma0_.auth_att_noprint as auth20_1604_, kmreviewma0_.auth_reader_flag as auth21_1604_, kmreviewma0_.fd_change_reader_flag as fd22_1604_, kmreviewma0_.fd_rbp_flag as fd23_1604_, kmreviewma0_.fd_change_att as fd24_1604_, kmreviewma0_.fd_model_name as fd25_1604_, kmreviewma0_.fd_model_id as fd26_1604_, kmreviewma0_.fd_work_id as fd27_1604_, kmreviewma0_.fd_phase_id as fd28_1604_, kmreviewma0_.fd_template_id as fd29_1604_, kmreviewma0_.auth_area_id as auth30_1604_ from ekpj.km_review_main kmreviewma0_ where kmreviewma0_.fd_id in (select kmreviewma1_.fd_id from ekpj.km_review_main kmreviewma1_, ekpj.km_review_main_areader authallrea2_, ekpj.sys_org_element sysorgelem3_ where kmreviewma1_.fd_id=authallrea2_.fd_doc_id and authallrea2_.auth_all_reader_id=sysorgelem3_.fd_id and 1=1 and kmreviewma1_.doc_creator_id=:1 and (sysorgelem3_.fd_id in (:2 , :3 , :4 , :5 , :6 , :7 , :8 , :9 , :10))) order by kmreviewma0_.doc_create_time desc, kmreviewma0_.fd_id desc ) where rownum
3. 错误原因分析
这个问题从来没有遇到过,那可怎么办?
方法很简单,先上网查询。
功夫不负有心人,很快就从网上找到了原因,具体说明如下:
该错误和网络上说是属于ORACLE的一个bug;
目前数据库最新版本是11.2.0.1.3
在11.2下同样也会出现ORA-03137,解决办法是:
在11.2.0.1.0下打patch9243912加设置_optim_peek_user_binds=false(设置这个参数可能影响CBO执行计划)
4. 解决方法
根据网上兄弟提供的解决方法,摘录如下:
4.1可以通过更换不同版本的JDBC驱动来避免该错误,也说明为什么同事在另外一套环境下,通过Weblogic的JDBC来访问数据库时,则不会遇到该错误;
4.2 给数据库打patch,初步认为可以通过打Patch:9703463 来解决;
4.3通过修改数据库参数来规避该错误:
修改之后,同事用之前的Tomcat那个版本的驱动来重新访问数据库时,则不再报错;
4.3直接升级数据库版本至11.2.0.3.0,通过导出导入的方式将刚项目组下的schema数据复制一份到一套11.2.0.3.0的库上,重新使用Tomcat那个版本的驱动来重新访问数据库时,亦不再报错。
5. 最终解决方案
虽然网上提供提供了多个解决方法,但是最终的解决方法还是需要根据各自的系统情况来进行选择,出现了问题得自己负责。
经过抉择先把这个参数_optim_peek_user_binds=false,主要考虑如下:第一该方法简单,简单就是王道;第二,该操作可逆,如果不是这个问题那么可以再把参数修改回来;
6. 附加操作
_optim_peek_user_binds=false属于隐藏参数,可以通过以下语句进行查询
SQL> select a.ksppinm name,b.ksppstvl value,a.ksppdesc description
2 from x$ksppi a,x$ksppcv b
3 where a.inst_id = USERENV ('Instance')
4 and b.inst_id = USERENV ('Instance')
5 and a.indx = b.indx
6 and upper(a.ksppinm) LIKE upper('%¶m%')
7 order by name
8 /
Enter value for param: _optim_peek_user_binds
old 6: and upper(a.ksppinm) LIKE upper('%¶m%')
new 6: and upper(a.ksppinm) LIKE upper('%_optim_peek_user_binds%')
NAME VALUE DESCRIPTION
-------------------------------- ------------------------ ----------------------------------
_optim_peek_user_binds TRUE enable peeking of user binds