在以前的一篇文章中,我提到千万不能将Oracle数据库的global_name更新为空。这不,事儿来了。我的一个同事,提到了一个解决办法,不过那个办法实际上是一种不完全恢复的办法,如果没有备份,就行不通。如果没有备份,可以使用BBED来修改块来解决这个问题,不过使用bbed仍然比较麻烦。
下面是我一时心血来潮进行的一次测试。测试环境,10.2.0.4 for Linux AS 5.5。注意,不要在生产库上模仿。
首先UPDATE GLOBAL_NAME为空,COMMIT后以abort方式关闭数据库,以abort方式只是为了增加点难度。之后再启动数据库。
SQL> update global_name set global_name='';
1 row updated.
SQL> commit;
Commit complete.
SQL> shutdown abort
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 209715200 bytes
Fixed Size 1266632 bytes
Variable Size 75500600 bytes
Database Buffers 130023424 bytes
Redo Buffers 2924544 bytes
Database mounted.
ORA-01092: ORACLE instance terminated. Disconnection forced
启动失败,不出意料出现ORA-600 [18062]错误:
Mon Sep 6 15:43:31 2010
Errors in file /oracle/app/oracle/admin/xty/udump/xty_ora_3149.trc:
ORA-00600: internal error code, arguments: [18062], [], [], [], [], [], [], []
QMNC started with pid=16, OS id=3151
Mon Sep 6 15:43:33 2010
Errors in file /oracle/app/oracle/admin/xty/udump/xty_ora_3149.trc:
ORA-00600: internal error code, arguments: [18062], [], [], [], [], [], [], []
Mon Sep 6 15:43:33 2010
Errors in file /oracle/app/oracle/admin/xty/udump/xty_ora_3149.trc:
ORA-00600: internal error code, arguments: [18062], [], [], [], [], [], [], []
下面来解决这个问题。
第1步,重启数据库到MOUNT状态:
[oracle@xty ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Mon Sep 6 15:43:47 2010
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
Connected to an idle instance.
SQL> startup mount
ORACLE instance started.
Total System Global Area 209715200 bytes
Fixed Size 1266632 bytes
Variable Size 75500600 bytes
Database Buffers 130023424 bytes
Redo Buffers 2924544 bytes
Database mounted.
第2步,在另一个窗口中,使用gdb
[oracle@xty ~]$ ps -ef | grep LOCAL
oracle 3186 3156 0 15:43 ? 00:00:00 oraclexty (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle 3188 2978 0 15:44 pts/3 00:00:00 grep LOCAL
[oracle@xty ~]$ gdb $ORACLE_HOME/bin/oracle 3186
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-23.el5)
...(省略)...
Reading symbols from /oracle/app/oracle/product/10.2.0/bin/oracle...(no debugging symbols found)...done.
Attaching to program: /oracle/app/oracle/product/10.2.0/bin/oracle, process 3186
Reading symbols from /etc/libcwait.so...(no debugging symbols found)...done.
Loaded symbols for /etc/libcwait.so
Reading symbols from /oracle/app/oracle/product/10.2.0/lib/libskgxp10.so...(no debugging symbols found)...done.
Loaded symbols for /oracle/app/oracle/product/10.2.0/lib/libskgxp10.so
Reading symbols from /oracle/app/oracle/product/10.2.0/lib/libhasgen10.so...(no debugging symbols found)...done.
...(省略)...
Reading symbols from /oracle/app/oracle/product/10.2.0/lib/libnnz10.so...(no debugging symbols found)...done.
Loaded symbols for /oracle/app/oracle/product/10.2.0/lib/libnnz10.so
Reading symbols from /usr/lib/libaio.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib/libaio.so.1
Reading symbols from /lib/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /lib/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/libnsl.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib/libnss_files.so.2
0x0025d402 in __kernel_vsyscall ()
(gdb) break kokiasg
Breakpoint 1 at 0xa3d404d
(gdb) continue
Continuing.
第3步,OPEN数据库:
SQL> alter database open;
约等一会儿,在alert日志里面可以看到:
Mon Sep 6 15:44:41 2010
SMON: enabling cache recovery
SMON: enabling tx recovery
Mon Sep 6 15:44:41 2010
Database Characterset is ZHS16GBK
在gdb的输出可以看到:
Breakpoint 1, 0x0a3d404d in kokiasg ()
第4步,在gdb那里中止OPEN:
(gdb) kill
Kill the program being debugged? (y or n) y
(gdb) quit
sqlplus会提示:
alter database open
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
这一次,Instance并没有terminated。只是Server process被KILL了。
第5步,还原GLOBAL_NAME:
[oracle@xty ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Mon Sep 6 15:45:09 2010
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select status from v$instance;
STATUS
------------
OPEN
SQL> update global_name set global_name='XTY';
1 row updated.
SQL> commit
2 ;
commit
*
ERROR at line 1:
ORA-01109: database not open
虽然可以执行UPDATE,但是不能COMMIT。再试试能不能做DDL:
SQL> create table t1 ( a int);
Table created.
成功了。
在另会一个会话中查看GLOBAL_NAME:
SQL> select * from global_name;
GLOBAL_NAME
--------------------------------------------------------------------------------
XTY
GLOBAL_NAME回来了。
这里通过DDL的隐式提交特性来UPDATE GLOBAL_NAME。其实还有更简单的办法:OCI主动断开连接时的自动提交。如果UPDATE之后,直接退出sqlplus,UPDATE GLOBAL_NAME的事务实际也提交了。看起来ORACLE这时只是不能执行显式的COMMIT语句。
第6步,重启数据库:
SQL> shutdown immediate
ORA-00604: error occurred at recursive SQL level 1
ORA-01109: database not open
SQL> shutdown abort
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 209715200 bytes
Fixed Size 1266632 bytes
Variable Size 75500600 bytes
Database Buffers 130023424 bytes
Redo Buffers 2924544 bytes
Database mounted.
Database opened.
看起来只能以abort方式关闭数据库:
成功了,数据库起来了。没有数据丢失,没有使用备份。不需要基于时间点的恢复,不需要BBED,不需要Resetlog。
internal, recovery