Friday 8 July 2016

ORA-8102 due to data inconsistent between job$ and index I_JOB_NEXT

These errors flooded the database alert log. Thankfully, Oracle was kind enough to specify the object number (obj# or object id) which permitted ease of locating the "challenged" object.

This simple SQL statement lead me to the object.
(290 is the one shown in my alert log.)
SQL> select object_name, object_type, status from dba_objects where object_id='290';
 OWNER
-------------------------------------
OBJECT_NAME
------------------------------------------------------------------------------------
OBJECT_TYPE                                      STATUS
-----------------------------------------------  ------------
SYS
I_JOB_NEXT
INDEX                                                    VALID


I checked further to determine the table which the index is built on.

SQL> select table_name, table_owner from dba_indexes where index_name='I_JOB_NEXT';
TABLE_NAME                                      TABLE_OWNER
------------------------------------  -------------------------------------
JOB$                                                     SYS

Next is to analyze the table/index to determine if there is a sort of corruption

SQL> analyze table  JOB$ validate structure cascade;
analyze table JOB$ validate structure cascade
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file

Solution:
Recreate the table job$ by create a backup table and reload the data into it.
1.SQL> create table job1 as select * from job$;
2.SQL> truncate table job$;
3.SQL> insert into job$ select * from job1;
4. SQL> commit;
5. SQL> analyze table job$ validate structure cascade;

The errors stoped after a successful validation.

In other cases, one may still get the error after following the above steps. In that case the indexes should be recreated.

Rebuild the two indexes on table job$

SQL> alter index I_job_job rebuild online;
SQL> alter index I_job_next rebuild online;
SQL> analyze table job$ validate structure cascade;


ORA-00600: internal error code, arguments: [4194]

These errors constantly appeared in the database alert log at the point of opening a database.

After reviewing the alert log and corresponding trace files. I found that the errors are due to UNDO corruption.

The database is a copy created using storage level snapshot technology. This probably explains the discrepancy between the redo and undo records. Oracle was validating the Undo block sequence number in the undo block against the Redo block sequence number relating to the change being applied and it found a mismatch.

Solution:

The solution I applied was to recreate the undo tablespace.

1. Create pfile from spfile.
2. Change UNDO_MANAGEMENT to MANUAL.
undo_management = manual
3. Start the database using the modified PFILE.
SQL> startup restrict pfile=<new_pfile>;
4. Create a new UNDO tablespace.
SQL> create undo tablespace <new undo tablespace> datafile <datafile> size XXXM;
4. Drop the original tablespace.
SQL> drop tablespace <originial undo tablespace> including contents and datafiles;
5. Shut down the database.
SQL> shut immediate;
6. Startup mount the database using the original spfile.
SQL> startup mount;
7. modify the spfile with the new undo tablespace name
SQL> alter system set undo_tablespace = '<new tablespace created in step 5>' scope=spfile;
8. shut down the database.
SQL> shut immediate;
9. Startup the database with modified spfile.
SQL> startup;



Thursday 23 July 2015

ORA-00392: log 1 of thread 1 is being cleared, operation not allowed, ORA-00312: online log 1 thread 1: '+DATA'

The above error occured during test restore of a RAC database (ASM based) to a single instance filesystem based database.

This was encountered at the point of opening the database after a successful restore of data files and necessary archive logs.

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00392: log 1 of thread 1 is being cleared, operation not allowed
ORA-00312: online log 1 thread 1: '+DATA'

Solution:

Check the redo log create destination.

SQL> show parameter create

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
create_bitmap_area_size              integer     8388608
create_stored_outlines               string
db_create_file_dest                  string      +DATA
db_create_online_log_dest_1          string
db_create_online_log_dest_2          string
db_create_online_log_dest_3          string
db_create_online_log_dest_4          string
db_create_online_log_dest_5          string

Set the parameters to reflect the filesystem based format to ensure redo logs are created in the appropriate destination.

SQL> alter system set db_create_file_dest='/filesystem_location/oradata';

System altered.

SQL> alter system set db_create_online_log_dest_1='/filesystem_location/onlinelog';

System altered.

SQL> alter system set db_create_online_log_dest_2='/filesystem_location/onlinelog';

System altered.

Check status of the affected redo group #1.


SQL> select status from v$log where GROUP#=1;
STATUS 
----------------
CLEARING_CURRENT

SQL> alter database clear unarchived logfile group 1;
Database altered.
 
 
SQL> select group#, status from v$log where GROUP#=1;

    GROUP# STATUS
---------- ----------------
         1 CURRENT

Open the database

SQL> alter database open resetlogs;

Database altered.

SQL> select open_mode from v$database;

OPEN_MODE
--------------------
READ WRITE
 
 

Tuesday 21 July 2015

'WARNING: Subscription for node down event still pending' in Listener Log

This is a single instance database 11.2.0.3 running on RHEL box.

This is related to the Oracle TNS Listener's default subscription to the Oracle Notification Service (ONS). It is recommended to disable this subscriptionIn a non-RAC environment.

Work around for this is to modify the listener.ora file to include the following:

SUBSCRIBE_FOR_NODE_DOWN_EVENT_<listener_name>=OFF


Then reload or restart the listener.


$ lsnrctl reload  <listener_name>


or


$lsnrctl stop <listener_name>


$lsnrct; start <listener_name>

ORA-00060: deadlock detected while waiting for resource. Global Enqueue Services Deadlock detected. More info in file

Environment:

2-node RAC 11.2.0.3.

The following messages are constantly showing up in the alert log.

ORA-00060: deadlock detected while waiting for resource


Tue Jul 21 11:29:29 2015
Global Enqueue Services Deadlock detected. More info in file
 /apps/oracle/diag/rdbms/testdb/testdb2/trace/testdb2_lmd0_21823642.trc.
Tue Jul 21 11:29:40 2015
Global Enqueue Services Deadlock detected. More info in file
 /apps/oracle/diag/rdbms/testdb/testdb2/trace/testdb2_lmd0_21823642.trc.
Tue Jul 21 11:32:12 2015
Global Enqueue Services Deadlock detected. More info in file
 /apps/oracle/diag/rdbms/testdb/testdb2/trace/testdb2_lmd0_21823642.trc.
Tue Jul 21 11:33:03 2015
Global Enqueue Services Deadlock detected. More info in file
 /apps/oracle/diag/rdbms/testdb/testdb2/trace/testdb2_lmd0_21823642.trc.
Tue Jul 21 11:33:29 2015
Global Enqueue Services Deadlock detected. More info in file
 /apps/oracle/diag/rdbms/testdb/testdb2/trace/testdb2_lmd0_21823642.trc.
Tue Jul 21 11:34:22 2015
Global Enqueue Services Deadlock detected. More info in file
 /apps/oracle/diag/rdbms/testdb/testdb2/trace/testdb2_lmd0_21823642.trc.
Tue Jul 21 11:35:04 2015
Global Enqueue Services Deadlock detected. More info in file
 /apps/oracle/diag/rdbms/testdb/testdb2/trace/testdb2_lmd0_21823642.trc.


Checking the specified LMD tracefile revealed the locks, sql text and other information.

Note: LMDx - The Global Enqueue Service Daemon (LMD) is the lock agent process that manages enqueue manager service requests for Global Cache Service enqueues to control access to global enqueues and resources. The LMD process also handles deadlock detection and remote enqueue requests. Remote resource requests are the requests originating from another instance.

Tracefile "/apps/oracle/diag/rdbms/testdb/testdb2/trace/testdb2_lmd0_21823642.trc" shows the following:

Submitting asynchronized dump request [28]. summary=[ges process stack dump (kjdglblkrdm1)].
Global blockers dump end:-----------------------------------
Global Wait-For-Graph(WFG) at ddTS[0.a5c4] :
BLOCKED 0x7000001cbfccba8 5 wq 2 cvtops x1 TX 0x1470008.0x72e(ext 0x5,0x0)[2009-0094-000001E3] inst 2
BLOCKER 0x7000001c60cd8b8 5 wq 1 cvtops x28 TX 0x1470008.0x72e(ext 0x5,0x0)[2010-010B-00000008] inst 2
BLOCKED 0x7000001c7267ed8 5 wq 2 cvtops x1 TX 0x88000e.0x3578(ext 0x5,0x0)[2010-010B-00000008] inst 2
BLOCKER 0x7000001c7267d08 5 wq 1 cvtops x28 TX 0x88000e.0x3578(ext 0x5,0x0)[2009-0094-000001E3] inst 2

*** 2015-07-21 12:13:52.792



From the above, it is clear that two sessions involved in TX 0x1470008.0x72e and TX 0x88000e.0x3578 form a deadlock, both sessions are from instance 2.

The trace file also contain the following:

 user session for deadlock lock 0x7000001b1ff2b68
  .........
  current SQL:
  delete from DATA where CLEANUP <= :1 and rownum <= :2 



Oracle states that "this is a typical application transaction TX enqueue lock, usually caused by SQL commit sequence and high concurrency. To avoid such deadlock, application code and logic need to be modified."

This information was communicated to the application team for further investigation.
 


 


 





Thursday 16 July 2015

RMAN-03002: failure of restore command at 07/16/2015 20:08:00 RMAN-06004: ORACLE error from recovery catalog database: RMAN-20005: target database name is ambiguous

RMAN-03002: failure of restore command at 07/16/2015 20:08:00
RMAN-06004: ORACLE error from recovery catalog database: RMAN-20005: target database name is ambiguous

The above error was encountered whiles trying to restore a control file to a different host.

Solution:

Set DBID to that of the source database and attempt the restore again. Obtain this from the source database.
syntax: RMAN> set DBID=2378162372

 rman target / catalog rman/XXXXX@CATDB msglog=/tmp/controlfile_restore16jul2015.log
RMAN> set DBID=2378162372
RMAN> run
{
2> 3> allocate channel CH01 type 'sbt_tape';
4> allocate channel CH02 type 'sbt_tape';
allocate channel CH03 type 'sbt_tape';
5> 6> send 'NB_ORA_POLICY=hostname-XXX-HOT-WEEKLY,NB_ORA_SERV=nbumaster,NB_ORA_CLIENT=hostname,NB_ORA_SCHED=Default-Application-Backup';
7> restore controlfile from tag TAG20150716T191835;
8> release channel CH01;
9> release channel CH02;
10> release channel CH03;
11> }



Wednesday 15 July 2015

upgrading RMAN catalog schema to support 12c databases

Existing infrastructure: Recovery catalog database is 11.2.0.3. We have just deployed a 12.1.0.1.0 database and it needs to be included in the backup infrastructure immediately.

An attempt was made to register the 12c database. This was met with errors relating to the version of the catalog schema.

Connecting from the 12c database:

[oracle@mytestbox]$ rman target / catalog rman/xxxxxxx@CATDB

Recovery Manager: Release 12.1.0.1.0 - Production on Mon Jun 22 09:34:59 2015

Copyright (c) 1982, 2013, Oracle and/or its affiliates. All rights reserved.

connected to target database: CDB12CR1 (DBID=1111111111)
connected to recovery catalog database
PL/SQL package RMAN.DBMS_RCVCAT version 11.02.00.02 in RCVCAT database is too old

RMAN> register database;

PL/SQL package RMAN.DBMS_RCVCAT version 11.02.00.02 in RCVCAT database is too old
PL/SQL package RMAN.DBMS_RCVCAT version 11.02.00.02 in RCVCAT database is too old
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of register command at 06/22/2015 09:35:19
RMAN-06429: RCVCAT database is not compatible with this version of RMAN


Solution:

We need to upgrade the RMAN owner schema on the catalog database.

1. check the current version of the schema on the catalog database.

$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Wed Jul 15 15:46:47 2015

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select * from rman.rcver;

VERSION
------------
11.02.00.03









2. Connect to the 12c database and catalog database.

$ rman target / catalog rman/xxxxxx@CATDB

Recovery Manager: Release 12.1.0.1.0 - Production on Wed Jul 15 14:23:52 2015

Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.

connected to target database: SVWPRD1B (DBID=1111111111)
connected to recovery catalog database
PL/SQL package RMAN.DBMS_RCVCAT version 11.02.00.03 in RCVCAT database is too old

RMAN>

RMAN>

RMAN>

RMAN> register database;

PL/SQL package RMAN.DBMS_RCVCAT version 11.02.00.03 in RCVCAT database is too old
PL/SQL package RMAN.DBMS_RCVCAT version 11.02.00.03 in RCVCAT database is too old
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of register command at 07/15/2015 14:27:59
RMAN-06429: RCVCAT database is not compatible with this version of RMAN

RMAN> upgrade catalog;

recovery catalog owner is RMAN
enter UPGRADE CATALOG command again to confirm catalog upgrade

RMAN> upgrade catalog;

recovery catalog upgraded to version 12.01.00.01
DBMS_RCVMAN package upgraded to version 12.01.00.01
DBMS_RCVCAT package upgraded to version 12.01.00.01

RMAN>

RMAN>

RMAN> register database;

database registered in recovery catalog
starting full resync of recovery catalog
full resync complete

RMAN>

3. Check version of the schema on the catalog database again.

$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Wed Jul 15 15:46:47 2015

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select * from rman.rcver;

VERSION
------------
12.01.00.01