Friday 8 July 2016

ORA-8102 due to data inconsistent between job$ and index I_JOB_NEXT

These errors flooded the database alert log. Thankfully, Oracle was kind enough to specify the object number (obj# or object id) which permitted ease of locating the "challenged" object.

This simple SQL statement lead me to the object.
(290 is the one shown in my alert log.)
SQL> select object_name, object_type, status from dba_objects where object_id='290';
 OWNER
-------------------------------------
OBJECT_NAME
------------------------------------------------------------------------------------
OBJECT_TYPE                                      STATUS
-----------------------------------------------  ------------
SYS
I_JOB_NEXT
INDEX                                                    VALID


I checked further to determine the table which the index is built on.

SQL> select table_name, table_owner from dba_indexes where index_name='I_JOB_NEXT';
TABLE_NAME                                      TABLE_OWNER
------------------------------------  -------------------------------------
JOB$                                                     SYS

Next is to analyze the table/index to determine if there is a sort of corruption

SQL> analyze table  JOB$ validate structure cascade;
analyze table JOB$ validate structure cascade
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file

Solution:
Recreate the table job$ by create a backup table and reload the data into it.
1.SQL> create table job1 as select * from job$;
2.SQL> truncate table job$;
3.SQL> insert into job$ select * from job1;
4. SQL> commit;
5. SQL> analyze table job$ validate structure cascade;

The errors stoped after a successful validation.

In other cases, one may still get the error after following the above steps. In that case the indexes should be recreated.

Rebuild the two indexes on table job$

SQL> alter index I_job_job rebuild online;
SQL> alter index I_job_next rebuild online;
SQL> analyze table job$ validate structure cascade;


ORA-00600: internal error code, arguments: [4194]

These errors constantly appeared in the database alert log at the point of opening a database.

After reviewing the alert log and corresponding trace files. I found that the errors are due to UNDO corruption.

The database is a copy created using storage level snapshot technology. This probably explains the discrepancy between the redo and undo records. Oracle was validating the Undo block sequence number in the undo block against the Redo block sequence number relating to the change being applied and it found a mismatch.

Solution:

The solution I applied was to recreate the undo tablespace.

1. Create pfile from spfile.
2. Change UNDO_MANAGEMENT to MANUAL.
undo_management = manual
3. Start the database using the modified PFILE.
SQL> startup restrict pfile=<new_pfile>;
4. Create a new UNDO tablespace.
SQL> create undo tablespace <new undo tablespace> datafile <datafile> size XXXM;
4. Drop the original tablespace.
SQL> drop tablespace <originial undo tablespace> including contents and datafiles;
5. Shut down the database.
SQL> shut immediate;
6. Startup mount the database using the original spfile.
SQL> startup mount;
7. modify the spfile with the new undo tablespace name
SQL> alter system set undo_tablespace = '<new tablespace created in step 5>' scope=spfile;
8. shut down the database.
SQL> shut immediate;
9. Startup the database with modified spfile.
SQL> startup;