Monthly Archives: December 2015

Beware the Orphan Streams Capture Effect on Archivelogs

Recently, I ran into an issue where the Oracle Backup and Recovery Manager (RMAN) would not delete old database archivelogs.  I have had this happen before when there as a lag condition with either data guard or GoldenGate extract process.  Upon further investigation, I found that RMAN was issuing the following error.  It looked just like the other times I have encountered this issue:

RMAN-08137: WARNING: archived log not deleted, needed for standby or upstream capture process
archived log file name=+RECOC1/……

This error repeated many times, representing a ton of space being consumed in the DB_FILE_RECOVERY_DEST.  In order to begin diagnosis of the problem (which also happens to be a data guard environment), I went through the normal motions of ensuring proper application of logs to the physical standby environments:

/* On Primary */
SELECT * FROM v$dataguard_stats;
/* To determine which logs are not shipped and applied yet */</p>
<p style="padding-left: 30px;">SELECT *
FROM v$archived_log
WHERE DEST_ID <> 1
AND APPLIED = 'NO'
ORDER BY completion_time;
/* Run on both source and standby to compare the last logs that were applied */</p>
<p style="padding-left: 30px;">SELECT THREAD#, MAX(SEQUENCE#) AS "LAST_APPLIED_LOG"
FROM V$LOG_HISTORY
GROUP BY THREAD#;
/* On Primary */
SELECT *
FROM
(SELECT *
FROM DBA_HIST_SYSMETRIC_HISTORY
WHERE metric_name='Redo Generated Per Sec'
ORDER BY snap_id DESC
)
WHERE rownum<=10;
/* On Primary */
SELECT current_scn FROM v$database;
/* On Standby */
SELECT current_scn FROM v$database;
/* On Primary - To determine actual time difference */
SELECT SCN_TO_TIMESTAMP([SCN of PRIMARY), SCN_TO_TIMESTAMP(SCN of SECONDARY) FROM dual;

After looking at data guard performance, it was clear that data guard itself was not holding RMAN from removing the archivelogs.  In addition to data guard being present, this system is in the process of being migrated from an older system via GoldenGate.  I have seen GoldenGate cause this issue before when an extract was registered with “log retention”, but this is the target system so this isn’t possible, RIGHT?  Just to be sure, I ran a query to see if there were any items which caused the database to think there was a GoldenGate object registered with the system:

/* On Primary - Where archivelogs are being held */
SELECT CAPTURE_NAME, START_TIME FROM dba_capture;


CAPTURE_NAME
---------------------
OGG2$_EECCP2_12C415F04

START_TIME
-------------------------------------------
02-NOV-15 06.47.41.000000000 AM

STATUS
----------
DISABLED

Hey, wait a minute, we do not have an GoldenGate extract or any other registered GoldenGate objects running on this database so why is this entry here?  I then ran the same query as above on my source server and there it is.  The same object on the source.

So how did this happen?

It all points back to the initial data pump that we used to instantiate the database before turning on GoldenGate.  We used a ‘FULL’ data pump export which was taken AFTER the extract was started on the source.  Because of this the export also contained the capture objects necessary to register the extract with ‘LOGRETENTION’.

Solution….

In order to remedy this situation we need to completely remove this ‘orphaned’ capture object from the database.  To do this we need to use the dbms_streams_adm package.  Utilize all of the package defaults, so that you will raise an error should you try to delete the incorrect queue:


exec DBMS_STREAMS_ADM.REMOVE_QUEUE([CAPTURE_NAME])

You should now re-execute the query against dba_capture and the queue just deleted, should no longer be there:


/* On Primary - Where archivelogs are being held */
SELECT CAPTURE_NAME, START_TIME FROM dba_capture;

no rows selected

From this point on, your archivelogs should not be required for any “standby or upstream capture process” and RMAN should now delete your backed up archivelogs providing free space in your DB_FILE_RECOVERY_DEST!