Series: Be a Problem Solver Not a DBA #1

I’m going to try from time to time to publish some scenarios I have been in throughout my career where being a DBA doesn’t mean being a DBA, it means being a problem solver.

In my opinion being a good problem solver requires following several basic tenets, one of which is:

“When presented with competing hypotheses to solve a problem, one should select the solution with the fewest assumptions.” – William of Ockham

Having done many migrations throughout my career, I have learned that performing database migrations is much like a “Reality TV” script. Everything starts out with a plan, the plan is executed and usually, with days to go, there is a big risk that jeopardizes the project. All to be figured out in the end with a successful migration. A recent migration was no different, however this time, it was a perfect example of how to be a Problem Solver not a DBA.

The purpose of this post is to not fully explain problem solving methods, it is more to discuss going outside the comfort zone as a DBA and look at items that you may not normally look at. In this case, I know enough about java and java coding to be dangerous, but knew that the other resources looking at this weren’t going to solve the problem (per the vendor, there was only one person on the planet who could solve this and they were on a bus in Italy) so I had to take things into my own hands.

A little background:

This particular migration was an upgrade from 11.2.0.4 to 12.2.0.1 on a SuperCluster. About a week or so out, I saw a very high spike in memory utilization to the point where the system was out of memory. Upon investigation, we found out that their scheduling agent was utilizing a newer version of Java and in turn, using 4x more heap space than the previous version of Java.

Upon investigation, I found that the process was not utilizing either the -Xms or -Xmx flags when invoking the process so what changed between Java versions to cause the increased utilization?

Since we did not own that portion of the application, the issue was transferred to the responsible party to troubleshoot. After several days of no movement, I decided to put my “Problem Solving” had on.

Using the your tenets of problem solving follow a logical path:
After lots of searching, I tried to check the defaults of what the java uses for min heap and max heap by default. There was a big change from the old and new version. For example, the old version used:

java -XX:+PrintFlagsFinal -version | grep HeapSize
    uintx ErgoHeapSizeLimit               = 0               {product}
    uintx InitialHeapSize                := 126556928       {product}
    uintx LargePageHeapSizeThreshold      = 134217728       {product}
    uintx MaxHeapSize                    := 2025848832      {product}
java version "1.6.0_45"
Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)

While the new version version used:

java -XX:+PrintFlagsFinal -version | grep HeapSize
    uintx ErgoHeapSizeLimit               = 0              {product}
    uintx HeapSizePerGCThread             = 87241520       {product}
    uintx InitialHeapSize                := 2147483648     {product}
    uintx LargePageHeapSizeThreshold      = 134217728      {product}
    uintx MaxHeapSize                    := 32210157568    {product}
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)

 

Ultimately, the solution was to add the “-Xms and -Xmx flags” to the program invoking the java process as to not utilize the environment defaults. In addition, this doesn’t waste infrastructure resources and also reduces time to invoke and close the java process by only assigning the memory that you need.

And as part of any problem solving exercise, focus from the bottom of the stack up, especially when multiple changes are in play.  In this case, the path with the least assumptions surrounded the changed java version so thats where I focused my effort.

Advertisements

Oracle 12 Non-CDB to PDB Migration Methods

Continuing with trying to finish additional blog posts which have been on my list for a while, is a review of the methods of migrating a Non-Container Database (Non-CDB) to a Container Database (CDB) with Pluggables.

I wish this was easier and Oracle gave you a complete “in-place” method to do this, but the only way to get a Non-CDB (pre-12c database) to a CDB is to create a new database and migrate the Non-CDB into the new CDB as a PDB.

I will make the assumption that the creation of the new CDB is complete. So lets look at some methods of creating / migrating the PDB and their pros and cons:

In either case, you need to start with creating the XML file which will describe the new PDB:

On the Non-CDB:

	SHUTDOWN IMMEDIATE;
	startup mount exclusive;
	alter database open read only;

Create the PDB Describe XML File:

     BEGIN
       DBMS_PDB.DESCRIBE(
          pdb_descr_file => '/export/home/oracle/nonCDBToPDB.xml');
     END;
     /
     shutdown immediate;

Verify the XML File on the new CDB:

     SET SERVEROUTPUT ON
     DECLARE
         hold_var boolean;
     begin
          hold_var := DBMS_PDB.CHECK_PLUG_COMPATIBILITY(pdb_descr_file=>'/export/home/oracle/nonCDBToPDB.xml');
     if hold_var then
          dbms_output.put_line('YES');
     else
          dbms_output.put_line('NO');
     end if;
     end;
     /

Check for errors:

   set lines 300
   col cause for a25
   col message for a150
   set pagesize 9999
   select name,cause,action,type,message,status from PDB_PLUG_IN_VIOLATIONS;

Before you can proceed, all errors contained in PDB_PLUG_IN_VIOLATIONS must be resolved. Next are 2 out of the 3 methods to migrate an Non-CDB to PDB. I will leave the “COPY” method out of this post as it is not feasible to move any of the databases I deal with on a day to day basis using ‘COPY’.

Create the PDB using “NOCOPY”. While “NOCOPY” is the fastest, it could be the most problematic long term because this function leaves all datafiles where they came from and since the new CDB is likely to be on the same host, the naming differences could be confusing at some point. Nonetheless, for demonstration, the command is quite easy:

     CREATE PLUGGABLE DATABASE devpdb USING '/export/home/oracle/nonCDBToPDB.xml' NOCOPY TEMPFILE REUSE;

Based on my testing, the method I liked the most was the ‘MOVE’ option. To some, this may seem invasive, but for my environments this was the best option because the new file names are also corrected to contain the correct OMF path names based on the new CDB. While this method wasn’t as fast as NOCOPY, in my 18TB environment with 1200 datafiles, this command finished in just over 30 minutes. Pretty acceptable in my book:

     CREATE PLUGGABLE DATABASE devpdb USING '/export/home/oracle/nonCDBToPDB.xml' MOVE TEMPFILE REUSE;

Finishing off the migration to the new PDB is the same regardless of the migration method:

Verify the PDB:

	select name,guid, open_mode from v$pdbs;
	col pdb_name for a15
	select pdb_name, status from dba_pdbs;

Clean up the PDB. This by far was one of the longest operations of the whole conversion:

	alter session set container=devpdb;
	$ORACLE_HOME/rdbms/admin/noncdb_to_pdb.sql

Check for errors:

	set lines 300
	col cause for a25
	col message for a50
	col action for a100
	set pagesize 9999
	select name,cause,action,type,message,status from PDB_PLUG_IN_VIOLATIONS;

In my case, the only remaining violations that I had were some orphan database services that were stuck in the metadata. To clean this up you can execute:

	alter session set container=;
	
	select SERVICE_ID,NAME,NETWORK_NAME,PDB FROM CDB_SERVICES ORDER BY PDB,SERVICE_ID;
	select SERVICE_ID,NAME,NETWORK_NAME FROM DBA_SERVICES ORDER BY SERVICE_ID;
	
	exec dbms_service.delete_service('')

Ensure that all pluggables are open and open on restart:

	alter pluggable database all open;
	alter pluggable database all save state;

As with anything in our business, adequate testing goes a long way and these were my observations in my environment. If your experience varies, I sure would like to hear about it.

Updating Solaris DNS via SVCCFG

It’s been a while since I have made a blog post. Time has certainly been ver precious. Believe me, I have no shortage of topics to cover either.

Recently, while doing some of my routine checks, I discovered one of of the SuperClusters that I support had the DNS of all Global and Local Domains pointing to a DNS server over 2000 miles away. How that happened isn’t entirely important, and I know we can all agree this isn’t a great idea, especially when there is a DNS in the local data center.

So let’s look at the best way to fix this. Solaris 11 makes it pretty easy using the “svccfg” command, right?

svccfg -s network/dns/client listprop config
svccfg -s network/dns/client setprop config/nameserver = net_address: "([primary dns] [backup dns])"
svcadm refresh dns/client
svccfg -s network/dns/client listprop config

Well there you have it. It all set now and forever? Actually, no. Upon further investigation, I found that the “/etc/resolv.conf” file was not updated with the new changes. Why was that? Further research yielded that there was actually one more command to execute and it wasn’t located any where near the first set of commands in the documents:

nscfg import svc:/network/dns/client:default

After I had added that simple command to the set that had been executed first, the “/etc/resolv.conf” file now reflected the new values.

Hope this helps.

Adjusting Available CPU Threads in SuperCluster Local Zones Online

Lately, I have been working on the Oracle SuperCluster platform. After having worked with Linux for the past many years, it was quite refreshing to get back to an OS that so many of us have worked on. As part of our local zone layout, we have a requirement to allocate different amount of M7 CPU Threads per zone. Upon researching the best way to do this, I found varying information, so I thought that I would go ahead and blog about the way that worked best for this situation.

In this case, CPU Thread control was set-up using resource pools. Solaris Resource Pools are described here:

Oracle Solaris Resource Pools

By default, the resource pool does not restrict access or control scheduling. By modifying the resource pool and allocating specific threads to specific zones, you thereby allocate threads to the local zones.

Here’s how:

First, lets display the pool layout. Since we only need to look at allocating threads (the command actually outputs a ton of data), I will limit the output to only what is relevant.

Find the pool configurations you want to effect. Pset pertains directly to cpu threads so that is what we will look for:

#  poolcfg -dc info | egrep 'pset |pset.size|pset.min|pset.max'

        pset pset_[host name]_id_25289
                uint    pset.min 32
                uint    pset.max 32
                uint    pset.size 32
        pset pset_[host name]_id_25223
                uint    pset.min 64
                uint    pset.max 64
                uint    pset.size 64
        pset pset_[host name]_id_25287
                uint    pset.min 64
                uint    pset.max 64
                uint    pset.size 64
        pset pset_[host name]_id_25224
                uint    pset.min 32
                uint    pset.max 32
                uint    pset.size 32
        pset pset_default
                uint    pset.min 1
                uint    pset.max 65536
                uint    pset.size 64

In this case we can see that out of the 256 CPU threads available to this Global Domain, 32 have been allocated to the first local domain, 64 each to the next 2 and then 32 to the last, leaving 64 in the default pool or available to the global domain.

If you would like to see the file which also details the complete rules of the resource pool, you can look here:

/etc/pooladm.conf

To start with any modifications, it is best to ensure that the latest configuration is saved. To do so you can run this command from the global domain:

# pooladm -s

Once this has been done, you can proceed with the reallocation. In this example, I will modify one pool by taking CPU Threads from the default pool.
Using “-d” operates directly on the kernel state, so use this with caution. On a running system, I would reallocate in small chunks. That will give the operating system time to adapt to the different CPU configuration. In this example we will add 8 threads to a local zone which already had 32 Threads:

# poolcfg -dc 'modify pset pset_[host name]_id_25289 ( uint pset.min = 40 ; uint pset.max = 40)'

At this point the change has been made to the configuration file only (/etc/pooladm.conf), not actually to the system. To make the change to the system, save the configuration and commit to the system:

# pooladm -s

# pooladm -c

Once this change is done, we can inspect the configuration by running the same command shown above. Notice the changes below:

#  poolcfg -dc info | egrep 'pset |pset.size|pset.min|pset.max'

        pset pset_[host name]_id_25289
                uint    pset.min 40
                uint    pset.max 40
                uint    pset.size 40
        pset pset_[host name]_id_25223
                uint    pset.min 64
                uint    pset.max 64
                uint    pset.size 64
        pset pset_[host name]_id_25287
                uint    pset.min 64
                uint    pset.max 64
                uint    pset.size 64
        pset pset_[host name]_id_25224
                uint    pset.min 32
                uint    pset.max 32
                uint    pset.size 32
        pset pset_default
                uint    pset.min 1
                uint    pset.max 65536
                uint    pset.size 56

If you need to transfer cpu from one local zone to another, you can do so by executing the following command:

poolcfg -dc 'transfer 8 from pset pset_default to pset_[host name]_id_25289'

Or if you want to assign a specific CPU Thread:

poolcfg -dc 'transfer to pset pset_[host name]_id_25289 ( cpu 5)'

The rest of the steps remain the same. In the next post I will show you how to verify the additional CPU in each local zone.

Improper Use of the Oracle ‘Rownum’ Pseudocolumn

The other day I found myself needing to explain to some developers why their use-case of the Oracle ‘rownum’ pseudocolumn was yielding a result in one database instance, but a completely different result in another.

In this situation, the correct result is the ‘maximum’ value of the column, however this query was also occasionally returning the exact ‘minimum’ value of this column. How could this happen? The answer lies in the using the ‘rownum’ pseudocolumn correctly. Of course there are other (probably better) ways to write this query without the use of ‘rownum’, but I’m not here to debate that right now….

** Note the tables in the query have been changed to protect the innocent.

select column_a from (select column_a,rownum rowid0 from schema.table order by column_a desc ) aa where aa.rowid0 =1;

Oracle documentation states that it depends how Oracle accessed the rows in the query as to which result you will get. For example your results can vary depending on a lot of factors (ie: the order that you inserted the data in the table or if there is an index on the table and how that index is used). For further information you can see the documentation here:

https://docs.oracle.com/cd/B28359_01/server.111/b28286/pseudocolumns009.htm#SQLRF00255

For further explanation, lets explore the explain plans encountered used in each system:

Correct Result:


--------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                               |       |       |     1 (100)|          |
|*  1 |  VIEW                        |                               |  1257 | 32682 |     1   (0)| 00:00:01 |
|   2 |   COUNT                      |                               |       |       |            |          |
|   3 |    INDEX FULL SCAN DESCENDING| SCHEMA_TABLE_PK               |  1257 |  6285 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("AA"."ROWID0"=1)


22 rows selected.

Incorrect Result:


---------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                               |       |       |     4 (100)|          |
|*  1 |  VIEW                   |                               |  1257 | 32682 |     4  (25)| 00:00:01 |
|   2 |   SORT ORDER BY         |                               |  1257 |  6285 |     4  (25)| 00:00:01 |
|   3 |    COUNT                |                               |       |       |            |          |
|   4 |     INDEX FAST FULL SCAN| SCHEMA_TABLE_PK               |  1257 |  6285 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("AA"."ROWID0"=1)


24 rows selected.

As you can see, the major difference here is that the two systems have not chosen the same access path in which to return the data. In one system a plan utilized an ‘INDEX FULL SCAN DESCENDING’ access path, while the other utilized an ‘INDEX FAST FULL SCAN’ access path.

Is this really that different? Turns out it is.

ASK Tom Explained the reason why very concisely:
(Ask TOM “Difference between Full Index Scans and Fast Full Index Scans”)

They state that:

“An index fast full scan reads the ENTIRE index, unsorted, as it exists on disk. It is basically using the index as a “skinny” version of the table. The query in question would only be accessing attributes in the index (we are not using the index as a way to get to the table, we are using the index INSTEAD of the table) We use multiblock IO and read all of the leaf, branch and the root block. We ignore the branch and root blocks and just process the (unordered) data on the leaf blocks.

An index full scan is when we read the index a block at a time – from start to finish. We’ll read the root block, navigate down the left hand side of the index (or right if we are doing a descending full scan) and then when we hit the leaf block – we’ll read across the entire bottom of the index – a block at a time – in sorted order. We use single block IO, not multiblock IO for this operation.”

Well there you have it. And this is why the result is different. How can we keep this from occurring in the future? The answer is to utilize the ‘rownum’ pseudocolumn correctly. Remember, rownum is not a real column so in order to get the right results, it needs to be added after the data is in the sorted order that you want. To do that, make sure you write the query so that ‘rownum’ is applied after the sort. Using the same query above, lets ‘rewrite’ it in such a way that it will achieve the desired results:

select column_a from (select column_a,rownum from (select column_a from schema.table order by column_a desc)) where rownum = 1;

See the steps now?

  1. Retrieve data in sorted order
  2. Apply the ‘rownum’ pseudocolumn
  3. Filter for the desired value in the list

If you must use the ‘rownum’ pseudocolumn, writing your query in this manner will ensure that you always get the same result.

Enjoy!

Oracle Native Network Encryption

With all of the security concerns out there and data being more important than ever, it might be also time to consider encrypting your data connections, even within your own data center. If you are utilizing cloud, there should be no question that some sort of encryption should be used. In terms of what Oracle provides, you have two options, Native Encryption and SSL/TLS encryption. As of the time of this writing, both of these options are free to use and are no longer part of the Advanced Security Option. In this post, I will discuss the set-up and use of Native Encryption, with SSL/TLS to come later.

Native network encryption provided by the Oracle client is by far, the easiest to set up, so in that same context it would also be the easiest to bypass. That said, there are ways to set it up in such a way that those risks can be mitigated. Due to those same risks, Native encryption would be a great solution to use within a private data center, but not in a public or hybrid cloud scenario. SSL/TLS would be an option to pursue in a public or hybrid cloud scenario and I plan to discuss that in a future post.

Set Up:

Setup of Native encryption is pretty straight forward and easy, especially for OCI “Thick” connections and any other method that utilizes the sqlnet.ora file. In cases where that file is not utilized, there is some additional setup and I will discuss that as well.

First, it is important to understand all of the different combinations of parameters which Native encryption uses. Luckily it is only two, however, there are many different combinations and those combinations and their results are better detailed here:

Version 12.x (OCI Thick):
https://docs.oracle.com/database/121/DBSEG/asoconfg.htm#DBSEG020

Version 12.x (JDBC Thin):
https://docs.oracle.com/database/121/DBSEG/asojbdc.htm#DBSEG9609

By default, both sides of any client connection is configured to ‘ACCEPT’ an encrypted connection.  Because of this, you only have to configure one side or the other, but for safety reasons, I would recommend configuration of both sides.

In 11.2, there are a few less options in terms of encryption and checksum algorithms, so for simplicity circumstances, I will just illustrate a 12.x ‘THICK’ client connection to an 11.2.0.4 database.

To enable this option within the ‘THICK’ client:

# sqlnet.ora Network Configuration File: /u01/app/oracle/product/12.2.0/client_1/network/admin/sqlnet.ora
# Generated by Oracle configuration tools.

NAMES.DIRECTORY_PATH= (TNSNAMES, EZCONNECT)

SQLNET.ENCRYPTION_CLIENT = REQUIRED
SQLNET.ENCRYPTION_TYPES_CLIENT = (AES256)

SQLNET.CRYPTO_CHECKSUM_CLIENT = REQUIRED
SQLNET.CRYPTO_CHECKSUM_TYPES_CLIENT = (SHA1)

If you are utilizing JDBC ‘thin’ connections, then you can also set the properties within the java code itself:

prop.setProperty(OracleConnection.CONNECTION_PROPERTY_THIN_NET_ENCRYPTION_LEVEL,level);
prop.setProperty(OracleConnection.CONNECTION_PROPERTY_THIN_NET_ENCRYPTION_TYPES,algorithm);
prop.setProperty(OracleConnection.CONNECTION_PROPERTY_THIN_NET_CHECKSUM_LEVEL,level);
prop.setProperty(OracleConnection.CONNECTION_PROPERTY_THIN_NET_CHECKSUM_TYPES, algorithm);

And edit the sqlnet.ora on the server:

# sqlnet.ora Network Configuration File: /u01/app/11.2.0.4/grid/network/admin/sqlnet.ora
# Generated by Oracle configuration tools.

NAMES.DIRECTORY_PATH= (TNSNAMES, EZCONNECT)

ADR_BASE = /u01/app/oracle

SQLNET.ENCRYPTION_SERVER = REQUIRED
SQLNET.ENCRYPTION_TYPES_SERVER = (AES256)

SQLNET.CRYPTO_CHECKSUM_SERVER = REQUIRED
SQLNET.CRYPTO_CHECKSUM_TYPES_SERVER = (SHA1)

Validation:

There are a few ways to validate that encryption is actually taking place. The easiest is to execute the following SQL upon login to the database:

If no encryption is occurring, then the banner will look like this:

SQL> select network_service_banner from v$session_connect_info
  2  where sid in (select distinct sid from v$mystat);

NETWORK_SERVICE_BANNER
-----------------------
TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
Oracle Advanced Security: encryption service for Linux: Version 11.2.0.4.0 - Production
Oracle Advanced Security: crypto-checksumming service for Linux: Version 11.2.0.4.0 - Production

If encryption is happening, then the banner will return additional data:

SQL> select network_service_banner from v$session_connect_info
  2  where sid in (select distinct sid from v$mystat);

NETWORK_SERVICE_BANNER
-----------------------
TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
Oracle Advanced Security: encryption service for Linux: Version 11.2.0.4.0 - Production
Oracle Advanced Security: AES256 encryption service adapter for Linux: Version 11.2.0.4.0 - Product
Oracle Advanced Security: crypto-checksumming service for Linux: Version 11.2.0.4.0 - Production
Oracle Advanced Security: SHA1 crypto-checksumming service adapter

Notice the 2 additional lines in the banner when encryption is occurring:
Oracle Advanced Security: AES256 encryption service adapter for Linux: Version 11.2.0.4.0 – Product
Oracle Advanced Security: SHA1 crypto-checksumming service adapter

So the database indicates that encryption is happening, so what is actually happening on the wire? To determine that, we can either use a product like Wireshark or trace the connection to the listener. To do this, enable the following parameters in the SQLNET.ORA on the client:

DIAG_ADR_ENABLED=OFF
TRACE_DIRECTORY_CLIENT=/home/oracle/trace
TRACE_FILE_CLIENT=nettrace
TRACE_LEVEL_CLIENT=16

And in the trace filem you will see an entry similar to the following:

(3310995200) [24-APR-2017 10:19:21:077] na_tns:         Encryption is active, using AES256
(3310995200) [24-APR-2017 10:19:21:077] na_tns:         Crypto-checksumming is active, using SHA1

So as you can see, the setup of Native encryption is quite easy. As with any additional feature, performance could be compromised, so make sure you test all combinations thoroughly in order to determine what works best in your environment. Enjoy!

Extending GoldenGate Change Data Capture With Eventactions

In the previous post, I discussed a very simple setup of GoldenGate for the purpose of implementing Change Data Capture.  Occasionally, depending on the requirements and the data volume, it may be worthwhile to suspend replication while the application processes data or performs some other work.  One way to do this is with eventactions.  GoldenGate eventactions are a simple way of telling Goldengate to do something if a certain data situation is encountered.  In this example, I am going to data drive my event actions by using a control table with 2 rows:

EVENT_DESC EVENT_TMSTP
SUSPEND REPLICAT 13-DEC-16 11.31.00.881031 AM
RESUME REPLICAT 13-DEC-16 11.48.51.065772 PM

Based on an update to the timestamp in the source system, the downstream replicat will utilize that data to either suspend or resume the replicat.

The only way that this event action can be completely data driven, is to have 2 replicats.  One which processes all of the change data and another which processes the ‘resume’ command once issued.  The second replicat is needed due to the fact that the first replicat cannot process any ‘resume’ commands or apply any data on its own because it is SUSPENDED!

In the replicat parameter which will process the suspend, the parameters may look like the following:

--Standard entries in a replicat parameter file

MAP MYSCHEMA.EVENTS, TARGET MYSCHEMA.EVENTS, &
COLMAP (USEDEFAULTS), &
FILTER (@STREQ (EVENT_DESC, 'SUSPEND REPLICAT' )), &
EVENTACTIONS (IGNORE RECORD, LOG INFO, REPORT, SUSPEND, CP BOTH);

In the replicat that will process the resume, the parameters may look like:

--Standard entries in a replicat parameter file

ALLOWDUPTARGETMAP

MAP MYSCHEMA.EVENTS, TARGET MYSCHEMA.EVENTS, &
COLMAP (USEDEFAULTS), &
HANDLECOLLISIONS;

MAP MYSCHEMA.EVENTS, TARGET MYSCHEMA.EVENTS, &
FILTER (@STREQ (EVENT_DESC, 'RESUME REPLICAT' )), &
EVENTACTIONS (IGNORE RECORD, LOG, REPORT, CP AFTER,&
SHELL ('./dirshell/resume_replicat.sh $1', VAR $1 = 'RCDC1'));

If you need a more robust mechanism wherby you need to check some other condition prior to issuing the suspend, you can also extend the suspend action further by executing a SQL Statement as shown within the replicat.  In this case, we need to make sure that if a suspend was issued, it is not prior to a resume record being issued.  This could be very helpful to safeguard against accidental suspends being processed since it is very possible that the CDC replicat may be processing data which is well behind that of the one that processes the ‘resume’ event.

--Standard entries in a replicat parameter file

MAP MYSCHEMA.EVENTS, TARGET MYSCHEMA.EVENTS, &
COLMAP (USEDEFAULTS), &
FILTER (@STREQ (EVENT_DESC, 'SUSPEND REPLICAT' )), &
SQLEXEC (ID LOOKUP_RESUME_DATE, &
QUERY ' SELECT EVENT_DESC,EVENT_TMSTP, COUNT(*) SUSPEND_OK FROM MYSCHEMA.EVENTS &
WHERE EVENT_DESC = :p_resume_event_desc AND EVENT_TMSTP < :p_suspend_tmstp & AND NOT EXISTS( & SELECT 1 FROM MYSCHEMA.EVENTS & WHERE EVENT_DESC = :p_suspend_event_desc AND EVENT_TMSTP > :p_suspend_tmstp) &
GROUP BY EVENT_DESC, EVENT_TMSTP ', &
PARAMS (p_resume_event_desc = 'RESUME REPLICAT', p_suspend_event_desc = 'SUSPEND REPLICAT', p_suspend_tmstp = EVENT_TMSTP), &
TRACE ALL, &
BEFOREFILTER), &
FILTER (@STREQ (LOOKUP_RESUME_DATE.SUSPEND_OK, 1 )), &
EVENTACTIONS (IGNORE, LOG, REPORT, SUSPEND, CP BOTH);

As you can see, event actions are very powerful and can be extended in a variety of ways.  This is just one example.  If you choose to implement this, make sure you also account for the event action in any monitoring scripts you have because GoldenGate will show lag while the suspend action is valid.  Have fun!