Fast Connection Failover

March 10th, 2013

This topic has consistently come up whenever a database or Real Application Cluster (RAC) node fails. The question is always, “why don’t the applications gracefully failover?” It’s a fair question, and the answer is really that it’s harder than it looks. Oracle has offered Transparent Application Failover (TAF) for a very long time, but TAF only works for OCI based applications like sqlplus and JDBC thick clients. Since the majority of applications seem to use JDBC thin connections TAF isn’t really an option. With JDBC thin connections there is the Fast Connection Failover (FCF) feature. It works in conjunction with a connection cache to provide the ability to recover from a RAC node failure without having to restart the application. And of course with all things Oracle, the connection cache feature has now changed to Universal Connection Pooling (UCP). The other pieces of FCF include Fast Application Notification (FAN) events and Oracle Notification Service (ONS). ONS is used to transport the FAN event notifications.

There is a lot of documentation from Oracle and on the web about how to set up FCF and UCP. There isn’t much documentation though on how to implement it once you have it set up. That’s really what I wanted to talk about, but you do have to understand how to set it up to use it. I have included most of the resources I used to try to understand FCF and UCP at the end of this post. I have also added a working sample to the appsdba.com web site so you can try this out yourself.

To set up the FCF example you will need Java 1.6 and three jar files: ucp.jar, ons.jar, ojdbc6.jar. The jar files are available on otn.oracle.com (OTN) and I used the 11.2.0.3 versions. You can probably use an older version of Java if you can find compatible versions of the jar files, but you’re on your own for that. I’ve only tested with the 11.2.0.3 versions on OTN.

The example from the appsdba.com web site is called FcfExample.java and if you open it up with a text editor there are setup instructions. The instructions are the following:


* To run the test:
*
* Set JAVA_HOME to your Java 1.6 directory
*  (set JAVA_HOME="C:\Program Files (x86)\Java\jdk1.6.0_37\bin")
* Add JAVA_HOME to the path
*  (set PATH="%JAVA_HOME%;%PATH%")
* Add the source directory to the LIBS variable
*  (set LIBS="c:\temp\jdbc_fcf_test")
* Add jar files to the CLASSPATH (ons.jar, ucp.jar, ojdbc6.jar)
*  (set CLASSPATH="c:\temp\jdbc_fcf_test;c:\temp\jdbc_fcf_test\ons.jar;c:\temp\jdbc_fcf_test\ojdbc6.jar;c:\temp\jdbc_fcf_test\
*    ucp.jar"
*
* Compile with: javac FcfExample.java
* Run with: java FcfExample
*
* The idea is to start the script running, then kill the instance or service
* the script is running on (that's why the instance name is listed) and
* the code is written so that the query will restart once a new connection
* is re-established.
*
* An additional file is created called FcfExample.log which lists the FAN
* events coming from ONS during the run. You should see the connection fail
* and then get re-established.

A couple of comments. I created this example using three main sources (i.e. see references [10], [11], [15]), and the example also writes the FAN messages  to a separate log file.

Now the interesting part of the example is the actual code that demonstrates how to recover from a failover.


do {
  try {

    conn = ds.getConnection();
    conn.setAutoCommit(false);
    System.out.println("Obtained a new connection from pool ...");
    System.out.println();

    // Print connection stats
    JDBCConnectionPoolStatistics stats = ds.getStatistics();
    if (stats != null) {
      printStat("AbandonedConnectionsCount", stats);
      printStat("AvailableConnectionsCount", stats);
      printStat("AverageBorrowedConnectionsCount", stats);
      printStat("AverageConnectionWaitTime", stats);
      printStat("BorrowedConnectionsCount", stats);
      printStat("ConnectionsClosedCount", stats);
      printStat("ConnectionsCreatedCount", stats);
      printStat("RemainingPoolCapacityCount", stats);
      printStat("TotalConnectionsCount", stats);
      System.out.println();
      System.out.println("***");
      System.out.println();
    }

    // Start work here, it has been made restartable

    if (retry) {
      if (step == "getBanner") {
        retry = false;
        getBanner(conn);
      }
    } else {
      step = "getBanner";
      getBanner(conn);
    }

    if (retry) {
      if (step == "getInstance") {
        retry = false;
        getInstance(conn);
      }
    } else {
      step = "getInstance";
      getInstance(conn);
    }

    if (retry) {
      if (step == "runQuery") {
        retry = false;
        runQuery(conn);
      }
    } else {
      step = "runQuery";
      runQuery(conn);
    }

    if (retry) {
      if (step == "getInstance") {
        retry = false;
        getInstance(conn);
      }
    } else {
      step = "getInstance";
      getInstance(conn);
    }

  }
  catch (SQLException sqlexc) {
    System.out.println();
    System.out.println("+-------------------------------+");
    System.out.println("| SQL Exception in program      |");
    System.out.println("+-------------------------------+");
    System.out.println("SQL Error Code    : " + sqlexc.getErrorCode());
    System.out.println("SQL Error Message : " + sqlexc.getMessage());
    System.out.println();

    // The recommended way to check connection usability after a
    // RAC-down event triggers UCP FCF actions.
    if (conn == null || !((ValidConnection) conn).isValid()) {
      System.out.println("Connection retry necessary ...");
      try {
        conn.close();
      }
      catch (Exception closeExc) {
        System.out.println("Exception detected when closing connection:");
        closeExc.printStackTrace();
      }
      retry = true;
    }
  }
} while (retry);

Granted this is a simplistic example, but the point is that the application has to handle a failover. This example uses a loop and an exception handler to catch any failures. A retry boolean is used to check to see if there’s been a failure and a step variable records the current step so the code knows where to restart. The proper restart is dependent on the ordering of the steps but this is just a sequential single thread so that’s not a problem.

When this example is run we see a printout of the connection pool stats, then the database banner and instance information, these run quick, and then a longer running query that counts the number of objects in the database up to a limit. It is at this point that you can kill the RAC node the program is running on, that’s why the instance information is listed, and see how the program fails over and re-runs the query.

The following is from a normal run with no failures:


C:\Temp\FCF>java FcfExample
Obtained a new connection from pool ...

AbandonedConnectionsCount          0
AvailableConnectionsCount          9
AverageBorrowedConnectionsCount    0
AverageConnectionWaitTime          0
BorrowedConnectionsCount           1
ConnectionsClosedCount             0
ConnectionsCreatedCount            10
RemainingPoolCapacityCount         15
TotalConnectionsCount              10

***

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

Instance number: 2
Instance name: ORCL2
Host: racvm2

Object count: 1000000000

Instance number: 2
Instance name: ORCL2
Host: racvm2

Released connection ...

***

AbandonedConnectionsCount          0
AvailableConnectionsCount          10
AverageBorrowedConnectionsCount    0
AverageConnectionWaitTime          0
BorrowedConnectionsCount           0
ConnectionsClosedCount             0
ConnectionsCreatedCount            10
RemainingPoolCapacityCount         15
TotalConnectionsCount              10

C:\Temp\FCF>

We see the connection pool usage and no messages about failures in the run. The FAN event log shows the following:

Feb 5, 2013 7:46:30 PM oracle.ucp.admin.UniversalConnectionPoolManagerMBeanImpl getUniversalConnectionPoolManagerMBean
FINE: Universal Connection Pool Manager MBean created
Feb 5, 2013 7:46:30 PM oracle.ucp.jdbc.PoolDataSourceImpl createPoolWithDefaultProperties
FINE: Connection pool instance is created with default properties
Feb 5, 2013 7:46:30 PM oracle.ucp.jdbc.PoolDataSourceImpl createUniversalConnectionPoolMBean
FINE: JDBC PoolMBean created
Feb 5, 2013 7:46:30 PM oracle.ucp.jdbc.PoolDataSourceImpl startPool
FINE: Connection pool instance is created
Feb 5, 2013 7:46:30 PM oracle.ucp.jdbc.oracle.RACManagerImpl start
FINE: failover started
Feb 5, 2013 7:46:31 PM oracle.ucp.common.UniversalConnectionPoolBase start
FINE: universal connection pool started
Feb 5, 2013 7:46:31 PM oracle.ucp.jdbc.PoolDataSourceImpl startPool
FINE: connection pool is started

Now let’s look at what happens when we stop the node that we’re running from in the middle of the longer running query:


C:\Temp\FCF>java FcfExample
Obtained a new connection from pool ...

AbandonedConnectionsCount          0
AvailableConnectionsCount          9
AverageBorrowedConnectionsCount    0
AverageConnectionWaitTime          0
BorrowedConnectionsCount           1
ConnectionsClosedCount             0
ConnectionsCreatedCount            10
RemainingPoolCapacityCount         15
TotalConnectionsCount              10

***

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

Instance number: 1
Instance name: ORCL1
Host: racvm1

+-------------------------------+
| SQL Exception in program      |
+-------------------------------+
SQL Error Code    : 1089
SQL Error Message : ORA-01089: immediate shutdown in progress - no operations are permitted

Connection retry necessary ...
Obtained a new connection from pool ...

AbandonedConnectionsCount          0
AvailableConnectionsCount          8
AverageBorrowedConnectionsCount    0
AverageConnectionWaitTime          0
BorrowedConnectionsCount           1
ConnectionsClosedCount             1
ConnectionsCreatedCount            10
RemainingPoolCapacityCount         16
TotalConnectionsCount              9

***

Object count: 1000000000

Instance number: 3
Instance name: ORCL3
Host: racvm3

Released connection ...

***

AbandonedConnectionsCount          0
AvailableConnectionsCount          6
AverageBorrowedConnectionsCount    0
AverageConnectionWaitTime          0
BorrowedConnectionsCount           0
ConnectionsClosedCount             4
ConnectionsCreatedCount            10
RemainingPoolCapacityCount         19
TotalConnectionsCount              6

C:\Temp\FCF>

The RAC node the script was connected to was stopped with the following command:


112:oracle@racvm1:/home/oracle [ORCL1]
$ srvctl stop instance -d ORCL -i ORCL1
113:oracle@racvm1:/home/oracle [ORCL1]
$

And the FAN event log shows that four connections in the UCP were closed and that the ORCL1 instance was shut down:


C:\Temp\FCF>type FcfExample.log
Feb 5, 2013 7:28:08 PM oracle.ucp.admin.UniversalConnectionPoolManagerMBeanImpl getUniversalConnectionPoolManagerMBean
FINE: Universal Connection Pool Manager MBean created
Feb 5, 2013 7:28:08 PM oracle.ucp.jdbc.PoolDataSourceImpl createPoolWithDefaultProperties
FINE: Connection pool instance is created with default properties
Feb 5, 2013 7:28:08 PM oracle.ucp.jdbc.PoolDataSourceImpl createUniversalConnectionPoolMBean
FINE: JDBC PoolMBean created
Feb 5, 2013 7:28:08 PM oracle.ucp.jdbc.PoolDataSourceImpl startPool
FINE: Connection pool instance is created
Feb 5, 2013 7:28:08 PM oracle.ucp.jdbc.oracle.RACManagerImpl start
FINE: failover started
Feb 5, 2013 7:28:10 PM oracle.ucp.common.UniversalConnectionPoolBase start
FINE: universal connection pool started
Feb 5, 2013 7:28:10 PM oracle.ucp.jdbc.PoolDataSourceImpl startPool
FINE: connection pool is started
Feb 5, 2013 7:28:22 PM oracle.ucp.common.UniversalConnectionPoolBase closePhysicalConnection
FINE: closed physical connection
Feb 5, 2013 7:28:33 PM oracle.ucp.common.UniversalConnectionPoolBase closePhysicalConnection
FINE: closed physical connection
Feb 5, 2013 7:28:33 PM oracle.ucp.common.UniversalConnectionPoolBase closePhysicalConnection
FINE: closed physical connection
Feb 5, 2013 7:28:33 PM oracle.ucp.common.UniversalConnectionPoolBase closePhysicalConnection
FINE: closed physical connection
Feb 5, 2013 7:28:33 PM oracle.ucp.jdbc.oracle.ONSDatabaseEventHandlerTask run
FINE: event triggered: Service name: orcl, Instance name: orcl1, Unique name: orcl, Host name: racvm1, Status: down, Cardinality: 0, Reason: user, Event type: database/event/service

Note that in the failure example that the code restarted at the object count step, we don’t see any information about what node we’re connected to until the last two steps run after the “runQuery” step.

References:

[1] Oracle WebLogic Server and Highly Available Oracle Databases: Oracle Integrated Maximum Availability Solutions – Active GridLink

[2] High-Performance Oracle JDBC Programming
Learn how to improve performance of Oracle-driven JDBC programs by leveraging connection and statement pooling features.
By Yuli Vasiliev
Published April 2009

[3] How to Use the Oracle Universal Connection Pool (UCP) ? [ID 759208.1]

[4] Transitioning to Oracle Universal Connection Pool (UCP)

[5] Fast Connection Failover (FCF) Test Client Using 11g JDBC Driver and 11g RAC Cluster [ID 566573.1]

[6] How to Use the Oracle Universal Connection Pool (UCP) ? [ID 759208.1]

[7] Understanding Transparent Application Failover (TAF) and Fast Connection Failover (FCF) [ID 334471.1]

[8] How to Verify Universal Connection Pool (UCP) / Fast Connection Failover (FCF) Setup [ID 1064652.1]

[9] JDBC Fast Connection Failover with Oracle RAC,  Configuring and testing with Tomcat and the Spring framework, February 08, 2010

[10] URL:http://www.drdobbs.com/jvm/jdbc-fast-connection-failover-with-oracl/222700353
Author: Michael Pilone

[11] http://www.idevelopment.info/data/Programming/java/jdbc/High_Availability/FastConnectionFailoverExampleThin.java

[12] Oracle® Database Administrator’s Guide, 11g Release 2 (11.2)

[13] Oracle® Universal Connection Pool for JDBC Developer’s Guide, 11g Release 2 (11.2)

[14] Oracle® Database JDBC Developer’s Guide, 11g Release 2 (11.2)

[15] Oracle Database 11g UCP Downloads, http://www.oracle.com/technetwork/database/enterprise-edition/downloads/ucp-112010-099129.html – ucpdemos.jar (FCFSample.java)

OLF: The Framework

December 6th, 2012

The Oracle Logging Framework (OLF) is finally ready. It’s available on SourceForge here. The OLF is a framework and a framework, as defined by the Free Merriam-Webster Dictionary, is: “a basic conceptional structure (as of ideas)”. The OLF, when combined with the Instrumentation Library for Oracle (ILO), can be used to implement the framework that was defined in the paper “Oracle Instrumentation“. I wrote about this in the first post about the OLF here.

The ILO provides the ability to set module and action, among other things, and does this with support for nesting. When combined with the ability to turn on extended SQL tracing and capture task timing it simply didn’t make sense to reinvent the wheel. So the OLF leverages the capabilities of the ILO and extensions to integrate the two are included in the install scripts.

The dblog package in the OLF provides the ability to persist logging messages based on a hierarchical level similar to log4j, to capture task timing from the ILO, and an API that the ILO can use to turn on extended SQL tracing dynamically.

Together these two tools achieve all of the goals set forth in the “Oracle Instrumentation” paper. Since they are both released as Open Source software under the GNU Lesser General Public License there should be enough flexibility to use them in just about any Oracle software project.

Using OID as a tnsnames Directory

August 5th, 2012

There’s a great presentation on how to use Oracle Internet Directory (OID) as a tnsnames directory. This is basically Oracle’s replacement for the Oracle Names server. The presentation is called “Replacing tnsnames.ora with OID” by Mark Luszczynski, and can be found here at the SLOUG.org site. It’s really done very well and I just had a few things to add:

  1. You can use the Identity Management 11.1.1.6.0 from the Oracle web site and not have to patch. It is a full installation.
  2. Use “sys” for the database ODS schema and not “system” as shown on the slide for 11.1.1.6.0 or you’ll get privilege errors.
  3. Use commas for the Internet Directory realm. The example shows: dc=aci,dc=corp,dc=net
  4. Pay particular attention to the issue with anonymous binds.

OLF: Configuration Settings

April 14th, 2012

Configuration settings in the Oracle Logging Framework (OLF) consist of adding or modifying rows in the dblog_config table that correspond to the task and its logging options. The table has the following columns:


SQL> desc dblog_config
Name                                      Null?    Type
----------------------------------------- -------- ----------------------------
CONFIG_ID                                 NOT NULL NUMBER
MODULE_NAME                                        VARCHAR2(30)
ACTION_NAME                                        VARCHAR2(30)
USERNAME                                           VARCHAR2(30)
SID                                                NUMBER
INST_ID                                            NUMBER
LOG_LEVEL                                 NOT NULL VARCHAR2(10)
TRACE_ENABLED                             NOT NULL CHAR(1)

SQL>

There is a unique index on the columns: ( module_name, action_name, username, sid, inst_id ). This is done to try and prevent duplicate records. The logging level and whether to turn on trace are set by updating the log_level and trace_enabled columns.

The query of this table has an ORDER BY in it to attempt to get the most restrictive qualifying record. For example the following three records are valid:


SQL> select * from dbamon.dblog_config;

CONFIG_ID MODULE_NAME ACTION_NAME USERNAME   SID INST_ID LOG_LEVEL TRACE_ENABLED
--------- ----------- ----------- -------- ----- ------- --------- -------------
        1 module      action                             DEBUG     Y
        3 module      action      dbamon                 INFO      N
        4 module      action      dbamon      42       1 ERROR     Y

SQL> select user, SYS_CONTEXT('USERENV','SID'), SYS_CONTEXT('USERENV','INSTANCE') from dual;

USER
------------------------------
SYS_CONTEXT('USERENV','SID')
--------------------------------------------------------------------------------
SYS_CONTEXT('USERENV','INSTANCE')
--------------------------------------------------------------------------------
DBAMON
42
1

SQL>

The most restrictive record is config_id 4. If we query dblog.get_current_level we should see a log_level of ERROR:


SQL> select dblog.get_current_level from dual;

GET_CURRENT_LEVEL
--------------------------------------------------------------------------------
INFO

SQL> exec ilo_task.begin_task('module','action');

PL/SQL procedure successfully completed.

SQL> select dblog.get_current_level from dual;

GET_CURRENT_LEVEL
--------------------------------------------------------------------------------
ERROR

SQL>

Note that our first attempt returned the default level of INFO. It wasn’t until we set our module and action, I had already verified the username, sid and inst_id above, that dblog picked up the most restrictive record and returned a log level of ERROR. The same holds true for dblog.is_trace_enabled:


SQL> select dblog.is_trace_enabled from dual;

IS_TRACE_ENABLED
--------------------------------------------------------------------------------
Y

SQL>

So now we can set the log_level and/or whether to trace on the fly for any Oracle process that matches our module and action and/or username and/or sid and instance id.

Next up we’ll talk about the integration with the Method-R Instrumentation Library for Oracle and how together we have an Oracle Logging Framework.

OLF: Appenders

April 12th, 2012

In keeping with a Log4j mode (i.e. levels from the last post) the Oracle Logging Framework (OLF) supplies two appenders or output modes. One outputs to a predefined table and the other to the console (i.e. dbms_output). This is done mainly to keep things simple. Other Oracle logging tools create more code to manage their output modes than they do to manage the actual logging, and this was not something that I wanted to repeat.

The table based appender inserts all logging messages to a single database table.The table has the following format:


SQL> desc dblog_log
Name                                      Null?    Type
----------------------------------------- -------- ----------------------------
LOGID                                     NOT NULL NUMBER
USERNAME                                  NOT NULL VARCHAR2(30)
SID                                       NOT NULL NUMBER
INST_ID                                   NOT NULL NUMBER
DATETIME                                  NOT NULL TIMESTAMP(6)
MODULE_NAME                                        VARCHAR2(30)
ACTION_NAME                                        VARCHAR2(30)
LOG_LEVEL                                          VARCHAR2(10)
TASKTIME                                           NUMBER
OBJECT_NAME                                        VARCHAR2(61)
LINE_NO                                            NUMBER
TEXT                                               CLOB
SQL_ERRNO                                          NUMBER
SQL_ERRMSG                                         VARCHAR2(100)
SQL_BACKTRACE                                      VARCHAR2(1000)

SQL>

The key output fields include tasktime, object_name and line_no from the call stack, text, the SQL error number, message and backtrace if an error.

The second appender uses DBMS_OUTPUT which requires running the code interactively. This appender is primarily meant to facilitate debugging since most code is not run from SQL*Plus. The output is formatted into two lines of the format:


level:user:sid:inst_id:datetime:module:action
level:tasktime:owner.object_name:line_no:text

Errors include three additional lines for error number, message and the backtrace:


ERROR: Oracle error code: 100
ERROR: Oracle error message: ORA-01403: no data found
ERROR: Backtrace: ORA-06512: at line 8

The following anonymous PL/SQL block was run with a config level of INFO:


SQL> set serveroutput on;
declare
l_num number;
begin
ilo_task.begin_task(
module => 'module',
action => 'action');
--
dblog.info('Before statement');
select 1 into l_num from dual;
dblog.info('After statement');
--
ilo_task.end_task;
end;
/SQL>   2    3    4    5    6    7    8    9   10   11   12   13   14
INFO:DBAMON:38:1:12-APR-12 01.02.01.051535 AM:module:action
INFO:NULL:DBAMON.ANONYMOUS BLOCK:Line 8:Before statement

INFO:DBAMON:38:1:12-APR-12 01.02.01.054261 AM:module:action
INFO:NULL:DBAMON.ANONYMOUS BLOCK:Line 10:After statement

PL/SQL procedure successfully completed.

SQL>

The database log table entries were:


USER   DATETIME                     MODULE ACTION LEVEL OBJECT_NAME            LINE TEXT
------ ---------------------------- ------ ------ ----- ---------------------- ---- ----------------
DBAMON 12-APR-12 01.02.01.053545 AM module action INFO  DBAMON.ANONYMOUS BLOCK    8 Before statement
DBAMON 12-APR-12 01.02.01.056147 AM module action INFO  DBAMON.ANONYMOUS BLOCK   10 After statement

For an error condition the error columns look like:


USER   MODULE ACTION LEVEL OBJECT_NAME            LINE TEXT                     errno errmsg                   backtrace
------ ------ ------ ----- ---------------------- ---- ------------------------ ----- ------------------------ --------------------
DBAMON module action ERROR DBAMON.ANONYMOUS BLOCK   13 dblog.error test message   100 ORA-01403: no data found ORA-06512: at line 8

So those are the two planned output modes for the Oracle Logging Facility. Next up we’ll talk a little bit more about the input configuration settings.

OLF: Logging Levels

April 10th, 2012

Logging levels are available in many logging tools. Log4j, log4perl and log4plsql all have logging levels. The Oracle Logging Facility (OLF) follows the basic Log4j logging levels and values:

  • FATAL – 50000
  • ERROR – 40000
  • WARN – 30000
  • INFO – 20000
  • DEBUG – 10000

Log4j also includes an ALL and an OFF level, and the Oracle Logging Framework (OLF) includes these levels as well as a TIMED level that is used to insure that task timing is always logged. Each level is set so that ALL < DEBUG < INFO < WARN < ERROR < FATAL < TIMED < OFF. A default level of INFO is assigned in the OLF code and can be overridden in the dynamic configuration or it can be explicitly set.

The log levels are hierarchical and a call to a level will only log messages that are of a level that is greater than or equal to the current level setting. In other words, if the current log level for the task has been set to INFO, then all dblog calls with a level of INFO or greater will be logged. DEBUG messages will not be logged. The level TIMED is not directly usable as it is set by the task time routine, and the ALL or OFF levels can only be set as the current configuration level.

Only the DEBUG, INFO, WARN, ERROR, FATAL levels can be used at the program level since there are no dblog.timed, dblog.all or dblog.off procedure calls.

The following table shows the output message hierarchy:

Will Output Messages
dblog level DEBUG INFO WARN ERROR FATAL
DEBUG Yes Yes Yes Yes Yes
INFO No Yes Yes Yes Yes
WARN No No Yes Yes Yes
ERROR No No No Yes Yes
FATAL No No No No Yes
TIMED Yes Yes Yes Yes Yes
ALL Yes Yes Yes Yes Yes
OFF No No No No No

The next topic will be about appenders.

OLF: Creating Dynamic Logging

April 9th, 2012

One of the real benefits to using a logging tool is being able to dynamically set log levels. Whether it’s for a single user, program or task you don’t want to have to stop everything, reset a config file and then restart the application. You really want to be able to set the logging dynamically on the fly. The problem with this approach is that to effectively do that means checking every time a logging call is made to see what the current log configuration is. This seems expensive since it usually means querying a table in the database for every call.

A new feature in Oracle 11g is the SQL Query Result Cache which stores the results of SQL queries in an area in the shared pool. By simply supplying a hint and having the result cache enabled, repetitive queries that return the same results are cached, causing almost no work for the database. The following is an example using code from the Oracle Logging Framework.

First a table to store our logging configuration data:


CREATE TABLE dblog_config
(
config_id       NUMBER        CONSTRAINT dblog_config_nn1 NOT NULL ENABLE,
module_name     VARCHAR2(30),
action_name     VARCHAR2(30),
username        VARCHAR2(30),
sid             NUMBER,
inst_id         NUMBER,
log_level       VARCHAR(10)   DEFAULT 'INFO' CONSTRAINT dblog_config_nn2 NOT NULL ENABLE,
trace_enabled   CHAR(1)       CONSTRAINT dblog_config_nn3 NOT NULL ENABLE,
CONSTRAINT dblog_config_ck1 CHECK(log_level IN ('ALL','DEBUG','INFO','WARN','ERROR','FATAL','TIMED','OFF')),
CONSTRAINT dblog_config_ck2 CHECK(trace_enabled IN ('Y','N')),
CONSTRAINT dblog_config_pk PRIMARY KEY (config_id) ENABLE
)
NOCOMPRESS
LOGGING;
--

 

I have added a single row for our testing:


SQL> select * from dbamon.dblog_config;

CONFIG_ID MODULE_NAME ACTION_NAME USERNAME   SID INST_ID LOG_LEVEL TRACE_ENABLED
--------- ----------- ----------- -------- ----- ------- --------- -------------
        1 module      action                             INFO      Y

SQL>

The following query can then be used by our logging code to determine the current configuration for “module” and “action”. This query also allows other fields in our configuration table to trigger actions, but for now we’ll focus on module and action:


SELECT /*+ RESULT_CACHE */ debugid,module_name,action_name,
username,sid,inst_id,UPPER(log_level) AS log_level,
UPPER(trace_enabled) AS trace_enabled
FROM  dbamon.dblog_config
WHERE (UPPER(module_name)=UPPER('&&module')
AND UPPER(action_name)=UPPER('&&action')
AND username IS NULL
AND sid IS NULL)
--
OR (UPPER(module_name)=UPPER('&&module')
AND UPPER(action_name)=UPPER('&&action')
AND UPPER(username)=UPPER(USER)
AND sid IS NULL)
--
OR ( UPPER(module_name)=UPPER('&&module')
AND UPPER(action_name)=UPPER('&&action')
AND username IS NULL
AND (sid=SYS_CONTEXT('USERENV','SID')
AND inst_id=SYS_CONTEXT('USERENV','INSTANCE')))
--
OR ( UPPER(module_name)=UPPER('&&module')
AND UPPER(action_name)=UPPER('&&action')
AND UPPER(username)=UPPER(USER)
AND (sid=SYS_CONTEXT('USERENV','SID')
AND inst_id=SYS_CONTEXT('USERENV','INSTANCE')))
--
--
OR ( module_name IS NULL
AND UPPER(action_name)=UPPER('&&action')
AND username IS NULL
AND sid IS NULL )
--
OR ( module_name IS NULL
AND UPPER(action_name)=UPPER('&&action')
AND UPPER(username)=UPPER(USER)
AND sid IS NULL )
--
OR ( module_name IS NULL
AND UPPER(action_name)=UPPER('&&action')
AND username IS NULL
AND (sid=SYS_CONTEXT('USERENV','SID')
AND inst_id=SYS_CONTEXT('USERENV','INSTANCE')) )
--
OR ( module_name IS NULL
AND UPPER(action_name)=UPPER('&&action')
AND UPPER(username)=UPPER(USER)
AND (sid=SYS_CONTEXT('USERENV','SID')
AND inst_id=SYS_CONTEXT('USERENV','INSTANCE')) )
--
--
OR ( module_name IS NULL
AND action_name IS NULL
AND username IS NULL
AND sid IS NULL )
--
OR ( module_name IS NULL
AND action_name IS NULL
AND UPPER(username)=UPPER(USER)
AND sid IS NULL )
--
OR ( module_name IS NULL
AND action_name IS NULL
AND username IS NULL
AND (sid=SYS_CONTEXT('USERENV','SID')
AND inst_id=SYS_CONTEXT('USERENV','INSTANCE')) )
--
OR ( module_name IS NULL
AND action_name IS NULL
AND (sid=SYS_CONTEXT('USERENV','SID')
AND inst_id=SYS_CONTEXT('USERENV','INSTANCE')) )
/

The cost to run this query the first time is shown below:


CONFIG_ID MODULE_NAME ACTION_NAME USERNAME   SID INST_ID LOG_LEVEL TRACE_ENABLED
--------- ----------- ----------- -------- ----- ------- --------- -------------
        1 module      action                             INFO      Y

Execution Plan
----------------------------------------------------------
Plan hash value: 2514110033

-------------------------------------------------------------------------------------------------
| Id  | Operation          | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                            |     1 |    24 |     3   (0)| 00:00:01 |
|   1 |  RESULT CACHE      | fq5ph52dq320u241n2r7pxkxn5 |       |       |            |          |
|*  2 |   TABLE ACCESS FULL| DBLOG_CONFIG               |     1 |    24 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter(UPPER("MODULE_NAME")='MODULE' AND UPPER("ACTION_NAME")='ACTION' AND
"USERNAME" IS NULL AND "SID" IS NULL OR UPPER("MODULE_NAME")='MODULE' AND
UPPER("ACTION_NAME")='ACTION' AND UPPER("USERNAME")=UPPER(USER@!) AND "SID" IS NULL OR
UPPER("MODULE_NAME")='MODULE' AND UPPER("ACTION_NAME")='ACTION' AND
"SID"=TO_NUMBER(SYS_CONTEXT('USERENV','SID')) AND
"INST_ID"=TO_NUMBER(SYS_CONTEXT('USERENV','INSTANCE')) AND "USERNAME" IS NULL OR
UPPER("MODULE_NAME")='MODULE' AND UPPER("ACTION_NAME")='ACTION' AND
UPPER("USERNAME")=UPPER(USER@!) AND "SID"=TO_NUMBER(SYS_CONTEXT('USERENV','SID')) AND
"INST_ID"=TO_NUMBER(SYS_CONTEXT('USERENV','INSTANCE')) OR "MODULE_NAME" IS NULL AND
UPPER("ACTION_NAME")='ACTION' AND "USERNAME" IS NULL AND "SID" IS NULL OR "MODULE_NAME"
IS NULL AND UPPER("ACTION_NAME")='ACTION' AND UPPER("USERNAME")=UPPER(USER@!) AND "SID"
IS NULL OR "MODULE_NAME" IS NULL AND UPPER("ACTION_NAME")='ACTION' AND
"SID"=TO_NUMBER(SYS_CONTEXT('USERENV','SID')) AND
"INST_ID"=TO_NUMBER(SYS_CONTEXT('USERENV','INSTANCE')) AND "USERNAME" IS NULL OR
"MODULE_NAME" IS NULL AND UPPER("ACTION_NAME")='ACTION' AND
UPPER("USERNAME")=UPPER(USER@!) AND "SID"=TO_NUMBER(SYS_CONTEXT('USERENV','SID')) AND
"INST_ID"=TO_NUMBER(SYS_CONTEXT('USERENV','INSTANCE')) OR "MODULE_NAME" IS NULL AND
"ACTION_NAME" IS NULL AND "USERNAME" IS NULL AND "SID" IS NULL OR "MODULE_NAME" IS NULL
AND "ACTION_NAME" IS NULL AND UPPER("USERNAME")=UPPER(USER@!) AND "SID" IS NULL OR
"MODULE_NAME" IS NULL AND "ACTION_NAME" IS NULL AND
"SID"=TO_NUMBER(SYS_CONTEXT('USERENV','SID')) AND
"INST_ID"=TO_NUMBER(SYS_CONTEXT('USERENV','INSTANCE')) AND "USERNAME" IS NULL OR
"MODULE_NAME" IS NULL AND "ACTION_NAME" IS NULL AND
"SID"=TO_NUMBER(SYS_CONTEXT('USERENV','SID')) AND
"INST_ID"=TO_NUMBER(SYS_CONTEXT('USERENV','INSTANCE')))

Result Cache Information (identified by operation id):
------------------------------------------------------

1 - column-count=8; dependencies=(DBAMON.DBLOG_CONFIG); parameters=(uid, nls, sys_context); name="SELECT /*+ RESULT_CACHE */ config_id,module_name,action_name,
username,sid,inst_id,UPPER(log_level) AS log_level,
UPPER(trace_en"

Statistics
----------------------------------------------------------
127  recursive calls
0  db block gets
153  consistent gets
0  physical reads
0  redo size
887  bytes sent via SQL*Net to client
419  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
6  sorts (memory)
0  sorts (disk)
1  rows processed

Note that the explain plan contains a result cache operation so our hint was accepted. Subsequent runs show the advantage of the result cache. No logical I/Os are required once the results are cached:


Statistics
----------------------------------------------------------
0  recursive calls
0  db block gets
0  consistent gets
0  physical reads
0  redo size
887  bytes sent via SQL*Net to client
419  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processed

The second query is almost “free”. What happens if we change a value in our “config” table? If we change the debug level from “INFO” to “DEBUG” then Oracle has to do a little more work:


CONFIG_ID MODULE_NAME ACTION_NAME USERNAME   SID INST_ID LOG_LEVEL TRACE_ENABLED
--------- ----------- ----------- -------- ----- ------- --------- -------------
        1 module      action                             DEBUG     Y

Statistics
----------------------------------------------------------
0  recursive calls
0  db block gets
7  consistent gets
0  physical reads
0  redo size
888  bytes sent via SQL*Net to client
419  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processed

Note that I didn’t show it, but the execution plan stays the same, and the query now requires 7 logical I/Os. The database had to access the table to get the updated data. A subsequent query shows the use of the query result cache again:


CONFIG_ID MODULE_NAME ACTION_NAME USERNAME   SID INST_ID LOG_LEVEL TRACE_ENABLED
--------- ----------- ----------- -------- ----- ------- --------- -------------
        1 module      action                             DEBUG     Y

Statistics
----------------------------------------------------------
0  recursive calls
0  db block gets
0  consistent gets
0  physical reads
0  redo size
888  bytes sent via SQL*Net to client
419  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processed

So if we use this technique to query our configuration table during each call from the framework the cost of being able to dynamically adjust logging is pretty low. This means we can easily change the logging level, or enable or disable logging or tracing on the fly for any session that uses the framework.

Note: I changed the name of the configuration table from what I had originally posted and this is now reflected in the examples. The configuration table will be called “dblog_config” in the first version of the OLF.

 

Oracle Logging Framework

April 9th, 2012

There is finally something to publish about the Oracle Logging Framework (OLF). This was a project that I have been thinking about for a long time. I even registered it as a SourceForge project back in 2008. Since then I’ve written about the framework in a paper title “Oracle Instrumentation” that was published in the IOUG SELECT magazine in 2009, and I spoke about the framework at last year’s Oracle Open World.  I’ve written about it on this blog, once about integrating with the Method-R ILO here and summarizing the Open World presentations here.

Well I’ve finally gotten around to almost finishing it, and I thought I would start to blog about the design of the code and then use the resulting posts to create the user guide. It will also give me a chance to document why some of the design decisions were made and generally what this logger has to offer that the other Oracle loggers out there don’t. I plan to follow/cover the requirements laid out in the original Oracle Instrumentation paper and how each of them were implemented. Along the way there will be plenty of examples and hopefully a compelling case to give it a try.

The following are the basic logging requirements of the framework:

  • Debugging should be written to one or more “persistent” sources and preferably to a console for initial debugging.
  • Should be able to activate debugging remotely for one or more “users”.
  • Should be able to enable debugging for one or more “modules” in the code.
  • Provide logging levels (i.e. informational or error)
  • The ability to record errors and anomalies.
  • The ability to record runtime information.

The first post will be about dynamic logging, which is really a requirement to being able to activate logging on the fly. Then we’ll cover logging levels. And after that we’ll have to see.

Open World 2011 Presentations

October 4th, 2011

I want to thank everyone for attending my two sessions yesterday. I also want to thank the OAUG Database SIG for the opportunity to present how we’re using DBFS HS at NIF and the IOUG for letting me speak about instrumenting applications for Oracle databases. Since I ran short on time in the instrumentation talk, I wanted to post the details of the logging and debugging examples.

The first example shows setting the level to INFO and then running two different anonymous PL/SQL blocks. The first doesn’t set the module and action and so it doesn’t output anything. The second block does set module and action and so the dblog calls output to the console and the log table.


INFO Level:

SQL> !cat insert_info.sql
insert into dblog_settings values(1,'module','action',null,null,null,'INFO','N');
commit;

SQL> @insert_info

1 row created.

Commit complete.

SQL> @show_settings

USERNAME        SID    INST_ID module   action   level  trace
-------- ---------- ---------- -------- -------- ------ -----
                               module   action   INFO   N

SQL> !cat dblog_info.sql
declare
  l_num number;
begin
  dblog.info('Before statement');
  select 1 into l_num from dual;
  dblog.info('After statement');
exception
  when others then
    dblog.error('Error message');
end;
/

SQL> @dblog_info

PL/SQL procedure successfully completed.

SQL> @show_log

no rows selected

SQL> !cat dblog_info_task.sql
declare
  l_num number;
begin
  ilo_task.begin_task(
    module => 'module',
    action => 'action');
  --
  dblog.info('Before statement');
  select 1 into l_num from dual;
  dblog.info('After statement');
  --
  ilo_task.end_task;
end;
/

SQL> @dblog_info_task.sql
INFO:DBAMON:51:1:03-OCT-11 02.36.40.269133 PM:module:action
INFO:DBAMON.ANONYMOUS BLOCK:Line 8:Before statement

INFO:DBAMON:51:1:03-OCT-11 02.36.40.275672 PM:module:action
INFO:DBAMON.ANONYMOUS BLOCK:Line 10:After statement

PL/SQL procedure successfully completed.

SQL> @show_log

USERNAME        SID    INST_ID DATETIME           module   action   level    TASKTIME OBJECT_NAME               LINE_NO TEXT
-------- ---------- ---------- ------------------ -------- -------- ------ ---------- ---------------------- ---------- --------------------
DBAMON           51          1 03-OCT-11 02.36.40 module   action   INFO              DBAMON.ANONYMOUS BLOCK          8 Before statement
DBAMON           51          1 03-OCT-11 02.36.40 module   action   INFO              DBAMON.ANONYMOUS BLOCK         10 After statement

SQL>

The next example shows how task times can be recorded using the ILO to mark the begin and end times:


SQL> @clean_log

Table truncated.

SQL> !cat dblog_info_tasktime.sql
declare
  l_num number;
begin
  ilo_task.begin_task(
    module => 'module',
    action => 'action',
    comment => 'Task1',
    begin_time => systimestamp);
  --
  dblog.info('Before statement');
  select 1 into l_num from dual;
  dblog.info('After statement');
  --
  ilo_task.end_task(
    end_time => systimestamp);
end;
/

SQL> @dblog_info_tasktime.sql
INFO:DBAMON:60:1:03-OCT-11 02.41.48.059157 PM:module:action
INFO:DBAMON.ANONYMOUS BLOCK:Line 10:Before statement

INFO:DBAMON:60:1:03-OCT-11 02.41.48.107721 PM:module:action
INFO:DBAMON.ANONYMOUS BLOCK:Line 12:After statement

TASK_TIME:DBAMON:60:1:03-OCT-11 02.41.48.114385 PM:module:action
TASK_TIME:ILO.ILO_TIMER:Line 127:Task1

PL/SQL procedure successfully completed.

SQL> @show_log

USERNAME        SID    INST_ID DATETIME           module   action   level    TASKTIME OBJECT_NAME               LINE_NO TEXT
-------- ---------- ---------- ------------------ -------- -------- ------ ---------- ---------------------- ---------- --------------------
DBAMON           60          1 03-OCT-11 02.41.48 module   action   INFO              DBAMON.ANONYMOUS BLOCK         10 Before statement
DBAMON           60          1 03-OCT-11 02.41.48 module   action   INFO              DBAMON.ANONYMOUS BLOCK         12 After statement
DBAMON           60          1 03-OCT-11 02.41.48 module   action             .057362 ILO.ILO_TIMER                 127 Task1

SQL>

The third example shows how to invoke tracing when a task is matched and the trace_enabled flag has been set:


SQL> !cat delete_info.sql
delete from dblog_settings;
commit;

SQL> @delete_info.sql

1 row deleted.

Commit complete.

SQL> !cat insert_trace.sql
insert into dblog_settings values(1,'module','action',null,null,null,'INFO','Y');
commit;

SQL> @insert_trace.sql

1 row created.

Commit complete.

SQL> @show_settings

USERNAME        SID    INST_ID module   action   level  trace
-------- ---------- ---------- -------- -------- ------ -----
                               module   action   INFO   Y

SQL> !cat dblog_trace.sql
declare
  l_num number;
begin
  ilo_task.begin_task(
    module => 'module',
    action => 'action');
  --
  dblog.info('Before statement');
  select 1 into l_num from dual;
  dblog.info('After statement');
  --
  ilo_task.end_task;
end;
/

SQL> @dblog_trace.sql
GET_CONFIG: tracing has been enabled
INFO:DBAMON:60:1:03-OCT-11 02.44.49.047982 PM:module:action
INFO:DBAMON.ANONYMOUS BLOCK:Line 8:Before statement

INFO:DBAMON:60:1:03-OCT-11 02.44.49.105359 PM:module:action
INFO:DBAMON.ANONYMOUS BLOCK:Line 10:After statement

PL/SQL procedure successfully completed.

SQL>

SQL> @dispusr
Enter value for 1: dbamon

User         Inst  Oracle  Serial OS      User          Client     Client                                               Service
Name           ID     SID  Number PID       ID Status   Identifier Info       Module               Action               Name
----------- ----- ------- ------- ------ ----- -------- ---------- ---------- -------------------- -------------------- ---------------
DBAMON          1      50     131 4585     262 INACTIVE                       SQL Developer                             SYS$USERS
DBAMON          1      60     473 5466     262 INACTIVE                       SQL*Plus                                  SYS$USERS
DBAMON          1      48     193 5526     262 ACTIVE                         SQL*Plus                                  SYS$USERS
SQL>

[oracle@localhost debugging]$ cd /home/oracle/app/oracle/diag/rdbms/orcl/orcl/trace
[oracle@localhost trace]$ ls -lrt
total 3108
-rw-rw---- 1 oracle oracle     85 Sep  8 12:16 orcl_ora_2613.trm
-rw-rw---- 1 oracle oracle    828 Sep  8 12:16 orcl_ora_2613.trc
-rw-rw---- 1 oracle oracle     59 Sep  8 12:16 orcl_mman_2690.trm
-rw-rw---- 1 oracle oracle    874 Sep  8 12:16 orcl_mman_2690.trc

< lines deleted >

-rw-rw---- 1 oracle oracle   4021 Oct  3 12:52 orcl_lgwr_2660.trc
-rw-rw---- 1 oracle oracle 412070 Oct  3 12:55 alert_orcl.log
-rw-rw---- 1 oracle oracle    265 Oct  3 14:44 orcl_ora_5466.trm
-rw-rw---- 1 oracle oracle  31902 Oct  3 14:44 orcl_ora_5466.trc
-rw-rw---- 1 oracle oracle    506 Oct  3 14:46 orcl_mmon_2668.trm
-rw-rw---- 1 oracle oracle   5177 Oct  3 14:46 orcl_mmon_2668.trc
[oracle@localhost trace]$ more orcl_ora_5466.trc
Trace file /home/oracle/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_5466.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /home/oracle/app/oracle/product/11.2.0/dbhome_2
System name:    Linux
Node name:      localhost.localdomain
Release:        2.6.18-194.17.1.0.1.el5
Version:        #1 SMP Wed Sep 29 15:40:03 EDT 2010
Machine:        i686
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 40
Unix process pid: 5466, image: oracle@localhost.localdomain (TNS V1-V3)

*** 2011-10-03 14:44:49.029
*** SESSION ID:(60.473) 2011-10-03 14:44:49.029
*** CLIENT ID:() 2011-10-03 14:44:49.029
*** SERVICE NAME:(SYS$USERS) 2011-10-03 14:44:49.029
*** MODULE NAME:(SQL*Plus) 2011-10-03 14:44:49.029
*** ACTION NAME:() 2011-10-03 14:44:49.029

=====================
PARSING IN CURSOR #44032292 len=70 dep=1 uid=260 oct=47 lid=260 tim=1317678288954581 hv=3588952050 ad='3ad591d8' sqlid='2qkj5qvayq3zk'
BEGIN dbms_monitor.session_trace_enable (NULL, NULL, TRUE, TRUE); END;
END OF STMT
EXEC #44032292:c=0,e=771,p=0,cr=0,cu=0,mis=1,r=1,dep=1,og=1,plh=0,tim=1317678288954565
CLOSE #44032292:c=0,e=4,dep=1,type=3,tim=1317678289029614
ILO_TASK.BEGIN_TASK[][module][action][oracle~pts/1~sqlplus@localhost.localdomain (TNS V1-V3)~SYS$USERS][]

*** 2011-10-03 14:44:49.030
*** MODULE NAME:(module) 2011-10-03 14:44:49.030
*** ACTION NAME:(action) 2011-10-03 14:44:49.030

=====================
PARSING IN CURSOR #67389968 len=51 dep=1 uid=260 oct=47 lid=260 tim=1317678289030201 hv=3782099507 ad='3ad8fad0' sqlid='0dk9kgvhqwhjm'
BEGIN DBMS_SESSION.SET_IDENTIFIER(:client_id); END;
END OF STMT
BINDS #67389968:
Bind#0
oacdty=01 mxl=2000(255) mxlc=00 mal=00 scl=00 pre=00
oacflg=13 fl2=206001 frm=01 csi=873 siz=2000 off=0
kxsbbbfp=03fdcd24  bln=2000  avl=64  flg=09
value="oracle~pts/1~sqlplus@localhost.localdomain (TNS V1-V3)~SYS$USERS"
*** CLIENT ID:(oracle~pts/1~sqlplus@localhost.localdomain (TNS V1-V3)~SYS$USERS) 2011-10-03 14:44:49.047

EXEC #67389968:c=0,e=17330,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=1,plh=0,tim=1317678289047514
CLOSE #67389968:c=0,e=5,dep=1,type=3,tim=1317678289047606

< Truncated >

[oracle@localhost trace]$

And finally calling the error level in an exception handler. Notice the backtrace information with the line number where the error occurred:


SQL> @dblog_error.sql
FATAL:DBAMON:51:1:03-OCT-11 02.28.30.231192 PM:SQL*Plus:
FATAL:DBAMON.DBLOG:Line 861:Error message
FATAL Oracle error code: 100
FATAL Oracle error message: ORA-01403: no data found
FATAL Backtrace: ORA-06512: at line 4

PL/SQL procedure successfully completed.

SQL> l
1  declare
2    l_num number;
3  begin
4    select 1 into l_num from dummy;
5  exception
6    when others then
7      dblog.error('Error message');
8* end;
SQL> @show_error

USERNAME DATETIME           level  OBJECT_NAME               LINE_NO  SQL_ERRNO SQL_ERRMSG                SQL_BACKTRACE
-------- ------------------ ------ ---------------------- ---------- ---------- ------------------------- -------------------------
DBAMON   30-SEP-11 01.15.10 ERROR  DBAMON.ANONYMOUS BLOCK          7        100 ORA-01403: no data found  ORA-06512: at line 4

SQL>

I hope this helps and I’ll post links to the presentations and the instrumentation paper shortly.

The LAG Function

June 23rd, 2011

The Oracle LAG function gives the value of the previous row within a table without the need for a self join. The LAG and LEAD functions have been available since Oracle 8.1.6 when analytic functions were first introduced. Tom Kyte dedicated an entire chapter to analytic functions in his first book, “Expert One-on-One Oracle” and they have obviously been part of the Oracle documentation as well. Their adoption has not been as quick though, and I will admit to being slow when it comes to converting code from using a self join and subtracting n and n+1 values rather than using LAG. In the open source SYSMON code many of the views used for interval summation used a function of the form:

  DECODE( SIGN(e.phyrds - b.phyrds),-1,0,
    (e.phyrds - b.phyrds) ),

This required a self join to get the begin and end values for each interval. This was done to insure that no negative values occurred when an instance restart occurred. The full SQL looked similar to this:

WITH g_fileio AS
(
  select
    fs.snap_id,
    fs.dbid,
    fs.instance_number,
    fs.tsname,
    fs.filename,
    fs.phyrds
  from
    dba_hist_filestatxs fs
)
SELECT
  b.begin_interval_time,
  e.begin_interval_time,
  e.snap_id,
  e.dbid,
  e.instance_number,
  e.tsname,
  e.filename,
  DECODE( SIGN(e.phyrds - b.phyrds),-1,0,
    (e.phyrds - b.phyrds) ),
  DECODE( SIGN(e.phyblkrd - b.phyblkrd),-1,0,
    (e.phyblkrd - b.phyblkrd) ),
  DECODE( SIGN(e.readtim - b.readtim),-1,0,
    (e.readtim - b.readtim) ),
  DECODE( SIGN(e.singleblkrds - b.singleblkrds),-1,0,
    (e.singleblkrds - b.singleblkrds) ),
  DECODE( SIGN(e.singleblkrdtim - b.singleblkrdtim),-1,0,
    (e.singleblkrdtim - b.singleblkrdtim) ),
  DECODE( SIGN(e.phywrts - b.phywrts),-1,0,
    (e.phywrts - b.phywrts) ),
  DECODE( SIGN(e.phyblkwrt - b.phyblkwrt),-1,0,
    (e.phyblkwrt - b.phyblkwrt) ),
  DECODE( SIGN(e.writetim - b.writetim),-1,0,
    (e.writetim - b.writetim) )
FROM
  g_fileio b,
  g_fileio e
WHERE
  e.snap_id = b.snap_id + 1
  AND b.tsname = e.tsname
  AND b.filename = e.filename
  AND b.dbid = e.dbid
  AND b.instance_number = e.instance_number

The following will compare the “old” way of using a self join with the LAG function.

LAG Function Usage

The following example will show how the LAG function can be used when accessing AWR tables in Oracle 11g. Initially just one “value” will be accessed to show how the LAG function works. The following SQL illustrates the example:

column filename format a50;
WITH g_fileio AS
(
  select
    fs.snap_id,
    fs.dbid,
    fs.instance_number,
    fs.tsname,
    fs.filename,
    fs.phyrds
  from
    dba_hist_filestatxs fs
)
SELECT
  e.snap_id,
  e.instance_number,
  e.filename,
  e.phyrds as curr_rds,
  LAG(e.phyrds,1,0) OVER(
    PARTITION BY e.dbid, e.instance_number, e.tsname, e.filename
    ORDER BY e.snap_id ) as prev_rds
FROM
  g_fileio e
WHERE
  e.snap_id between 83578 and 83590
  and e.tsname = 'DATA_TBS'
  and e.instance_number = 2
ORDER BY
  e.snap_id
/

This LAG function has been written in a manner that if the previous row is outside the scope of our result set it will return a zero, otherwise it will return the previous row’s value for phyrds. When this SQL is run we see that for the first interval that the “previous” value is 0. This makes sense since there is no previous row in our set, as constrained by “e.snap_id between 83578 and 83590″, and as we said previously our LAG function has been defined to return a zero in that case. The rest of the rows return the value of the previous row’s phyrds column. Notice that we’ve order by snap_id to make this easy to see in the example below:

   SNAP_ID INSTANCE_NUMBER FILENAME                                 CURRRDS    PREVRDS
---------- --------------- ------------------------------------- ---------- ----------
     83578               2 +ORADATA01/db11g/data_tbs_01.dbf           36477          0
     83579               2 +ORADATA01/db11g/data_tbs_01.dbf           36484      36477
     83580               2 +ORADATA01/db11g/data_tbs_01.dbf           36484      36484
     83581               2 +ORADATA01/db11g/data_tbs_01.dbf           36512      36484
     83582               2 +ORADATA01/db11g/data_tbs_01.dbf           36512      36512
     83583               2 +ORADATA01/db11g/data_tbs_01.dbf           36519      36512
     83584               2 +ORADATA01/db11g/data_tbs_01.dbf           36519      36519
     83585               2 +ORADATA01/db11g/data_tbs_01.dbf           36541      36519
     83586               2 +ORADATA01/db11g/data_tbs_01.dbf           36541      36541
     83587               2 +ORADATA01/db11g/data_tbs_01.dbf           36542      36541
     83588               2 +ORADATA01/db11g/data_tbs_01.dbf           36544      36542
     83589               2 +ORADATA01/db11g/data_tbs_01.dbf           36544      36544
     83590               2 +ORADATA01/db11g/data_tbs_01.dbf           36545      36544

This is an important observation that we will explore further in a subsequent section.

Efficiency

So an interesting question is how much better, if at all, is the LAG function over the original self join?

In order to make the “lag” example produce the same output as the “self-join” example the LAG function was moved into the “WITH” clause so that the row previous to the first row of the result set would be available. In our first example we didn’t do this to illustrate how the LAG function handles the situation where there is no previous row.

The “self-join” SQL:

column filename format a50;
SELECT
  e.snap_id,
  e.instance_number,
  e.filename,
  DECODE( SIGN(e.phyrds - b.phyrds),-1,0,
    (e.phyrds - b.phyrds) ) AS delta_phyrds
FROM
  dba_hist_filestatxs b,
  dba_hist_filestatxs e
WHERE
  e.snap_id = b.snap_id + 1
  AND b.filename = e.filename
  AND b.dbid = e.dbid
  AND b.instance_number = e.instance_number
  and e.snap_id between 83578 and 83590
  and e.tsname = 'DATA_TBS'
  and e.instance_number = 2
ORDER BY
  e.snap_id
/

The “lag” SQL:

column filename format a50;
WITH g_fileio AS
(
  select
    fs.snap_id,
    fs.dbid,
    fs.instance_number,
    fs.tsname,
    fs.filename,
    fs.phyrds - LAG(fs.phyrds,1,0)
    OVER( PARTITION BY fs.dbid, fs.instance_number, fs.tsname, fs.filename
      ORDER BY fs.snap_id ) as delta_phyrds
  from
    dba_hist_filestatxs fs
)
SELECT
  e.snap_id,
  e.instance_number,
  e.filename,
  e.delta_phyrds
FROM
  g_fileio e
WHERE
  e.snap_id BETWEEN 83578 AND 83590 AND
  e.tsname = 'DATA_TBS' AND
  e.instance_number = 2
ORDER BY
  e.snap_id
/

Using Tom Kyte’s “runstats” utility makes testing the difference between these two methods easy. The following will use a slightly modified version available here that summarizes the output into a “workload” format. This was used here because it shows just what is needed to see the difference between the two methods.

The test was run several times to try and filter out the affects of whether the data was cached or not. In fact, the “self-join” version is particularly sensitive to caching and can be expected to run much more slowly when the data is not if the buffer cache.

Test Script

exec runStats_pkg.rs_start;
column filename format a50;
SELECT
  e.snap_id,
  e.instance_number,
  e.filename,
  DECODE( SIGN(e.phyrds - b.phyrds),-1,0,
    (e.phyrds - b.phyrds) ) AS delta_phyrds
FROM
  dba_hist_filestatxs b,
  dba_hist_filestatxs e
WHERE
  e.snap_id = b.snap_id + 1
  AND b.filename = e.filename
  AND b.dbid = e.dbid
  AND b.instance_number = e.instance_number
  and e.snap_id between 83578 and 83590
  and e.tsname = 'CMSPUB_DATA_TBS'
  and e.instance_number = 2
ORDER BY
  e.snap_id
/

exec runStats_pkg.rs_middle;

column filename format a50;
WITH g_fileio AS
(
  select
    fs.snap_id,
    fs.dbid,
    fs.instance_number,
    fs.tsname,
    fs.filename,
    fs.phyrds - LAG(fs.phyrds,1,0) OVER( PARTITION BY fs.dbid, fs.instance_number, fs.tsname, fs.filename
      ORDER BY fs.snap_id ) as delta_phyrds
  from
    dba_hist_filestatxs fs
)
SELECT
  e.snap_id,
  e.instance_number,
  e.filename,
  e.delta_phyrds
FROM
  g_fileio e
WHERE
  e.snap_id between 83578 and 83590
  and e.tsname = 'CMSPUB_DATA_TBS'
  and e.instance_number = 2
ORDER BY
  e.snap_id
/

exec runStats_pkg.rs_stop(p_output=>'WORKLOAD');

Workload Results

The following are the results from three tests run one right after another. In each case the LAG is faster and does less work. In the third run a I reversed the order just to see if running either one first made any difference.

SQL> Run1 ran in 118 hsecs
Run2 ran in 11 hsecs
run 1 ran in 1072.73% of the time

Name                                            Run1        Run2        Diff
STAT...sorts (memory)                              2           3           1
STAT...physical reads                             29           0         -29
STAT...physical reads cache                       29           0         -29
STAT...CPU used by this session                   89          13         -76
STAT...Elapsed Time                              121          14        -107
STAT...redo size                               4,568       4,816         248
STAT...sorts (rows)                            5,860      11,707       5,847
STAT...session logical reads                 170,866      19,517    -151,349

Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct
1,793,976     135,443  -1,658,533  1,324.52%

PL/SQL procedure successfully completed.

SQL>

SQL> Run1 ran in 85 hsecs
Run2 ran in 24 hsecs
run 1 ran in 354.17% of the time

Name                                            Run1        Run2        Diff
STAT...recursive cpu usage                         3           2          -1
STAT...sorts (memory)                              2           3           1
STAT...physical reads cache                        8           0          -8
STAT...physical reads                              8           0          -8
STAT...CPU used by this session                   75          25         -50
STAT...Elapsed Time                               88          28         -60
STAT...redo size                               5,952       5,108        -844
STAT...sorts (rows)                            5,861      11,709       5,848
STAT...session logical reads                 170,942      19,546    -151,396

Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct
1,895,158     701,188  -1,193,970    270.28%

PL/SQL procedure successfully completed.

SQL>

SQL> Run1 ran in 24 hsecs
Run2 ran in 88 hsecs
run 1 ran in 27.27% of the time

Name                                            Run1        Run2        Diff
STAT...sorts (memory)                              3           2          -1
STAT...recursive cpu usage                         2           4           2
STAT...Elapsed Time                               28          90          62
STAT...CPU used by this session                   25          90          65
STAT...redo size                               4,288       5,908       1,620
STAT...sorts (rows)                           11,709       5,861      -5,848
STAT...session logical reads                  19,500     170,948     151,448

Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct
510,514   2,151,923   1,641,409     23.72%

PL/SQL procedure successfully completed.

SQL>

Caveats

The biggest problem with using the LAG function is to insure that the row set that it operates on is complete. This means that the leading row prior to the first row used in the result set must be available. This usually means that an inline view or a with statement must be used to insure that the resulting query has a prior row to work with. The following example will try to illustrate this. In the first case the LAG function is calculated as part of a single query. The where clause restricts the data to a specific range of snap_ids and the resulting lag entry is incorrect for the first row:

select
  snap_id,
  wait_class,
  total_waits_fg AS waits_fg,
  total_waits_fg - LAG(total_waits_fg,1,0)
  OVER(PARTITION BY dbid, instance_number, event_id, wait_class_id
    ORDER BY snap_id) AS waits_fg_lag
from
  dba_hist_system_event
where
  snap_id between 28452 and 28462
  and wait_class = 'Commit'
  and instance_number = 1
order by
  snap_id
/
   SNAP_ID WAIT_CLASS                                         WAITS_FG WAITS_FG_LAG
---------- ------------------------------------------------ ---------- ------------
     28452 Commit                                              1241804      1241804
     28453 Commit                                              1244080         2276
     28454 Commit                                              1246347         2267
     28455 Commit                                              1248675         2328
     28456 Commit                                              1250995         2320
     28457 Commit                                              1253477         2482
     28458 Commit                                              1255821         2344
     28459 Commit                                              1258202         2381
     28460 Commit                                              1260383         2181
     28461 Commit                                              1262600         2217
     28462 Commit                                              1264982         2382

11 rows selected.

SQL>

The next example uses an inline view to allow the LAG function to have the row prior to the first row available to operate on:

compute sum of waits_fg_lag on wait_class;
break on report;
select *
from (
  select
    snap_id,
    wait_class,
    total_waits_fg AS waits_fg,
    total_waits_fg - LAG(total_waits_fg,1,0)
      OVER(PARTITION BY dbid, instance_number, event_id, wait_class_id
      ORDER BY snap_id) AS waits_fg_lag
from
  dba_hist_system_event
where
  instance_number = 1
  and wait_class = 'Commit'
)
where
  snap_id between (28452 + 1) and 28462
order by
  snap_id
/
   SNAP_ID WAIT_CLASS                                         WAITS_FG WAITS_FG_LAG
---------- ------------------------------------------------ ---------- ------------
     28453 Commit                                              1244080         2276
     28454 Commit                                              1246347         2267
     28455 Commit                                              1248675         2328
     28456 Commit                                              1250995         2320
     28457 Commit                                              1253477         2482
     28458 Commit                                              1255821         2344
     28459 Commit                                              1258202         2381
     28460 Commit                                              1260383         2181
     28461 Commit                                              1262600         2217
     28462 Commit                                              1264982         2382

10 rows selected.

SQL>

In this case the lag result for the first row is correct. Again, this is because the row prior to the first snap_id is available to the LAG function.

Conclusion

The LAG function is much more efficient in these tests than the self-join. The session logical reads average about 19,500 for the lag test and around 171,000 for the self-join test. The run time is always less for the lag test and the self-join test seems particularly sensitive to the benefits of caching when run continually. When initially run the time to run was significantly worse (see Full Runstats Test):

Run1 ran in 8296 hsecs

Run2 ran in 1913 hsecs

run 1 ran in 433.66% of the time

The only disadvantage is the more care must be taken to code the query in such a way that the row prior to the first row wanted is available to the LAG function.