Open World 2011 Presentations

October 4th, 2011

I want to thank everyone for attending my two sessions yesterday. I also want to thank the OAUG Database SIG for the opportunity to present how we’re using DBFS HS at NIF and the IOUG for letting me speak about instrumenting applications for Oracle databases. Since I ran short on time in the instrumentation talk, I wanted to post the details of the logging and debugging examples.

The first example shows setting the level to INFO and then running two different anonymous PL/SQL blocks. The first doesn’t set the module and action and so it doesn’t output anything. The second block does set module and action and so the dblog calls output to the console and the log table.


INFO Level:

SQL> !cat insert_info.sql
insert into dblog_settings values(1,'module','action',null,null,null,'INFO','N');
commit;

SQL> @insert_info

1 row created.

Commit complete.

SQL> @show_settings

USERNAME        SID    INST_ID module   action   level  trace
-------- ---------- ---------- -------- -------- ------ -----
                               module   action   INFO   N

SQL> !cat dblog_info.sql
declare
  l_num number;
begin
  dblog.info('Before statement');
  select 1 into l_num from dual;
  dblog.info('After statement');
exception
  when others then
    dblog.error('Error message');
end;
/

SQL> @dblog_info

PL/SQL procedure successfully completed.

SQL> @show_log

no rows selected

SQL> !cat dblog_info_task.sql
declare
  l_num number;
begin
  ilo_task.begin_task(
    module => 'module',
    action => 'action');
  --
  dblog.info('Before statement');
  select 1 into l_num from dual;
  dblog.info('After statement');
  --
  ilo_task.end_task;
end;
/

SQL> @dblog_info_task.sql
INFO:DBAMON:51:1:03-OCT-11 02.36.40.269133 PM:module:action
INFO:DBAMON.ANONYMOUS BLOCK:Line 8:Before statement

INFO:DBAMON:51:1:03-OCT-11 02.36.40.275672 PM:module:action
INFO:DBAMON.ANONYMOUS BLOCK:Line 10:After statement

PL/SQL procedure successfully completed.

SQL> @show_log

USERNAME        SID    INST_ID DATETIME           module   action   level    TASKTIME OBJECT_NAME               LINE_NO TEXT
-------- ---------- ---------- ------------------ -------- -------- ------ ---------- ---------------------- ---------- --------------------
DBAMON           51          1 03-OCT-11 02.36.40 module   action   INFO              DBAMON.ANONYMOUS BLOCK          8 Before statement
DBAMON           51          1 03-OCT-11 02.36.40 module   action   INFO              DBAMON.ANONYMOUS BLOCK         10 After statement

SQL>

The next example shows how task times can be recorded using the ILO to mark the begin and end times:


SQL> @clean_log

Table truncated.

SQL> !cat dblog_info_tasktime.sql
declare
  l_num number;
begin
  ilo_task.begin_task(
    module => 'module',
    action => 'action',
    comment => 'Task1',
    begin_time => systimestamp);
  --
  dblog.info('Before statement');
  select 1 into l_num from dual;
  dblog.info('After statement');
  --
  ilo_task.end_task(
    end_time => systimestamp);
end;
/

SQL> @dblog_info_tasktime.sql
INFO:DBAMON:60:1:03-OCT-11 02.41.48.059157 PM:module:action
INFO:DBAMON.ANONYMOUS BLOCK:Line 10:Before statement

INFO:DBAMON:60:1:03-OCT-11 02.41.48.107721 PM:module:action
INFO:DBAMON.ANONYMOUS BLOCK:Line 12:After statement

TASK_TIME:DBAMON:60:1:03-OCT-11 02.41.48.114385 PM:module:action
TASK_TIME:ILO.ILO_TIMER:Line 127:Task1

PL/SQL procedure successfully completed.

SQL> @show_log

USERNAME        SID    INST_ID DATETIME           module   action   level    TASKTIME OBJECT_NAME               LINE_NO TEXT
-------- ---------- ---------- ------------------ -------- -------- ------ ---------- ---------------------- ---------- --------------------
DBAMON           60          1 03-OCT-11 02.41.48 module   action   INFO              DBAMON.ANONYMOUS BLOCK         10 Before statement
DBAMON           60          1 03-OCT-11 02.41.48 module   action   INFO              DBAMON.ANONYMOUS BLOCK         12 After statement
DBAMON           60          1 03-OCT-11 02.41.48 module   action             .057362 ILO.ILO_TIMER                 127 Task1

SQL>

The third example shows how to invoke tracing when a task is matched and the trace_enabled flag has been set:


SQL> !cat delete_info.sql
delete from dblog_settings;
commit;

SQL> @delete_info.sql

1 row deleted.

Commit complete.

SQL> !cat insert_trace.sql
insert into dblog_settings values(1,'module','action',null,null,null,'INFO','Y');
commit;

SQL> @insert_trace.sql

1 row created.

Commit complete.

SQL> @show_settings

USERNAME        SID    INST_ID module   action   level  trace
-------- ---------- ---------- -------- -------- ------ -----
                               module   action   INFO   Y

SQL> !cat dblog_trace.sql
declare
  l_num number;
begin
  ilo_task.begin_task(
    module => 'module',
    action => 'action');
  --
  dblog.info('Before statement');
  select 1 into l_num from dual;
  dblog.info('After statement');
  --
  ilo_task.end_task;
end;
/

SQL> @dblog_trace.sql
GET_CONFIG: tracing has been enabled
INFO:DBAMON:60:1:03-OCT-11 02.44.49.047982 PM:module:action
INFO:DBAMON.ANONYMOUS BLOCK:Line 8:Before statement

INFO:DBAMON:60:1:03-OCT-11 02.44.49.105359 PM:module:action
INFO:DBAMON.ANONYMOUS BLOCK:Line 10:After statement

PL/SQL procedure successfully completed.

SQL>

SQL> @dispusr
Enter value for 1: dbamon

User         Inst  Oracle  Serial OS      User          Client     Client                                               Service
Name           ID     SID  Number PID       ID Status   Identifier Info       Module               Action               Name
----------- ----- ------- ------- ------ ----- -------- ---------- ---------- -------------------- -------------------- ---------------
DBAMON          1      50     131 4585     262 INACTIVE                       SQL Developer                             SYS$USERS
DBAMON          1      60     473 5466     262 INACTIVE                       SQL*Plus                                  SYS$USERS
DBAMON          1      48     193 5526     262 ACTIVE                         SQL*Plus                                  SYS$USERS
SQL>

[oracle@localhost debugging]$ cd /home/oracle/app/oracle/diag/rdbms/orcl/orcl/trace
[oracle@localhost trace]$ ls -lrt
total 3108
-rw-rw---- 1 oracle oracle     85 Sep  8 12:16 orcl_ora_2613.trm
-rw-rw---- 1 oracle oracle    828 Sep  8 12:16 orcl_ora_2613.trc
-rw-rw---- 1 oracle oracle     59 Sep  8 12:16 orcl_mman_2690.trm
-rw-rw---- 1 oracle oracle    874 Sep  8 12:16 orcl_mman_2690.trc

< lines deleted >

-rw-rw---- 1 oracle oracle   4021 Oct  3 12:52 orcl_lgwr_2660.trc
-rw-rw---- 1 oracle oracle 412070 Oct  3 12:55 alert_orcl.log
-rw-rw---- 1 oracle oracle    265 Oct  3 14:44 orcl_ora_5466.trm
-rw-rw---- 1 oracle oracle  31902 Oct  3 14:44 orcl_ora_5466.trc
-rw-rw---- 1 oracle oracle    506 Oct  3 14:46 orcl_mmon_2668.trm
-rw-rw---- 1 oracle oracle   5177 Oct  3 14:46 orcl_mmon_2668.trc
[oracle@localhost trace]$ more orcl_ora_5466.trc
Trace file /home/oracle/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_5466.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /home/oracle/app/oracle/product/11.2.0/dbhome_2
System name:    Linux
Node name:      localhost.localdomain
Release:        2.6.18-194.17.1.0.1.el5
Version:        #1 SMP Wed Sep 29 15:40:03 EDT 2010
Machine:        i686
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 40
Unix process pid: 5466, image: oracle@localhost.localdomain (TNS V1-V3)

*** 2011-10-03 14:44:49.029
*** SESSION ID:(60.473) 2011-10-03 14:44:49.029
*** CLIENT ID:() 2011-10-03 14:44:49.029
*** SERVICE NAME:(SYS$USERS) 2011-10-03 14:44:49.029
*** MODULE NAME:(SQL*Plus) 2011-10-03 14:44:49.029
*** ACTION NAME:() 2011-10-03 14:44:49.029

=====================
PARSING IN CURSOR #44032292 len=70 dep=1 uid=260 oct=47 lid=260 tim=1317678288954581 hv=3588952050 ad='3ad591d8' sqlid='2qkj5qvayq3zk'
BEGIN dbms_monitor.session_trace_enable (NULL, NULL, TRUE, TRUE); END;
END OF STMT
EXEC #44032292:c=0,e=771,p=0,cr=0,cu=0,mis=1,r=1,dep=1,og=1,plh=0,tim=1317678288954565
CLOSE #44032292:c=0,e=4,dep=1,type=3,tim=1317678289029614
ILO_TASK.BEGIN_TASK[][module][action][oracle~pts/1~sqlplus@localhost.localdomain (TNS V1-V3)~SYS$USERS][]

*** 2011-10-03 14:44:49.030
*** MODULE NAME:(module) 2011-10-03 14:44:49.030
*** ACTION NAME:(action) 2011-10-03 14:44:49.030

=====================
PARSING IN CURSOR #67389968 len=51 dep=1 uid=260 oct=47 lid=260 tim=1317678289030201 hv=3782099507 ad='3ad8fad0' sqlid='0dk9kgvhqwhjm'
BEGIN DBMS_SESSION.SET_IDENTIFIER(:client_id); END;
END OF STMT
BINDS #67389968:
Bind#0
oacdty=01 mxl=2000(255) mxlc=00 mal=00 scl=00 pre=00
oacflg=13 fl2=206001 frm=01 csi=873 siz=2000 off=0
kxsbbbfp=03fdcd24  bln=2000  avl=64  flg=09
value="oracle~pts/1~sqlplus@localhost.localdomain (TNS V1-V3)~SYS$USERS"
*** CLIENT ID:(oracle~pts/1~sqlplus@localhost.localdomain (TNS V1-V3)~SYS$USERS) 2011-10-03 14:44:49.047

EXEC #67389968:c=0,e=17330,p=0,cr=0,cu=0,mis=0,r=1,dep=1,og=1,plh=0,tim=1317678289047514
CLOSE #67389968:c=0,e=5,dep=1,type=3,tim=1317678289047606

< Truncated >

[oracle@localhost trace]$

And finally calling the error level in an exception handler. Notice the backtrace information with the line number where the error occurred:


SQL> @dblog_error.sql
FATAL:DBAMON:51:1:03-OCT-11 02.28.30.231192 PM:SQL*Plus:
FATAL:DBAMON.DBLOG:Line 861:Error message
FATAL Oracle error code: 100
FATAL Oracle error message: ORA-01403: no data found
FATAL Backtrace: ORA-06512: at line 4

PL/SQL procedure successfully completed.

SQL> l
1  declare
2    l_num number;
3  begin
4    select 1 into l_num from dummy;
5  exception
6    when others then
7      dblog.error('Error message');
8* end;
SQL> @show_error

USERNAME DATETIME           level  OBJECT_NAME               LINE_NO  SQL_ERRNO SQL_ERRMSG                SQL_BACKTRACE
-------- ------------------ ------ ---------------------- ---------- ---------- ------------------------- -------------------------
DBAMON   30-SEP-11 01.15.10 ERROR  DBAMON.ANONYMOUS BLOCK          7        100 ORA-01403: no data found  ORA-06512: at line 4

SQL>

I hope this helps and I’ll post links to the presentations and the instrumentation paper shortly.

The LAG Function

June 23rd, 2011

The Oracle LAG function gives the value of the previous row within a table without the need for a self join. The LAG and LEAD functions have been available since Oracle 8.1.6 when analytic functions were first introduced. Tom Kyte dedicated an entire chapter to analytic functions in his first book, “Expert One-on-One Oracle” and they have obviously been part of the Oracle documentation as well. Their adoption has not been as quick though, and I will admit to being slow when it comes to converting code from using a self join and subtracting n and n+1 values rather than using LAG. In the open source SYSMON code many of the views used for interval summation used a function of the form:

  DECODE( SIGN(e.phyrds - b.phyrds),-1,0,
    (e.phyrds - b.phyrds) ),

This required a self join to get the begin and end values for each interval. This was done to insure that no negative values occurred when an instance restart occurred. The full SQL looked similar to this:

WITH g_fileio AS
(
  select
    fs.snap_id,
    fs.dbid,
    fs.instance_number,
    fs.tsname,
    fs.filename,
    fs.phyrds
  from
    dba_hist_filestatxs fs
)
SELECT
  b.begin_interval_time,
  e.begin_interval_time,
  e.snap_id,
  e.dbid,
  e.instance_number,
  e.tsname,
  e.filename,
  DECODE( SIGN(e.phyrds - b.phyrds),-1,0,
    (e.phyrds - b.phyrds) ),
  DECODE( SIGN(e.phyblkrd - b.phyblkrd),-1,0,
    (e.phyblkrd - b.phyblkrd) ),
  DECODE( SIGN(e.readtim - b.readtim),-1,0,
    (e.readtim - b.readtim) ),
  DECODE( SIGN(e.singleblkrds - b.singleblkrds),-1,0,
    (e.singleblkrds - b.singleblkrds) ),
  DECODE( SIGN(e.singleblkrdtim - b.singleblkrdtim),-1,0,
    (e.singleblkrdtim - b.singleblkrdtim) ),
  DECODE( SIGN(e.phywrts - b.phywrts),-1,0,
    (e.phywrts - b.phywrts) ),
  DECODE( SIGN(e.phyblkwrt - b.phyblkwrt),-1,0,
    (e.phyblkwrt - b.phyblkwrt) ),
  DECODE( SIGN(e.writetim - b.writetim),-1,0,
    (e.writetim - b.writetim) )
FROM
  g_fileio b,
  g_fileio e
WHERE
  e.snap_id = b.snap_id + 1
  AND b.tsname = e.tsname
  AND b.filename = e.filename
  AND b.dbid = e.dbid
  AND b.instance_number = e.instance_number

The following will compare the “old” way of using a self join with the LAG function.

LAG Function Usage

The following example will show how the LAG function can be used when accessing AWR tables in Oracle 11g. Initially just one “value” will be accessed to show how the LAG function works. The following SQL illustrates the example:

column filename format a50;
WITH g_fileio AS
(
  select
    fs.snap_id,
    fs.dbid,
    fs.instance_number,
    fs.tsname,
    fs.filename,
    fs.phyrds
  from
    dba_hist_filestatxs fs
)
SELECT
  e.snap_id,
  e.instance_number,
  e.filename,
  e.phyrds as curr_rds,
  LAG(e.phyrds,1,0) OVER(
    PARTITION BY e.dbid, e.instance_number, e.tsname, e.filename
    ORDER BY e.snap_id ) as prev_rds
FROM
  g_fileio e
WHERE
  e.snap_id between 83578 and 83590
  and e.tsname = 'DATA_TBS'
  and e.instance_number = 2
ORDER BY
  e.snap_id
/

This LAG function has been written in a manner that if the previous row is outside the scope of our result set it will return a zero, otherwise it will return the previous row’s value for phyrds. When this SQL is run we see that for the first interval that the “previous” value is 0. This makes sense since there is no previous row in our set, as constrained by “e.snap_id between 83578 and 83590″, and as we said previously our LAG function has been defined to return a zero in that case. The rest of the rows return the value of the previous row’s phyrds column. Notice that we’ve order by snap_id to make this easy to see in the example below:

   SNAP_ID INSTANCE_NUMBER FILENAME                                 CURRRDS    PREVRDS
---------- --------------- ------------------------------------- ---------- ----------
     83578               2 +ORADATA01/db11g/data_tbs_01.dbf           36477          0
     83579               2 +ORADATA01/db11g/data_tbs_01.dbf           36484      36477
     83580               2 +ORADATA01/db11g/data_tbs_01.dbf           36484      36484
     83581               2 +ORADATA01/db11g/data_tbs_01.dbf           36512      36484
     83582               2 +ORADATA01/db11g/data_tbs_01.dbf           36512      36512
     83583               2 +ORADATA01/db11g/data_tbs_01.dbf           36519      36512
     83584               2 +ORADATA01/db11g/data_tbs_01.dbf           36519      36519
     83585               2 +ORADATA01/db11g/data_tbs_01.dbf           36541      36519
     83586               2 +ORADATA01/db11g/data_tbs_01.dbf           36541      36541
     83587               2 +ORADATA01/db11g/data_tbs_01.dbf           36542      36541
     83588               2 +ORADATA01/db11g/data_tbs_01.dbf           36544      36542
     83589               2 +ORADATA01/db11g/data_tbs_01.dbf           36544      36544
     83590               2 +ORADATA01/db11g/data_tbs_01.dbf           36545      36544

This is an important observation that we will explore further in a subsequent section.

Efficiency

So an interesting question is how much better, if at all, is the LAG function over the original self join?

In order to make the “lag” example produce the same output as the “self-join” example the LAG function was moved into the “WITH” clause so that the row previous to the first row of the result set would be available. In our first example we didn’t do this to illustrate how the LAG function handles the situation where there is no previous row.

The “self-join” SQL:

column filename format a50;
SELECT
  e.snap_id,
  e.instance_number,
  e.filename,
  DECODE( SIGN(e.phyrds - b.phyrds),-1,0,
    (e.phyrds - b.phyrds) ) AS delta_phyrds
FROM
  dba_hist_filestatxs b,
  dba_hist_filestatxs e
WHERE
  e.snap_id = b.snap_id + 1
  AND b.filename = e.filename
  AND b.dbid = e.dbid
  AND b.instance_number = e.instance_number
  and e.snap_id between 83578 and 83590
  and e.tsname = 'DATA_TBS'
  and e.instance_number = 2
ORDER BY
  e.snap_id
/

The “lag” SQL:

column filename format a50;
WITH g_fileio AS
(
  select
    fs.snap_id,
    fs.dbid,
    fs.instance_number,
    fs.tsname,
    fs.filename,
    fs.phyrds - LAG(fs.phyrds,1,0)
    OVER( PARTITION BY fs.dbid, fs.instance_number, fs.tsname, fs.filename
      ORDER BY fs.snap_id ) as delta_phyrds
  from
    dba_hist_filestatxs fs
)
SELECT
  e.snap_id,
  e.instance_number,
  e.filename,
  e.delta_phyrds
FROM
  g_fileio e
WHERE
  e.snap_id BETWEEN 83578 AND 83590 AND
  e.tsname = 'DATA_TBS' AND
  e.instance_number = 2
ORDER BY
  e.snap_id
/

Using Tom Kyte’s “runstats” utility makes testing the difference between these two methods easy. The following will use a slightly modified version available here that summarizes the output into a “workload” format. This was used here because it shows just what is needed to see the difference between the two methods.

The test was run several times to try and filter out the affects of whether the data was cached or not. In fact, the “self-join” version is particularly sensitive to caching and can be expected to run much more slowly when the data is not if the buffer cache.

Test Script

exec runStats_pkg.rs_start;
column filename format a50;
SELECT
  e.snap_id,
  e.instance_number,
  e.filename,
  DECODE( SIGN(e.phyrds - b.phyrds),-1,0,
    (e.phyrds - b.phyrds) ) AS delta_phyrds
FROM
  dba_hist_filestatxs b,
  dba_hist_filestatxs e
WHERE
  e.snap_id = b.snap_id + 1
  AND b.filename = e.filename
  AND b.dbid = e.dbid
  AND b.instance_number = e.instance_number
  and e.snap_id between 83578 and 83590
  and e.tsname = 'CMSPUB_DATA_TBS'
  and e.instance_number = 2
ORDER BY
  e.snap_id
/

exec runStats_pkg.rs_middle;

column filename format a50;
WITH g_fileio AS
(
  select
    fs.snap_id,
    fs.dbid,
    fs.instance_number,
    fs.tsname,
    fs.filename,
    fs.phyrds - LAG(fs.phyrds,1,0) OVER( PARTITION BY fs.dbid, fs.instance_number, fs.tsname, fs.filename
      ORDER BY fs.snap_id ) as delta_phyrds
  from
    dba_hist_filestatxs fs
)
SELECT
  e.snap_id,
  e.instance_number,
  e.filename,
  e.delta_phyrds
FROM
  g_fileio e
WHERE
  e.snap_id between 83578 and 83590
  and e.tsname = 'CMSPUB_DATA_TBS'
  and e.instance_number = 2
ORDER BY
  e.snap_id
/

exec runStats_pkg.rs_stop(p_output=>'WORKLOAD');

Workload Results

The following are the results from three tests run one right after another. In each case the LAG is faster and does less work. In the third run a I reversed the order just to see if running either one first made any difference.

SQL> Run1 ran in 118 hsecs
Run2 ran in 11 hsecs
run 1 ran in 1072.73% of the time

Name                                            Run1        Run2        Diff
STAT...sorts (memory)                              2           3           1
STAT...physical reads                             29           0         -29
STAT...physical reads cache                       29           0         -29
STAT...CPU used by this session                   89          13         -76
STAT...Elapsed Time                              121          14        -107
STAT...redo size                               4,568       4,816         248
STAT...sorts (rows)                            5,860      11,707       5,847
STAT...session logical reads                 170,866      19,517    -151,349

Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct
1,793,976     135,443  -1,658,533  1,324.52%

PL/SQL procedure successfully completed.

SQL>

SQL> Run1 ran in 85 hsecs
Run2 ran in 24 hsecs
run 1 ran in 354.17% of the time

Name                                            Run1        Run2        Diff
STAT...recursive cpu usage                         3           2          -1
STAT...sorts (memory)                              2           3           1
STAT...physical reads cache                        8           0          -8
STAT...physical reads                              8           0          -8
STAT...CPU used by this session                   75          25         -50
STAT...Elapsed Time                               88          28         -60
STAT...redo size                               5,952       5,108        -844
STAT...sorts (rows)                            5,861      11,709       5,848
STAT...session logical reads                 170,942      19,546    -151,396

Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct
1,895,158     701,188  -1,193,970    270.28%

PL/SQL procedure successfully completed.

SQL>

SQL> Run1 ran in 24 hsecs
Run2 ran in 88 hsecs
run 1 ran in 27.27% of the time

Name                                            Run1        Run2        Diff
STAT...sorts (memory)                              3           2          -1
STAT...recursive cpu usage                         2           4           2
STAT...Elapsed Time                               28          90          62
STAT...CPU used by this session                   25          90          65
STAT...redo size                               4,288       5,908       1,620
STAT...sorts (rows)                           11,709       5,861      -5,848
STAT...session logical reads                  19,500     170,948     151,448

Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct
510,514   2,151,923   1,641,409     23.72%

PL/SQL procedure successfully completed.

SQL>

Caveats

The biggest problem with using the LAG function is to insure that the row set that it operates on is complete. This means that the leading row prior to the first row used in the result set must be available. This usually means that an inline view or a with statement must be used to insure that the resulting query has a prior row to work with. The following example will try to illustrate this. In the first case the LAG function is calculated as part of a single query. The where clause restricts the data to a specific range of snap_ids and the resulting lag entry is incorrect for the first row:

select
  snap_id,
  wait_class,
  total_waits_fg AS waits_fg,
  total_waits_fg - LAG(total_waits_fg,1,0)
  OVER(PARTITION BY dbid, instance_number, event_id, wait_class_id
    ORDER BY snap_id) AS waits_fg_lag
from
  dba_hist_system_event
where
  snap_id between 28452 and 28462
  and wait_class = 'Commit'
  and instance_number = 1
order by
  snap_id
/
   SNAP_ID WAIT_CLASS                                         WAITS_FG WAITS_FG_LAG
---------- ------------------------------------------------ ---------- ------------
     28452 Commit                                              1241804      1241804
     28453 Commit                                              1244080         2276
     28454 Commit                                              1246347         2267
     28455 Commit                                              1248675         2328
     28456 Commit                                              1250995         2320
     28457 Commit                                              1253477         2482
     28458 Commit                                              1255821         2344
     28459 Commit                                              1258202         2381
     28460 Commit                                              1260383         2181
     28461 Commit                                              1262600         2217
     28462 Commit                                              1264982         2382

11 rows selected.

SQL>

The next example uses an inline view to allow the LAG function to have the row prior to the first row available to operate on:

compute sum of waits_fg_lag on wait_class;
break on report;
select *
from (
  select
    snap_id,
    wait_class,
    total_waits_fg AS waits_fg,
    total_waits_fg - LAG(total_waits_fg,1,0)
      OVER(PARTITION BY dbid, instance_number, event_id, wait_class_id
      ORDER BY snap_id) AS waits_fg_lag
from
  dba_hist_system_event
where
  instance_number = 1
  and wait_class = 'Commit'
)
where
  snap_id between (28452 + 1) and 28462
order by
  snap_id
/
   SNAP_ID WAIT_CLASS                                         WAITS_FG WAITS_FG_LAG
---------- ------------------------------------------------ ---------- ------------
     28453 Commit                                              1244080         2276
     28454 Commit                                              1246347         2267
     28455 Commit                                              1248675         2328
     28456 Commit                                              1250995         2320
     28457 Commit                                              1253477         2482
     28458 Commit                                              1255821         2344
     28459 Commit                                              1258202         2381
     28460 Commit                                              1260383         2181
     28461 Commit                                              1262600         2217
     28462 Commit                                              1264982         2382

10 rows selected.

SQL>

In this case the lag result for the first row is correct. Again, this is because the row prior to the first snap_id is available to the LAG function.

Conclusion

The LAG function is much more efficient in these tests than the self-join. The session logical reads average about 19,500 for the lag test and around 171,000 for the self-join test. The run time is always less for the lag test and the self-join test seems particularly sensitive to the benefits of caching when run continually. When initially run the time to run was significantly worse (see Full Runstats Test):

Run1 ran in 8296 hsecs

Run2 ran in 1913 hsecs

run 1 ran in 433.66% of the time

The only disadvantage is the more care must be taken to code the query in such a way that the row prior to the first row wanted is available to the LAG function.

DBFS_HS – Database Hierarchical Storage, Part 2

June 21st, 2011

In part 1 we set up a hierarchical store under the user hstest, created a table with SecureFile LOBs, and archived 30 SecureFiles. So what’s going on under the covers and how do we retrieve the data? Is the data even really out of the database? In this article I will try to answer these questions as we continue to explore our hierarchical store.

How it all works

At the simplest level the hierarchical store provides various APIs to accomplish a put, a get, and some informational queries. Intermixed in these functions are the ability to create “links” to archived data such that a query of an archived SecureFile LOB that is no longer in database storage will be retrieved without application intervention (i.e. application transparency). The mechanism of a put involves moving the SecureFile LOB to a cached staging area (i.e. the cache table) until it can be archived out to the store’s defined storage media. Once archived it may be eligible to be “flushed” from the cache table in the database and therefore no longer reside in database storage. Once SecureFile LOB’s are moved to a store cache, or really staging, table they are eventually written to the store’s storage media in batches or “tarballs”. The size of these tarballs is controlled by a property defined for the store in the DBFS_HS$_STOREPROPERTIES table. If a “get” is issued and the SecureFile LOB is in the cache table then it can be retrieved directly. If not then the SecureFile LOB must be recalled from the tarball and written back into the SecureFile LOB’s row in the cache table.

This raises a question about storage. Ideally you would expect to see the cache table steady state at some fairly constant size as the rate of archiving and retrieval evens out. But it seems pretty obvious that the cache table is a good candidate for a dedicated tablespace. Since its space usage is not necessarily predictable it would be nice to keep it separate from other objects so that it’s size could be capped and not allowed to grow uncontrollably. It’s not clear if the cache table could be partitioned, but this would probably only be practical if it was very large. The partitioning scheme would also need to be set in a way that new and old SecureFile LOBs were being read and written to from each partition in order to insure that space was reused. The cache table also seems to be a good candidate for a second tier of storage. If the source table(s) are on a primary storage tier then clearly the cache table is not going to need to meet the same performance criteria, otherwise why archive? Since the actual store will be on a media outside the database this raises the possibility of having at least three storage tiers.

Database Objects

Next let’s look at what objects have been created. The following SYS dictionary tables exist to support dbms_dbfs:

  • DBFS_HS$_BACKUPFILETABLE
  • DBFS_HS$_CONTENTFILEMAPTBL
  • DBFS_HS$_FS
  • DBFS_HS$_PROPERTY
  • DBFS_HS$_SFLOCATORTABLE
  • DBFS_HS$_STORECOMMANDS
  • DBFS_HS$_STOREID2POLICYCTX
  • DBFS_HS$_STOREIDTABLE
  • DBFS_HS$_STOREPROPERTIES
  • DBFS_SFS$_FS
  • DBFS_SFS$_FST
  • DBFS_SFS$_FSTO
  • DBFS_SFS$_FSTP
  • DBFS_SFS$_SNAP
  • DBFS_SFS$_TAB
  • DBFS_SFS$_VOL
  • DBFS$_MOUNTS
  • DBFS$_STATS
  • DBFS$_STORES

Additionally for each store created the following tables are created in the SYS schema:

  • HS$AT<dbname><seq>
  • HS$CT<dbname><seq>
  • HS$DB<dbname><seq>
  • HS$DT<dbname><seq>
  • HS$JT<dbname><seq>
  • HS$TT<dbname><seq>

When a store is created running the dbms_dbfs_hs.createStore procedure a table is created from the “tblname” input parameter. This table is used as a metadata table and stores the SecureFile LOB in the “filedata” column. A second table is also created with the name in the format of SFS$_FSTP_<seq>. This table holds more information about how the SecureFile LOB is archived and provides the gateway into the SYS metadata tables.

A view is also created with the name in the format of VSFS$_<seq> which exposes store based metadata about the archived SecureFile LOB.

In my database the following tables and view have been created:

Store View

The following view can be used to provide a mapping between the archived SecureFile LOB and it’s location in the store and cache. A separate view is required for each store due to the use of separate cache and metadata tables for each store.

create or replace force view hs_store_vas
select
  c.pathname AS path,
  c.item AS name,
  c.std_guid AS guid,
  sf2.propvalue AS file_size,
  dbms_lob.getlength(c.filedata) AS cache_size,
  sf1.propvalue AS contentid,
  l.tarballid,
  b.backupfilename,
  st.storeowner,
  st.storename
from
  hstest.hs_cache c,
  hstest.sfs$_fstp_12 sf1,
  hstest.sfs$_fstp_12 sf2,
  sys.dbfs_hs$_contentfnmaptbl m,
  sys.dbfs_hs$_sflocatortable l,
  sys.dbfs_hs$_backupfiletable b,
  sys.dbfs_hs$_storeidtable st
where
  c.std_guid = sf1.std_guid
  and c.pathtype = 1
  and sf1.propname = 'HS:CONTENTID'
  and sf1.propvalue = m.contentfilename(+)
  and sf1.std_guid = sf2.std_guid
  and sf2.propname = 'HS:FILESIZE'
  and m.archiverefid = l.archiverefid(+)
  and l.tarballid = b.tarballid(+)
  and b.storeid = st.storeid(+);

Usage Examples

Now that everything is setup let’s walk through some examples of how DBFS HS works. First let’s look at what we have. In part 1 of this series I archived 30 SecureFile LOBs and that created one “tarball” operating system file:

380:oracle@testvm:/ora_src/bkup [db11g]
$ ls -lrt
total 8212
-rw-r--r-- 1 oracle oinstall    2136 Jun  6 23:46 Oracle_Disk_SBT_Catalog
-rw-r--r-- 1 oracle oinstall 8388616 Jun  6 23:46 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
381:oracle@testvm:/ora_src/bkup [db11g]

Now we’ll archive 15 more SecureFile LOBs, but we’ll just start at the beginning to show that if the SecureFile LOB has already been archived then re-archiving it does not invoke an error:

SQL> SET SERVEROUTPUT ON;
DECLARE
  v_mountpoint         VARCHAR2(32) := '/hs_store/';
  v_LOB                BLOB;
  v_LOBsize            NUMBER;
  v_debug              CHAR(1) := 'N';
BEGIN
  FOR arch_rec IN
  (
    SELECT hs.* FROM hs_lobtable hs
    where dbms_lob.getlength(hs.lobcol) != 0
    and rownum <= 45   )   LOOP     BEGIN       SELECT hs.lobcol       INTO v_LOB       FROM hs_lobtable hs       WHERE hs.id = arch_rec.id FOR UPDATE;       --       v_LOBsize := SYS.DBMS_LOB.GETLENGTH(v_LOB);       DBMS_OUTPUT.PUT_LINE('Archive ID: ' || arch_rec.id);       DBMS_OUTPUT.PUT_LINE('LOB size: ' || v_LOBsize);       --       -- Write the LOB to the store       --       DBMS_LOB.MOVE_TO_DBFS_LINK (         lob_loc => v_LOB,
        storage_path => v_mountpoint || arch_rec.id,
        flags => DBMS_LOB.DBFS_LINK_NOCACHE);
      --
      COMMIT;
    EXCEPTION
      WHEN OTHERS THEN
        ROLLBACK;
        DBMS_OUTPUT.PUT_LINE(DBMS_UTILITY.FORMAT_ERROR_BACKTRACE);
    END;
  END LOOP;
END;
/
SQL>   2    3    4    5    6    7    8    9   10   11   12   13   14   15   16   17   18   19   20   21   22   23   24   25

26   27   28   29   30   31   32   33   34   35   36   37   38   39

Archive ID: 1
LOB size: 16384
Archive ID: 2
LOB size: 16384
Archive ID: 3
LOB size: 16384
Archive ID: 4
LOB size: 16384
Archive ID: 5
LOB size: 16384
Archive ID: 6
LOB size: 16384
Archive ID: 7
LOB size: 16384
Archive ID: 8
LOB size: 16384
Archive ID: 9
LOB size: 16384
Archive ID: 10
LOB size: 16384
Archive ID: 11
LOB size: 16384
Archive ID: 12
LOB size: 16384
Archive ID: 13
LOB size: 16384
Archive ID: 14
LOB size: 16384
Archive ID: 15
LOB size: 16384
Archive ID: 16
LOB size: 16384
Archive ID: 17
LOB size: 16384
Archive ID: 18
LOB size: 16384
Archive ID: 19
LOB size: 16384
Archive ID: 20
LOB size: 16384
Archive ID: 21
LOB size: 16384
Archive ID: 22
LOB size: 16384
Archive ID: 23
LOB size: 16384
Archive ID: 24
LOB size: 16384
Archive ID: 25
LOB size: 16384
Archive ID: 26
LOB size: 16384
Archive ID: 27
LOB size: 16384
Archive ID: 28
LOB size: 16384
Archive ID: 29
LOB size: 16384
Archive ID: 30
LOB size: 16384
Archive ID: 31
LOB size: 16384
Archive ID: 32
LOB size: 16384
Archive ID: 33
LOB size: 16384
Archive ID: 34
LOB size: 16384
Archive ID: 35
LOB size: 16384
Archive ID: 36
LOB size: 16384
Archive ID: 37
LOB size: 16384
Archive ID: 38
LOB size: 16384
Archive ID: 39
LOB size: 16384
Archive ID: 40
LOB size: 16384
Archive ID: 41
LOB size: 16384
Archive ID: 42
LOB size: 16384
Archive ID: 43
LOB size: 16384
Archive ID: 44
LOB size: 16384
Archive ID: 45
LOB size: 16384

PL/SQL procedure successfully completed.

SQL>

Now let’s run a store push and see if we’ve create another archive file:

SQL> DECLARE
  store_name varchar2(32) := 'HS_STORE';
BEGIN
  dbms_dbfs_hs.storePush(store_name) ;
  commit ;
END;
/  2    3    4    5    6    7

PL/SQL procedure successfully completed.

SQL> !ls -lrt
total 16420
-rw-r--r-- 1 oracle oinstall 8388616 Jun  6 23:46 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
-rw-r--r-- 1 oracle oinstall    3204 Jun 20 09:17 Oracle_Disk_SBT_Catalog
-rw-r--r-- 1 oracle oinstall 8388616 Jun 20 09:17 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2

SQL>

OK, we’ve created a new archive file in the OS. Let’s look at what our metadata view says about our store and the cache for the store:

SQL> select name, cache_size, backupfilename from hs_store_v;

NAME            CACHE_SIZE BACKUPFILENAME
--------------- ---------- ------------------------------------------------------------
LOB1                     0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB10                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB11                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB12                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB13                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB14                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB15                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB16                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB17                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB18                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB19                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB2                     0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB20                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB21                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB22                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB23                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB24                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB25                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB26                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB27                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB28                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB29                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB3                     0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB30                    0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB4                     0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB5                     0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB6                     0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB7                     0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB8                     0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
LOB9                     0 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
31                       0 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2
32                       0 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2
33                       0 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2
34                       0 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2
35                       0 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2
36                       0 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2
37                       0 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2
38                       0 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2
39                       0 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2
40                       0 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2
41                       0 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2
42                       0 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2
43                       0 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2
44                       0 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2
45                       0 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2

45 rows selected.

SQL>

It looks like all 45 of our SecureFile LOBs have been archived to a file and are not currently in the cache. Now let’s see if we can access one of the SecureFile LOBs:

SQL> SET SERVEROUTPUT ON;
DECLARE
  l_LOB                BLOB;
  l_buffer             RAW(4000);
  l_ctr                NUMBER := 1;
  l_amount             NUMBER := 100;
  l_id                 NUMBER := 45;
BEGIN
  SELECT hs.lobcol
  INTO l_LOB
  FROM hs_lobtable hs
  WHERE hs.id = l_id;
  --
  dbms_lob.read(l_LOB, l_amount, l_ctr, l_buffer);
  dbms_output.put_line( RAWTOHEX(l_buffer) );
END;
/SQL>   2    3    4    5    6    7    8    9   10   11   12   13   14   15   16
45506C45676E4761545861514265575641797571534A77687A4F79634D5A6E765771484867736C456B634C6845587466566377747A7A4745646A6774
4E7456704E69596857485A6B6146585767704D567275564D5A74417A616F41746C4874784B6A5061

PL/SQL procedure successfully completed.

SQL>

We were able to read the first 100 bytes of the SecureFile LOB with name or id of ’45′. If we query our metadata we now see:

SQL> select name, cache_size, backupfilename from hs_store_v where name = '45';

NAME            CACHE_SIZE BACKUPFILENAME
--------------- ---------- ------------------------------------------------------------
45                   16384 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2

SQL>

And now the cache_size shows the size of the SecureFile LOB. For those doubters out there I’ll rename the backup file and we’ll try accessing another SecureFile LOB:

SQL> !ls -lrt
total 16420
-rw-r--r-- 1 oracle oinstall 8388616 Jun  6 23:46 __DB11G_1_A5180F78E736D02AE040A8C057011B2C_1
-rw-r--r-- 1 oracle oinstall    3204 Jun 20 09:17 Oracle_Disk_SBT_Catalog
-rw-r--r-- 1 oracle oinstall 8388616 Jun 20 09:17 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2

SQL> !mv __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2.BAK

SQL> DECLARE
  l_LOB                BLOB;
  l_buffer             RAW(4000);
  l_ctr                NUMBER := 1;
  l_amount             NUMBER := 100;
  l_id                 NUMBER := 44;
BEGIN
  SELECT hs.lobcol
  INTO l_LOB
  FROM hs_lobtable hs
  WHERE hs.id = l_id;
  --
  dbms_lob.read(l_LOB, l_amount, l_ctr, l_buffer);
  dbms_output.put_line( RAWTOHEX(l_buffer) );
END;
/  2    3    4    5    6    7    8    9   10   11   12   13   14   15   16
DECLARE
*
ERROR at line 1:
ORA-19507: failed to retrieve sequential file, handle="__DB11G_2_A6248CC964C16A45E040A8C0570111A4_2", parms=""
ORA-27029: skgfrtrv: sbtrestore returned error
ORA-19511: Error received from media manager layer, error text:
sbtpvt_open_input: file /ora_src/bkup/__DB11G_2_A6248CC964C16A45E040A8C0570111A4_2 does not exist or cannot be
accessed, errno = 2
ORA-06512: at "SYS.DBMS_APBACKEND", line 60
ORA-06512: at "SYS.DBMS_APBACKEND", line 1378
ORA-06512: at "SYS.DBMS_DBFS_HS", line 1042
ORA-06512: at "SYS.DBMS_DBFS_HS", line 1203
ORA-06512: at "SYS.DBMS_DBFS_HS", line 1481
ORA-06512: at "SYS.DBMS_DBFS_HS", line 1592
ORA-06512: at line 1
ORA-06512: at "SYS.DBMS_DBFS_CONTENT", line 3108
ORA-06512: at "SYS.DBMS_LOB_AM_PRIVATE", line 65
ORA-06512: at line 1
ORA-06512: at "SYS.DBMS_LOB", line 1056
ORA-06512: at line 13

The access failed because the archive file could not be found. This proves that the database is capable of accessing secondary storage to transparently retrieve the SecureFile LOB. Now we’ll rename the file back and see if we can successfully retrieve the SecureFile LOB:

SQL> !mv __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2.BAK __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2

SQL> DECLARE
  l_LOB                BLOB;
  l_buffer             RAW(4000);
  l_ctr                NUMBER := 1;
  l_amount             NUMBER := 100;
  l_id                 NUMBER := 44;
BEGIN
  SELECT hs.lobcol
  INTO l_LOB
  FROM hs_lobtable hs
  WHERE hs.id = l_id;
  --
  dbms_lob.read(l_LOB, l_amount, l_ctr, l_buffer);
  dbms_output.put_line( RAWTOHEX(l_buffer) );
END;
/  2    3    4    5    6    7    8    9   10   11   12   13   14   15   16
6F4A6A796978784F576F54524261634979444256535747644772655952614D4E6646735657675545566B48577762596C43707162426F7975414F5841
594B6242754D6E5948784E546E49724F78664F4968564154504C4966555156757277475077757544

PL/SQL procedure successfully completed.

SQL> select name, cache_size, backupfilename from hs_store_v where name = '44';

NAME            CACHE_SIZE BACKUPFILENAME
--------------- ---------- ------------------------------------------------------------
44                   16384 __DB11G_2_A6248CC964C16A45E040A8C0570111A4_2

SQL>

And we see that we can indeed access the SecureFile LOB and that it has been retrieved back into the cache.

To wrap this up Oracle has made available a feature that can be used to move closer to an ILM environment. One that can transparently support multiple tiers of storage for SecureFile LOBs.

DBFS_HS – Database Hierarchical Storage

June 7th, 2011

Oracle delivered the database file system in 11gR2, but it also slipped in hierarchical storage as part of the feature. In 11.2.0.2 most of the bugs seem to have been worked out. The DBFS_HS package along with the DBMS_LOB package, and SecureFile LOBs delivers application transparent hierarchical storage. A sort of ILM if you will. This has been a while in coming but it finally seems to really work. I’ll walk through an example, and it’s sort of documented (not all quite there yet) in the Oracle documentation. So why is this important? Well there are some customers that have chosen to store a lot of unstructured data in SecureFiles and don’t want to keep it all in premium storage. DBFS_HS provides a mechanism to move that data to different storage tiers. Initially Oracle is supporting RMAN based archiving and Amazon S3.

The following will show the setup on a Linux VM running Oracle 11.2.0.2 using RMAN with disk based SBT (see the post RMAN Tape Simulation). In future posts I’ll explore how it works under the covers and what you can do with it.

I found the examples in the SecureFiles and Large Object’s Developer Guide to be a little cryptic so hopefully this example will do better.

connect / as sysdba

create tablespace hs_tab
datafile
 '/ora_src/oradata/db11g/hs_tab01.dbf'
 size 10m autoextend on next 10m maxsize 2000m;

create user hstest identified by hstest
default tablespace hs_tab
temporary tablespace temp
quota unlimited on hs_tab;

grant create session to hstest;
grant alter session to hstest;
--
grant dbfs_role to hstest;
grant scheduler_admin to hstest;
--
grant create procedure to hstest;
grant create sequence to hstest;
grant create table to hstest;
grant create trigger to hstest;
grant create view to hstest;
--
grant select any dictionary to hstest;
grant select any table to hstest;
--
-- Run the following:
--
connect hstest/hstest
--
declare
 storename varchar2(32) ;
 tblname varchar2(30) ;
 tbsname varchar2(30) ;
 lob_cache_quota number := 0.8 ;
 cachesz number ;
 ots number ;
begin
 cachesz := (100 * 1048576);  -- Minimum size? 3 MB or less errors
 ots := (10 * 1048576);
 storename := 'HS_STORE' ;
 tblname := 'HS_CACHE' ;
 tbsname := 'HS_TAB' ; -- Substitute a valid tablespace name
 --
 -- Create the store.
 -- Here tbsname is the tablespace used for the store,
 -- tblname is the table holding all the store entities,
 -- cachesz is the space used by the store to cache content
 --   in the tablespace,
 -- lob_cache_quota is the fraction of cachesz allocated
 --   to level-1 cache and
 -- ots is minimum amount of content that will be accumulated
 --   in level-2 cache before being stored in AmazonS3
 --
 dbms_dbfs_hs.createStore(
 storename,
 dbms_dbfs_hs.STORETYPE_TAPE,
 tblname, tbsname, cachesz,
 lob_cache_quota, ots) ;
 --
 dbms_dbfs_hs.setstoreproperty(
 storename,
 dbms_dbfs_hs.PROPNAME_SBTLIBRARY,
 'oracle.disksbt, ENV=(BACKUP_DIR=/ora_src/bkup)' );
 --
 dbms_dbfs_hs.setstoreproperty(
 storename,
 dbms_dbfs_hs.PROPNAME_MEDIAPOOL,
 '0') ;  -- Substitute valid value
 --
 dbms_dbfs_hs.setstoreproperty(
 storename,
 dbms_dbfs_hs.PROPNAME_COMPRESSLEVEL,
 'NONE') ;
 --
 -- Insure that the store is streamable. According to Oracle
 -- this must be set to allow links to work correctly.
 --
 dbms_dbfs_hs.setstoreproperty(
 storename,
 dbms_dbfs_hs.PROPNAME_STREAMABLE,
 'TRUE') ;
 --
 -- Please refer to DBMS_DBFS_CONTENT documentation
 -- for details about this method
 --
 dbms_dbfs_content.registerstore(
 storename,
 'HS',
 'dbms_dbfs_hs') ;
 --
 -- Please refer to DBMS_DBFS_CONTENT documentation
 -- for details about this method
 --
 dbms_dbfs_content.mountstore(storename, 'hs_store') ;
 --
 commit;
end ;
/

That will get you a DBFS store that will archive to the backup directory /ora_src/bkup. You may want to change that to fit your system, but these examples use that directory. The next set of statements have to be run because the create store commands don’t properly insert the backup directory since we’re cheating and using the disk sbt for RMAN.

connect / as sysdba

INSERT INTO SYS.DBFS_HS$_STORECOMMANDS
SELECT STOREID, '''ENV=(BACKUP_DIR=/ora_src/bkup)''', 1
FROM SYS.DBFS_HS$_STOREIDTABLE WHERE STORENAME = 'HS_STORE';

INSERT INTO SYS.DBFS_HS$_STORECOMMANDS
SELECT STOREID, '''ENV=(BACKUP_DIR=/ora_src/bkup)''', 2
FROM SYS.DBFS_HS$_STOREIDTABLE WHERE STORENAME = 'HS_STORE';

COMMIT;

Now we’ll create a directory so we can use the store:

connect hstest/hstest
--
declare
 v_properties  dbms_dbfs_content_properties_t;
begin
 v_properties := NULL;
 dbms_dbfs_content.createDirectory(
 path => '/hs_store',
 properties => v_properties,
 store_name => 'HS_STORE' ) ;
 --
 commit;
end;
/

So we’ve got a hierarchical store defined for the user hstest and it will archive to the operating system directory /ora_src/bkup. Now we need a table with SecureFile LOB’s to test it out.

  CREATE TABLE HS_LOBTABLE
   (	"ID" NUMBER,
	"LASTACCESSDATE" DATE,
	"CONTENTTYPE" NUMBER,
	"LOBCOL" BLOB,
	 PRIMARY KEY ("ID")
  USING INDEX COMPUTE STATISTICS
  TABLESPACE "HS_TAB"  ENABLE
   ) SEGMENT CREATION IMMEDIATE
  NOCOMPRESS LOGGING
  TABLESPACE "HS_TAB"
 LOB ("LOBCOL") STORE AS SECUREFILE (
  TABLESPACE "HS_TAB" ENABLE STORAGE IN ROW CHUNK 32768 RETENTION AUTO
  CACHE LOGGING  NOCOMPRESS  KEEP_DUPLICATES
  ) ;
--
-- Add sequence
--
CREATE SEQUENCE hs_seq;

-
-- Load 100 16K LOBs
--
set serveroutput on;
declare
  v_buffer             RAW(16384);
  v_tmpLOB             BLOB;
  v_amount             NUMBER := 16384;
  --
  v_lobnum             NUMBER := 100;
begin
  FOR lobctr IN 1..v_lobnum LOOP
    DBMS_LOB.CREATETEMPORARY(
      lob_loc => v_tmpLOB,
      cache => TRUE,
      dur => DBMS_LOB.CALL);
    --
    DBMS_LOB.OPEN(
      lob_loc => v_tmpLOB,
      open_mode => DBMS_LOB.LOB_READWRITE);
    --
    v_buffer := NULL;
    --
    FOR i IN 1..v_amount LOOP
      v_buffer := v_buffer || utl_raw.cast_to_raw(dbms_random.string('a',1));
    END LOOP;
    --
    dbms_lob.write(v_tmpLOB, v_amount, 1, v_buffer);
    --
    dbms_output.put_line('New LOB length: ' || dbms_lob.getlength(v_tmpLOB));
    --
    INSERT INTO HS_LOBTABLE
    VALUES(hs_seq.nextval, SYSDATE, 1, v_tmpLOB);
    --
    COMMIT;
    DBMS_LOB.CLOSE(lob_loc => v_tmpLOB);
    --
  END LOOP;
END;
/

If you’ve made it this far then all that’s left is to archive some of the SecureFile LOBs. When I set this up I only archived 30 SecureFile LOBs so I had to run a “store push” to force the database to archive the “cached” LOBs to the /ora_src/bkup directory. I’ll spend some more time on what’s going on under the covers in Part 2.

SET SERVEROUTPUT ON;
DECLARE
 v_mountpoint         VARCHAR2(32) := '/hs_store/';
 --
 v_store              VARCHAR2(32) := 'HS_STORE';
 v_LOB                BLOB;
 v_LOBsize            NUMBER;
 --
 prop1                dbms_dbfs_content_properties_t ;
 newcontent           BLOB;
BEGIN
 FOR arch_rec IN
 (
 SELECT id FROM hs_lobtable
 WHERE rownum <= 30
 )
 LOOP
 BEGIN
 SELECT lobcol
 INTO v_LOB
 FROM hs_lobtable
 WHERE id = arch_rec.id FOR UPDATE;
 --
 v_LOBsize := SYS.DBMS_LOB.GETLENGTH(v_LOB);
 DBMS_OUTPUT.PUT_LINE('Archive id: ' || arch_rec.id);
 DBMS_OUTPUT.PUT_LINE('LOB size: ' || v_LOBsize);
 --
 -- Write the LOB to the store
 --
 DBMS_LOB.MOVE_TO_DBFS_LINK (
 lob_loc => v_LOB,
 storage_path => v_mountpoint || 'LOB' || arch_rec.id,
 flags => DBMS_LOB.DBFS_LINK_NOCACHE);
 --
 COMMIT;
 EXCEPTION
 WHEN OTHERS THEN
 ROLLBACK;
 DBMS_OUTPUT.PUT_LINE(DBMS_UTILITY.FORMAT_ERROR_BACKTRACE);
 RAISE;
 END;
 END LOOP;
END;
/
--
-- This will archive LOBs in the "stage table" to the store. It will leave the LOB
-- in the staging table unless a flush is run.
--
--
DECLARE
 store_name varchar2(32) := 'HS_STORE';
BEGIN
 dbms_dbfs_hs.storePush(store_name) ;
 commit ;
END;
/

ORACLE_UNQNAME

June 6th, 2011

New in 11.2 is the environment variable ORACLE_UNQNAME and it is used with the OEM dbconsole. The following are some tests on an 11.2.0.2 Linux system:

$ echo $ORACLE_SID
db11g
314:oracle@testvm:/home/oracle [db11g]
$ echo $ORACLE_UNQNAME
db11g
315:oracle@testvm:/home/oracle [db11g]
$ emctl status dbconsole
Oracle Enterprise Manager 11g Database Control Release 11.2.0.2.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.

https://testvm:1158/em/console/aboutApplication

Oracle Enterprise Manager 11g is running.
------------------------------------------------------------------
Logs are generated in directory /app/local/oracle/product/11.2.0.2/db_1/testvm_db11g/sysman/log
316:oracle@testvm:/home/oracle [db11g]
$ unset ORACLE_SID
317:oracle@testvm:/home/oracle []
$ emctl status dbconsole
Oracle Enterprise Manager 11g Database Control Release 11.2.0.2.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.

https://testvm:1158/em/console/aboutApplication

Oracle Enterprise Manager 11g is running.
------------------------------------------------------------------
Logs are generated in directory /app/local/oracle/product/11.2.0.2/db_1/testvm_db11g/sysman/log
318:oracle@testvm:/home/oracle []
$ unset ORACLE_UNQNAME
319:oracle@testvm:/home/oracle []
$ emctl status dbconsole
Environment variable ORACLE_UNQNAME not defined. Please set ORACLE_UNQNAME to database unique name.
320:oracle@testvm:/home/oracle []
$ export ORACLE_SID=db11g
321:oracle@testvm:/home/oracle [db11g]
$ emctl status dbconsole
Oracle Enterprise Manager 11g Database Control Release 11.2.0.2.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.

https://testvm:1158/em/console/aboutApplication

Oracle Enterprise Manager 11g is running.
------------------------------------------------------------------
Logs are generated in directory /app/local/oracle/product/11.2.0.2/db_1/testvm_db11g/sysman/log
322:oracle@testvm:/home/oracle [db11g]
$

However, when working on an 11.2.0.2 DataGuard database with db_unique_name set I had to have ORACLE_UNQNAME set as well. Also note that in the message from the dbconsole that the OEM directory is $ORACLE_HOME/testvm_db11g which is the hostname and SID, but when using db_unique_name this won’t necessarily be the same. Hence I suspect, and will need to run some more tests to prove, that this was added to handle the situation where the SID and the unique name are different.

Do Date Arithmetic on Dates

June 3rd, 2011

I was reminded of this when the following SQL failed on June 1st:

(EXTRACT(DAY FROM s.end_interval_time)*86400+
 EXTRACT(HOUR FROM s.end_interval_time)*3600+
 EXTRACT(MINUTE FROM s.end_interval_time)*60+
 EXTRACT(SECOND FROM s.end_interval_time)) -
(EXTRACT(DAY FROM s.begin_interval_time)*86400+
 EXTRACT(HOUR FROM s.begin_interval_time)*3600+
 EXTRACT(MINUTE FROM s.begin_interval_time)*60+
 EXTRACT(SECOND FROM s.begin_interval_time)) AS etim

which resulted in the following:

                             Available        Service                  Estimated
                Elapsed        Service           Time                   Workload
 Inst ID      Time(Sec)      Time(Sec)   Used(DB CPU) Wait Time(Sec) Utilization
-------- -------------- -------------- -------------- -------------- -----------
       1     -2,636,999    -21,095,993          7,189    -21,103,182        2.08

A little closer inspection shows that there’s a problem with the arithmetic when the ending interval day’s date is less than the beginning interval’s day like when going from 5/31 to 6/1:

   SNAP_ID BIT                 EIT                       ETIM
---------- ------------------- ------------------- ----------
     49971 05/31/2011 22:30:46 05/31/2011 23:00:51   1805.029
     49972 05/31/2011 23:00:51 05/31/2011 23:30:56   1805.273
     49973 05/31/2011 23:30:56 06/01/2011 00:00:01 -2676655.1
     49974 06/01/2011 00:00:01 06/01/2011 00:30:07    1806.01
     49975 06/01/2011 00:30:07 06/01/2011 01:00:12   1805.091
     49976 06/01/2011 01:00:12 06/01/2011 01:30:17   1805.093
SQL>

The problem can be fixed with the following:

(EXTRACT(DAY FROM (s.end_interval_time-s.begin_interval_time) )*86400+
 EXTRACT(HOUR FROM (s.end_interval_time-s.begin_interval_time) )*3600+
 EXTRACT(MINUTE FROM (s.end_interval_time-s.begin_interval_time) )*60+
 EXTRACT(SECOND FROM (s.end_interval_time-s.begin_interval_time) ))

So, in this case with the corrected code we get:

                             Available        Service                  Estimated
                Elapsed        Service           Time                   Workload
 Inst ID      Time(Sec)      Time(Sec)   Used(DB CPU) Wait Time(Sec) Utilization
-------- -------------- -------------- -------------- -------------- -----------
       1         41,401        331,207          7,189        324,018        2.17

The corrected SQL follows:

SELECT
  s.instance_number AS inst_id,
  SUM(
    (EXTRACT(DAY FROM (s.end_interval_time-s.begin_interval_time) )*86400+
     EXTRACT(HOUR FROM (s.end_interval_time-s.begin_interval_time) )*3600+
     EXTRACT(MINUTE FROM (s.end_interval_time-s.begin_interval_time) )*60+
     EXTRACT(SECOND FROM (s.end_interval_time-s.begin_interval_time) ))
  ) AS etim,
  SUM(
    (
      (EXTRACT(DAY FROM (s.end_interval_time-s.begin_interval_time) )*86400+
       EXTRACT(HOUR FROM (s.end_interval_time-s.begin_interval_time) )*3600+
       EXTRACT(MINUTE FROM (s.end_interval_time-s.begin_interval_time) )*60+
       EXTRACT(SECOND FROM (s.end_interval_time-s.begin_interval_time) ))
    )*s.osval
  ) AS cputim,
  SUM(s.dbval)/1000000 AS dbcpu,
  SUM(
    ((
      (EXTRACT(DAY FROM (s.end_interval_time-s.begin_interval_time) )*86400+
       EXTRACT(HOUR FROM (s.end_interval_time-s.begin_interval_time) )*3600+
       EXTRACT(MINUTE FROM (s.end_interval_time-s.begin_interval_time) )*60+
       EXTRACT(SECOND FROM (s.end_interval_time-s.begin_interval_time) ))
    )*s.osval) - (s.dbval/1000000)
  ) AS wtim,
  AVG(
    (
      (s.dbval/1000000) /
      (
        (
          (EXTRACT(DAY FROM (s.end_interval_time-s.begin_interval_time) )*86400+
           EXTRACT(HOUR FROM (s.end_interval_time-s.begin_interval_time) )*3600+
           EXTRACT(MINUTE FROM (s.end_interval_time-s.begin_interval_time) )*60+
           EXTRACT(SECOND FROM (s.end_interval_time-s.begin_interval_time) ))
        )*s.osval
      )
    )*100
  ) wkld
FROM (
  SELECT
    ss.snap_id,
    ss.instance_number,
    ss.begin_interval_time,
    ss.end_interval_time,
    os.value AS osval,
    DECODE(SIGN( tim.value - LAG(tim.value,1,0)
      OVER(PARTITION BY tim.dbid, tim.instance_number, tim.stat_name
      ORDER BY tim.snap_id)),
      -1,0,
      tim.value - LAG(tim.value,1,0)
        OVER(PARTITION BY tim.dbid, tim.instance_number, tim.stat_name
        ORDER BY tim.snap_id) ) AS dbval
  FROM
    dba_hist_snapshot ss,
    dba_hist_osstat os,
    dba_hist_sys_time_model tim
  WHERE
    ss.snap_id = os.snap_id
    and ss.dbid = os.dbid
    and ss.instance_number = os.instance_number
    and os.stat_name = 'NUM_CPUS'
    --
    and ss.snap_id = tim.snap_id
    and ss.dbid = tim.dbid
    and ss.instance_number = tim.instance_number
    and tim.stat_name = 'DB CPU'
) s
WHERE
  s.snap_id BETWEEN (&&begsnap + 1) AND &&endsnap
GROUP BY
  s.instance_number
ORDER BY
  s.instance_number;

ILO Customizations

March 4th, 2011

I’ve been using the Instrumentation Library for Oracle for quite a while now, but I will confess that all I use it for is to set module and action. The fact that the ILO allows task nesting has always been it’s biggest appeal for me. Recently though I’ve been enhancing my PL/SQL logging utility and needed to be able to turn tracing on at the module/action level and come up with a more elegant way to time tasks.

In the ILO documentation it says that the ILO can do both of these things and that the ILO_TIMER package is supplied to do just that. The problem is that the comments in the code are kind of cryptic, at least to me, and there really aren’t any good examples of how to implement either one. Well I finally had enough motivation to actually read the code and figure out what was going on. It turns out to be kind of slick, although it’s heavily dependent on outside code, my logging utility in this case. In the ILO_TIMER package three functions/procedures need to be modified. I modified them with conditional compilation so that I don’t have to maintain more than one version, even if I’m not using my logging package.

Let’s tackle tracing first. The GET_CONFIG procedure is where the ILO provides a “callout” to turn tracing on for a module and action pair. In particular, the ILO_TASK.begin_task procedure calls the ILO_TIMER.get_config procedure to check to see if the boolean “p_trace” is true. If it is it calls ILO_SYSUTIL.turn_trace_on to turn tracing on with the appropriate database call for the version being used. The specific section in the ILO_TIMER package looks like this:


 IF g_all_interesting AND g_ignore_schedule THEN
   p_trace := TRUE;
   P_walltime := TRUE;
   p_rtime := TRUE;
   --If the Module and Action are "interesting"  ...
   --ELSIF call your custom method here THEN
   --
   -- If the Module and Action DON'T exist in the schedule ...
 ELSE
   IF g_all_interesting THEN
     p_trace := TRUE;
     P_walltime := TRUE;
     p_rtime := TRUE;
   ELSE
     p_trace := FALSE;
     P_walltime := FALSE;
     p_rtime := FALSE;
   END IF;
 END IF;

Now since I wanted to use conditional compilation I left this code alone and added my code right after:


 --
 -- If using dblog check to see if tracing should be enabled
 --
 $IF $$dblog $THEN
   --
   -- If dblog exists then always set run time data if it exists.
   --
   p_rtime := TRUE;
   --
   IF dblog.Is_Trace_Enabled(p_module_name => p_module,
     p_action_name => p_action ) = 'Y' THEN
     --
     p_trace := TRUE;
     P_walltime := TRUE;
   END IF;
 $END

Ignore the fact that I moved the “p_rtime” parameter to always be set if using the dblog package. We’ll talk about that in the section on task timing. With this change the ILO_TASK.begin_task call will check to see if tracing is enabled from my logging procedure and if it is, the p_trace boolean will be set to true and the ILO will turn tracing on. Very slick and easy.

Now let’s tackle task timing. The ILO_TIMER package has two calls, a “begin_timed_task” function call and a “end_timed_task” procedure call. The begin_timed_task function simply returns a null and the end_timed_task procedure is just a null statement. It is left up to us to figure out what to do with them. One helpful hint is that these function/procedures are called when tracing (e.g. p_trace) or response time (e.g. p_rtime) booleans have been set (although there are other ways as well). This is why I moved the p_rtime boolean setting up above to always be set when using my logging procedure. The other helpful hint is that it only makes sense when the begin_time/end_time parameters are passed to the begin_task and end_task procedures in the ILO_TASK package. This is effectively how task timing is performed. So, the code then looks like this in the ILO_TIMER package:


FUNCTION begin_timed_task(p_begin_time timestamp) RETURN NUMBER IS
BEGIN
  --
  -- If using dblog return the start time in seconds
  --
  $IF $$dblog $THEN
    IF p_begin_time IS NOT NULL THEN
      RETURN(
        (extract(hour from p_begin_time))*3600+
        (extract(minute from p_begin_time))*60+
        extract(second from p_begin_time)
      );
    END IF;
  $END
  --
  RETURN NULL;
EXCEPTION
  WHEN OTHERS THEN
    if ilo_util.get_raise_exceptions then
      raise;
    else
      RETURN NULL;
    end if;
END begin_timed_task;

PROCEDURE end_timed_task (
  p_stack_rec    ilo_task.stack_rec_t
  ,p_error_num    PLS_INTEGER default 0
  ,p_end_time     timestamp DEFAULT NULL
  ,p_widget_count NUMBER DEFAULT NULL)
IS
BEGIN
  --
  -- If using dblog set the task time
  --
  $IF $$dblog $THEN
    IF (( p_end_time IS NOT NULL ) AND ( p_stack_rec.rtime_active )) THEN
      --
      -- The following can be used as debugging statements
      --
      --dbms_output.put_line('module: ' || p_stack_rec.module );
      --dbms_output.put_line('action: ' || p_stack_rec.action );
      --dbms_output.put_line('SEQUENCE: ' || p_stack_rec.SEQUENCE );
      --dbms_output.put_line('p_end_time: ' || ((extract(hour from p_end_time))*3600+
      --  (extract(minute from p_end_time))*60+ extract(second from p_end_time)) );
      --
      IF ( ( ((extract(hour from p_end_time))*3600+
        (extract(minute from p_end_time))*60+
        extract(second from p_end_time)) - p_stack_rec.SEQUENCE ) >= 0 ) THEN
        --
        dblog.log(
          p_module_name=> p_stack_rec.module,
          p_action_name=> p_stack_rec.action,
          p_tasktime => ( ((extract(hour from p_end_time))*3600+
            (extract(minute from p_end_time))*60+
            extract(second from p_end_time)) - p_stack_rec.SEQUENCE ),
            p_text=> 'Task time' );
      END IF;
    END IF;
  $END
  --
  NULL;
EXCEPTION
  WHEN OTHERS THEN
    if ilo_util.get_raise_exceptions then
      raise;
    else
      NULL;
    end if;
END end_timed_task;

The begin_timed_task function simply returns the timestamp in seconds. The ILO_TASK.begin_task procedure sets this value in a SEQUENCE field of a record for the current stack level (e.g. the task nesting mechanism). When an ILO_TASK.end_task call is made with the end_time parameter the task time is calculated from the current timestamp and the value saved from the corresponding begin_task call.  A call is then made to the logging procedure to save that time as the “tasktime” for the current module and action.

As it turns out is was pretty simple. Of course most of the heavy lifting is being done by the logging package and I will make that available shortly along with some examples of how to put this to real use.

What a great week!

February 26th, 2011

It was a really good week. To start the week off I was able to attend an Exadata workshop at Oracle. From a technical point of view the Exadata machine really is quite amazing. The storage subsystem and the integration with the Oracle database is very impressive. As far as database I/O is concerned it is a real game changer. We know have to think in terms of row level I/O when the Smart Scan is involved.

I was also able to attend the NoCOUG conference this week and hear Cary Millsap speak about performance. Cary is one of those people that, in the words of another expert in his field, really does inform, instruct and inspire. I always take away something when I hear Cary speak.

And if that weren’t enough I was able to arrange to visit with Cary and get him a tour of the National Ignition Facility. I’m always awed by the NIF and I think Cary enjoyed it as well. It really was a great week.

Input Validation

February 24th, 2011

Tom Kyte posted a blog entry that included a test illustrating why not using bind variables can make your code susceptible to SQL Injection. I’ve heard Tom and others talk about this subject before, but to be honest I just accepted their premise and never spent much time experimenting with it. So after reading the post and Googling for SQL Injection, I found a nice example by David Litchfield and came up with this as a way to attack the code included in the test.

create or replace procedure inj( p_date in date )
as
l_rec   all_users%rowtype;
c       sys_refcursor;
l_query long;
begin
l_query := '
select *
from all_users
where created = ''' ||p_date ||'''';

dbms_output.put_line( l_query );
open c for l_query;

for i in 1 .. 5
loop
fetch c into l_rec;
exit when c%notfound;
dbms_output.put_line( l_rec.username || '.....' );
end loop;
close c;
end;
/

SQL> ALTER SESSION SET NLS_DATE_FORMAT = '" '' or ''a'' = ''a"';

Session altered.

SQL> exec inj(''' or ''a'' = ''a');

select *
from all_users
where created = ' ' or 'a' = 'a'
ILO.....
ILMDS.....
BI.....
PM.....
SH.....

PL/SQL procedure successfully completed.

SQL>

Tom subsequently posted a followup entry and my example wasn’t too far off and did illustrate the point.

I thought question 3: “how to best protect against that attack” was also interesting. Here’s my attempt at creating a “safer” procedure:

create or replace procedure inj1( p_date in date )
as
  l_ctr pls_integer := 1;
begin
  for l_rec in
    ( select * from all_users
      where created = p_date )
  loop
    if l_ctr <= 5 then
      dbms_output.put_line( l_rec.username || '.....' );
    else
      exit;
    end if;
    --
    l_ctr := l_ctr + 1;
  end loop;
end;
/

The big difference I think is that there was no reason to use dynamic SQL in the original example. Now I understand it was for illustration, but realistically in that case dynamic SQL just added to the ease of attacking the code. I still use a bind variable but I’ve eliminated the dynamic SQL from the code. The results?

SQL> ALTER SESSION SET NLS_DATE_FORMAT = '" '' or ''a'' = ''a"';

Session altered.

SQL> set serveroutput on
SQL> exec inj1(''' or ''a'' = ''a');

PL/SQL procedure successfully completed.

SQL> select username,
to_char(created,'MM/DD/YYYY HH24:MI:SS')
from all_users where trunc(created) =
to_date('10/14/2007','MM/DD/YYYY');

USERNAME                       TO_CHAR(CREATED,'MM
------------------------------ -------------------
SYS                            10/14/2007 15:05:30
SYSTEM                         10/14/2007 15:05:31
OUTLN                          10/14/2007 15:05:32
DIP                            10/14/2007 15:07:32
XDB                            10/14/2007 15:31:47
DBSNMP                         10/14/2007 15:20:35
WMSYS                          10/14/2007 15:23:01
EXFSYS                         10/14/2007 15:30:19
CTXSYS                         10/14/2007 15:30:45
ANONYMOUS                      10/14/2007 15:31:47
ORDSYS                         10/14/2007 15:36:15
ORDPLUGINS                     10/14/2007 15:36:16
SI_INFORMTN_SCHEMA             10/14/2007 15:36:16
MDSYS                          10/14/2007 15:36:16
OLAPSYS                        10/14/2007 15:46:49
MDDATA                         10/14/2007 15:47:54
SYSMAN                         10/14/2007 15:54:03
MGMT_VIEW                      10/14/2007 15:58:29
FLOWS_FILES                    10/14/2007 15:59:24
APEX_PUBLIC_USER               10/14/2007 15:59:24
FLOWS_030000                   10/14/2007 15:59:24
OWBSYS                         10/14/2007 16:10:50
SCOTT                          10/14/2007 16:13:15

23 rows selected.

SQL> exec inj(to_date('10/14/2007','MM/DD/YYYY'));

select *
from all_users
where created = '10/14/2007'

PL/SQL procedure successfully completed.

SQL> exec inj1(to_date('10/14/2007 16:13:15','MM/DD/YYYY HH24:MI:SS'));
SCOTT.....

PL/SQL procedure successfully completed.

SQL> exec inj(to_date('10/14/2007 16:13:15','MM/DD/YYYY HH24:MI:SS'));

select *
from all_users
where created = '10/14/2007'

PL/SQL procedure successfully completed.

SQL>

The attack didn’t work, but there are other problems. Because of the date usage the query isn’t really practical as you can see in the example. Unless you format or trunc the “created” column it’s not very easy to make this procedure, or the original for that matter, work as intended.

My point though is not to take issue with a simple example, but to raise the issue of input validation. In playing around with this I was reminded of how hard it can be to really code the checks needed to insure that input is properly validated. Most of the time we don’t and we just throw an exception when bad input causes a failure at run time. This practice though, can also lead to SQL Injection if you’re not careful. Even Tom and several commenters to his thread point out, that another way to protect the code is to validate the input. Whether by using format masks as Tom mentions, which is probably the easiest way, or by actually decoding and testing the input. The problem of course is that actually validating input can be hard and leads to more coding and possibly errors. Of course getting your database hacked isn’t that great of an alternative either.

New SQL Execution Tree Utility Posted

September 16th, 2010

Well I’ve finally posted the SQL execution tree utility that I mentioned here. It is available on appsdba.com at the bottom of the Resource Profiler page. I called it the sql_tree utility for lack of a better name. I guess I could have named it the extended SQL trace parse, execute and fetch exclusive time calculator, but sql_tree just seemed easier. In any case it’s available if you want to check it out. I posted a pretty detailed explanation in my original blog post so I won’t repeat it here, but I did add a fairly good size example with the utility on appsdba.com.