The LAG Function

The Oracle LAG function gives the value of the previous row within a table without the need for a self join. The LAG and LEAD functions have been available since Oracle 8.1.6 when analytic functions were first introduced. Tom Kyte dedicated an entire chapter to analytic functions in his first book, “Expert One-on-One Oracle” and they have obviously been part of the Oracle documentation as well. Their adoption has not been as quick though, and I will admit to being slow when it comes to converting code from using a self join and subtracting n and n+1 values rather than using LAG. In the open source SYSMON code many of the views used for interval summation used a function of the form:

  DECODE( SIGN(e.phyrds - b.phyrds),-1,0,
    (e.phyrds - b.phyrds) ),

This required a self join to get the begin and end values for each interval. This was done to insure that no negative values occurred when an instance restart occurred. The full SQL looked similar to this:

WITH g_fileio AS
(
  select
    fs.snap_id,
    fs.dbid,
    fs.instance_number,
    fs.tsname,
    fs.filename,
    fs.phyrds
  from
    dba_hist_filestatxs fs
)
SELECT
  b.begin_interval_time,
  e.begin_interval_time,
  e.snap_id,
  e.dbid,
  e.instance_number,
  e.tsname,
  e.filename,
  DECODE( SIGN(e.phyrds - b.phyrds),-1,0,
    (e.phyrds - b.phyrds) ),
  DECODE( SIGN(e.phyblkrd - b.phyblkrd),-1,0,
    (e.phyblkrd - b.phyblkrd) ),
  DECODE( SIGN(e.readtim - b.readtim),-1,0,
    (e.readtim - b.readtim) ),
  DECODE( SIGN(e.singleblkrds - b.singleblkrds),-1,0,
    (e.singleblkrds - b.singleblkrds) ),
  DECODE( SIGN(e.singleblkrdtim - b.singleblkrdtim),-1,0,
    (e.singleblkrdtim - b.singleblkrdtim) ),
  DECODE( SIGN(e.phywrts - b.phywrts),-1,0,
    (e.phywrts - b.phywrts) ),
  DECODE( SIGN(e.phyblkwrt - b.phyblkwrt),-1,0,
    (e.phyblkwrt - b.phyblkwrt) ),
  DECODE( SIGN(e.writetim - b.writetim),-1,0,
    (e.writetim - b.writetim) )
FROM
  g_fileio b,
  g_fileio e
WHERE
  e.snap_id = b.snap_id + 1
  AND b.tsname = e.tsname
  AND b.filename = e.filename
  AND b.dbid = e.dbid
  AND b.instance_number = e.instance_number

The following will compare the “old” way of using a self join with the LAG function.

LAG Function Usage

The following example will show how the LAG function can be used when accessing AWR tables in Oracle 11g. Initially just one “value” will be accessed to show how the LAG function works. The following SQL illustrates the example:

column filename format a50;
WITH g_fileio AS
(
  select
    fs.snap_id,
    fs.dbid,
    fs.instance_number,
    fs.tsname,
    fs.filename,
    fs.phyrds
  from
    dba_hist_filestatxs fs
)
SELECT
  e.snap_id,
  e.instance_number,
  e.filename,
  e.phyrds as curr_rds,
  LAG(e.phyrds,1,0) OVER(
    PARTITION BY e.dbid, e.instance_number, e.tsname, e.filename
    ORDER BY e.snap_id ) as prev_rds
FROM
  g_fileio e
WHERE
  e.snap_id between 83578 and 83590
  and e.tsname = 'DATA_TBS'
  and e.instance_number = 2
ORDER BY
  e.snap_id
/

This LAG function has been written in a manner that if the previous row is outside the scope of our result set it will return a zero, otherwise it will return the previous row’s value for phyrds. When this SQL is run we see that for the first interval that the “previous” value is 0. This makes sense since there is no previous row in our set, as constrained by “e.snap_id between 83578 and 83590″, and as we said previously our LAG function has been defined to return a zero in that case. The rest of the rows return the value of the previous row’s phyrds column. Notice that we’ve order by snap_id to make this easy to see in the example below:

   SNAP_ID INSTANCE_NUMBER FILENAME                                 CURRRDS    PREVRDS
---------- --------------- ------------------------------------- ---------- ----------
     83578               2 +ORADATA01/db11g/data_tbs_01.dbf           36477          0
     83579               2 +ORADATA01/db11g/data_tbs_01.dbf           36484      36477
     83580               2 +ORADATA01/db11g/data_tbs_01.dbf           36484      36484
     83581               2 +ORADATA01/db11g/data_tbs_01.dbf           36512      36484
     83582               2 +ORADATA01/db11g/data_tbs_01.dbf           36512      36512
     83583               2 +ORADATA01/db11g/data_tbs_01.dbf           36519      36512
     83584               2 +ORADATA01/db11g/data_tbs_01.dbf           36519      36519
     83585               2 +ORADATA01/db11g/data_tbs_01.dbf           36541      36519
     83586               2 +ORADATA01/db11g/data_tbs_01.dbf           36541      36541
     83587               2 +ORADATA01/db11g/data_tbs_01.dbf           36542      36541
     83588               2 +ORADATA01/db11g/data_tbs_01.dbf           36544      36542
     83589               2 +ORADATA01/db11g/data_tbs_01.dbf           36544      36544
     83590               2 +ORADATA01/db11g/data_tbs_01.dbf           36545      36544

This is an important observation that we will explore further in a subsequent section.

Efficiency

So an interesting question is how much better, if at all, is the LAG function over the original self join?

In order to make the “lag” example produce the same output as the “self-join” example the LAG function was moved into the “WITH” clause so that the row previous to the first row of the result set would be available. In our first example we didn’t do this to illustrate how the LAG function handles the situation where there is no previous row.

The “self-join” SQL:

column filename format a50;
SELECT
  e.snap_id,
  e.instance_number,
  e.filename,
  DECODE( SIGN(e.phyrds - b.phyrds),-1,0,
    (e.phyrds - b.phyrds) ) AS delta_phyrds
FROM
  dba_hist_filestatxs b,
  dba_hist_filestatxs e
WHERE
  e.snap_id = b.snap_id + 1
  AND b.filename = e.filename
  AND b.dbid = e.dbid
  AND b.instance_number = e.instance_number
  and e.snap_id between 83578 and 83590
  and e.tsname = 'DATA_TBS'
  and e.instance_number = 2
ORDER BY
  e.snap_id
/

The “lag” SQL:

column filename format a50;
WITH g_fileio AS
(
  select
    fs.snap_id,
    fs.dbid,
    fs.instance_number,
    fs.tsname,
    fs.filename,
    fs.phyrds - LAG(fs.phyrds,1,0)
    OVER( PARTITION BY fs.dbid, fs.instance_number, fs.tsname, fs.filename
      ORDER BY fs.snap_id ) as delta_phyrds
  from
    dba_hist_filestatxs fs
)
SELECT
  e.snap_id,
  e.instance_number,
  e.filename,
  e.delta_phyrds
FROM
  g_fileio e
WHERE
  e.snap_id BETWEEN 83578 AND 83590 AND
  e.tsname = 'DATA_TBS' AND
  e.instance_number = 2
ORDER BY
  e.snap_id
/

Using Tom Kyte’s “runstats” utility makes testing the difference between these two methods easy. The following will use a slightly modified version available here that summarizes the output into a “workload” format. This was used here because it shows just what is needed to see the difference between the two methods.

The test was run several times to try and filter out the affects of whether the data was cached or not. In fact, the “self-join” version is particularly sensitive to caching and can be expected to run much more slowly when the data is not if the buffer cache.

Test Script

exec runStats_pkg.rs_start;
column filename format a50;
SELECT
  e.snap_id,
  e.instance_number,
  e.filename,
  DECODE( SIGN(e.phyrds - b.phyrds),-1,0,
    (e.phyrds - b.phyrds) ) AS delta_phyrds
FROM
  dba_hist_filestatxs b,
  dba_hist_filestatxs e
WHERE
  e.snap_id = b.snap_id + 1
  AND b.filename = e.filename
  AND b.dbid = e.dbid
  AND b.instance_number = e.instance_number
  and e.snap_id between 83578 and 83590
  and e.tsname = 'CMSPUB_DATA_TBS'
  and e.instance_number = 2
ORDER BY
  e.snap_id
/

exec runStats_pkg.rs_middle;

column filename format a50;
WITH g_fileio AS
(
  select
    fs.snap_id,
    fs.dbid,
    fs.instance_number,
    fs.tsname,
    fs.filename,
    fs.phyrds - LAG(fs.phyrds,1,0) OVER( PARTITION BY fs.dbid, fs.instance_number, fs.tsname, fs.filename
      ORDER BY fs.snap_id ) as delta_phyrds
  from
    dba_hist_filestatxs fs
)
SELECT
  e.snap_id,
  e.instance_number,
  e.filename,
  e.delta_phyrds
FROM
  g_fileio e
WHERE
  e.snap_id between 83578 and 83590
  and e.tsname = 'CMSPUB_DATA_TBS'
  and e.instance_number = 2
ORDER BY
  e.snap_id
/

exec runStats_pkg.rs_stop(p_output=>'WORKLOAD');

Workload Results

The following are the results from three tests run one right after another. In each case the LAG is faster and does less work. In the third run a I reversed the order just to see if running either one first made any difference.

SQL> Run1 ran in 118 hsecs
Run2 ran in 11 hsecs
run 1 ran in 1072.73% of the time

Name                                            Run1        Run2        Diff
STAT...sorts (memory)                              2           3           1
STAT...physical reads                             29           0         -29
STAT...physical reads cache                       29           0         -29
STAT...CPU used by this session                   89          13         -76
STAT...Elapsed Time                              121          14        -107
STAT...redo size                               4,568       4,816         248
STAT...sorts (rows)                            5,860      11,707       5,847
STAT...session logical reads                 170,866      19,517    -151,349

Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct
1,793,976     135,443  -1,658,533  1,324.52%

PL/SQL procedure successfully completed.

SQL>

SQL> Run1 ran in 85 hsecs
Run2 ran in 24 hsecs
run 1 ran in 354.17% of the time

Name                                            Run1        Run2        Diff
STAT...recursive cpu usage                         3           2          -1
STAT...sorts (memory)                              2           3           1
STAT...physical reads cache                        8           0          -8
STAT...physical reads                              8           0          -8
STAT...CPU used by this session                   75          25         -50
STAT...Elapsed Time                               88          28         -60
STAT...redo size                               5,952       5,108        -844
STAT...sorts (rows)                            5,861      11,709       5,848
STAT...session logical reads                 170,942      19,546    -151,396

Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct
1,895,158     701,188  -1,193,970    270.28%

PL/SQL procedure successfully completed.

SQL>

SQL> Run1 ran in 24 hsecs
Run2 ran in 88 hsecs
run 1 ran in 27.27% of the time

Name                                            Run1        Run2        Diff
STAT...sorts (memory)                              3           2          -1
STAT...recursive cpu usage                         2           4           2
STAT...Elapsed Time                               28          90          62
STAT...CPU used by this session                   25          90          65
STAT...redo size                               4,288       5,908       1,620
STAT...sorts (rows)                           11,709       5,861      -5,848
STAT...session logical reads                  19,500     170,948     151,448

Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct
510,514   2,151,923   1,641,409     23.72%

PL/SQL procedure successfully completed.

SQL>

Caveats

The biggest problem with using the LAG function is to insure that the row set that it operates on is complete. This means that the leading row prior to the first row used in the result set must be available. This usually means that an inline view or a with statement must be used to insure that the resulting query has a prior row to work with. The following example will try to illustrate this. In the first case the LAG function is calculated as part of a single query. The where clause restricts the data to a specific range of snap_ids and the resulting lag entry is incorrect for the first row:

select
  snap_id,
  wait_class,
  total_waits_fg AS waits_fg,
  total_waits_fg - LAG(total_waits_fg,1,0)
  OVER(PARTITION BY dbid, instance_number, event_id, wait_class_id
    ORDER BY snap_id) AS waits_fg_lag
from
  dba_hist_system_event
where
  snap_id between 28452 and 28462
  and wait_class = 'Commit'
  and instance_number = 1
order by
  snap_id
/
   SNAP_ID WAIT_CLASS                                         WAITS_FG WAITS_FG_LAG
---------- ------------------------------------------------ ---------- ------------
     28452 Commit                                              1241804      1241804
     28453 Commit                                              1244080         2276
     28454 Commit                                              1246347         2267
     28455 Commit                                              1248675         2328
     28456 Commit                                              1250995         2320
     28457 Commit                                              1253477         2482
     28458 Commit                                              1255821         2344
     28459 Commit                                              1258202         2381
     28460 Commit                                              1260383         2181
     28461 Commit                                              1262600         2217
     28462 Commit                                              1264982         2382

11 rows selected.

SQL>

The next example uses an inline view to allow the LAG function to have the row prior to the first row available to operate on:

compute sum of waits_fg_lag on wait_class;
break on report;
select *
from (
  select
    snap_id,
    wait_class,
    total_waits_fg AS waits_fg,
    total_waits_fg - LAG(total_waits_fg,1,0)
      OVER(PARTITION BY dbid, instance_number, event_id, wait_class_id
      ORDER BY snap_id) AS waits_fg_lag
from
  dba_hist_system_event
where
  instance_number = 1
  and wait_class = 'Commit'
)
where
  snap_id between (28452 + 1) and 28462
order by
  snap_id
/
   SNAP_ID WAIT_CLASS                                         WAITS_FG WAITS_FG_LAG
---------- ------------------------------------------------ ---------- ------------
     28453 Commit                                              1244080         2276
     28454 Commit                                              1246347         2267
     28455 Commit                                              1248675         2328
     28456 Commit                                              1250995         2320
     28457 Commit                                              1253477         2482
     28458 Commit                                              1255821         2344
     28459 Commit                                              1258202         2381
     28460 Commit                                              1260383         2181
     28461 Commit                                              1262600         2217
     28462 Commit                                              1264982         2382

10 rows selected.

SQL>

In this case the lag result for the first row is correct. Again, this is because the row prior to the first snap_id is available to the LAG function.

Conclusion

The LAG function is much more efficient in these tests than the self-join. The session logical reads average about 19,500 for the lag test and around 171,000 for the self-join test. The run time is always less for the lag test and the self-join test seems particularly sensitive to the benefits of caching when run continually. When initially run the time to run was significantly worse (see Full Runstats Test):

Run1 ran in 8296 hsecs

Run2 ran in 1913 hsecs

run 1 ran in 433.66% of the time

The only disadvantage is the more care must be taken to code the query in such a way that the row prior to the first row wanted is available to the LAG function.

Leave a Reply