Tag Archives: SQL

Improper Use of the Oracle ‘Rownum’ Pseudocolumn

The other day I found myself needing to explain to some developers why their use-case of the Oracle ‘rownum’ pseudocolumn was yielding a result in one database instance, but a completely different result in another.

In this situation, the correct result is the ‘maximum’ value of the column, however this query was also occasionally returning the exact ‘minimum’ value of this column. How could this happen? The answer lies in the using the ‘rownum’ pseudocolumn correctly. Of course there are other (probably better) ways to write this query without the use of ‘rownum’, but I’m not here to debate that right now….

** Note the tables in the query have been changed to protect the innocent.

select column_a from (select column_a,rownum rowid0 from schema.table order by column_a desc ) aa where aa.rowid0 =1;

Oracle documentation states that it depends how Oracle accessed the rows in the query as to which result you will get. For example your results can vary depending on a lot of factors (ie: the order that you inserted the data in the table or if there is an index on the table and how that index is used). For further information you can see the documentation here:

https://docs.oracle.com/cd/B28359_01/server.111/b28286/pseudocolumns009.htm#SQLRF00255

For further explanation, lets explore the explain plans encountered used in each system:

Correct Result:


--------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                               |       |       |     1 (100)|          |
|*  1 |  VIEW                        |                               |  1257 | 32682 |     1   (0)| 00:00:01 |
|   2 |   COUNT                      |                               |       |       |            |          |
|   3 |    INDEX FULL SCAN DESCENDING| SCHEMA_TABLE_PK               |  1257 |  6285 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("AA"."ROWID0"=1)


22 rows selected.

Incorrect Result:


---------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                               |       |       |     4 (100)|          |
|*  1 |  VIEW                   |                               |  1257 | 32682 |     4  (25)| 00:00:01 |
|   2 |   SORT ORDER BY         |                               |  1257 |  6285 |     4  (25)| 00:00:01 |
|   3 |    COUNT                |                               |       |       |            |          |
|   4 |     INDEX FAST FULL SCAN| SCHEMA_TABLE_PK               |  1257 |  6285 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("AA"."ROWID0"=1)


24 rows selected.

As you can see, the major difference here is that the two systems have not chosen the same access path in which to return the data. In one system a plan utilized an ‘INDEX FULL SCAN DESCENDING’ access path, while the other utilized an ‘INDEX FAST FULL SCAN’ access path.

Is this really that different? Turns out it is.

ASK Tom Explained the reason why very concisely:
(Ask TOM “Difference between Full Index Scans and Fast Full Index Scans”)

They state that:

“An index fast full scan reads the ENTIRE index, unsorted, as it exists on disk. It is basically using the index as a “skinny” version of the table. The query in question would only be accessing attributes in the index (we are not using the index as a way to get to the table, we are using the index INSTEAD of the table) We use multiblock IO and read all of the leaf, branch and the root block. We ignore the branch and root blocks and just process the (unordered) data on the leaf blocks.

An index full scan is when we read the index a block at a time – from start to finish. We’ll read the root block, navigate down the left hand side of the index (or right if we are doing a descending full scan) and then when we hit the leaf block – we’ll read across the entire bottom of the index – a block at a time – in sorted order. We use single block IO, not multiblock IO for this operation.”

Well there you have it. And this is why the result is different. How can we keep this from occurring in the future? The answer is to utilize the ‘rownum’ pseudocolumn correctly. Remember, rownum is not a real column so in order to get the right results, it needs to be added after the data is in the sorted order that you want. To do that, make sure you write the query so that ‘rownum’ is applied after the sort. Using the same query above, lets ‘rewrite’ it in such a way that it will achieve the desired results:

select column_a from (select column_a,rownum from (select column_a from schema.table order by column_a desc)) where rownum = 1;

See the steps now?

  1. Retrieve data in sorted order
  2. Apply the ‘rownum’ pseudocolumn
  3. Filter for the desired value in the list

If you must use the ‘rownum’ pseudocolumn, writing your query in this manner will ensure that you always get the same result.

Enjoy!