Inside the Oracle Optimizer - Removing the black magic

We have moved!

2010-07-31T10:57:00.000-07:00

You might have been wondering why things had gone so quiet on the Optimizer development team's blog Optimizer Magic over the last few months. Well the blog has moved to blogs.oracle.com/optimizer. All of the old articles have moved too and we plan to be a lot more active at our new home, with at least one new post every month.

Maria Colgan

Explain adaptive cursor sharing behavior with cursor_sharing = similar and force.

2010-03-12T17:00:00.000-08:00

This post has moved to
http://blogs.oracle.com/optimizer/entry/explain_adaptive_cursor_sharing_behavior_with_cursor_sharing_similar_and_force

Upgrading from 9i to 11g and the implicit migration from RBO

2009-12-21T15:26:00.000-08:00

This blog post has moved to
http://blogs.oracle.com/optimizer/entry/upgrading_from_9i_to_11g_and_the_implicit_migration_from_rbo

Maria Colgan

What should I do with old hints in my workload?

2009-11-17T15:05:00.000-08:00

This blog post has moved to
http://blogs.oracle.com/optimizer/entry/what_should_i_do_with_old_hints_in_my_workload

Maria Colgan

Oracle Open World 2009 Summary

2009-10-28T11:09:00.000-07:00

We had a great time talking to our users at Open World 2009 both at our Demogrounds booth and at our two sessions. We received a lot of interesting questions during the Optimizer Roundtable discussion, but we did not get to answer all of them due to time constraints. We plan to address the questions we received (both answered and unanswered) in future blog posts... so stay tuned. If you didn't get to attend the discussion, but have a question about the Optimizer, submit it through the email link above.

For those of you who did not get a chance to stop by our Demogrounds booth, here's a recap of the new features that we talked about. Many of the topics have already been covered in earlier blog posts.

Enhanced bind peeking with Adaptive Cursor Sharing: Read about Adaptive Cursor Sharing and a followup.

Faster and more accurate Statistics gathering: Read about our new statistics gathering algorithm in 11g and incremental maintenance of global statistics for partitioned tables.

New types of Statistics that improve cardinality estimates

Ability to gather but not publish Statistics

Guaranteed plan stability and controlled plan evolution: Read about SQL Plan Management.

New tools to help correct regressed execution plans: Read about our Plan Diff tool and Test Case Builder.

These topics are focused on well-known pain points from earlier versions of Oracle. But we also have plenty of new optimizations in Oracle 11gR1 and 11gR2. Stay tuned for details about some of our new optimizations.

Maria Colgan

Open World Recap and New White papers

2009-10-09T12:32:00.000-07:00

The Optimizer group has two session and a demo station in the Database campground at this year's Oracle Open World. We will give a technical presentation on What to Expect from the Oracle Optimizer When Upgrading to Oracle Database 11g and host an Oracle Optimizer Roundtable.

The technical session, which is on Tuesday Oct 13 at 2:30 pm, gives step by step instructions on how to use the new 11g features to ensure your upgrade goes smoothly and without any SQL plan regressions. This session is based on our latest white papers, Upgrading from Oracle Database 10g to 11g: What to expect from the Optimizer and SQL Plan Management in Oracle Database 11g.

The roundtable, which is on Thursday Oct. 15th at 10:30 am, will give you a first hand opportunity to pose you burning Optimizer and statistics questions directly to a panel of our leading Optimizer developers. In fact if you plan to attend the roundtable and already know what questions you would like to ask, then please send them to us via email and we will be sure to include them. Other wise, you can hand in your questions at our demo station at any stage during the week, or as you enter the actual session. Just be sure to write your questions in clear block capitals!

We look forward to see you all at Open world.

Maria Colgan

What's Changed between my New Query Plan and the Old One?

2009-09-01T13:21:00.000-07:00

This blog post has moved here
http://blogs.oracle.com/optimizer/entry/whats_changed_between_my_new_query_plan_and_the_old_one

Understanding DBMS_STATS.SET_*_PREFS procedures

2009-08-11T19:25:00.000-07:00

In previous Database releases you had to use the DBMS_STATS.SET_PARM procedure to change the default value for the parameters used by the DBMS_STATS.GATHER_*_STATS procedures. The scope of any changes that were made was all subsequent operations. In Oracle Database 11g, the DBMS_STATS.SET_PARM procedure has been deprecated and it has been replaced with a set of procedures that allow you to set a preference for each parameter at a table, schema, database, and Global level. These new procedures are called DBMS_STATS.SET_*_PREFS and offer a much finer granularity of control.

However there has been some confusion around which procedure you should use when and what the hierarchy is among these procedures. In this post we hope to clear up the confusion. Lets start by looking at the list of parameters you can change using the DBMS_STAT.SET_*_PREFS procedures.

AUTOSTATS_TARGET (SET_GLOBAL_PREFS only)
CASCADE
DEGREE
ESTIMATE_PERCENT
METHOD_OPT
NO_INVALIDATE
GRANULARITY
PUBLISH
INCREMENTAL
STALE_PERCENT

As mentioned above there are four DBMS_STATS.SET_*_PREFS procedures.

SET_TABLE_PREFS

SET_SCHEMA_PREFS

SET_DATABASE_PREFS

SET_GLOBAL_PREFS

The DBMS_STATS.SET_TABLE_PREFS procedure allows you to change the default values of the parameters used by the DBMS_STATS.GATHER_*_STATS procedures for the specified table only.

The DBMS_STATS.SET_SCHEMA_PREFS procedure allows you to change the default values of the parameters used by the DBMS_STATS.GATHER_*_STATS procedures for all of the existing objects in the specified schema. This procedure actually calls DBMS_STATS.SET_TABLE_PREFS for each of the tables in the specified schema. Since it uses DBMS_STATS.SET_TABLE_PREFS calling this procedure will not affect any new objects created after it has been run. New objects will pick up the GLOBAL_PREF values for all parameters.

The DBMS_STATS.SET_DATABASE_PREFS procedure allows you to change the default values of the parameters used by the DBMS_STATS.GATHER_*_STATS procedures for all of the user defined schemas in the database. This procedure actually calls DBMS_STATS.SET_TABLE_PREFS for each of the tables in each of the user defined schemas. Since it uses DBMS_STATS.SET_TABLE_PREFS this procedure will not affect any new objects created after it has been run. New objects will pick up the GLOBAL_PREF values for all parameters. It is also possible to include the Oracle owned schemas (sys, system, etc) by setting the ADD_SYS parameter to TRUE.

The DBMS_STATS.SET_GLOBAL_PREFS procedure allows you to change the default values of the parameters used by the DBMS_STATS.GATHER_*_STATS procedures for any object in the database that does not have an existing table preference. All parameters default to the global setting unless there is a table preference set or the parameter is explicitly set in the DBMS_STATS.GATHER_*_STATS command. Changes made by this procedure will affect any new objects created after it has been run as new objects will pick up the GLOBAL_PREF values for all parameters.

With GLOBAL_PREFS it is also possible to set a default value for one additional parameter, called AUTOSTAT_TARGET. This additional parameter controls what objects the automatic statistic gathering job (that runs in the nightly maintenance window) will look after. The possible values for this parameter are ALL,ORACLE, and AUTO. ALL means the automatic statistics gathering job will gather statistics on all objects in the database. ORACLE means that the automatic statistics gathering job will only gather statistics for Oracle owned schemas (sys, sytem, etc) Finally AUTO (the default) means Oracle will decide what objects to gather statistics on. Currently AUTO and ALL behave the same.

In summary, DBMS_STATS obeys the following hierarchy for parameter values, parameters values set in the DBMS_STAT.GATHER*_STATS command over rules everything. If the parameter has not been set in the command we check for a table level preference. If there is no table preference set we use the global preference.

Will the Optimizer Team be at Oracle Open World 2009?

2009-07-23T18:18:00.001-07:00

With only two and a half months to go until Oracle Open World in San Francisco, October 11-15th, we have gotten several requests asking if we plan to present any session at the conference.

We have two session and a demo station in the Database campground at this year's show. We will give a technical presentation on What to Expect from the Oracle Optimizer When Upgrading to Oracle Database 11g and the Oracle Optimizer Roundtable.

The technical session, which is on Tuesday Oct 13 at 2:30 pm, gives step by step instructions and detailed examples of how to use the new 11g features to ensure your upgrade goes smoothly and without any SQL plan regressions.

The roundtable, which is on Thursday Oct. 15th at 10:30 am, will give you a first hand opportunity to pose you burning Optimizer and statistics questions directly to a panel of our leading Optimizer developers. In fact if you plan to attend the roundtable and already know what questions you would like to ask, then please send them to us via email and we will be sure to include them. Other wise, you can hand in your questions at our demo station at any stage during the week, or as you enter the actual session. Just be sure to write your questions in clear block capitals!

We look forward to seeing you all at Oracle Open World.

Maria Colgan

Why do I have hundreds of child cursors when cursor_sharing set to similar in 10g

2009-05-26T10:04:00.000-07:00

Recently we received several questions regarding a usual situation where a SQL Statement has hundreds of child cursors. This is in fact the expected behavior when

CURSOR_SHARING is set to similar

Bind peeking is in use

And a histogram is present on the column used in the where clause predicate of query

You must now be wondering why this is the expected behavior. In order to explain, let's step back and begin by explaining what CURSOR_SHARING actually does. CURSOR_SHARING was introduced to help relieve pressure put on the shared pool, specifically the cursor cache, from applications that use literal values rather than bind variables in their SQL statements. It achieves this by replacing the literal values with system generated bind variables thus reducing the number of (parent) cursors in the cursor cache. However, there is also a caveat or additional requirement on CURSOR_SHARING, which is that the use of system generated bind should not negatively affect the performance of the application. CURSOR_SHARING has three possible values: EXACT, SIMILAR, and FORCE. The table below explains the impact of each setting with regards to the space used in the cursor cache and the query performance.

CURSOR_SHARING VALUE	SPACE USED IN SHARED POOL	QUERY PERFORMANCE
EXACT (No literal replacement)	Worst possible case - each stmt issued has its own parent cursor	Best possible case as each stmt has its own plan generated for it based on the value of the literal value present in the stmt
FORCE	Best possible case as only one parent and child cursor for each distinct stmt	Potentially the worst case as only one plan will be used for each distinct stmt and all occurrences of that stmt will use that plan
SIMILAR without histogram present	Best possible case as only one parent and child cursor for each distinct stmt	Potentially the worst case as only one plan will be used for each distinct stmt and all occurrences of that stmt will use that plan
SIMILAR with histogram present	Not quite as much space used as with EXACT but close. Instead of each stmt having its own parent cursor they will have their own child cursor (which uses less space)	Best possible case as each stmt has its own plan generated for it based on the value of the literal value present in the stmt

In this case the statement with hundreds of children falls into the last category in the above table, having CURSOR_SHARING set to SIMILAR and a histogram on the columns used in the where clause predicate of the statement. The presence of the histogram tells the optimizer that there is a data skew in that column. The data skew means that there could potentially be multiple execution plans for this statement depending on the literal value used. In order to ensure we don't impact the performance of the application, we will peek at the bind variable values and create a new child cursor for each distinct value. Thus ensuring each bind variable value will get the most optimal execution plan. It's probably easier to understand this issue by looking at an example. Let's assume there is an employees table with a histogram on the job column and CURSOR_SHARING has been set to similar. The following query is issued

select * from employees where job = 'Clerk';

The literal value 'Clerk' will be replaced by a system generated bind variable B1 and a parent cursor will be created as

select * from employees where job = :B1;

The optimizer will peek the bind variable B1 and use the literal value 'Clerk' to determine the execution plan. 'Clerk' is a popular value in the job column and so a full table scan plan is selected and child cursor C1 is created for this plan. The next time the query is executed the where clause predicate is job='VP' so B1 will be set to 'VP', this is not a very popular value in the job column so an index range scan is selected and child cursor C2 is created. The third time the query is executed the where clause predicate is job ='Engineer' so the value for B1 is set to 'Engineer'. Again this is a popular value in the job column and so a full table scan plan is selected and a new child cursor C3 is created. And so on until we have seen all of the distinct values for job column. If B1 is set to a previously seen value, say 'Clerk', then we would reuse child cursor C1.

Value for B1	Plan Used	Cursor Number
Clerk	Full Table Scan	C1
VP	Index Range Scan	C2
Engineer	Full Table Scan	C3

As each of these cursors is actually a child cursor and not a new parent cursor you will still be better off than with CURSOR_SHARING set to EXACT as a child cursor takes up less space in the cursor cache. A child cursor doesn't contain all of the information stored in a parent cursor, for example, the SQL text is only stored in the parent cursor and not in each child.

Now that you know the explanation for all of the child cursors you are seeing you need to decide if it is a problem for you and if so which aspect affects you most, space used in the SHARED_POOL or query performance. If your goal is to guarantee the application performance is not affected by setting CURSOR_SHARING to SIMILAR then keep the system settings unchanged. If your goal is to reduce the space used in the shared pool then you can use one of the following solutions with different scopes:

Individual SQL statements - drop the histograms on the columns for each of the affected SQL statements

System-wide - set CURSOR_SHARING to FORCE this will ensure only one child cursor per SQL statement

Both of these solutions require testing to ensure you get the desired effect on your system. Oracle Database 11g provides a much better solution using the Adaptive Cursor Sharing feature. In Oracle Database 11g, all you need to do is set CURSOR_SHARING to FORCE and keep the histograms. With Adaptive Cursor Sharing, the optimizer will create a cursor only when its plan is different from any of the plans used by other child cursors. So in the above example, you will get two child cursors (C1 and C2) instead of 3.

Update on Adaptive Cursor Sharing

2009-04-12T18:24:00.001-07:00

This blog post has moved to
http://blogs.oracle.com/optimizer/entry/update_on_adaptive_cursor_sharing

Maintaining statistics on large partitioned tables

2009-02-11T14:07:00.000-08:00

We have gotten a lot of questions recently regarding how to gather and maintain optimizer statistics on large partitioned tables. The majority of these questions can be summarized into two topics:

When queries access a single partition with stale or non-existent partition level statistics I get a sub optimal plan due to “Out of Range” values

Global statistics collection is extremely expensive in terms of time and system resources

This article will describe both of these issues and explain how you can address them both in Oracle Database 10gR2 and 11gR1.

Out of Range
Large tables are often decomposed into smaller pieces called partitions in order to improve query performance and ease of data management. The Oracle query optimizer relies on both the statistics of the entire table (global statistics) and the statistics of the individual partitions (partition statistics) to select a good execution plan for a SQL statement. If the query needs to access only a single partition, the optimizer uses only the statistics of the accessed partition. If the query access more than one partition, it uses a combination of global and partition statistics.

“Out of Range” means that the value supplied in a where clause predicate is outside the domain of values represented by the [minimum, maximum] column statistics. The optimizer prorates the selectivity based on the distance between the predicate value and the maximum value (assuming the value is higher than the max), that is, the farther the value is from the maximum value, the lower the selectivity will be. This situation occurs most frequently in tables that are range partitioned by a date column, a new partition is added, and then queried while rows are still being loaded in the new partition. The partition statistics will be stale very quickly due to the continuous trickle feed load even if the statistics get refreshed periodically. The maximum value known to the optimizer is not correct leading to the “Out of Range” condition. The under-estimation of selectivity often leads the query optimizer to pick a sub optimal plan. For example, the query optimizer would pick an index access path while a full scan is a better choice.

The "Out of Range" condition can be prevented by using the new copy table statistics procedure available in Oracle Database10.2.0.4 and 11g. This procedure copies the statistics of the source [sub] partition to the destination [sub] partition. It also copies the statistics of the dependent objects: columns, local (partitioned) indexes etc. It adjusts the minimum and maximum values of the partitioning column as follows; it uses the high bound partitioning value as the maximum value of the first partitioning column (it is possible to have concatenated partition columns) and high bound partitioning value of the previous partition as the minimum value of the first partitioning column for range partitioned table. It can optionally scale some of the other statistics like the number of blocks, number of rows etc. of the destination partition.

Assume we have a table called SALES that is ranged partitioned by quarter on the SALES_DATE column. At the end of every day data is loaded into latest partition. However, statistics are only gathered at the end of every quarter when the partition is fully loaded. Assuming global and partition level statistics (for all fully loaded partitions) are up to date, use the following steps in order to prevent getting a sub-optimal plan due to “out of range”.

Lock the table statistics using LOCK_TABLE_STATS procedure in DBMS_STATS. This is to avoid interference from auto statistics job.

EXEC DBMS_STATS.LOCK_TABLE_STATS('SH','SALES');

Before beginning the initial load into each new partition (say SALES_Q4_2000) copy the statistics from the previous partition (say SALES_Q3_2000) using COPY_TABLE_STATS. You need to specify FORCE=>TRUE to override the statistics lock.

EXEC DBMS_STATS.COPY_TABLE_STATS ('SH', 'SALES', 'SALES_Q3_2000', 'SALES_Q4_2000', FORCE=>TRUE);

Expensive global statistics collection

In data warehouse environment it is very common to do a bulk load directly into one or more empty partitions. This will make the partition statistics stale and may also make the global statistics stale. Re-gathering statistics for the effected partitions and for the entire table can be very time consuming. Traditionally, statistics collection is done in a two-pass approach:

In the first pass we will scan the table to gather the global statistics

In the second pass we will scan the partitions that have been changed to gather their partition level statistics.

The full scan of the table for global statistics collection can be very expensive depending on the size of the table. Note that the scan of the entire table is done even if we change a small subset of partitions.

In Oracle Database 11g, we avoid scanning the whole table when computing global statistics by deriving the global statistics from the partition statistics. Some of the statistics can be derived easily and accurately from partition statistics. For example, number of rows at global level is the sum of number of rows of partitions. Even global histogram can be derived from partition histograms. But the number of distinct values (NDV) of a column cannot be derived from partition level NDVs. So, Oracle maintains another structure called a synopsis for each column at the partition level. A synopsis can be considered as sample of distinct values. The NDV can be accurately derived from synopses. We can also merge multiple synopses into one. The global NDV is derived from the synopsis generated by merging all of the partition level synopses. To summarize

Gather statistics and create synopses for the changed partitions only
Oracle automatically merges partition level synopses into a global synopsis
The global statistics are automatically derived from the partition level statistics and global synopses

Incremental maintenance feature is disabled by default. It can be enabled by changing the INCREMENTAL table preference to true. It can also be enabled for a particular schema or at the database level. If you are interested in more details of the incremental maintenance feature, please refer to the following paper presented in SIGMOD 2008 and to our previous blog entry on new ndv gathering in 11g.

Assume we have table called SALES that is range partitioned by day on the SALES_DATE column. At the end of every day data is loaded into latest partition and partition statistics are gathered. Global statistics are only gathered at the end of every month because gathering them is very time and resource intensive. Use the following steps in order to maintain global statistics after every load.

Turn on incremental feature for the table.

EXEC DBMS_STATS.SET_TABLE_PREFS('SH','SALES','INCREMENTAL','TRUE');

At the end of every load gather table statistics using GATHER_TABLE_STATS command. You don’t need to specify the partition name. Also, do not specify the granularity parameter. The command will collect statistics for partitions with stale or missing statistics and update the global statistics based on the partition level statistics and synopsis.

EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','SALES');

Note: that the incremental maintenance feature was introduced in Oracle Database 11g Release 1. However, we also provide a solution in Oracle Database10g Release 2 (10.2.0.4) that simulates the same behavior. The 10g solution is a new value, 'APPROX_GLOBAL AND PARTITION' for the GRANULARITY parameter of the GATHER_TABLE_STATS procedures. It behaves the same as the incremental maintenance feature except that we don’t update the NDV for non-partitioning columns and number of distinct keys of the index at the global level. For partitioned column we update the NDV as the sum of NDV at the partition levels. Also we set the NDV of columns of unique indexes as the number of rows of the table. In general, non-partitioning column NDV at the global level becomes stale less often. It may be possible to collect global statistics less frequently then the default (when table changes 10%) since approx_global option maintains most of the global statistics accurately.

Let's take a look at an example to see how you would effectively use the Oracle Database 10g approach.

After the data load is complete, gather statistics using DBMS_STATS.GATHER_TABLE_STATS for the last partition (say SALES_11FEB2009), specify granularity => 'APPROX_GLOBAL AND PARTITION'. It will collect statistics for the specified partition and derive global statistics from partition statistics (except for NDV as described before).

EXEC DBMS_STATS.GATHER_TABLE_STATS ('SH', 'SALES', 'SALES_11FEB2009', GRANULARITY => 'APPROX_GLOBAL AND PARTITION');

It is necessary to install the one off patch for bug 8719831 if you are using copy_table_stats procedure or APPROX_GLOBAL option in 10.2.0.4 (patch 8877245) or in 11.1.0.7 (patch 8877251).

Maria Colgan

SQL Plan Management (Part 4 of 4): User Interfaces and Other Features

2009-02-02T17:32:00.000-08:00

In the first three parts of this article, we have seen how SQL plan baselines are created, used and evolved. In this final installment, we will show some user interfaces, describe the interaction of SPM with other features and answer some of your questions.

DBMS_SPM package

A new package, DBMS_SPM, allows you to manage plan histories. We have already seen in previous examples how you can use it to create and evolve SQL plan baselines. Other management functions include changing attributes (like enabled status and plan name) of plans or dropping plans. You need the ADMINISTER SQL MANAGEMENT OBJECT privilege to execute this package.

Viewing the plan history

Regardless of how a plan history is created, you can view details about the various plans in the view DBA_SQL_PLAN_BASELINES. At the end of Part 3 of this blog, we saw that the SQL statement had two accepted plans:

SQL> select sql_text, sql_handle, plan_name, enabled, accepted
2 from dba_sql_plan_baselines;

SQL_TEXT SQL_HANDLE PLAN_NAME ENA ACC
------------------------ ------------------------ ----------------------------- --- ---
select p.prod_name, s.am SYS_SQL_4bf04d85fcc170b0 SYS_SQL_PLAN_fcc170b08cbcb825 YES YES
ount_sold, t.calendar_ye
ar
from sales s, products p
, times t
where s.prod_id = p.prod
_id
and s.time_id = t.time
_id
and p.prod_id < :pid

select p.prod_name, s.am SYS_SQL_4bf04d85fcc170b0 SYS_SQL_PLAN_fcc170b0a62d0f4d YES YES
ount_sold, t.calendar_ye
ar
from sales s, products p
, times t
where s.prod_id = p.prod
_id
and s.time_id = t.time
_id
and p.prod_id < :pid

The SQL handle is a unique identifier for each SQL statement that you can use when managing your plan history using the DBMS_SPM package.

Creating an accepted plan by modifying the SQL text

Some of you may be manually tuning SQL statements by adding hints or otherwise modifying the SQL text. If you enable automatic capture of SQL plans and then execute this statement, you will be creating a SQL plan baseline for this modified statement. What you most likely want, however, is to add this plan to the plan history of the original SQL statement. Here's how you can do this using the above SQL statement as an example.

Let's modify the SQL statement, execute it and look at the plan:

SQL> var pid number
SQL> exec :pid := 100;

PL/SQL procedure successfully completed.

SQL> select /*+ leading(t) */ p.prod_name, s.amount_sold, t.calendar_year
2    from sales s, products p, times t
3    where s.prod_id = p.prod_id
4      and s.time_id = t.time_id
5      and p.prod_id < :pid;

PROD_NAME AMOUNT_SOLD CALENDAR_YEAR
--------- ----------- -------------
...
9 rows selected.

SQL> select * from table(dbms_xplan.display_cursor('b17wnz4y8bqv1', 0, 'basic note'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:
------------------------
select /*+ leading(t) */ p.prod_name, s.amount_sold, t.calendar_year
from sales s, products p, times t where s.prod_id = p.prod_id   and
s.time_id = t.time_id   and p.prod_id < :pid

Plan hash value: 2290436051

---------------------------------------------------------------
| Id | Operation                            | Name           |
---------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                |
| 1 | HASH JOIN                           |                |
| 2 |   HASH JOIN                          |                |
| 3 |    TABLE ACCESS FULL                 | TIMES          |
| 4 |    PARTITION RANGE ALL               |                |
| 5 |     TABLE ACCESS BY LOCAL INDEX ROWID| SALES          |
| 6 |      BITMAP CONVERSION TO ROWIDS     |                |
| 7 |       BITMAP INDEX RANGE SCAN        | SALES_PROD_BIX |
| 8 |   TABLE ACCESS BY INDEX ROWID        | PRODUCTS       |
| 9 |    INDEX RANGE SCAN                  | PRODUCTS_PK    |
---------------------------------------------------------------

23 rows selected.

We can now create a new accepted plan for the original SQL statement by associating the modified statement's plan to the original statement's sql handle (obtained from DBA_SQL_PLAN_BASELINES):

SQL> var pls number
SQL> exec :pls := dbms_spm.load_plans_from_cursor_cache( -
>                    sql_id => 'b17wnz4y8bqv1', -
>                    plan_hash_value => 2290436051, -
>                    sql_handle => 'SYS_SQL_4bf04d85fcc170b0');

If the original SQL statement does not already have a plan history (and thus no SQL handle), another version of load_plans_from_cursor_cache allows you to specify the original statement's text.

To confirm that we now have three accepted plans for our SQL statement, let's check in DBA_SQL_PLAN_BASELINES:

SQL> select sql_text, sql_handle, plan_name, enabled, accepted
2 from dba_sql_plan_baselines;

SQL_TEXT SQL_HANDLE PLAN_NAME ENA ACC
------------------------ ------------------------ ----------------------------- --- ---
select p.prod_name, s.am SYS_SQL_4bf04d85fcc170b0 SYS_SQL_PLAN_fcc170b0888547d3 YES YES
ount_sold, t.calendar_ye
ar
from sales s, products p
, times t
where s.prod_id = p.prod
_id
and s.time_id = t.time
_id
and p.prod_id < :pid

select p.prod_name, s.am SYS_SQL_4bf04d85fcc170b0 SYS_SQL_PLAN_fcc170b08cbcb825 YES YES
ount_sold, t.calendar_ye
ar
from sales s, products p
, times t
where s.prod_id = p.prod
_id
and s.time_id = t.time
_id
and p.prod_id < :pid

select p.prod_name, s.am SYS_SQL_4bf04d85fcc170b0 SYS_SQL_PLAN_fcc170b0a62d0f4d YES YES
ount_sold, t.calendar_ye
ar
from sales s, products p
, times t
where s.prod_id = p.prod
_id
and s.time_id = t.time
_id
and p.prod_id < :pid

Displaying plans

When the optimizer uses an accepted plan for a SQL statement, you can see it in the plan table (for explain) or V$SQL_PLAN (for shared cursors). Let's explain the SQL statement above and display its plan:

SQL> explain plan for
2 select p.prod_name, s.amount_sold, t.calendar_year
3 from sales s, products p, times t
4 where s.prod_id = p.prod_id
5 and s.time_id = t.time_id
6 and p.prod_id < :pid;

The note at the bottom tells you that the optimizer used an accepted plan.

A plan history might have multiple plans. You can see one of the accepted plans if the optimizer selects it for execution. But what if you want to display some or all of the other plans? You can do this using the display_sql_plan_baseline function in the DBMS_XPLAN package. Using the above example, here's how you can display the plan for all plans in the plan history.

SQL> select *
2 from table(dbms_xplan.display_sql_plan_baseline(
3 sql_handle => 'SYS_SQL_4bf04d85fcc170b0', format => 'basic'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
SQL handle: SYS_SQL_4bf04d85fcc170b0
SQL text: select p.prod_name, s.amount_sold, t.calendar_year from sales s,
          products p, times t where s.prod_id = p.prod_id   and s.time_id =
          t.time_id   and p.prod_id < :pid
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SYS_SQL_PLAN_fcc170b0888547d3
Enabled: YES     Fixed: NO      Accepted: YES     Origin: MANUAL-LOAD
--------------------------------------------------------------------------------

Plan hash value: 2290436051

---------------------------------------------------------------
| Id | Operation                            | Name           |
---------------------------------------------------------------
| 0 | SELECT STATEMENT                     |                |
| 1 | HASH JOIN                           |                |
| 2 |   HASH JOIN                          |                |
| 3 |    TABLE ACCESS FULL                 | TIMES          |
| 4 |    PARTITION RANGE ALL               |                |
| 5 |     TABLE ACCESS BY LOCAL INDEX ROWID| SALES          |
| 6 |      BITMAP CONVERSION TO ROWIDS     |                |
| 7 |       BITMAP INDEX RANGE SCAN        | SALES_PROD_BIX |
| 8 |   TABLE ACCESS BY INDEX ROWID        | PRODUCTS       |
| 9 |    INDEX RANGE SCAN                  | PRODUCTS_PK    |
---------------------------------------------------------------

--------------------------------------------------------------------------------
Plan name: SYS_SQL_PLAN_fcc170b08cbcb825
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

Plan hash value: 2361178149

------------------------------------------
| Id | Operation             | Name     |
------------------------------------------
|   0 | SELECT STATEMENT      |          |
|   1 | HASH JOIN            |          |
|   2 |   HASH JOIN           |          |
|   3 |    PARTITION RANGE ALL|          |
|   4 |     TABLE ACCESS FULL | SALES    |
|   5 |    TABLE ACCESS FULL | TIMES    |
|   6 |   TABLE ACCESS FULL   | PRODUCTS |
------------------------------------------

--------------------------------------------------------------------------------
Plan name: SYS_SQL_PLAN_fcc170b0a62d0f4d
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
--------------------------------------------------------------------------------

Plan hash value: 2787970893

----------------------------------------------------------------
| Id | Operation                             | Name           |
----------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                |
|   1 | NESTED LOOPS                         |                |
|   2 |   NESTED LOOPS                        |                |
|   3 |    HASH JOIN                          |                |
|   4 |     TABLE ACCESS BY INDEX ROWID       | PRODUCTS       |
|   5 |      INDEX RANGE SCAN                 | PRODUCTS_PK    |
|   6 |     PARTITION RANGE ALL               |                |
|   7 |      TABLE ACCESS BY LOCAL INDEX ROWID| SALES          |
|   8 |       BITMAP CONVERSION TO ROWIDS     |                |
|   9 |        BITMAP INDEX RANGE SCAN        | SALES_PROD_BIX |
| 10 |    INDEX UNIQUE SCAN                  | TIME_PK        |
| 11 |   TABLE ACCESS BY INDEX ROWID         | TIMES          |
----------------------------------------------------------------

72 rows selected.

Parameters

Two parameters allow you to control SPM. The first, optimizer_capture_sql_plan_baselines, which is FALSE by default, allows you to automatically capture plans. SPM will start managing every repeatable SQL statement that is executed and will create a plan history for it. The first plan that is captured will beautomatically accepted. Subsequent plans for these statements will not be accepted until they are evolved.

The second parameter, optimizer_use_sql_plan_baselines, is TRUE by default. It allows the SPM aware optimizer to use the SQL plan baseline if available when compiling a SQL statement. If you set this parameter to FALSE, the SPM aware optimizer will be disabled and you will get the regular cost-based optimizer which will select the best plan based on estimated cost.

SPM and SQL profiles

A SQL statement can have both a SQL profile and a SQL plan baseline. Such a case was described in Part 3 where we evolved a SQL plan baseline by accepting a SQL profile. In this case, the SPM aware optimizer will use both the SQL profile and the SQL plan baseline. The SQL profile contains additional information that helps the optimizer to accurately cost each accepted plan and select the best one. The SPM aware optimizer may choose a different accepted plan when a SQL profile is present than when it is not.

SPM and stored outlines

It is possible for a SQL statement to have a stored outline as well as a SQL plan baseline. If a stored outline exists for a SQL statement and is enabled for use, then the optimizer will use it, ignoring the SQL plan baseline. In other words, the stored outline trumps a SQL plan baseline. If you are using stored outlines, you can test SPM by creating SQL plan baselines and disabling the stored outlines. If you are satisfied with SPM, you can either drop the stored outlines or leave them disabled. If SPM doesn't work for you (and we would love to know why), you can re-enable the stored outlines.

If you are using stored outlines, be aware of their limitations:

You can only have one stored outline at a time for a given SQL statement. This may be fine in some cases, but a single plan is not necessarily the best when the statement is executed under varying conditions (e.g., bind values).
The second limitation is related to the first. Stored outlines do not allow for evolution. That is, even if a better plan exists, the stored outline will continue to be used, potentially degrading your system's performance. To get the better plan, you have to manually drop the current stored outline and generate a new one.
If an access path (e.g., an index) used in a stored outline is dropped or otherwise becomes unusable, the partial stored outline will continue to be used with the potential of a much worse plan.

One question that readers have is what we plan to do with the stored outlines feature. Here is the official word in Chapter 20 of Oracle's Performance Tuning Guide:

Stored outlines will be desupported in a future release in favor of SQL plan management. In Oracle Database 11g Release 1 (11.1), stored outlines continue to function as in past releases. However, Oracle strongly recommends that you use SQL plan management for new applications. SQL plan management creates SQL plan baselines, which offer superior SQL performance and stability compared with stored outlines.

If you have existing stored outlines, consider migrating them to SQL plan baselines by using the LOAD_PLANS_FROM_CURSOR_CACHE or LOAD_PLANS_FROM_SQLSET procedure of the DBMS_SPM package. When the migration is complete, you should disable or remove the stored outlines.

SPM and adaptive cursor sharing

Adaptive cursor sharing (ACS) may generate multiple cursors for a given bind sensitive SQL statement if it is determined that a single plan is not optimal under all conditions. Each cursor is generated by forcing a hard parse of the statement. The optimizer will normally select the plan with the best cost upon each hard parse.

When you have a SQL plan baseline for a statement, the SPM aware optimizer will select the best accepted plan as the optimal plan. This also applies for the hard parse of a bind sensitive statement. There may be multiple accepted plans, each of which is optimal for different bind sets. With SPM and ACS enabled, the SPM aware optimizer will select the best plan for the current bind set.

Thus, if a hard parse occurs, the normal SPM plan selection algorithm is used regardless of whether a statement is bind sensitive.

Enterprise Manager

You can view SQL plan baselines and configure and manage most SPM tasks through the Enterprise Manager. The screenshots below show two of these tasks.

Setting init.ora parameters for SPM

Loading SQL plan baselines from cursor cache

Further Reading

More details about SPM are available in the Oracle documentation, especially Chapter 15 of the Performance Tuning Guide. There is also a whitepaper, and a paper published in the VLDB 2008 conference. The VLDB paper also has experimental results that show how SPM prevents performance regressions while simultaneously allowing better plans to be used.

SQL Plan Management (Part 3 of 4): Evolving SQL Plan Baselines

2009-01-26T10:33:00.000-08:00

In the example in Part 2, we saw that the optimizer used an accepted plan instead of a brand new plan. The statement has two plans in its plan history, but only one is accepted and thus in the SQL plan baseline:

SQL> select sql_text, plan_name, enabled, accepted from dba_sql_plan_baselines;

SQL_TEXT PLAN_NAME ENA ACC
---------------------------------------- ------------------------------ --- ---
select p.prod_name, s.amount_sold, t.cal SYS_SQL_PLAN_fcc170b08cbcb825 YES NO
endar_year
from sales s, products p, times t
where s.prod_id = p.prod_id
and s.time_id = t.time_id
and p.prod_id < :pid

select p.prod_name, s.amount_sold, t.cal SYS_SQL_PLAN_fcc170b0a62d0f4d YES YES
endar_year
from sales s, products p, times t
where s.prod_id = p.prod_id
and s.time_id = t.time_id
and p.prod_id < :pid

Non-accepted plans can be verified by executing the evolve_sql_plan_baseline function. This function will execute the non-accepted plan and compare its performance to the best accepted plan. The execution is performed using the conditions (e.g., bind values, parameters, etc.) in effect at the time the non-accepted plan was added to the plan history. If the non-accepted plan's performance is better, the function will make it accepted, thus adding it to the SQL plan baseline. Let's see what happens when we execute this function:

SQL> var report clob;

SQL> exec :report := dbms_spm.evolve_sql_plan_baseline();

PL/SQL procedure successfully completed.

SQL> print :report

REPORT
-------------------------------------------------------------------------------

-------------------------------------------------------------------------------
                        Evolve SQL Plan Baseline Report
-------------------------------------------------------------------------------

Inputs:
-------
SQL_HANDLE =
PLAN_NAME =
TIME_LIMIT = DBMS_SPM.AUTO_LIMIT
VERIFY     = YES
COMMIT     = YES

Plan: SYS_SQL_PLAN_fcc170b08cbcb825
-----------------------------------
Plan was verified: Time used .1 seconds.
Passed performance criterion: Compound improvement ratio >= 10.13
Plan was changed to an accepted plan.

                      Baseline Plan      Test Plan     Improv. Ratio
                      -------------      ---------     -------------
Execution Status:        COMPLETE       COMPLETE
Rows Processed:               960            960
Elapsed Time(ms):              19             15              1.27
CPU Time(ms):                  18             15               1.2
Buffer Gets:                 1188            116             10.24
Disk Reads:                     0              0
Direct Writes:                  0              0
Fetches:                        0              0
Executions:                     1              1

-------------------------------------------------------------------------------
                                 Report Summary
-------------------------------------------------------------------------------
Number of SQL plan baselines verified: 1.
Number of SQL plan baselines evolved: 1.

The plan verification report shows that the new plan's performance was better and so it was made accepted and became part of the SQL plan baseline. We can confirm it by looking in the dba_sql_plan_baselines view:

SQL> select sql_text, plan_name, enabled, accepted from dba_sql_plan_baselines;

SQL_TEXT PLAN_NAME ENA ACC
---------------------------------------- ------------------------------ --- ---
select p.prod_name, s.amount_sold, t.cal SYS_SQL_PLAN_fcc170b08cbcb825 YES YES
endar_year
from sales s, products p, times t
where s.prod_id = p.prod_id
and s.time_id = t.time_id
and p.prod_id < :pid

select p.prod_name, s.amount_sold, t.cal SYS_SQL_PLAN_fcc170b0a62d0f4d YES YES
endar_year
from sales s, products p, times t
where s.prod_id = p.prod_id
and s.time_id = t.time_id
and p.prod_id < :pid

The SQL plan baseline now has two accepted plans: SYS_SQL_PLAN_fcc170b08cbcb825 is now accepted.

You can either execute the evolve_sql_plan_baseline() function manually or schedule it to run automatically in a maintenance window.

Another way of evolving a SQL plan baseline is to use the SQL Tuning Advisor. Instead of executing evolve_sql_plan_baseline, suppose we start from the original state where we have one accepted and one non-accepted plan:

SQL> select sql_text, plan_name, enabled, accepted from dba_sql_plan_baselines;

You can execute the SQL Tuning Advisor on the cursor in the cursor cache:

SQL> var tname varchar2(30);

SQL> exec :tname := dbms_sqltune.create_tuning_task(sql_id => 'bfbr3zrg9d5cc');

PL/SQL procedure successfully completed.

SQL> exec dbms_sqltune.execute_tuning_task(task_name => :tname);

PL/SQL procedure successfully completed.

SQL> select dbms_sqltune.report_tuning_task(:tname, 'TEXT', 'BASIC') FROM dual;

DBMS_SQLTUNE.REPORT_TUNING_TASK(:TNAME,'TEXT','BASIC')
-------------------------------------------------------------------------------
GENERAL INFORMATION SECTION
-------------------------------------------------------------------------------
Tuning Task Name   : TASK_505
Tuning Task Owner : SH
Workload Type      : Single SQL Statement
Scope              : COMPREHENSIVE
Time Limit(seconds): 1800
Completion Status : COMPLETED
Started at         : 11/11/2008 16:43:12
Completed at       : 11/11/2008 16:43:13

-------------------------------------------------------------------------------
Schema Name: SH
SQL ID     : bfbr3zrg9d5cc
SQL Text   : select p.prod_name, s.amount_sold, t.calendar_year
             from sales s, products p, times t
             where s.prod_id = p.prod_id
               and s.time_id = t.time_id
               and p.prod_id < :pid

-------------------------------------------------------------------------------
FINDINGS SECTION (1 finding)
-------------------------------------------------------------------------------
1- A potentially better execution plan was found for this statement.

-------------------------------------------------------------------------------
EXPLAIN PLANS SECTION
-------------------------------------------------------------------------------

1- Original With Adjusted Cost
------------------------------
Plan hash value: 2787970893

----------------------------------------------------------------
| Id | Operation                             | Name           |
----------------------------------------------------------------
| 0 | SELECT STATEMENT                      |                |
| 1 | NESTED LOOPS                         |                |
| 2 |   NESTED LOOPS                        |                |
| 3 |    HASH JOIN                          |                |
| 4 |     TABLE ACCESS BY INDEX ROWID       | PRODUCTS       |
| 5 |      INDEX RANGE SCAN                 | PRODUCTS_PK    |
| 6 |     PARTITION RANGE ALL               |                |
| 7 |      TABLE ACCESS BY LOCAL INDEX ROWID| SALES          |
| 8 |       BITMAP CONVERSION TO ROWIDS     |                |
| 9 |        BITMAP INDEX RANGE SCAN        | SALES_PROD_BIX |
| 10 |    INDEX UNIQUE SCAN                  | TIME_PK        |
| 11 |   TABLE ACCESS BY INDEX ROWID         | TIMES          |
----------------------------------------------------------------

2- Original With Adjusted Cost
------------------------------
Plan hash value: 2787970893

----------------------------------------------------------------
| Id | Operation                             | Name           |
----------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                |
|   1 | NESTED LOOPS                         |                |
|   2 |   NESTED LOOPS                        |                |
|   3 |    HASH JOIN                          |                |
|   4 |     TABLE ACCESS BY INDEX ROWID       | PRODUCTS       |
|   5 |      INDEX RANGE SCAN                 | PRODUCTS_PK    |
|   6 |     PARTITION RANGE ALL               |                |
|   7 |      TABLE ACCESS BY LOCAL INDEX ROWID| SALES          |
|   8 |       BITMAP CONVERSION TO ROWIDS     |                |
|   9 |        BITMAP INDEX RANGE SCAN        | SALES_PROD_BIX |
| 10 |    INDEX UNIQUE SCAN                  | TIME_PK        |
| 11 |   TABLE ACCESS BY INDEX ROWID         | TIMES          |
----------------------------------------------------------------

3- Using SQL Profile
--------------------
Plan hash value: 2361178149

------------------------------------------
| Id | Operation             | Name     |
------------------------------------------
|   0 | SELECT STATEMENT      |          |
|   1 | HASH JOIN            |          |
|   2 |   HASH JOIN           |          |
|   3 |    PARTITION RANGE ALL|          |
|   4 |     TABLE ACCESS FULL | SALES    |
|   5 |    TABLE ACCESS FULL | TIMES    |
|   6 |   TABLE ACCESS FULL   | PRODUCTS |
------------------------------------------

-------------------------------------------------------------------------------

SQL> exec dbms_sqltune.accept_sql_profile(task_name => :tname);

PL/SQL procedure successfully completed.

SQL> select sql_text, plan_name, enabled, accepted from dba_sql_plan_baselines;

SQL_TEXT                                 PLAN_NAME                      ENA ACC
---------------------------------------- ------------------------------ --- ---
select p.prod_name, s.amount_sold, t.cal SYS_SQL_PLAN_fcc170b08cbcb825 YES YES
endar_year
from sales s, products p, times t
where s.prod_id = p.prod_id
and s.time_id = t.time_id
and p.prod_id < :pid

select p.prod_name, s.amount_sold, t.cal SYS_SQL_PLAN_fcc170b0a62d0f4d YES YES
endar_year
from sales s, products p, times t
where s.prod_id = p.prod_id
and s.time_id = t.time_id
and p.prod_id < :pid

SQL> select sql_text, type, status from dba_sql_profiles;

SQL_TEXT                                 TYPE    STATUS
---------------------------------------- ------- --------
select p.prod_name, s.amount_sold, t.cal MANUAL ENABLED
endar_year
from sales s, products p, times t
where s.prod_id = p.prod_id
and s.time_id = t.time_id
and p.prod_id < :pid

What we see here is that SQL Tuning Advisor found a tuned plan (that coincidentally happened to be the non-accepted plan in our plan history). When we accepted the recommended SQL profile, the SQL Tuning Advisor created a SQL profile and also changed the non-accepted plan to accepted status, thus evolving the SQL plan baseline to two plans.

Note that the SQL Tuning Advisor may also find a completely new tuned plan, one that is not in the plan history. If you then accept the recommended SQL profile, the SQL Tuning Advisor will create a SQL profile and also add the tuned plan to the SQL plan baseline.

Thus, you can evolve a SQL plan baseline either by executing the evolve_sql_plan_baseline function or by using the SQL Tuning Advisor. New and provably better plans will be added by either of these methods to the SQL plan baseline.

SQL Plan Management (Part 2 of 4): SPM Aware Optimizer

2009-01-20T09:59:00.000-08:00

(Keep sending your feedback and questions. We'll address them in Part 4.)

In Part 1, we saw how you can create SQL plan baselines. After you create a SQL plan baseline for a statement, subsequent executions of that statement will use the SQL plan baseline. From all the plans in the SQL plan baseline, the optimizer will select the one with the best cost in the current environment (including bind values, current statistics, parameters, etc.). The optimizer will also generate the best-cost plan that it would otherwise have used without a SQL plan baseline. However, this best-cost plan will not be used but instead added to the statement's plan history for later verification. In other words, the optimizer will use a known plan from the SQL plan baseline instead of a new and hitherto unknown plan. This guarantees no performance regression.

Let's see this plan selection process in action. First, we create a SQL plan baseline by enabling automatic plan capture and executing the query twice:

SQL> alter session set optimizer_capture_sql_plan_baselines = true;

Session altered.

SQL> var pid number
SQL> exec :pid := 100;

PL/SQL procedure successfully completed.

SQL> select p.prod_name, s.amount_sold, t.calendar_year
2    from sales s, products p, times t
3    where s.prod_id = p.prod_id
4      and s.time_id = t.time_id
5      and p.prod_id < :pid;

PROD_NAME AMOUNT_SOLD CALENDAR_YEAR
--------- ----------- -------------
...
9 rows selected.

SQL> select * from table(dbms_xplan.display_cursor('bfbr3zrg9d5cc', 0, 'basic note'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:
------------------------
select p.prod_name, s.amount_sold, t.calendar_year from sales s,
products p, times t where s.prod_id = p.prod_id and s.time_id =
t.time_id and p.prod_id < :pid

Plan hash value: 2787970893

----------------------------------------------------------------
| Id  | Operation                             | Name           |
----------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                |
|   1 | NESTED LOOPS                         |                |
|   2 |   NESTED LOOPS                        |                |
|   3 |    HASH JOIN                          |                |
|   4 |     TABLE ACCESS BY INDEX ROWID       | PRODUCTS       |
|   5 |      INDEX RANGE SCAN                 | PRODUCTS_PK    |
|   6 |     PARTITION RANGE ALL               |                |
|   7 |      TABLE ACCESS BY LOCAL INDEX ROWID| SALES          |
|   8 |       BITMAP CONVERSION TO ROWIDS     |                |
|   9 |        BITMAP INDEX RANGE SCAN        | SALES_PROD_BIX |
| 10 |    INDEX UNIQUE SCAN                  | TIME_PK        |
| 11 |   TABLE ACCESS BY INDEX ROWID         | TIMES          |
----------------------------------------------------------------

25 rows selected.

SQL> select p.prod_name, s.amount_sold, t.calendar_year
2    from sales s, products p, times t
3    where s.prod_id = p.prod_id
4      and s.time_id = t.time_id
5      and p.prod_id < :pid;

PROD_NAME AMOUNT_SOLD CALENDAR_YEAR
--------- ----------- -------------
...
9 rows selected.

SQL> alter session set optimizer_capture_sql_plan_baselines = false;

Session altered.

SQL> select sql_text, plan_name, enabled, accepted from dba_sql_plan_baselines;

SQL_TEXT                                 PLAN_NAME                      ENA ACC
---------------------------------------- ------------------------------ --- ---
select p.prod_name, s.amount_sold, t.cal SYS_SQL_PLAN_fcc170b0a62d0f4d YES YES
endar_year
from sales s, products p, times t
where s.prod_id = p.prod_id
and s.time_id = t.time_id
and p.prod_id < :pid

We can see that a SQL plan baseline was created for the statement. Suppose the statement is hard parsed again (we do it here by flushing the shared pool). Let's turn off SQL plan management and execute the query with a different bind value:

SQL> exec :pid := 100000;

PL/SQL procedure successfully completed.

SQL> alter system flush shared_pool;

System altered.

SQL> alter session set optimizer_use_sql_plan_baselines = false;

Session altered.

SQL> select p.prod_name, s.amount_sold, t.calendar_year
2    from sales s, products p, times t
3    where s.prod_id = p.prod_id
4      and s.time_id = t.time_id
5      and p.prod_id < :pid;

PROD_NAME AMOUNT_SOLD CALENDAR_YEAR
--------- ----------- -------------
...
960 rows selected.

SQL> select * from table(dbms_xplan.display_cursor('bfbr3zrg9d5cc', 0, 'basic note'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
EXPLAINED SQL STATEMENT:
------------------------
select p.prod_name, s.amount_sold, t.calendar_year from sales s,
products p, times t where s.prod_id = p.prod_id and s.time_id = t.time_id and
p.prod_id < :pid

Plan hash value: 2361178149

------------------------------------------
| Id  | Operation             | Name     |
------------------------------------------
|   0 | SELECT STATEMENT      |          |
|   1 | HASH JOIN            |          |
|   2 |   HASH JOIN           |          |
|   3 |    PARTITION RANGE ALL|          |
|   4 |     TABLE ACCESS FULL | SALES    |
|   5 |    TABLE ACCESS FULL  | TIMES    |
|   6 |   TABLE ACCESS FULL   | PRODUCTS |
------------------------------------------

20 rows selected.

We can see that the optimizer selected a different plan because the new bind value makes the predicate less selective. Let's turn SQL plan management back on and re-execute the query with the same bind value:

SQL> alter session set optimizer_use_sql_plan_baselines = true;

The note at the bottom tells you that the optimizer is using the SQL plan baseline. In other words, we can see that the optimizer used an accepted plan in the SQL plan baseline in favor of a new plan. In fact, we can also check that the optimizer inserted the new plan into the statement's plan history:

SQL> select sql_text, plan_name, enabled, accepted from dba_sql_plan_baselines;

The 'NO' value for the accepted column implies that the new plan is in the plan history but is not available for use until it is verified to be a good plan. The optimizer will continue to use an accepted plan until new plans are verified and added to the SQL plan baseline. If there is more than one plan in the SQL plan baseline, the optimizer will use the one with the best cost under the then-current conditions (statistics, bind values, parameter settings and so on).

When you create a SQL plan baseline for a SQL statement, the SPM aware optimizer thus guarantees that no new plans will be used other than the ones in the SQL plan baseline. This prevents unexpected plan changes that sometimes lead to performance regressions.

Preventing new plans from being used is fine, but what if the new plans are in fact better? In Part 3, we will describe how new and improved plans are added to a SQL plan baseline.

Plan regressions got you down? SQL Plan Management to the rescue!

2009-01-08T15:58:00.000-08:00

Part 1 of 4: Creating SQL plan baselines

Do you ever experience performance regressions because an execution plan has changed for the worse? If you have, then we have an elegant solution for you in 11g called SQL Plan Management (SPM). The next four posts on our blog will cover SPM in detail. Let's begin by reviewing the primary causes for plan changes.

Execution plan changes occur due to various system changes. For example, you might have (manually or automatically) updated statistics for some objects, or changed a few optimizer-related parameters. A more dramatic change is a database upgrade (say from 10gR2 to 11g). All of these changes have the potential to cause new execution plans to be generated for many of your SQL statements. Most new plans are obviously improvements because they are tailored to the new system environment, but some might be worse leading to performance regressions. It is the latter that cause sleepless nights for many DBAs.

DBAs have several options for addressing these regressions. However, what most DBAs want is simple: plans should only change when they will result in performance gains. In other words, the optimizer should not pick bad plans, period.

This first post in our series, describes the concepts of SQL Plan Management and how to create SQL plan baselines. The second part will describe how and when these SQL plan baselines are used. The third part will discuss evolution, the process of adding new and improved plans to SQL plan baselines. Finally, the fourth part will describe user interfaces and interactions with other Oracle objects (like stored outlines).

Introduction

SQL Plan Management (SPM) allows database users to maintain stable yet optimal performance for a set of SQL statements. SPM incorporates the positive attributes of plan adaptability and plan stability, while simultaneously avoiding their shortcomings. It has two main objectives:

prevent performance regressions in the face of database system changes
offer performance improvements by gracefully adapting to database system changes

A managed SQL statement is one for which SPM has been enabled. SPM can be configured to work automatically or it can be manually controlled either wholly or partially (described later). SPM helps prevent performance regressions by enabling the detection of plan changes for managed SQL statements. For this purpose, SPM maintains, on disk, a plan history consisting of different execution plans generated for each managed SQL statement. An enhanced version of the Oracle optimizer, called SPM aware optimizer, accesses, uses, and manages this information which is stored in a repository called the SQL Management Base (SMB).

The plan history enables the SPM aware optimizer to determine whether the best-cost plan it has produced using the cost-based method is a brand new plan or not. A brand new plan represents a plan change that has potential to cause performance regression. For this reason, the SPM aware optimizer does not choose a brand new best-cost plan. Instead, it chooses from a set of accepted plans. An accepted plan is one that has been either verified to not cause performance regression or designated to have good performance. A set of accepted plans is called a SQL plan baseline, which represents a subset of the plan history.

A brand new plan is added to the plan history as a non-accepted plan. Later, an SPM utility verifies its performance, and keeps it as a non-accepted plan if it will cause a performance regression, or changes it to an accepted plan if it will provide a performance improvement. The plan performance verification process ensures both plan stability and plan adaptability.

The figure below shows the SMB containing the plan history for three SQL statements. Each plan history contains some accepted plans (the SQL plan baseline) and some non-accepted plans.

(Click on the image for a larger view.)

You can create a SQL plan baseline in several ways: using a SQL Tuning Set (STS); from the cursor cache; exporting from one database and importing into another; and automatically for every statement. Let's look at each in turn. The examples in this blog entry use the Oracle Database Sample Schemas so you can try them yourself.

Creating SQL plan baselines from STS

If you are upgrading from 10gR2 or have an 11g test system, you might already have an STS containing some or all of your SQL statements. This STS might contain plans that perform satisfactorily. Let's call this STS MY_STS. You can create a SQL plan baseline from this STS as follows:

SQL> variable pls number;
SQL> exec :pls := dbms_spm.load_plans_from_sqlset(sqlset_name => 'MY_STS', -
>                   basic_filter => 'sql_text like ''select%p.prod_name%''');

This will create SQL plan baselines for all statements that match the specified filter.

Creating SQL plan baselines from cursor cache

You can automatically create SQL plan baselines for any cursor that is currently in the cache as follows:

SQL> exec :pls := dbms_spm.load_plans_from_cursor_cache( -
>                   attribute_name => 'SQL_TEXT', -
>                   attribute_value => 'select%p.prod_name%');

This will create SQL plan baselines for all statements whose text matches the specified string. Several overloaded variations of this function allow you to filter on other cursor attributes.

Creating SQL plan baselines using a staging table

If you already have SQL plan baselines (say on an 11g test system), you can export them to another system (a production system for instance).

First, on the test system, create a staging table and pack the SQL plan baselines you want to export:

SQL> exec dbms_spm.create_stgtab_baseline(table_name => 'MY_STGTAB', -
>           table_owner => 'SH');

PL/SQL procedure successfully completed.

SQL> exec :pls := dbms_spm.pack_stgtab_baseline( -
>                   table_name => 'MY_STGTAB', -
>                   table_owner => 'SH', -
>                   sql_text => 'select%p.prod_name%');

This will pack all SQL plan baselines for statements that match the specified filter. The staging table, MY_STGTAB, is a regular table that you should export to the production system using Datapump Export.

On the production system, you can now unpack the staging table to create the SQL plan baselines:

SQL> exec :pls := dbms_spm.unpack_stgtab_baseline( -
>                   table_name => 'MY_STGTAB', -
>                   table_owner => 'SH', -
>                   sql_text => 'select%p.prod_name%');

This will unpack the staging table and create SQL plan baselines. Note that the filter for unpacking the staging table is optional and may be different than the one used during packing. This means that you can pack several SQL plan baselines into a staging table and selectively unpack only a subset of them on the target system.

Creating SQL plan baselines automatically

You can create SQL plan baselines for all repeatable statements automatically by setting the parameter optimizer_capture_sql_plan_baselines to TRUE (default is FALSE). The first plan captured for any statement is automatically accepted and becomes part of the SQL plan baseline, so enable this parameter only when you are sure that the default plans are performing well.

You can use the automatic plan capture mode when you have upgraded from a previous database version. Set optimizer_features_enable to the earlier version and execute your workload. Every repeatable statement will have its plan captured thus creating SQL plan baselines. You can reset optimizer_features_enable to its default value after you are sure that all statements in your workload have had a chance to execute.

Note that this automatic plan capture occurs only for repeatable statements, that is, statements that are executed at least twice. Statements that are only executed once will not benefit from SQL plan baselines since accepted plans are only used in subsequent hard parses.

The following example shows a plan being captured automatically when the same statement is executed twice:

SQL> alter session set optimizer_capture_sql_plan_baselines = true;

Oracle Open World follow up

2008-10-14T13:40:00.001-07:00

We were delighted to see so many people turn up for our Open World session - Inside the 11g Optimizer - so early on Tuesday morning! A lot of people have been asking where they can find more information on the topics covered especially the demos that were shown. You can find similar worked examples for most of the new 11g Optimizer features on the Oracle By Example website. You can also get more information on SQL Plan Management in the following white paper. Our blog entry from December 2007 has more information on Adaptive Cursor Sharing while the January 2008 entry gives more details on the enhancements made to statistics. We hope you enjoyed Oracle Open World as much as we did!

Will the Optimizer development team be at Oracle Open World?

2008-07-24T15:50:00.000-07:00

The largest gathering of Oracle customers, partners, developers, and technology enthusiasts will happen in September when Oracle will host its annual user conference Open World in San Francisco and the Optimizer development group will be there! You will have two opportunities to meet the team -- attend the technical presentation "Inside the 11g Optimizer - Removing the mystery" on Tuesday morning at 9am or stop by the Oracle demo grounds (in Moscone West) to see all of the demos for the 11g new features and ask the development team any burning questions you may have!

Maria Colgan

Why are some of the tables in my query missing from the plan?

2008-06-26T15:35:00.000-07:00

We apologize for our brief hiatus from blogging. We've been busy working on improvements to the optimizer.

In 10gR2, we introduced a new transformation, table elimination (alternately called "join elimination"), which removes redundant tables from a query. A table is redundant if its columns are only referenced to in join predicates, and it is guaranteed that those joins neither filter nor expand the resulting rows. There are several cases where Oracle will eliminate a redundant table. We will discuss each case in turn.

Primary Key-Foreign Key Table Elimination

Starting in 10gR2, the optimizer eliminates tables that are redundant due to primary key-foreign key constraints. Consider the following example tables:

create table jobs
(
job_id NUMBER PRIMARY KEY,
job_title VARCHAR2(35) NOT NULL,
min_salary NUMBER,
max_salary NUMBER
);
create table departments
(
department_id NUMBER PRIMARY KEY,
department_name VARCHAR2(50)
);
create table employees
(
employee_id NUMBER PRIMARY KEY,
employee_name VARCHAR2(50),
department_id NUMBER REFERENCES departments(department_id),
job_id NUMBER REFERENCES jobs(job_id)
);

and the query:

select e.employee_name
from employees e, departments d
where e.department_id = d.department_id;

In this query, the join to departments is redundant. The only column referenced in the query appears in the join predicate, and the primary key-foreign key constraint guarantees that there is at most one match in departments for each row in employees. Hence, the query is equivalent to:

select e.employee_name
from employees e
where e.department_id is not null;

The optimizer will generate this plan for the query:

-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
*  1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 1 - filter("E"."DEPARTMENT_ID" IS NOT NULL)

Note that the IS NOT NULL predicate is not necessary if the column has a NOT NULL constraint on it.

Starting in 11gR1, the optimizer will also eliminate tables that are semi-joined or anti-joined. Consider the following query:


select e.employee_id, e.employee_name
from employees e
where not exists (select 1
                  from jobs j
                  where j.job_id = e.job_id);

Since employees.job_id is a foreign key to jobs.job_id, any non-null value in employees.job_id must have a match in jobs. So only employees with null values for employees.job_id will appear in the result. Hence, this query is equivalent to:

select e.employee_id, e.employee_name
from employees e
where job_id is null;

and the optimizer can choose this plan:


-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
*  1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("E"."JOB_ID" IS NULL)

Suppose employees.job_id has a NOT NULL constraint:

alter table employees modify job_id not null;

In this case, there could not possibly be any rows in EMPLOYEES, and the optimizer could choose this plan:

-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
*  1   FILTER                         
   2    TABLE ACCESS FULL   EMPLOYEES 
-------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 1 - filter(NULL IS NOT NULL)

The "NULL IS NOT NULL" filter is a false constant predicate, that will prevent the table scan from even taking place.

Also in 11gR1, the optimization became available for ANSI compliant joins. For this query:

select employee_name
from employees e inner join jobs j
on e.job_id = j.job_id;

the optimizer can eliminate JOBS and produce this plan:

-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
   1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------

Outer Join Table Elimination

In 11gR1, a new form of table elimination was introduced for outer joins, which does not require PK-FK constraints. For the example, we require a new table and an addition to EMPLOYEES:

create table projects
(
project_id NUMBER UNIQUE,
deadline DATE,
priority NUMBER
);

alter table employees add project_id number;

Now consider a query that outer joins employees and projects:

select e.employee_name, e.project_id
from employees e, projects p
where e.project_id = p.project_id (+);

The outer join guarantees that every row in employees will appear at least once in the result. The unique constraint on projects.project_id guarantees that every row in employees will match at most one row in projects. Together, these two properties guarantee that every row in employees will appear in the result exactly once. Since no other columns from projects are referenced, projects can be eliminated, and the optimizer can choose this plan:

-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
   1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------

Why Would I Ever Write Such a Query?

All of the example queries in this post are very simple, and one would be unlikely to write a query where the join is so obviously unnecessary. There are many real world scenarios where table elimination may be helpful, including machine-generated queries and elimination of tables in views. For example, a set of tables might be exposed as a view, which contains a join. The join may be necessary to retrieve all of the columns exposed by the view. But some users of the view may only access a subset of the columns, and in this case, the joined table can be eliminated.

For example, consider the view:

create view employee_directory_v as
select e.employee_name, d.department_name, j.job_title
from employees e, departments d, jobs j
where e.department_id = d.department_id
and e.job_id = j.job_id;

This view might be exposed to a simple employee directory application. To lookup employee names by job title, the application issues a query:

select employee_name
from employee_directory_v
where department = 'ACCOUNTING';

Since the job_title column is not referenced, jobs can be eliminated from the query, and the optimizer can choose this plan:

--------------------------------------------
 Id   Operation             Name       
--------------------------------------------
   0  SELECT STATEMENT                 
*  1   HASH JOIN                       
   2    TABLE ACCESS FULL   EMPLOYEES  
*  3    TABLE ACCESS FULL   DEPARTMENTS
--------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 1 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
 3 - filter("D"."DEPARTMENT_NAME"='ACCOUNTING')

Known Limitations

There are currently a few limitations of table elimination:

Multi-column primary key-foreign key constraints are not supported.
Referring to the join key elsewhere in the query will prevent table elimination. For an inner join, the join keys on each side of the join are equivalent, but if the query contains other references to the join key from the table that could otherwise be eliminated, this prevents elimination. A workaround is to rewrite the query to refer to the join key from the other table (we realize this is not always possible).

Oracle keeps closing my TAR because I cannot provide a testcase, can you help?

2008-03-14T12:31:00.001-07:00

The answer to this question is yes, as Oracle Database 11g provides a new diagnostic tool called SQL Test Case Builder. In this article, we explain what SQL Test Case Builder is, and how to use it with examples.

Why SQL Test Case Builder?

For most SQL problems, the single most important factor for a speedy bug resolution is to obtain a reproducible test case. However, this is normally the longest and most painful step for customers. The goal of the SQL Test Case Builder (TCB) is to automatically gather as much information as possible related to a SQL incident (problem) and package it in a way that allows a developer or a support engineer to reproduce the problem on his or her own machine quickly.

At a very high-level, SQL Test Case Builder can be seen as a way to export a SQL. Currently, Oracle export (expdp) takes a schema or a set of tables and exports all the dependents objects. SQL Test Case Builder provides the same service but takes a SQL statement as input.

What's Inside Test Case Builder?

The main input of SQL Test Case Builder is a SQL object. A SQL object is defined as the SQL text plus all the information required to compile it on a particular database instance (this contains the parsing user name, for example).

Logically, a SQL test case appears as a script containing all the necessary commands to recreate the objects, the user, the statistics, and the environment.

Within the Oracle Diagnosability infrastructure, TCB compiles the problem SQL in a special capture mode to obtain the set of objects to export. A test case captures two types of information:

Permanent information

SQL text

PL/SQL functions, procedures, packages

Statistics

Bind variables

Compilation environment

User information (like privileges)

SQL profiles, stored outlines, or other SQL Management Objects

Meta data on all the objects involved

Optimizer statistics

The execution plan information

The table content (sample or full). This is optional.

Transient information

How do I use the SQL Test Case Builder?

The task of creating a SQL test case can be performed in two ways:

From EM (Enterprise Manager), where TCB is invoked on user-demand via IPS (Incident Packaging Service) after a SQL incident occurred. The user can also manually create an incident for a problem query for building test case purpose.

From SQLPLUS, where you can directly invoke one of the PL/SQL API functions in the SQL Diagnostic package. We will give examples of using the APIs below.

All the new PL/SQL procedures supporting SQL Test Case Builder are part of a new PL/SQL package called dbms_sqldiag (see dbmsdiag.sql for details). The two main features related to TCB in this package are export and import test cases.

Procedure dbms_sqldiag.export_sql_testcase exports a SQL test case for a given SQL statement to a given directory.

Procedure dbms_sqldiag.import_sql_testcase imports a test case from a given directory.

To build (or export) a test case, the simplest form would be something like:

     dbms_sqldiag.export_sql_testcase(
       directory  => 'TCB_DIR_EXP',
       sql_text   => 'select count(*) from sales',
       testcase   => tco)

Here directory and sql_text are inputs which specify where the test case will be stored, and the problem query statement, respectively. Testcase specifies the test case metadata as output.

For security reason, the user data are not exported by default. You have the option to set exportData to TRUE to include the data. You can also set samplingPercent if you are exporting with data. To protect users proprietary codes, TCB will not export PL/SQL package body by default.

Once the test case has been built, you can copy all the files under the export directory to your test environment. Note there is a file called xxxxxxxxmain.xml, for example, oratcb1_03C600800001main.xml, which contains the metadata of the test case.

Now importing the test case can be as simple as:

     dbms_sqldiag.import_sql_testcase(
       directory => 'TEST_DIR',
       filename => 'oratcb1_03C600800001main.xml')

To verify that the test case is successfully rebuilt, you can just issue an explain command for the problem query. However, if you want to actully run the query, then you need to have the data available.

You can refer to dbmsdiag.sql for more information about other options available for these procedures.

Example - We now show the typical steps of using TCB by a sample query with materialized view. In this exmaple, we set the exportData option to TRUE, so we can re-run the same query after the TCB task is completed.

Setup

SQL> connect / as sysdba
Connected.
SQL>
SQL> create or replace directory TCB_DIR_EXP as
  2  '/net/tiger/apps/tcb_exp';
Directory created.
SQL>
SQL> grant dba to apps;
Grant succeeded.
SQL>
SQL> connect apps/apps
Connected.
SQL>
SQL> create materialized view scp_mvu
  2  parallel 2
  3  as
  4  select          p.prod_name, c.cust_gender,
  5                  max(s.amount_sold) max_amount_sold
  6  from            sales s, products p, customers c
  7  where           s.prod_id = p.prod_id
  8  and             s.cust_id = c.cust_id
  9  group by        p.prod_name, c.cust_gender;

Materialized view created.

SQL>
SQL> desc scp_mvu;
 Name                                      Null?    Type
 ----------------------------------------- -------- ------------
 PROD_NAME                                 NOT NULL VARCHAR2(50)
 CUST_GENDER                                        CHAR(1)
 MAX_AMOUNT_SOLD                                    NUMBER

SQL>
SQL> select * from scp_mvu where max_amount_sold > 7000 order by 3;

PROD_NAME                                          C MAX_AMOUNT_SOLD
-------------------------------------------------- - ---------------
Joseph Sportcoat                                   F          7400.8
Kenny Cool Leather Skirt                           M            7708
Leather Boot-Cut Trousers                          M            8184

3 rows selected.

Export as user APPS

SQL> connect apps/apps
Connected.

SQL>
SQL> Rem define the problem SQL statement
SQL> create or replace package define_vars is
  2    sql_stmt1     varchar2(2000) := q'# select * from scp_mvu
  3                                        where max_amount_sold > 7000
  4                                        order by 3
  5                                      #';
  6  end;
  7  /

Package created.
SQL> 
SQL> set serveroutput on
SQL>
SQL> declare
  2    tco           clob;
  3  begin
  4    -- Export test case
  5    dbms_sqldiag.export_sql_testcase
  6    (
  7      directory           => 'TCB_DIR_EXP',
  8      sql_text            => define_vars.sql_stmt1,
  9      user_name           => 'APPS',
 10      exportData          => TRUE,
 11      testcase            => tco
 12    );
 13 
 14  end;
 15  /

PL/SQL procedure successfully completed.
SQL>
SQL> Rem Drop MV before importing
SQL> drop materialized view scp_mvu;

Materialized view dropped.

SQL> conn / as sysdba
Connected.
SQL> create or replace directory TCB_DIR_IMP
  2  as '/net/lion/test/tcb_imp';
Directory created.
SQL>
SQL> grant dba to test;
Grant succeeded.

Import as user TEST

SQL> connect test/test
Connected.
SQL>
SQL> set serveroutput on
SQL>
SQL> begin
  2    -- Import test case
  3    dbms_sqldiag.import_sql_testcase
  4    (
  5      directory           => 'TCB_DIR_IMP',
  6      filename            => 'oratcb3_05e803500001main.xml',
  7      importData          => TRUE
  8    );
  9 
 10  end;
 11  /

PL/SQL procedure successfully completed.

Verification. This is to check that now all relevant objects were imported successfully.

SQL> desc scp_mvu;
 Name                                      Null?    Type
 ----------------------------------------- -------- ------------
 PROD_NAME                                 NOT NULL VARCHAR2(50)
 CUST_GENDER                                        CHAR(1)
 MAX_AMOUNT_SOLD                                    NUMBER
SQL>
SQL> select * from scp_mvu where max_amount_sold > 7000 order by 3;

PROD_NAME                                          C MAX_AMOUNT_SOLD
-------------------------------------------------- - ---------------
Joseph Sportcoat                                   F          7400.8
Kenny Cool Leather Skirt                           M            7708
Leather Boot-Cut Trousers                          M            8184

3 rows selected.

Finally, we also have good news for 10g users: SQL Test Case Builder has been backported to 10.2.0.4!

Upgrading from Oracle Database 9i to 10g: What to expect from the Optimizer

2008-02-26T13:34:00.000-08:00

One of the most daunting activities a DBA can undertake is upgrading the database to a new version. Having to comprehend all of the new features and to deal with potential plan changes can be overwhelming. In order to help DBA's upgrade from Oracle Database 9i to 10g a new whitepaper called "Upgrading from Oracle Database 9i to 10g: What to expect from the Optimizer" has recently been posted on Oracle Technology Network (OTN). This paper aims to explain in detail what to expect from the CBO when you upgrade from Oracle database 9i to 10g and describes what steps you should take before and after the upgrade to minimize any potential SQL regressions. This is a must read for any DBA planning on upgrading from 9i to 10g in the near future!

Displaying and reading the execution plans for a SQL statement

2008-02-07T12:37:00.000-08:00

Generating and displaying the execution plan of a SQL statement is a common task for most DBAs, SQL developers, and preformance experts as it provides them information on the performance characteristics of a SQL statement. An execution plan shows the detailed steps necessary to execute a SQL statement. These steps are expressed as a set of database operators that consumes and produces rows. The order of the operators and their implentation is decided by the query optimizer using a combination of query transformations and physical optimization techniques.

While the display is commonly shown in a tabular format, the plan is in fact tree-shaped. For example, consider the following query based on the SH schema (Sales History):


select prod_category, avg(amount_sold)
from sales s, products p
where p.prod_id = s.prod_id
group by prod_category;

The tabular representation of this query's plan is:


------------------------------------------
 Id   Operation              Name   
------------------------------------------
   0  SELECT STATEMENT              
   1   HASH GROUP BY                
   2    HASH JOIN                   
   3     TABLE ACCESS FULL   PRODUCTS
   4     PARTITION RANGE ALL        
   5      TABLE ACCESS FULL  SALES  
------------------------------------------

While the tree-shaped representation of the plan is:


   GROUP BY
      |
     JOIN
 _____|_______
 |            |
ACCESS     ACCESS
(PRODUCTS) (SALES)

When you read a plan tree you should start from the bottom up. In the above example begin by looking at the access operators (or the leaves of the tree). In this case the access operators are implemented using full table scans. The rows produced by these tables scans will be consumed by the join operator. Here the join operator is a hash-join (other alternatives include nested-loop or sort-merge join). Finally the group-by operator implemented here using hash (alternative would be sort) consumes rows produced by the join-opertor.

The execution plan generated for a SQL statement is just one of the many alternative execution plans considered by the query optimizer. The query optimizer selects the execution plan with the lowest cost. Cost is a proxy for performance, the lower is the cost the better is the performance. The cost model used by the query optimizer accounts for the IO, CPU, and network usage in the query.

There are two different methods you can use to look at the execution plan of a SQL statement:

EXPLAIN PLAN command - This displays an execution plan for a SQL statement without actually executing the statement.

V$SQL_PLAN - A dictionary view introduced in Oracle 9i that shows the execution plan for a SQL statement that has been compiled into a cursor in the cursor cache.

Under certain conditions the plan shown when using EXPLAIN PLAN can be different from the plan shown using V$SQL_PLAN. For example, when the SQL statement contains bind variables the plan shown from using EXPLAIN PLAN ignores the bind variable values while the plan shown in V$SQL_PLAN takes the bind variable values into account in the plan generation process.

Displaying an execution plan has been made easier after the introduction of the dbms_xplan package in Oracle 9i and by the enhancements made to it in subsequent releases. This packages provides several PL/SQL procedures to display the plan from different sources:

EXPLAIN PLAN command

V$SQL_PLAN

Automatic Workload Repository (AWR)

SQL Tuning Set (STS)

SQL Plan Baseline (SPM)

The following examples illustrate how to generate and display an execution plan for our original SQL statement using the different functions provided in the dbms_xplan package.

Example 1 Uses the EXPLAIN PLAN command and the dbms_xplan.display function.


SQL> EXPLAIN PLAN FOR
 2   select prod_category, avg(amount_sold)
 3   from sales s, products p
 4   where p.prod_id = s.prod_id
 5   group by prod_category;

Explained.


SQL> select plan_table_output
 2    from table(dbms_xplan.display('plan_table',null,'basic'));

------------------------------------------
 Id   Operation              Name   
------------------------------------------
   0  SELECT STATEMENT              
   1   HASH GROUP BY                
   2    HASH JOIN                   
   3     TABLE ACCESS FULL   PRODUCTS
   4     PARTITION RANGE ALL        
   5      TABLE ACCESS FULL  SALES  
------------------------------------------

The arguments are for dbms_xplan.display are:

plan table name (default 'PLAN_TABLE'),

statement_id (default null),

format (default 'TYPICAL')

More details can be found in $ORACLE_HOME/rdbms/admin/dbmsxpln.sql.

Example 2 Generating and displaying the execution plan for the last SQL statement executed in a session:


SQL> select prod_category, avg(amount_sold)
 2   from sales s, products p
 3   where p.prod_id = s.prod_id
 4   group by prod_category;

no rows selected


SQL> select plan_table_output
 2    from table(dbms_xplan.display_cursor(null,null,'basic'));

------------------------------------------
 Id   Operation              Name   
------------------------------------------
   0  SELECT STATEMENT              
   1   HASH GROUP BY                
   2    HASH JOIN                   
   3     TABLE ACCESS FULL   PRODUCTS
   4     PARTITION RANGE ALL        
   5      TABLE ACCESS FULL  SALES  
------------------------------------------

The arguments used by dbms_xplay.dispay_cursor are:

SQL ID (default null, null means the last SQL statement executed in this session),

child number (default 0),

format (default 'TYPICAL')

The details are in $ORACLE_HOME/rdbms/admin/dbmsxpln.sql.

Example 3 Displaying the execution plan for any other statement requires the SQL ID to be provided, either directly or indirectly:

Directly:


SQL> select plan_table_output from
 2   table(dbms_xplan.display_cursor('fnrtqw9c233tt',null,'basic'));

Indirectly:


SQL> select plan_table_output
 2   from v$sql s,
 3   table(dbms_xplan.display_cursor(s.sql_id,
 4                                  s.child_number, 'basic')) t
 5   where s.sql_text like 'select PROD_CATEGORY%';

Example 4 - Displaying an execution plan corresponding to a SQL Plan Baseline. SQL Plan Baselines have been introduced in Oracle 11g to support the SQL Plan Management feature (SPM). In order to illustrate such a case we need to create a SQL Plan Baseline first.


SQL> alter session set optimizer_capture_sql_plan_baselines=true;

Session altered.

SQL> select prod_category, avg(amount_sold)
 2   from sales s, products p
 3   where p.prod_id = s.prod_id
 4   group by prod_category;

no rows selected

If the above statement has been executed more than once, a SQL Plan Baseline will be created for it and you can verified this using the follows query:


SQL> select SQL_HANDLE, PLAN_NAME, ACCEPTED
 2   from dba_sql_plan_baselines
 3   where sql_text like 'select prod_category%';

SQL_HANDLE                     PLAN_NAME                      ACC
------------------------------ ------------------------------ ---
SYS_SQL_1899bb9331ed7772       SYS_SQL_PLAN_31ed7772f2c7a4c2  YES

The execution plan for the SQL Plan Baseline created above can be displayed either directly or indirectly:

Directly

select t.* from
table(dbms_xplan.display_sql_plan_baseline('SYS_SQL_1899bb9331ed7772',
                                           format => 'basic')) t

Indirectly

select t.*
    from (select distinct sql_handle
          from dba_sql_plan_baselines
          where sql_text like 'select prod_category%') pb,
         table(dbms_xplan.display_sql_plan_baseline(pb.sql_handle,
                                                    null,'basic')) t;

The output of either of these two statements is:


----------------------------------------------------------------------------
SQL handle: SYS_SQL_1899bb9331ed7772
SQL text: select prod_category, avg(amount_sold) from sales s, products p
          where p.prod_id = s.prod_id group by prod_category
----------------------------------------------------------------------------

----------------------------------------------------------------------------
Plan name: SYS_SQL_PLAN_31ed7772f2c7a4c2
Enabled: YES     Fixed: NO      Accepted: YES     Origin: AUTO-CAPTURE
----------------------------------------------------------------------------

Plan hash value: 4073170114

---------------------------------------------------------
 Id   Operation                 Name               
---------------------------------------------------------
   0  SELECT STATEMENT                             
   1   HASH GROUP BY                               
   2    HASH JOIN                                  
   3     VIEW                   index$_join$_002   
   4      HASH JOIN                                
   5       INDEX FAST FULL SCAN PRODUCTS_PK        
   6       INDEX FAST FULL SCAN PRODUCTS_PROD_CAT_IX
   7     PARTITION RANGE ALL                       
   8      TABLE ACCESS FULL     SALES              
---------------------------------------------------------

Formatting

The format argument is highly customizable and allows you to see as little (high-level) or as much (low-level) details as you need / want in the plan output. The high-level options are:

Basic
The plan includes the operation, options, and the object name (table, index, MV, etc)
Typical
It includes the information shown in BASIC plus additional optimizer-related internal information such as cost, size, cardinality, etc. These information are shown for every operation in the plan and represents what the optimizer thinks is the operation cost, the number of rows produced, etc. It also shows the predicates evaluation by the operation. There are two types of predicates: ACCESS and FILTER. The ACCESS predicates for an index are used to fetch the relevant blocks because they apply to the search columns. The FILTER predicates are evaluated after the blocks have been fetched.
All
It includes the information shown in TYPICAL plus the lists of expressions (columns) produced by every operation, the hint alias and query block names where the operation belongs. The last two pieces of information can be used as arguments to add hints to the statement.

The low-level options allow the inclusion or exclusion of find details, such as predicates and cost.
For example,

select plan_table_output
from table(dbms_xplan.display('plan_table',null,'basic +predicate +cost'));

-------------------------------------------------------
 Id   Operation              Name      Cost (%CPU)
-------------------------------------------------------
   0  SELECT STATEMENT                    17  (18)
   1   HASH GROUP BY                      17  (18)
*  2    HASH JOIN                         15   (7)
   3     TABLE ACCESS FULL   PRODUCTS      9   (0)
   4     PARTITION RANGE ALL               5   (0)
   5      TABLE ACCESS FULL  SALES         5   (0)
-------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("P"."PROD_ID"="S"."PROD_ID")

select plan_table_output from
table(dbms_xplan.display('plan_table',null,'typical -cost -bytes'));

----------------------------------------------------------------------------
 Id   Operation              Name      Rows  Time      Pstart Pstop
----------------------------------------------------------------------------
   0  SELECT STATEMENT                    4  00:00:01             
   1   HASH GROUP BY                      4  00:00:01             
*  2    HASH JOIN                       960  00:00:01             
   3     TABLE ACCESS FULL   PRODUCTS   766  00:00:01             
   4     PARTITION RANGE ALL            960  00:00:01      1     16
   5      TABLE ACCESS FULL  SALES      960  00:00:01      1     16
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("P"."PROD_ID"="S"."PROD_ID")

Note Section

In addition to the plan, the package displays notes in the NOTE section, such as that dynamic sampling was used during query optimization or that star transformation was applied to the query.
For example, if the table SALES did not have statistics then the optimizer will use dynamic sampling and the plan display will report it as follows (see '+note' detail in the query):

select plan_table_output
from table(dbms_xplan.display('plan_table',null,'basic +note'));

------------------------------------------
 Id   Operation              Name   
------------------------------------------
   0  SELECT STATEMENT              
   1   HASH GROUP BY                
   2    HASH JOIN                   
   3     TABLE ACCESS FULL   PRODUCTS
   4     PARTITION RANGE ALL        
   5      TABLE ACCESS FULL  SALES  
------------------------------------------

Note
-----
- dynamic sampling used for this statement

Bind peeking

The query optimizer takes into account the values of bind variable values when generation an execution plan. It does what is generally called bind peeking. See the first post in this blog about the concept of bind peeking and its impact on the plans and the performance of SQL statements.
As stated earlier the plan shown in V$SQL_PLAN takes into account the values of bind variables while the one shown from using EXPLAIN PLAN does not. Starting with 10gR2, the dbms_xplan package allows the display of the bind variable values used to generate a particular cursor/plan. This is done by adding '+peeked_binds' to the format argument when using display_cursor().
This is illustrated with the following example:


variable pcat varchar2(50)
exec :pcat := 'Women'

select PROD_CATEGORY, avg(amount_sold)
from sales s, products p
where p.PROD_ID = s.PROD_ID
and prod_category != :pcat
group by PROD_CATEGORY;

select plan_table_output
from table(dbms_xplan.display_cursor(null,null,'basic +PEEKED_BINDS'));

------------------------------------------
 Id   Operation              Name   
------------------------------------------
   0  SELECT STATEMENT              
   1   HASH GROUP BY                
   2    HASH JOIN                   
   3     TABLE ACCESS FULL   PRODUCTS
   4     PARTITION RANGE ALL        
   5      TABLE ACCESS FULL  SALES  
------------------------------------------

Peeked Binds (identified by position):
--------------------------------------

1 - :PCAT (VARCHAR2(30), CSID=2): 'Women'

Improvement of AUTO sampling statistics gathering feature in Oracle 11g

2008-01-22T15:34:00.001-08:00

Optimizer statistics in Oracle are managed via a pl/sql package, dbms_stats. It provides several pl/sql procedures to gather statistics for a table, schema, or a database. For example, gather_table_statistics is used to gather statistics on a table. This procedure has an estimate_percent parameter, which specifies the sampling percentage of the statistics gathering. The users can specify any number between 0 ~ 100 for this parameter. For example, suppose you have a table BIGT, you can specify a 1% sampling percentage as follows:

exec dbms_stats.gather_table_stats(null, 'BIGT', 
                       estimate_percent => 1);

It is not always easy for users to pick the appropriate sampling percentage. If the specified sampling percentage is too high, it can take longer to gather statistics. On the contray, if the data is skewed and the specified sampling percentage is too low, the resulting statistics can be inaccurate.

For this reason, Oracle introduced the AUTO value for the estimate_percent parameter. For example, you can gather statistics on BIGT as follows:

exec dbms_stats.gather_table_stats(null, 'BIGT', 
                       estimate_percent => dbms_stats.auto_sample_size);

The advantage of using AUTO sample size over a fixed number is two-folds. First, when AUTO sample size is specified, the system automatically determines the appropriate sampling percentage. Second, AUTO sample size is more flexible than a fixed sampling percentage. A fixed sampling percentage size that was good at some point in time may not be appropriate after the data distribution in the table has changed. On the other hand when AUTO value is used Oracle will adjust the sample size when the data distribution changes.

When AUTO is used Oracle picks a sample size where the statistics quality is good enough. However, it does not work very well under cases of extreme skew in the data. In Oracle 11g, we improved the behavior when the AUTO value is used. First, AUTO sampling now generates deterministic statistics. Second, and more importantly, AUTO sampling generates statistics that are almost as accurate as 100% sampling but takes much less time than 100% sampling. To illustrate these merits, we compare the performance of using a fixed sampling percentage, AUTO sample size in Oracle 10g and AUTO sample size in Oracle 11g.

We used the standard TPC-D data generator to generate a Lineitem table. The Lineitem table is about 230G and contains 1.8 million rows with 16 columns. The schema of the lineitem table is as follows:

column name	column type
l_shipdate	date
l_orderkey	number
l_discount	number
l_extendedprice	number
l_suppkey	number
l_quantity	number
l_returnflag	varchar2
l_partkey	number
l_linestatus	varchar2
l_tax	number
l_commitdate	date
l_receiptdate	date
l_shipmode	varchar2
l_linenumber	number
l_shipinstruct	varchar2
l_comment	varchar2

Table 1 gives the elapsed time of gathering statistics on the Lineitem table by different sampling percentages.

Sampling Percentage	Elapsed Time (sec)
1% sampling	797
100% sampling (Compute)	18772
Auto sampling in Oracle 10g	2935
Auto sampling in Oracle 11g	1908

Table 1: Statistics gathering time on 230G TPC-D Lineitem Table Using Different Estimate Percentages

We also compare the quality of the statistics gathered using different estimate percentages. Among all the statistics of a column, number of distinct values (NDV) is the one whose accuracy used to be an issue. We define the accuracy rate of NDV of a column as follows:


accuracy rate = 1 - (|estimated NDV - actual NDV|)/actual NDV.

The accuracy rate ranges from 0% to 100%. The higher the accuracy rate is, the more accurate the gathered statistics are. Since 100% sampling always lead to an accuracy rate of 100%, we do not report it. We focus on the columns which has at least one statistics accuracy rate below 99.9% when using different estimate percentages. Table 2 illustrates the accurate rates of the columns.

Column Name	Actual NDV	Auto Sampling in Oracle 11g	1% Sampling
orderkey	450,000,000	98.0%	50%
comment	181,122,127	98.60%	4.60%
partkey	60,000,000	99.20%	98.20%
suppkey	3,000,000	99.60%	99.90%
extendedprice	3,791,320	99.60%	94.30%

Table 2: Accuracy Rate of Gathering NDV LineItem Using Different Estimate Percentages

In short, the elapsed time of ''AUTO sampling in Oracle 11g'' is 10 times faster than 100% sampling but yields the statistics of similar quality (accuracy rate close to 100%).

Outerjoins in Oracle

2007-12-11T13:00:00.000-08:00

Since release 6, Oracle has supported a restricted form of left outerjoin, which uses Oracle-specific syntax. In 9i, Oracle introduced support for ANSI SQL 92/99 syntax for inner joins and various types of outerjoin. The Oracle syntax for left outerjoin and that of ANSI SQL 92/99 are not equivalent, as the latter is more expressive.

There appears to be some confusion about equivalence between ANSI outer join and Oracle outer join syntaxes. The following examples explain the equivalences and inequivalences of these two syntaxes.

Oracle-Specific Syntax

Consider query A, which expresses a left outerjoin in the Oracle syntax. Here T1 is the left table whose non-joining rows will be retained and all non-joining rows of T2 will be null appended.

A.
SELECT T1.d, T2.c
FROM T1, T2
WHERE T1.x = T2.x (+);

ANSI Left Outerjoin

In the ANSI outer join syntax, query A can be expressed as query B.

B.
SELECT T1.d, T2.c
FROM T1 LEFT OUTER JOIN T2
ON (T1.x = T2.x);

Equivalence

Consider the following queries. In the Oracle semantics, the presence of (+) on the filter predicate (e.g., T2.y (+) > 5 in query C) indicates that this filter must be applied to the table T2 before the outer join takes place.

C.
SELECT T1.d, T2.c
FROM T1, T2
WHERE T1.x = T2.x (+) and T2.y (+) > 5;

The ANSI left outer join query D is equivalent to C. Applying the filter on the right table in the left outer join is the same as combining the filter with the join condition.

D.
SELECT T1.d, T2.c
FROM T1 LEFT OUTER JOIN T2
ON (T1.x = T2.x and T2.y > 5);

Similarly, the presence of (+) on the filter predicate, T2.y (+) IS NULL, in query E indicates that this filter must be applied to the table T2 before the outer join takes place.

E.
SELECT T1.d, T2.c
FROM T1, T2
WHERE T1.x = T2.x (+) and T2.y (+) IS NULL;

The ANSI left outer join query F is equivalent to E.

F.
SELECT T1.d, T2.c
FROM T1 LEFT OUTER JOIN T2
ON (T1.x = T2.x and T2.y IS NULL);

Consider query G. Oracle will apply the filter, T2.y IS NULL, in query G after the outer join has taken place. G will return only those rows of T2 that failed to join with T1 or those whose T2.y values happen to be null.

G.
SELECT T1.d, T2.c
FROM T1, T2
WHERE T1.x = T2.x (+) and T2.y IS NULL;

The ANSI left outer join query H is equivalent to G, as the WHERE clause in H is applied after the left outer join is performed based on the condition specified in the ON clause.

H.
SELECT T1.d, T2.c
FROM T1 LEFT OUTER JOIN T2
ON (T1.x = T2.x)
WHERE T2.y IS NULL;

Consider query I, where the filter on the left table is applied before or after the outer join takes place.

I.
SELECT T1.d, T2.c
FROM T1, T2
WHERE T1.x = T2.x (+) and T1.Z > 4;

The ANSI left outer join query J is equivalent to query I.

J.
SELECT T1.d, T2.c
FROM T1 LEFT OUTER JOIN T2
ON (T1.x = T2.x)
WHERE T1.Z > 4;

Lateral Views

In Oracle, ANSI left and right outerjoins are internally expressed in terms of left outerjoined lateral views. In many cases, a left outerjoined lateral view can be merged and the ANSI left (or right) outerjoin can be expressed entirely in terms of Oracle native left outerjoin operator. (A lateral view is an inline view that contains correlation referring to other tables that precede it in the FROM clause.)

Consider the ANSI left outer join query K, which is first represented internally as L.

K.
SELECT T1.d, T2.c
FROM T1 LEFT OUTER JOIN T2
ON (T1.x = T2.x and T2.k = 5);

L.
SELECT T1.d, LV.c
FROM T1,
LATERAL (SELECT T2.C
FROM T2
WHERE T1.x = T2.x and T2.k = 5)(+) LV;

The lateral view in query L is merged to yield query M.

M.
SELECT T1.d, T2.c
FROM T1, T2
WHERE T1.x = T2.x (+) and T2.k (+)= 5;

Consider query N, which expresses a left outerjoin in the ANSI join syntax. Currently query N cannot be expressed using the Oracle native left outer join operator.

N.
SELECT T1.m, T2.n
FROM T1 LEFT OUTER JOIN T2
ON (T1.h = 11 and T1.y = T2.y)
WHERE T1.q > 3;

The query N is converted into query O with a left outer-joined lateral view. The lateral view in O cannot be merged, since the filter on the left table specified in the ON clause must be part of the left outerjoin condition.

O.
SELECT T1.m, LV.n
FROM T1,
LATERAL(SELECT T2.n
FROM T2
WHERE T1.h = 11 and T1.y = T2.y)(+) LV
WHERE T1.q > 3;

Consider query P with a disjunction in the outer join condition. Currently N cannot be expressed using the Oracle native left outer join operator.

P.
SELECT T1.A, T2.B
FROM T1 LEFT OUTER JOIN T2
ON (T1.x = T2.x OR T1.Z = T2.Z);

The query P will be converted into Q with a left outer-joined lateral view containing the disjunctive join condition.

Q.
SELECT T1.A, LV.B
FROM T1,
LATERAL (SELECT T2.B
FROM T2
WHERE T1.x = T2.x OR T1.Z = T2.Z) (+) LV;

ANSI Full Outerjoin

Before Oracle 11gR1 all ANSI full outerjoins were converted into a UNION ALL query with two branches, where one branch contained a left outerjoined lateral view and the other branch contained a NOT EXISTS subquery. A native support for hash full outerjoin was introduced in 11gR1 to overcome this problem. When the native full outerjoin, cannot be used, Oracle reverts to the pre-11gR1 strategy.

Consider query R, which specifies an ANSI full outerjoin.

R.
SELECT T1.c, T2.d
FROM T1 FULL OUTER JOIN T2
ON T1.x = T2.y;

Before 11gR1, Oracle would internally convert query R into S.

S.
SELECT T1.c, T2.d
FROM T1, T2
WHERE T1.x = T2.y (+)
UNION ALL
SELECT NULL, T2.d
FROM T2
WHERE NOT EXISTS
(SELECT 1 FROM T1 WHERE T1.x = T2.y);

With the native support of hash full outerjoin, R will be simply expressed as query T, where the view, VFOJ, is considered unmergeable.

T.
SELECT VFOJ.c, VFOJ.d
FROM (SELECT T1.c, T2.d
FROM T1, T2
WHERE T1.x F=F T2.y) VFOJ;

Conversion of Outerjoin into Inner Join

Consider query U. Here the filter predicate on the outer-joined table T2 does not contain the outerjoin operator (+); thus it will be applied after the left outerjoin has taken place. This will result in the elimination of all null appended rows of T2. Hence, Oracle converts the outer join into an inner join.

U.
SELECT T1.d, T2.c
FROM T1, T2
WHERE T1.x = T2.x (+) and T2.y > 5;

The ANSI left outer join query V is equivalent to query U, as the WHERE clause in V is applied after the left outer join is performed based on the condition specified in the ON clause.

V.
SELECT T1.d, T2.c
FROM T1 LEFT OUTER JOIN T2
ON (T1.x = T2.x)
WHERE T2.y > 5;

Oracle converts the queries U and V into query W with an inner join.

W.
SELECT T1.d, T2.c
FROM T1, T2
WHERE T1.x = T2.x and T2.y > 5;

Q&A

Q1: I do not understand the queries N and O. What is the difference between
a filter appearing in the ON Clause or a filter appearing in the WHERE
clause?

A1: Consider two tables T11 and T22.

T11:
A | B
1 | 2
2 | 3
3 | 5

T22:
X | Y
7 | 2
8 | 4
9 | 4

The following ANSI left outer join query N’ involving
the tables T11 and T22 will return three rows, since
the filter, which always fails, is part of the join
condition. Although this join condition, which
comprises both the predicates in the ON clause,
always evaluates to FALSE, all the rows of the left
table T11 are retained in the result.

N’.
SELECT *
FROM T11 LEFT OUTER JOIN T22
ON (T11.A > 9 and T11.B = T22.Y);

A B X Y
------ ---------- ---------- ---------
1 2
2 3
3 5

However, if the filter, T11.A > 9, is moved to the WHERE clause,
the query will return zero rows.

Q2: Is the outer to inner join conversion a new feature?

A2: No. This feature has been avaliable since Release 7.

Q3: Has native full outer join been made available in
versions prior to 11gR1?

A3: Yes. It is available in 10.2.0.3 and 10.2.0.4, but not by
default.

Why are there more cursors in 11g for my query containing bind variables?

2007-12-03T15:05:00.000-08:00

Oracle introduced a new feature, adaptive cursor sharing, in 11g, to improve the plans that are selected for queries containing bind variables. This feature can result in more cursors for the same query containing bind variables. We'll explain why in this article. Before we get into the details, let's review a little history.

Oracle introduced the bind peeking feature in Oracle 9i. With bind peeking, the Optimizer peeks at the values of user-defined bind variables on the first invocation of a cursor. This allows the optimizer to determine the selectivity of any WHERE clause condition as if literals have been used instead of bind variables, thus improving the quality of the execution plan generated for statements using bind variables.

However, there was a problem with this approach, when the column used in the WHERE clause with the bind contained a data skew. If there is data skew in the column, it is likely that a histogram has been created on this column during statistics gathering. When the optimizer peeks at the value of the user-defined bind variable and chooses a plan, it is not guaranteed that this plan will be good for all possible values for the bind variable. In other words, the plan is optimized for the peeked value of the bind variable, but not for all possible values.

In 11g, the optimizer has been enhanced to allow multiple execution plans to be used for a single statement that uses bind variables. This ensures that the best execution plan will be used depending on the bind value. Let's look at an example to see exactly how this works.

Assume I have simple table emp which has 100,000 rows and has one index called emp_i1 on deptno column.

SQL> desc emp

Name                   Null?    Type
---------------------- -------- ----------------------------------
ENAME                           VARCHAR2(20)
EMPNO                           NUMBER
PHONE                           VARCHAR2(20)
DEPTNO                          NUMBER

There is a data skew in the deptno column, so when I gathered statistics on the emp table, Oracle automatically created a histogram on the deptno column.

SQL> select table_name, column_name, histogram from user_tab_cols;

TABLE_NAME         COLUMN_NAME        HISTOGRAM
------------------ ------------------ ---------------
EMP                DEPTNO             HEIGHT BALANCED
EMP                EMPNO              NONE
EMP                ENAME              NONE
EMP                PHONE              NONE

Now I will execute a simple select on my emp table, which has a single WHERE
clause predicate on the deptno column. The predicate contains a bind variable. We will begin by using the value 9 for this bind variable. The value 9 occurs 10 times in the table, i.e. in 0.0001% of the rows.

SQL> exec :deptno := 9

SQL> select /*ACS_1*/ count(*), max(empno)
2  from emp
3  where deptno = :deptno;


COUNT(*) MAX(EMPNO)
---------- ----------
    10         99

Given how selective the value 9 is, we should expect to get an index range scan for this query. Lets check the execution plan.

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
SQL_ID  272gr4hapc9w1, child number 0
------------------------------------------------------------------------
select /*ACS_1*/ count(*), max(empno) from emp where deptno = :deptno

Plan hash value: 3184478295
------------------------------------------------------------------------
| Id | Operation                    | Name  | Rows | Bytes |Cost (%CPU)|
------------------------------------------------------------------------
|  0 | SELECT STATEMENT             |       |      |       |    2 (100)|
|  1 |  SORT AGGREGATE              |       |     1|    16 |           |
|  2 |   TABLE ACCESS BY INDEX ROWID| EMP   |     1|    16 |    2   (0)|
|  3 |    INDEX RANGE SCAN          | EMP_I1|     1|       |    1   (0)|
------------------------------------------------------------------------

So we got the index range scan that we expected. Now let's look at the execution statistics for this statement

SQL> select child_number, executions, buffer_gets,
2    is_bind_sensitive, is_bind_aware
3  from v$sql
4  where sql_text like 'select /*ACS_1%';

CHILD_NUMBER EXECUTIONS BUFFER_GETS IS_BIND_SENSITIVE IS_BIND_AWARE
------------ ---------- ----------- ----------------- -------------
          0          1          53 Y                 N

You can see we have one child cursor that has been executed once and has a small number of buffer gets. We also see that the cursor has been marked bind sensitive. A cursor is marked bind sensitive if the optimizer believes the optimal plan may depend on the value of the bind variable. When a cursor is marked bind sensitive, Oracle monitors the behavior of the cursor using different bind values, to determine if a different plan for different bind values is called for. This cursor was marked bind sensitive because the histogram on the deptno column was used to compute the selectivity of the predicate "where deptno = :deptno". Since the presence of the histogram indicates that the column is skewed, different values of the bind variable may call for different plans.

Now let's change the value of the bind variable to 10, which is the most popular value for the deptno column. It occurs 99900 times in the table, i.e in 99.9% of the rows.

SQL>  exec :deptno := 10

SQL> select /*ACS_1*/ count(*), max(empno)
2  from emp
3  where deptno = :deptno;

COUNT(*) MAX(EMPNO)
---------- ----------
 99900     100000

We expect to get the same plan as before for this execution because Oracle initially assumes it can be shared. Let's check:

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------
SQL_ID  272gr4hapc9w1, child number 0
------------------------------------------------------------------------
select /*ACS_1*/ count(*), max(empno) from emp where deptno = :deptno

Plan hash value: 3184478295
------------------------------------------------------------------------
| Id | Operation                    | Name  | Rows | Bytes |Cost (%CPU)|
------------------------------------------------------------------------
|  0 | SELECT STATEMENT             |       |      |       |    2 (100)|
|  1 |  SORT AGGREGATE              |       |     1|    16 |           |
|  2 |   TABLE ACCESS BY INDEX ROWID| EMP   |     1|    16 |    2   (0)|
|  3 |    INDEX RANGE SCAN          | EMP_I1|     1|       |    1   (0)|
------------------------------------------------------------------------

The plan is still an index range scan as before, but if we look at the execution statistics, we should see two executions and a big jump in the number of buffer gets from what we saw before.

SQL> select child_number, executions, buffer_gets,
 2    is_bind_sensitive, is_bind_aware
 3  from v$sql
 4  where sql_text like 'select /*ACS_1%';

CHILD_NUMBER EXECUTIONS BUFFER_GETS IS_BIND_SENSITIVE IS_BIND_AWARE
------------ ---------- ----------- ----------------- -------------
          0          2        1007 Y                 N

You should also note that the cursor is still only marked bind sensitive and not bind aware at this point. So let's re-execute the statement using the same popular value, 10.

SQL> exec :deptno := 10

SQL> select /*ACS_1*/ count(*), max(empno)
2  from emp
3  where deptno = :deptno;

COUNT(*)  MAX(EMPNO) 
---------- -----------
99900      100000

Behind the scenes during the first two executions, Oracle was monitoring the behavior of the queries, and determined that the different bind values caused the data volumes manipulated by the query to be significantly different. Based on this difference, Oracle "adapts" its behavior so that the same plan is not always shared for this query. Hence a new plan is generated based on the current bind value, 10.

Let's check what the new plan is.

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------
SQL_ID  272gr4hapc9w1, child number 1
--------------------------------------------------------------------
select /*ACS_1*/ count(*), max(empno) from emp where deptno = :deptno

Plan hash value: 2083865914
--------------------------------------------------------------------                                                                       
| Id  | Operation            | Name  | Rows   | Bytes | Cost (%CPU)|
--------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |        |       |   240 (100)|
|   1 |  SORT AGGREGATE      |       |     1  |   16  |            |
|*  2 |   TABLE ACCESS FULL  | EMP   | 95000  | 1484K |   240   (1)|
--------------------------------------------------------------------

Given how unselective the value 10 is in the table, it's not surprising that the new plan is a full table scan. Now if we display the execution statistics we should see an additional child cursor (#1) has been created. Cursor #1 should show a number of buffers gets lower than cursor #0 and it is marked both bind sensitive and bind aware. A bind aware cursor may use different plans for different bind values, depending on how selective the predicates containing the bind variable are.

Looking at the execution statistics:

SQL> select child_number, executions, buffer_gets,
 2    is_bind_sensitive, is_bind_aware
 3  from v$sql
 4  where sql_text like 'select /*ACS_1%';

CHILD_NUMBER EXECUTIONS BUFFER_GETS IS_BIND_SENSITIVE IS_BIND_AWARE
------------ ---------- ----------- ----------------- -------------
          0          2        1007 Y                 N
          1          1         821 Y                 Y

we see that there is a new cursor, which represents the plan which uses a table scan. But if we execute the query again with a more selective bind value, we should use the index plan:

SQL> exec :deptno := 9

SQL> select /*ACS_1*/ count(*), max(empno)
2  from emp
3  where deptno = :deptno;

COUNT(*) MAX(EMPNO)
---------- ----------
    10         99

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------
SQL_ID  272gr4hapc9w1, child number 2
------------------------------------------------------------------------
select /*ACS_1*/ count(*), max(empno) from emp where deptno = :deptno

Plan hash value: 3184478295
------------------------------------------------------------------------
| Id | Operation                    | Name  | Rows | Bytes |Cost (%CPU)|
------------------------------------------------------------------------
|  0 | SELECT STATEMENT             |       |      |       |    2 (100)|
|  1 |  SORT AGGREGATE              |       |     1|    16 |           |
|  2 |   TABLE ACCESS BY INDEX ROWID| EMP   |     1|    16 |    2   (0)|
|  3 |    INDEX RANGE SCAN          | EMP_I1|     1|       |    1   (0)|
------------------------------------------------------------------------

The proper plan was chosen, based on the selectivity produced by the current bind value.

There is one last interesting thing to note about this. If we look at the execution statistics again, there are three cursors now:


SQL> select child_number, executions, buffer_gets,
 2    is_bind_sensitive, is_bind_aware, is_shareable
 3  from v$sql
 4  where sql_text like 'select /*ACS_1%';

CHILD_NUMBER EXECUTIONS BUFFER_GETS IS_B_SENS IS_B_AWAR IS_SHAR
------------ ---------- ----------- --------- --------- ----------
          0          2         957 Y         N         N
          1          1         765 Y         Y         Y
          2          2           6 Y         Y         Y

The original cursor was discarded when the cursor switched to bind aware mode. This is a one-time overhead. Note that the cursor is marked as not shareable (is_shareable is "N"), which means that this cursor will be among the first to be aged out of the cursor cache, and that it will no longer be used. In other words, it is just waiting to be garbage collected.

There is one other reason that you may notice additional cursors for such a query in 11g. When a new bind value is used, the optimizer tries to find a cursor that it thinks will be a good fit, based on similarity in the bind value's selectivity. If it cannot find such a cursor, it will create a new one (like above, when one (#1) was created for unselective "10" and one (#2) was created for highly-selective "9"). If the plan for the new cursor is the same as one of the existing cursors, the two cursors will be merged, to save space in the cursor cache. This will result in one being left behind that is in a not shareable state. This cursor will be aged out first if there is crowding in the cursor cache, and will not be used for future executions.

Q & A
Instead of answering the questions in your comments one by one, I am going to summarize the questions and provide my answers here.

Q: Is this behavior managed by 11g optimizer automatically and we don't need cursor_sharing anymore?
A: We have not changed the behavior of the cursor_sharing parameter yet, for backwards compatibility purposes. So if you set it to similar, adaptive cursor sharing will only kick in for queries where the literals are replace with binds. We hope that in the future, this feature will persuade people to set cursor_sharing to force.

Q: Would it have any impact like holding library cache latches for longer time to search for appropriate child cursor.
A: Any additional overhead in matching a cursor is always a concern, and we strive to minimize the impact. There is of course some increase in the code path to match a bind-aware cursor, since it requires more intelligent checks. This feature should not, however, impact cursors which are not yet marked bind-aware.

Q: What triggers a cursor to be marked "bind sensitive"?
A: Our goal is to consider many types of predicates where the selectivity can change when the bind value changes. In this first version of the feature, we only handle equality predicates where a histogram exists on the column and range predicates (with or without histogram). We do not currently consider LIKE predicates, but it is on the top of our list for future work.

Q: Also it sounds like the optimizer is using the number of rows returned to decided that it's time for a new plan...
A: I am not going to go into the details of the "special sauce" for how we decide to mark a cursor bind-aware. The number of rows processed is one input.

Q: Are you planning a hint to mark statements as bind-aware ?
A: Yes, we plan to add this in the future. This will allow users to bypass the startup cost of automatically determining that a query is a good candidate for bind-aware cursor sharing.