blog menu1

SQL Tuning

SQL Tuning

Three steps to SQL tuning


There are three main steps to the SQL tuning process:

  1. Isolate the problem to a single SQL.
  2. Analyse the SQL to determine the nature of the problem. Most performance problems are quite common and easy to fix. This section will describe some of these and how to spot them, and then go on to describe a more general analysis method.
  3. Fix it. Almost every performance problem has a solution; it's just that some are more complex than others. In order of increasing complexity and expense, such fixes include:
    • Analyze the underlying table to give Oracle's Cost Based Optimizer the information it needs to resolve the SQL efficiently.
    • Add one or more hints to the SQL to encourage or discourage certain execution plans.
    • Minor changes to the SQL to encourage or discourage certain execution plans.
    • Restructure a poorly designed SQL that cannot be resolved efficiently.
    • Alter the underlying infrastructure. eg. Add or change (or even remove!) indexes; introduce clusters, partitions or index-organised tables; de-normalize tables; use materialized views. Note that these actions venture outside the scope of this document, and should only be performed with the prior permission of (and preferably assistance from) the DBA and/or System Architect.
    • Refer the problem to the database administrator. Possible solutions here would include tuning the Oracle instance, restructuring or moving tablespaces, or archiving old data.
    • Refer the problem to the System Administrator. Possible solutions may include reconfiguration of existing hardware, or acquisition of new hardware.

No links are provided here for fixes. Every problem is different, and the fixes are a logical progression of the analysis. Follow the link for Step 2 above to determine the problem and see the recommended fix. If this does not work, consult a more experienced SQL Tuner, the System Architect, or the DBA

SQL*Text


Queries Oracle’s data dictionary views to show all SQLs running for a given User ID, Server PID, or Client PID.
Enter the users operating system - USER NAME., client process id, or database process id :
At the prompt above, enter the operating system User NAME of the person currently running a long running job. Alternatively, enter the PID of the Oracle shadow process, or the PID of the client process. If you don’t know these, or don’t know how to find them out, it would be simpler just to supply the User NAME.
SQL*Text will then list all of the SQLs running for that user (or PID), and provide some indication of what that SQL is doing. Eg:
O/SUser CPid SPid DbUser S WaitResn 

===========================

ABC 1010 101010 ABC ClientSQLNet



SQLText
-------------------------------------------------------------------------------- 
commit




The output above is simplistic, but it shows that the session for user ABC last ran a commit, and is currently in status ClientSQLNet: meaning that the database process is idle, and control has been handed back to the client process or program.
Common wait Reasons include:
ClientSQLNet - Oracle session is connected but Inactive
IndexScan - Oracle is accessing a table via an index
FullTblScan - Oracle is performing a full table scan
For a detailed description of other wait reasons, see Oracle’s online documentation on the V$SESSION_WAIT view.
For those unimpressed by the green-screen technology of this tool, most of the information is also available in Oracle Enterprise Manager, however it is slower and harder to establish the Wait Reason.

==================

Oracle Trace


If you are considering Oracle Trace, you must be able to get a single user to re-create the problem in a single Oracle session.
If so, you can trace just that session. This will require co-operation from both the user and the DBA, so make arrangements with them both privately, and then get them on a conference call. Get the user to open up their Oracle session, then give the DBA a moment to find the session ID. The DBA can then launch a trace directly on that session, generating a trace file. If the user can repeat the performance problem, then get them to close their session; this will close the trace. Now you have a trace file to run through TK*Prof.


TK*Prof


TK*Prof is an Oracle tool used to display the statistics generated during a trace. When an Oracle session is traced (by SQL*Trace, Oracle Trace, or Database Trace), a trace file is generated. This trace file is barely human-readable; TK*Prof collates and formats the data into a a more meaningful form.

Finding the trace file


All trace files are written to the same location: a directory that is defined when the database is booted. To find out the location of this directory, run the following SQL.
SELECT value FROM sys.v_$parameter WHERE name = 'user_dump_dest' ;
If this returns a 'Table or view does not exist' error, then have the DBA grant select privileges on sys.v_$parameter to everybody. Go to the directory shown, and list the files in date order; on Unix, this is ls -ltr. If the trace files are not readable, ask the DBA to change the privileges. There is a database initialisation parameter that the DBA can set so that all future trace files are created readable.

Running TK*Prof - Check link for more details on TKPROF


Running TK*Prof is simple:



tkprof trace_file output_file [ explain=userid/password@database ]



trace_file is the name of the trace file you found in the previous step, and output_file is the file to which TK*Prof will send the output. The optional explain argument will display an Explain Plan for all SQLs in the trace file. There are other optional arguments to tkprof, see the Oracle Utilities manual for more detail.

TK*Prof output


The output of TK*Prof is very well described in the Oracle Utilities manual, so it will not be described again here. The sort of things you should be looking for are:

  • For each SQL, check the Elapsed statistic. This shows the elapsed time for each SQL. High values obviously indicate long-running SQL
  • Note the Disk and Query columns. These indicate data retrieval from disk and data retrieval from memory respectively. If the Disk column is relatively low compared to the Query column, then it could mean that the SQL has been run several times and the data has been cached. This might not give a true indication of the performance when the data is not cached. Either have the database bounced by the DBA, or try the trace again another day.
  • The first row of statistics for each SQL is for the Parse step. If a SQL is run many times, it usually does not need to be re-parsed unless Oracle needs the memory it is taking up, and swaps it out of the shared pool. If you have SQLs parsed more than once, get the DBA to check whether the database can be tuned to reduce this.
  • A special feature of the Explain Plan used in TK*Prof is that it shows the number of rows read for each step of the execution plan. This can be useful to track down Range Scan problems where thousands of rows are read from an index and table, but only a few are returned after the bulk are filtered out.
  • In order to run SQL statements, Oracle must perform its own SQL statements to query the data dictionary, looking at indexes, statistics etc. This is called Recursive SQL. The last two entries in the TK*Prof output are summaries of the Recursive and Non-Recursive (ie. "normal") SQL. If the recursive SQL is taking up more than a few seconds, then it is a likely sign that the Shared Pool is too small. Show the TK*Prof output to the DBA to see if the database can be tuned.


If your Explain Plan in the TK*Prof output shows 0 rows for every line, check the following:

  • Make sure you turn tracing off or exit your traced session before running TK*Prof. Some statistics are only written at the end.
  • Have you run any ALTER SESSION commands that affect the optimizer? If so, then the plan shown may differ from the real plan. Note that the real plan is not shown: TK*Prof re-evaluates the plan when you run TK*Prof. Make sure that you turn SQL_TRACE on before you ALTER SESSION. TK*Prof is clever enough to see the ALTER SESSION command in the trace file and evaluate plans accordingly. It will probably display two plans: the default plan, and the new plan taking the ALTER SESSION into account.


SQL*Trace



If you know the program / module / job running slowly, then you can trace the session with SQL*Trace.



Add a line near the beginning of your program that executes the following SQL:



Alter session set sql_trace = true
For example



In Pro*C
EXEC SQL Alter session set sql_trace = true;



In PL*SQL
execute immediate 'Alter session set sql_trace = true';



If you are using a tool that does not allow you to submit SQL directly to the RDBMS (eg. A front-end reporting tool) then you will need the DBA to start an Oracle Trace for you.



Now when you run the program, Oracle will generate a trace file that you can analyze using TK*Prof.



Some sites may have jobs that make repeated connections to the database, perhaps by submitting each step separately to SQL*Plus. If this is the case, the alter session above will not help, as it only affects a single connection to the database. If this is the case, you will need to either add the alter session command to each call, or – and this is a last resort - revert to a full Database Trace.


Top SQL


Using Oracle Enterprise Manager or any number of home-grown tools a DBA might have developed, pilfered, or downloaded, you can find recent SQLs that have caused the most IO. Depending on your problem, this may be a relatively fast SQL executed thousands of times, or one dirty big SQL executed just a few times (or even just once).
Ensure that the performance problem is either currently happening, or has happened recently. The information Oracle stores on performance is only temporary, so if you wait too long it may no longer be available.
Have you found the SQL you’re looking for? If so, now you can begin to analyze the problem.
If Top SQL wasn’t any help, it’s either because there were too many SQLs to investigate, or because you didn't get to it in time. If you've already ruled out SQL*Text and SQL*Trace because you cannot isolate the problem, you will have to revert to a Database Trace.



Database Trace


If you can’t isolate the problem down to a single user, session, or program / module, then you will have to trace the entire database. This will require two short outages – one to start the trace, the other to stop it. You will need co-operation from the DBA (to conduct the outages) and the primary stakeholders of the system (whose business will be impacted by the outages).
Note that whilst tracing the entire database it will run much slower and chew up disk like crazy, so keep it as short as possible – half an hour or less is preferred.
The DBA will shut down the database and re-mount it with Trace switched on. From this time all sessions will be fully traced. At the end of the trial period, all sessions should be logged off, and the DBA will shut down the database and re-mount it with Trace switched off.
Now you will have a separate trace file for every session that was initiated in the trial period. You can convert them to human readable form in one of 3 ways:

  • Run TK*Prof over each of them individually.
  • Concatenate the lot and run TK*Prof once to generate a single output file.
  • With later versions of Oracle you can use the trcsess utility (see the Oracle Performance Tuning Manual). This is a more organised approach to tracing that allows you to group traces by session, client, service, action, or module.

Using the first method, go straight to the bottom of the output file and check the session statistics. Ignore sessions with low Call, Elapsed, Disk, and Current figures. Those trace files with relatively high number in these columns are the ones you want to look at in more detail.



Analysing a problem SQL


Before you begin to analyze a SQL, you might want to check if there are any quick fixes available. Also get familiar yourself with the many Data Storage problems where tuning will not help - this may save you some time.
If none of these appear to help, then further analysis is required. There are two main classes of problem:

    • Low volume SQLs - Low volume SQLs are usually expected to read only a few rows, and to access tables via an index. Problems occur when indexes are missing, ignored, or misused.
    • High volume SQLs - High volume SQLs - usually reports or batch jobs - can suffer when they use an index. Much has been written on this subject and it will not be repeated here. Suffice to say that when accessing a table or partition via an index, there is a point beyond which it becomes more efficient to simply read the entire table instead of using the index. The argument over exactly what proportion of the data that is (most estimates range from 4% to 50%) has become somewhat of a religious war. The truth is that there is no magic number, however if you have a SQL processing more than a few percent of a table or partition with more than 10,000 rows, you should try a Full Table Scan for comparison.
      For a detailed explanation of the causes behind this phenomenon, refer to the Performance Tuning book of the Oracle Online Documentation, or any good Oracle Performance Tuning manual.


In addition to the normal problems of high volume SQLs, there are two other special classes of High Volume problems worthy of a special note.

    • High Volume DML: Is your SQL performing INSERTs, UPDATEs, or DELETEs?
    • Procedural SQL: Are you using PL/SQL, Pro*C, or another procedural language that is looping through high volumes of data?


How did you go? If you still need better performance, think about the following:

  • Are your expectations realistic? Look at the tables and work out how many blocks you actually need to read to process your query. Check TK*Prof to find out how many blocks it is actually processing. If they are not close, look back through this guide in case there's something you missed - you need to eliminate those redundant blocks.
    If they are close, show the TK*Prof output to the DBA and ask whether the blocks/sec disk access is reasonable - there may be a problem with the table, tablespace, or disk.
  • Talk to your DBA about using the Parallel Query Server. It has been left out of this guide partly because when it is used too often it loses its effectiveness, but mostly because it promotes laziness. If parallel query returns a better result (and it usually will) people stop looking for a better solution.
  • Take a look at the Advanced Objects page. Perhaps you can remodel the data to make access more efficient.
  • Perhaps there is a greater problem at the database or system level. Review your tuning results with the DBA to see if they have any idea


Quick Fixes


There are a number of very common SQL problems that are easy to identify and easy to fix.



Checking the statistics on a table



Oracle's Cost Based Optimizer collects statistics on tables and indexes in order to determine the best access method when presented with many options. Usually this boils down to "to use, or not to use" an index.
There are three likely cases where statistics are the problem, and they all have the same solution:

  • There are no statistics on the table or index.
  • The statistics are old, and no longer representative of the table/index contents.
  • The statistics were estimated, and an insufficient sample size was used.


To determine whether any of these are true, you can display the statistics using Oracle Enterprise Manager. Open up your database in the Navigator window; open the Schema folder; open the schema that owns your query table; open Tables; find your table name and open it; open Indexes. When you click on the table and each of the indexes, click the Statistics tab on the right hand pane to view the statistics. If the table is partitioned, you should do the same with each partition as well.



Alternatively, download Disply_stats.sql and run it from SQL*Plus as follows:
@Display_stats my_table_name
Where my_table_name is the name of the table for which you want to list statistics. Display_stats.sql will display the statistics for the table, its partitions, and its indexes.



The things you are looking for are:


  • The statistics fields are NULL, indicating the statistics have not been calculated at all (or have been deleted).
  • The Last Analyzed date is old, and statistics need to be recalculated. Other signs of this problem would be Total Rows showing a number that is nothing like the total number of rows in the table.
  • The Sample Size is less than 20% of the total number of rows in the table. In many cases, accurate statistics can be derived from a sample size of 5% or less of the total rows, but this is not always the case.


You should see the DBA if statistics need to be recalculated.


Bad Hints


Sometimes it is not the Oracle Cost Based Optimizer that is getting it wrong. A well-meaning programmer may have added an inappropriate hint to a SQL that actually causes it to run slower. The most classic examples of this are the RULE and INDEX hints.



Check your SQL for hints. If you are selecting from a view, also check the view definition for hints.



In versions of Oracle prior to 7.3, the Cost Based Optimizer was frequently unreliable, so the RULE hint was used to make the execution plans predictable. Unfortunately this also eliminates a number of CBO-only execution plans. eg. Hash joins and bitmap indexes.



With the INDEX hint, it is a common misconception that indexes make all queries run faster.



Make sure your statistics are up-to-date, and confirm with the DBA that there are no unusual database settings that would affect the execution plan (eg. OPTIMIZER_GOAL should be CHOOSE, and HASH_JOIN_ENABLED should be TRUE). Now remove the existing hints from the SQL and try it again. If not, leave the hints off and follow the other steps in this guide to tune the SQL.



Cartesian Products
A Cartesian Product occurs when Oracle joins all of the rows in one table to all of the rows in another table. If the number of rows in one of those tables is zero or one, this will not be a problem. However consider the consequences if both tables have more than 100,000 rows.100,000 x 100,000 = 100,000,000,000. ie. 100 billion rows. Clearly this will not run fast.
To find out whether your SQL is performing a Cartesian Product, run it through Explain Plan and search for a step that looks like this:
MERGE JOIN (CARTESIAN)
There are several situations where Oracle will use a Cartesian Product

  • You have not specified any join conditions in the WHERE clause of your SQL. For example:
  • SELECT a.column1, b.column3 FROM table_a a, tab b WHERE a.column9 = 'X' AND b.column5 >= to_date('13-JAN-03','DD-MON-YY')
  • This is usually a programming error. Add join conditions and re-test. Better still, use ANSI join syntax in V9i+; it will force you to supply a join predicate (unless you specify the CROSS JOIN method).
  • Oracle thinks that one of the tables in the join has zero or one rows. If this is actually the case, the Cartiesian Product is fine. Otherwise it is likely that the statistics are missing on one or both tables: check the statistics and recalculate if necessary.
  • Oracle is performing a Star Query. However this usually only happens when you supply the STAR hint. Try it without the STAR hint and see if it improves.
  • Oracle is performing a Star Join. However this usually only happens when you supply the ORDERED hint. Try it without the ORDERED hint and see if it improves.

If you have appropriate join predicates, up to date statistics, no hints, and tables selecting 0 or 1 row, then Oracle should not be performing a Cartesian Product. If it is still happening, look at your join predicates and consider the best order to join the tables. Place the tables in the FROM clause in that order and add the ORDERED hint to the SQL. This should stop the Cartesian product. If it is still slow, continue reading this Guide for other alternatives.


Full Table Scans


Full table scans have an image problem. Many Oracle developers - usually those working on OLTP systems - are told early in their careers that Full Table Scans are badPeriod! Many will then hold on to this prejudice and never learn the truth.
Full Table Scans can be good when:

  • Used on a small table (< 500 rows)
  • Used in a query returning more than a few percent of the rows in a table (see High Volume SQLs for a brief discussion on this point)
  • Used in a sort merge | Sort-Merge or Hash join, but only when these join methods are intended.

Full Table Scans are bad when:

  • Used on a large table where indexed or hash cluster access is preferred.
  • Used on a medium-large table (> 500 rows) as the outer table of a Nested Loop join.
  • Used on a medium-large table (> 500 rows) in a Nested Sub-Query.
  • Used in a sort merge|Sort-Merge or Hash join when a nested loops|Nested Loop]] join is preferred.

To check for a full table scan, first run your SQL through Explain Plan and check for a line like:
TABLE ACCESS (FULL) MY_TABLE_NAME
To find out why Oracle is not using index/cluster access, see Low Volume SQLs. Remember, read the situations above when a Full Table Scan can be good; if this is one of those cases, hit the back button and look for a different problem.


Multiple Table Joins

NP-Hard Problems


Oracle's Cost Based Optimizer works by analyzing several of the possible execution paths for a SQL and choosing the one that it considers best. For instance, a two table join could drive off table A and look up table B for each row returned, or it could drive off table B. By adding in the possibilities of join methods and index selection, the number of possible execution paths increases.
A three table join has three times as many alternatives, a four table join has four times the alternatives of a three table join. In general, the number of possible execution paths for a join statement is proportional to n! (ie. n x n-1 x n-2 x ... x 2 x 1), where n is the number of tables in the join.
The problem of choosing the absolute best execution path becomes near impossible as n increases. Mathematicians call this an np-hard - or non-polynomial - problem. The theory of np-hard problems states that there are no shortcuts to finding the absolute best solution. Often we must settle for a solution that just looks OK.
Oracle does not attempt to evaluate every possible execution path. As the number of tables in a join increases, the percentage of possible paths evaluated plummets. Since bad execution paths far outnumber good execution paths, the likelihood that Oracle does not stumble on even a halfway good execution path increases with the number of tables in the join.

What to look for


If you have a table join (ie. a FROM clause) with five or more tables, and you have not included a hint for join order (eg. ORDERED or LEADING ), then Oracle may be joining the tables in the wrong order.
Run your SQL through Explain Plan and check the order that the tables are processed. Say we have a SQL with five tables (A-E) in the FROM clause, and the following joins:
A <-> B
A<-> C
B <-> D
D <-> E
If the Explain Plan showed the tables in the order A, B, C, D, E, then this would be OK. However if it showed A, B, C, E, D, then there would be a problem because we need the columns from D to join to E, but D is joined last.
This is a simplistic example that would never happen if all of the join predicates were equals joins. But what if the SQL had clauses like this:



In this case, table E joins to both table D and table A, however the join to table A is a range operator (>=). It may seem obvious that table E should be joined after table D, but this is exactly the sort of case where Oracle can mess up.

How to fix it


If the tables are being joined in the wrong order, you can supply a hint to suggest a better order.

  • If Oracle is just starting with the wrong table, try a LEADING hint to suggest the best table to start with. SQLs with equi-joins will often get the rest of the joins right if only they know where to start.
  • For ultimate control, update the FROM clause to list the tables in the exact order that they should be joined, and specify the ORDERED hint.

Is it now joining in the correct order? If not, then Oracle is ignoring the hint; check with your DBA.
If it is joining in the desired order, is it faster? If not, try elsewhere in this guide; the problem may be related to index usage or join methods.


Accessing Remote Tables


Do you have a step in your Explain Plan that reads simply:
REMOTE
You are accessing a table that is on a different Oracle Database, connected by a Database Link. There are a number of ways this could happen:

  • Selecting from a table with a database link specified.
  • Selecting from a synonym which is defined as a remote table
  • Selecting from a view that selects from one of the above.


Remote tables have a number of performance implications:

  • If all of the tables from which you are selecting are on the same remote database, then Oracle will get the remote database to run the query and send back the results. The only problems are the cost of shipping the results back, and the fact that you cannot see how the remote database is executing the query in Explain Plan (from Oracle 10g onwards, Explain Plan will get the remote database to plan its bit of the query and pass back the results to your Explain Plan). Make sure that the whole query is being performed remotely - the Explain Plan output should just have the one step - "REMOTE" - or in 10g+, every line of the plan contains the word "REMOTE".
  • If some tables are local - or you are joining tables over two or more remote databases - then Oracle will need to ship all of the data to one database before it can perform a join.
    If you want to use an Indexed Nested Loop (low volume) join, then the outer (2nd) table cannot be remote - if it is you will get a full table scan and a hash join. You could use the DRIVING SITE hint to make the remote database the driving site of the query. Then the inner (1st) table will be sent over there, and an index can be used on the outer table. The remote database will then send back the results.
    Even better, speak to your DBA about having the remote table(s) replicated on your database so that you can avoid the remote join altogether.

The AND-EQUAL Trap

What is AND-EQUAL?


There is a very old feature of Oracle that causes far more problems than it ever solved. One of the first questions asked by a novice SQL tuner is

  • "Can Oracle use two indexes on a table at the same time?"

The wise answer is "Yes. But..."
If a table contains two or more single column indexes, and two or more of those columns are used in a SQL using equals predicates, then it is possible for Oracle to use all of the indexes on the columns supplied. Oracle will scan each index separately for matching rows and then merge the rowids from each scan, producing a list of rowids that match every index lookup. Oracle calls the feature AND_EQUAL.
AND_EQUAL is used for low volume SQLs, and is not to be confused with Index Join, another multi-index trick for high volume queries.



Sounds good? Well it's usually not. What usually happens is that the indexes all return much the same rows and the AND-EQUAL merge does not really narrow the search down much, or one index is much better than the others and we would be better off just using that one.
Consider the following examples on a table with two single column indexes on columns A and B:

  • Column A has approximately 20 rows per key, and column B has approximately 1000 rows per key. An AND-EQUAL query will select the 20 rows from index A, 1000 rows from index B, and merge them to get between 0 and 20 rowids to look up. It would have been faster to ignore index B altogether and just look up every row matching index A.
  • Column A and column B both have approximately 1000 rows per key, but due to a quirk of the table the two columns are loosely aligned. Oracle will select the 1000 rows from each index, merge them and return close to 1000 rows. It would have been faster just to select one of the indexes and ignore the other one.
  • Column A and column B both have approximately 1000 rows per key, and combined they return an average of 5 rows per combined key? Whilst AND-EQUAL would return reasonable performance, a concatenated index on A and B would be much faster still as it would take you straight to the 5 rows you want.

It is a common trap to believe that two single column indexes are (almost) as good as one concatenated index over both columns. This is never the case.
Fortunately, AND-EQUAL is never chosen automatically by the Cost Based Optimizer, so is now rare. It will only happen when:

  • You have not calculated statistics on the table (Oracle defaults back to the Rule based optimizer).
  • You specify the AND_EQUAL hint, or
  • You specify the RULE hint, or
  • You have the Optimizer Goal set to RULE (either with a ALTER SESSION command, or it can be set by the DBA.

How do I stop it?


If you see an AND-EQUAL line in your Explain Plan output, check the information above and decide whether AND-EQUAL is really appropriate. If not, calculate the statistics, remove the AND_EQUAL or RULE hint, or if your Optimizer Goal is set to RULE, add a CHOOSE hint to the SQL.

Is AND-EQUAL ever useful?


Hardly ever. Here are some cases where you might use AND-EQUAL.

  • A table has two or more columns of similarly average cardinality (say 2 - 100 rows per key) that are each separately indexed. The cardinality of each index is good enough and used often enough to justify the indexes' existence. Normally, a query with equals clauses on each column would be fastest just using one of the indexes. But if the SQL is using only those indexed columns (ie. all cols in the SELECT, WHERE, GROUP BY and HAVING clauses are in those single column indexes) then Oracle can use AND-EQUAL to avoid accessing the table. The AND-EQUALoption without a table access will probably be faster than using one index and accessing the table.
  • Table T has 1 million rows. Columns A, B, and C each have approximately 100 rows per value. Any two combined will return on average 10 rows, and three together guarantee uniqueness. Users frequently query on any two or on all three. Bitmap indexes are inappropriate because there are too many unique values (approx 10,000) for each column. Concatenated indexes are inappropriate, as you would need to create at least three: ABC, AC, BC (this overhead is not really compelling, a better example would use four or more columns). AND-EQUALwould be OK here, because a two column query will read 100 rows from each index returning 10 rows from the table. This would be quicker than using a single index and reading the whole 100 rows from the table to filter them down to 10.
  • This one is a real example - the only one I have ever seen. A Workers Compensation system records employees' injuries and illnesses. Every injury has 4 properties:
    1. Nature of Injury/Disease: eg. broken bone, burn, bruise, cancer ...
    2. Location of Injury: eg. eye, cranium, upper leg, liver, skin ...
    3. Mechanism of Injury: eg. brick, shovel, nicotine, uranium ...
    4. Agency of Injury: eg. Falling from height, struck by object, passive smoking...

To maintain a high quality of data, the regulatory authority distributes a list of invalid code combinations. eg. Broken liver, cancer caused by falling from a height, etc. The invalid combinations are always in pairs (ie. never three or four codes together), and are stored in a table calledINVALID_INJURY_COMBINATIONS with 4 columns: NATURE, LOCATION, MECHANISM, and AGENCY. There are about 30,000 rows in the table. Every row has two columns null and two columns set, the combination of the two set (not null) columns values is unique.

  • Queries on the table are always on two of the four columns, making 6 possible query paths. Six concatenated indexes would be inefficient because for any two columns, there are only 5,000 rows where both columns are null, leaving 25,000 rows in the index of which 20,000 are useless having one null column. Single column indexes are efficient because the four of them average 15,000 rows each with no wastage.
  • For each column, there are between 20 and 200 unique values, giving a cardinality of between 75 and 750 rows per index key. Although Bitmap Indexes would provide good performance, they are inefficient because of the number of nulls in each column, and the relatively high number of unique values. With the AND-EQUAL option, any given query on two columns will read between 75 and 750 rows from each index, and merge the results to give either 1 or 0 rows. That's not bad!

Triggers


If an INSERT, UPDATE, or DELETE command is slow, check whether the table has any triggers. Either use a GUI tool like TOAD or the SQL below:



MY_TABLE_NAME



You may also query user_triggers if you own the table you are inspecting. Be wary of GUI tools and the all_triggers view (GUI tools may be using the all_triggers view), even if you have privileges on a table, you may not be able to see the triggers in all_triggers if you are connected as user which has only SELECT privs.
If a table does have a trigger, run the INSERT, UPDATE, or DELETE through SQL Trace to see why it is so slow. You will probably need to tune the trigger instead of the INSERT statement.
If you are doing thousands of INSERTs, UPDATEs, or DELETEs, then triggers might not be the best way to handle whatever it is they are doing. Perhaps the whole thing could be packaged up into a high volume PL/SQL routine.


Referential Integrity Constraints

Inserting and Updating


When a table contains foreign key contraints that reference a parent table, inserts and updates on a foreign key column result in Oracle running a SQL in the background that verifies the existence of the parent row. Even when you INSERT or UPDATE many rows in a single statement, the parent rows are still checked row-by-row and key-by-key. For example, inserting 1000 rows into a table with 6 foreign keys will execute an additional 6000 SQL statements behind the scenes.
There is no way to make Oracle perform these foreign key checks in bulk (eg. insert all rows, and then verify the foreign keys in one SELECT against the parent table.
One option is to disable the constraint before you perform the bulk-DML, and then re-enable it afterwards. The re-enable process will validate all rows together rather than individually, but it must validate the entire table, not just the rows you inserted/updated. Clearly this is a bad option if you are updating just a few hundred rows. The break-even point depends on the table, the environment, and the load; but it is commonly faster to disable/enable foreign keys if you are going to insert/update more than 1%-10% of the total rows in the table.

Deleting


A Foreign Key constraint can be set up so that when a row is deleted from the parent table, all dependent rows are deleted from the child table, or the foreign keys in the child table are set to NULL. Even if ON DELETE CASCADE or ON DELETE SET NULL is not used, Oracle must still check that the child table contains no dependent rows. If you have a DELETE that is running slowly, check for foreign keys in a GUI tool like TOAD, or using the following SQL



MY_TABLE_NAME



If you find that your table is referenced by another, then there may be one of several problems:

  • There is no index on the child table foreign key, forcing Oracle to do a full table scan.
  • The Foreign Key is declared with ON DELETE CASCADE (or ON DELETE SET NULL), and there are a lot of rows to delete.
  • The Foreign Key is declared with ON DELETE CASCADE, and the child table has further tables referencing its primary key. One of the first two problems may apply to the child-child table.
  • A child table may have a Delete Trigger.

Foreign Keys are not indexed automatically. If a foreign key refers to a table where rows can be deleted, or where the primary key can be updated (God forbid), then the Foreign Key should be indexed. To find out what is really happening, trace the SQL and run it through TK*Prof.


Locking Problems


If you are running an INSERT/UPDATE/DELETE or a DDL statement (like TRUNCATE, or refreshing a Materialized View), the problem may not lie with your SQL. You may be locked out by another session or by the lack of a specific resource.
Whilst the SQL is running, run st.sql in another session and check the Wait Event. If the Wait Event mentions Lock, Latch, or Enq (enqueues), then your SQL is probably doing nothing but waiting. Show the Wait Event to your DBA and take it from there.
There are some locking problems that you may be able to solve yourself. If the Wait Event indicates Latch or Enqueue problems, then you almost certainly need help from the DBA. However Locks are pretty easy to diagnose.
If you run waiters.sql, it will list all of the sessions waiting on locks, the OS user id of the waiter, the table they are waiting for, and the session/name of the holder of the lock. If waiters.sql shows that your session is waiting on another, use the information provided on the holder and find out what they are doing and when they will be finished. You'd be surpised how many times you actually lock yourself out with another session.


Data Storage Problems


Some SQL problems are related not to the execution plan, but to the way Oracle is storing the data.

  • Chaining

Oracle stores table rows in blocks. After a row is inserted into a table, Oracle checks how much space is left in the block. If the amount of space remaining is less than a specified percentage (PCTFREE), then the row is removed from the free-list and new rows will be inserted into the next available block. The free space left in each block is used for UPDATEs.
Updates can increase the size of a row in three ways

  • NULL columns take up 0 bytes. Setting a NULL column to a not NULL value makes it take up space.
  • VARCHAR2 columns use only as many bytes as there are characters in the string. Eg. Updating 'Fred' to 'Barney' uses an extra 2 bytes.
  • NUMBER and DATE columns consume a variable amount of space. Dates can use 7 or 8 bytes depending on the time component, and large numbers or high precision numbers use more space. eg. The number 2 uses two bytes, whereas 2.8727862 uses six bytes.

If you don't leave enough space at the end of the block, then Oracle must move the row in order to update it - this is called chaining. Oracle migrates the row to another block, and leaves a pointer to the new block in the old block. When you want to read that row, Oracle must first read the old block, and then the new block. This effectively doubles the amount of IO to read that row.
After you analyze a table, check the CHAIN_CNT column in the USER_TABLES or USER_TAB_PARTITIONS view. This is the number of chained rows in the table. If this number is high (more than a 1 or 2% of the rows in the table) then it is a concern. The only way to unchain rows is to rebuild the table or partition.
To prevent chaining from re-occuring, you need to analyze how you are populating the table.

  • If the chaining is a result of a once-only UPDATE, say adding and populating a new column, then the DBA can just rebuild the table and the problem will go away.
  • If the table contains a lot of VARCHAR2 or one or two large VARCHAR2 columns that are often updated to larger strings, then you will need to increase the table's PCTFREE value. See your DBA.
  • If you have a table where you insert a minimal row with mostly NULL columns and then populate them with UPDATE statements, then you may be able to fix this by updating the PCTFREE attribute. However this is a bad technique and should be avoided. Consider using a Global Temporary Table for a more efficient two-step insert. eg. Insert base details into a Global Temporary Table, then use that data plus data from another source to insert into the target table.

Note that if PCTFREE is set too high, then the table will contain a lot of empty space and Full Table Scans will be slower.


The High Water Mark problem


Oracle tables have an attribute called the High Water Mark. This is the highest numbered block in the table that has ever contained a row. When an INSERT statement cannot find a free block to populate, it goes straight to the High Water Mark and allocates a new block.
When rows are deleted, the High Water Mark does not come down. This is not normally a problem, except for Full Table Scans, because a Full Table Scan must search every block beneath the High Water Mark for data.
If a table contains 1000 rows, then you would expect a full table scan to be fairly speedy. But if it once contained 100000 rows, then you would be wrong: the full table scan of the 1000 rows will take roughly the same amount of time as a 100000 row full table scan.
If the table is continually growing and shrinking (through regular deletes), then you should look for a different way to manage it. Perhaps it can be partitioned, so that instead of deleting rows you truncate or drop partitions.
If the table is not going to grow back to full size in the medium term, then it should be dropped and rebuilt by the DBA to move the High Water Mark back down to an appropriate level.
A related problem to this is the Empty Hash Cluster


Fragmented Indexes


Since (b-tree) indexes are stored in a sorted order, updating an indexed column usually involves deleting it from one position in the index and inserting into another place. If this happens often enough, the index can become riddled with holes. Range scans on the index will read more blocks than they have to because each block is under-utilised.
This can also happen when many rows are deleted; a common occurrence in a fast-refresh Materialized View.
The only solution is to rebuild the index when it becomes fragmented.
ALTER INDEX my_index REBUILD [PARTITION part_name]
Run the following SQL to list fragmented indexes. If you have fast-refresh materialized views, you should run this SQL regularly.




Note that this query does not take into account block headers (a small space overhead in each block) or the prefix size of compressed and prefixed partitioned indexes. Also, bitmap indexes, bitmap join indexes, function-based indexes, and domain indexes are excluded. Space usage for these index types will differ from the standard calculated here.


Empty Hash Cluster


A hash cluster is like a wall of Post Office Boxes, or pigeon holes. In a regular table, a row can be placed in any old block, but in a hash cluster a row belongs in a specific block based on its cluster key (one or more columns of the table).
When you create a hash cluster, you must nominate how much space to reserve for each hash key (ie. the size of the pigeon hole) and how many hash keys to allow (the number pigeon holes). The minimum size of the hash cluster is therefore the size attribute multiplied by the hashkeys attribte. eg. A hash cluster with 2Kb per hash key, and 100,000 hash keys will have an initial size of 200Mb.
Oracle sets the High Water Mark of the cluster to this pre-allocated size. Like the High Water Mark problem in regular tables, this creates problems if the hash cluster is empty or under-utilised. A full table scan must read every block in the cluster, not just those blocks containing rows. In the example above, even if the cluster is empty, the SQL SELECT * FROM my_clust_table will take some time because it must read the entire 200Mb
The solution is to avoid full table scans completely, or create your cluster at a sensible size and fill it up. This means that if the table/cluster is continually growing then the DBA will have to regularly rebuild it to maintain its efficiency. If you are performing regular full table scans, then perhaps a Hash Cluster is the wrong structure for you. You might be better off with an indexed cluster or a regular old table and index.


Low volume SQLs


There are two classes of low volume SQLs: (of course it possible for a single SQL to span both classes)

  • SQLs on small tablesWhen accessing a small table (say with 500 or fewer rows) it is surprising how often a Full Table Scan will out-perform indexed access. Typically no SQL on a small table will run for very long at all, however the cumulative effect of a small performance loss over many iterations can become enormous. For example, a loss of one twentieth of a second (0.05 sec) over 100,000 iterations will add over 80 minutes to a report or batch job.
    The way to tell which is better is to try both. Use an INDEX() or HASH() hint to encourage indexed / hash cluster access respectively, or FULL() to encourage a Full Table scan, and SQL*Trace both. Compare the TK*Prof output to determine which is faster.
  • SQLs on large tablesA low volume SQL on a large table should always be using an index or hash cluster. Use Explain Plan to obtain the execution plan. Is it using the index or hash cluster you expect? The problem may be one of the following:
    • The index is missingIs there an index covering your columns? Has it been dropped?
    • The index exists but is not being used, or the wrong index is being used.The Cost Based Optimizer may choose not to use the index, or there may be something in the SQL that stops the index being used.
    • The index is being used with a Range ScanThere are two types of index scans - Unique Scan and Range Scan. A Unique Scan is guaranteed to read no more than 1 row, but a range scan can read any number of rows. A Range Scan may be processing more rows than you expect. Click on the above link to find out why this might be the case.
    • The table is not really hash clustered.You think a table is hash clustered but it's really not.
    • The hash cluster exists but Oracle won't use it.You've gone to the effort of creating a hash cluster and now Oracle won't use it.
    • You have sub-query blowout.There is a particular problem with correlated sub-queries that blows out if your sub-query has to scan multiple rows.
    • The hash cluster is inefficientWhen you access a hash cluster, it is possible that you may be reading more rows or more blocks than you think.

Check for a missing index


If you think Oracle should be using an index to resolve your query and it is not doing so, then make sure the index exists. Here are two ways to check:

    • Using Oracle Enterprise Manager (OEM), open up your database in the Navigator window; open the Schema folder; open the schema that owns your query table; open Tables; find your table name and open it; open Indexes. Click on each index listed, and the indexed column(s) will show in the right hand pane.
    • For those who don't have OEM (and those that find it too slow), check the statistics of the table download Display_stat.sql and run it from SQL*Plus as follows:
      @Display_stat.sql my_table_name
      where my_table_name is the table you expect to be indexed. show.sql will list the columns in the table, followed by a list of the indexes including their uniqueness, followed by the columns of each index.

If you can't find the table, perhaps it is a VIEW, or a SYNONYM to a table in another schema.

  • Do any indexes exist? If so, is there one where the leading columns of the index match the columns of your WHERE clause?
  • If the answer is NO to either of these questions, then the problem is that there is no index that covers your SQL. Perhaps the index has been accidently dropped, or it never existed at all. You have four choices:
    1. If the index has been dropped accidently, get the DBA to re-create it.
    2. Restructure your query. Perhaps the structure of your database permits you to retrieve the information in another way that does contain an index.
    3. How bad is the performance really? Is it unacceptably slow? If not, then doing nothing is a valid course of action. But consider that performance will deteriorate if the table is growing.
    4. Discuss with the DBA the prospect of adding a new index. Providing the index is efficient, and you have genuinely considered and rejected the options above, the DBA sould have no objections. However if the index is inefficient, then it will not help your query. You could discuss with the DBA the merits of partitioned tables|partitioning]], or a single table clusters|hash cluster.

The index is being ignored


If you are here, you have already:

    • Run your SQL through Explain Plan and confirmed that the desired index is not being used.
    • Checked the statistics on the underlying table.
    • Confirmed that the index exists.

Oracle will ignore an index for a number of reasons, but they boil down to two possibilities:

    • The structure of the SQL is such that Oracle cannot use an index
    • Oracle's cost based optimizer is electing not to use the index

To find out which of these is true for your SQL, add an INDEX hint to your SQL.


  • eg.
  • SELECT /*+ INDEX(a, i_empno) */ a.* FROM emp a
  • WHERE a.emp_no = 111Now run the SQL through Explain Plan. Is the index being used now? If the index is being used in a FULL SCAN, then this is pointless: it is just reading every row from the index. If the index is now being used with a UNIQUE or RANGE scan, it means that the Cost Based Optimiser doesn't think the index is very useful. Have you checked the statistics? Try running the SQL again; is the performance better now? If the performance is still poor, then Oracle was probably right to ignore the index - you almost certainly have a Range Scan problem.
  • If the INDEX hint did not work (ie. Oracle is still using Full Table Scan, Full Index Scan, or a different index), then there is something about the SQL preventing the index from being used.
    • How are indexed column(s) used in the WHERE clause?
      • Are they altered by functions, arithmetic, concatenation?
      • Are they used with NOT or != operators?
    • Are you using the index within OR or NOT predicates?
    • If the index is a multi-column (concatenated) index, are you using the leading columns?
    • Are you selecting from a view, or a nested query?
    • Do you have a two table join? What order are the tables joined in?
    • Are you trying to use Bitmap Indexes?
    • Do you have a star schema query? Are you using Oracle's star query techniques?
    • Are you casting an indexed column from a string to a number?
    • Is the indexed column used with a sub-query clause?
      eg.
    • WHERE col IN (SELECT ...)
    • WHERE col = (SELECT ...)
    • WHERE col >= ANY (SELECT ...)
    • WHERE col <= ALL (SELECT ...)
    • WHERE col NOT IN (SELECT ...)

One of the bullet points above should determine why the index is not being used. If not, then work on the process of elimination. First confirm that oracle recognises the index - run the following through Explain Plan

  • SELECT /*+ INDEX(a, index_name)*/ *
  • FROM table_name a
  • WHERE index_col = :valIf this simple SQL refuses to use the index, then there is a fundamental problem. Get the DBA to drop and rebuild the index.
  • If the sample SQL above does use the index, then start with your original SQL and remove lines one at a time, running each successively smaller SQL through Explain Plan as you go. Somewhere between the original SQL and the one above, the index will start to work. Once you find the problem, check the bullet points above to see whether any of them explain the fault. If not, discuss the problem with the DBA.


Index Range Scans



A Range Scan is any scan on an index that is not guaranteed to return zero or one row. ie. A SQL that uses a Unique index and supplies every column in the unique index in an equals clause will result in a Unique Scan, anything else is a Range Scan.



We can tell if Oracle is using a Range Scan in Explain Plan by the presence of a line similar to the following:
[UNIQUE] INDEX RANGE SCAN INDEX_NAME
The problem is that it is impossible to tell how many rows the range scan is scanning. A range scan that scans 5 rows on average will probably be fairly efficient. However a range scan scanning 10000 rows on average will probably be causing grief.



Inefficient range scans can have a number of causes:


  • low cardinality|Low cardinality index key
  • range predicates|Use of range predicates such as <, >, LIKE, and BETWEEN.
  • Partial use of a concatenated index

Low cardinality index keys



If an index has poor cardinality (ie. more than 4% rows with the same index key) then it will perform poorly. It will usually be faster to perform a full table scan. eg. Table SALES has an index on the column PAYMENT_METHOD which can contain values such as COD, CREDIT, CHEQUE, CASH. The statement




will probably perform so badly that you are better off without the index.


Range Predicates



Oracle can perform index scans when you use comparison operators =, <[=], >[=], LIKE, or BETWEEN. With all except =, Oracle can return a range of diffent index key values. For example, consider an index on sales.sale_date. The statement:




can use the index (with a range of an entire year), but it is going to scan so many rows that it would be quicker to perform a full table scan.



A special case of the Range Predicates problem involves Range Keyed Tables, where the primary key of a table is a concatenated low and high value representing a range.


Partial use of a concatenated index



If your index has 2 or more columns (ie. a concatenated - or composite - index), then you don't need to scan on all of the columns in the index. Any query that does not use every column in the concatenated index with an = comparison will result in a range scan. See Concatenated Indexes for more information on problems with this type of index.

Check that a table is clustered


If you think Oracle should be using hash cluster access to resolve your query and it is not doing so, then make sure the table is really hash clustered:



One of three things will be returned:

  • The SQL returns No rows returned. MY_TABLE may not be a table, it may be a VIEW, or a SYNONYM over another schema's table. Check USER_VIEWS and USER_SYNONYMS to find out. If it is a view, then the SQL of the view may be too complex for your WHERE clause to be merged. If it is a synonym, try the following query instead:
  • The SQL returns a single NULL value. This means the table is not clustered. To cluster an unclustered table, you will need the DBA to rebuild it. If the table is not supposed to be clustered after all, go back and check the indexes again.
  • The SQL returns the name of a cluster - the table is clustered. Go back and check for other problems.

Oracle will not perform a Hash Cluster scan


If you are here, you have already:

  • Run your SQL through Explain Plan and confirmed that the desired cluster scan is not being used.
  • Checked the statistics on the underlying table.
  • Confirmed that the table is clustered.

Oracle will refuse to perform a hash cluster scan for a number of reasons, but they boil down to two possibilities:

  • The structure of the SQL is such that Oracle cannot use a hash cluster scan
  • Oracle's cost based optimizer is electing not to use the hash cluster scan

To find out which of these is true for your SQL, add a HASH hint to your SQL.



eg.
SELECT /*+ HASH(a) */ a.*
FROM emp a
WHERE a.emp_no = 111
Now run the SQL through Explain Plan. Is the cluster being used now? If the cluster is now being scanned, it means that the Cost Based Optimiser doesn't think the cluster is very useful for your SQL. Have you checked the statistics? Try running the SQL again with the HASH hint; is the performance better now? If the performance is still poor, then Oracle was probably right to ignore the cluster - you almost certainly have a Bad Cluster problem.
If the HASH hint did not work (ie. Oracle is still using Full Table Scan or an Index Scan), then there is something about the SQL preventing the hash cluster scan from being used.

  • How is the cluster key used in the WHERE clause?
    • Is any part of it altered by functions, arithmetic, concatenation?
    • Is any part of it used with NOT or != operators?
  • Is the cluster key used within OR or NOT predicates?
  • If the cluster key contains multiple columns (concatenated key), are you using just the leading columns?
  • Do you have a two table join? What order are the tables joined in?
  • Are you casting the cluster key from a string to a number?
  • Is the indexed column used with a sub-query clause?
    eg.
  • WHERE col IN (SELECT ...)
  • WHERE col = (SELECT ...)
  • WHERE col >= ANY (SELECT ...)
  • WHERE col <= ALL (SELECT ...)
  • WHERE col NOT IN (SELECT ...)

One of the bullet points above should determine why the hash cluster scan is not being used. If not, then work on the process of elimination. First confirm that oracle recognises the hash cluster - run the following through Explain Plan
SELECT /*+ HASH(a)*/ *
FROM table_name a
WHERE clust_col = :val
If this simple SQL refuses to perform a hash cluster scan, then there is a fundamental problem. Get the DBA to rebuild the cluster.
If the sample SQL above does use a hash cluster scan, then start with your original SQL and remove lines one at a time, running each successively smaller SQL through Explain Plan as you go. Somewhere between the original SQL and the one above, the hash cluster scan will start to work. Once you find the problem, check the bullet points above to see whether any of them explain the fault. If not, discuss the problem with the DBA.


Sub-query blowout


A correlated sub-query is one where the sub-query references columns in the outer query. With few exceptions, Oracle will evaluate a correlated sub-query as a nested sub-queries | Nested Sub-Query.
For example, the table CUST stores historical images of changes to the table, and differentiates rows with cust_seq. The latest details are on the row with the highest cus_seq.



If the table contains an average of 5 rows per customer, then the outer query will return 5 rows, and then execute the inner query for each of the 5 rows. The inner query also return 5 rows, but it does this 5 times. ie. The above query will read 5 + 5 x 5 = 30 rows, even though the table only contains 5 rows for the customer. Evidence of this processing can be seen in TK*Prof, with seemingly small queries processing hundreds of blocks in the Disk and/or Query columns.
This effect tends to go unnoticed, because it is still pretty quick to query 30 rows; it's not like you have to wait 10 minutes for the query to return. Problems start to occur when:

  • The number of rows increases. eg. The same query with 50 rows per customer will process 50 + 50 x 50 = 2550 rows.
  • The SQL is performed in a loop that processes thousands of different keys. For example, an end of month invoicing program may run the SQL above for every single customer.
  • The sub-query contains a similar sub-query. Consider for example a query with correlated sub-queries nested 3 deep, each selecting 5 rows per key (I have actualy seen this!). The top level will select 5 rows. The second level will select 5 rows 5 times = 25 rows. The third level will select 5 rows 25 times = 125 rows. The final level will select 5 rows 125 times = 625 rows. A total of 625 + 125 + 25 + 1 = 776 rows processed for a single "low volume" query.

How to fix it


The example above can be fixed with an index on CUST_NO, CUST_SEQ. Rather than scanning the entire range of rows with a matching CUST_NO, Oracle will just pick the first (MIN(CUST_SEQ)) or last (MAX(CUST_SEQ)) one from the index range without having to read the rest. There are restrictions on this technique though:

  • The table and index must be analysed.
  • You may not use the Rule-Based Optimizer. This means that the OPTIMIZER_MODE (or OPTIMIZER_GOAL) session parameter cannot beRULE, nor may you use the /*+ RULE*/ hint.
  • The sub-query may only SELECT the MAX or MIN of one of the indexed columns.
  • The WHERE clause must have equality (=) conditions on all of the index columns preceding MAX/MIN column and it may optionally have an equality or range condition (<, >, BETWEEN, LIKE) on the MIN/MAX column, but may have no other WHERE conditions.

If these conditions are not met, then the index solution will not work; the query must be re-written. The best solution is usually to process the results in PL/SQL without using a sub-query. Alternatively, if it is possible to write the SQL as a table join or with analytic functions instead of a sub-query, then it will usually run without blowing out like the sub-query.
When you re-write the SQL to eliminate the sub-query, compare the new version to the old in TK*Prof to make sure you have fixed the problem.



Bad Hash Clusters
Hash clusters are a great way to reduce IO on some tables, but they have their downside.

  • If too little space is reserved for each key, or if the cluster is created with too few hash keys, then each key will split across multiple blocks negating the benefits of the cluster.
  • If too much space is reserved for each key, or if the cluster is created with too many hash keys, then the cluster will contain thousands of empty blocks that slow down full table scans (see Empty Hash Cluster).

Oracle provides some reasonable advice on selecting the size per cluster key and number of hash keys. See the Oracle online doco SQL Reference manual, CREATE CLUSTER statement.
How can you tell if your cluster is inefficient?
You need to know two things: the distribution of your data, and the config settings of your cluster.
Run the following SQLs:



avg_row_length



mean



avg_key_size



In the first SQL, Oracle reserves key_size bytes for each hash key, and hashkeys hash keys. ie. The minimum table size is therefore key_size xhashkeysbytes.
In the second SQL(s), keycount is the number of different cluster keys, mean is the average number of rows per cluster key, and stddev is the standard deviation of the number of rows per cluster key.

  • Is hashkeys bigger than keycount (use the highest value of keycount if there are multiple tables in the cluster)? Much bigger; say 25% bigger?
    If so, you have a lot of empty blocks waiting to be filled. If you are expecting lots of new key values in the future, this is OK. But in the meantime your Full Table Scans will be slow.
  • Is hashkeys smaller than keycount (use the highest value of keycount if there are multiple tables in the cluster)?
    If so, then you have multiple cluster keys hashing to the same hash key and sharing the same space. This is only a problem ifavg_blocks_per_key is greater than 1, because it means you are reading 2 or more blocks when you only need to read one. If that is the case, you should either make key_size bigger (but no bigger than the size of 1 block - ask your DBA what block size you are using), or make hashkeys bigger.
  • If hashkeys is similar to keycount, then is avg_blocks_per_key per key greater than 1?
    This means rows for one cluster key are spilling across multiple blocks. If key_size is greater than 1 block, then this is OK because you expect so many rows per cluster key that they cannot physically fit into one block. If key_size is smaller than one block, then make it bigger.
  • Is the avg_key_size less than 75% of key_size?
    If so, this may be OK if stddev is relatively large (say more than 25% of mean), otherwise it's a waste of space and will slow down full table scans. Larger values of stddev mean that the number of rows for any given cluster key vary significantly from the mean, so more space needs to be allocated to avoid splitting cluster keys. However if stddev is relatively small, then key_size should be set closer to avg_key_size to save space.

High Volume SQLs


In many respects, high volume SQLs are easier to tune than low volume SQLs because the solution is always the same:

  • Avoid reading data you don't need.
  • Avoid reading the same block twice (or more).
  • Avoid indexes.
  • Avoid Nested Loop joins.

In practice it is more complex. Although the goal is conceptually simple, it can sometimes be hard to achieve.
The first step is to run your SQL through Explain Plan. Check the execution plan for the following:

  • Index scans: unique, range, full, or fast full.
  • Nested Loop joins or Nested sub-queries.

Did the above steps help? If not, then your SQL should contain only Full Scans and Merge or Hash joins (plus Index scans and Nested Loop joins that are described in the links above as acceptable for High Volume SQL). If performance is still unacceptable, check the following?



Still want more? If you still have Nested Loops and Indexed table accesses, then either go back and try again or start redesigning your schema.
Otherwise, you should have a nice Full Table Scanning, Hash or Merge Joining query. If you still want to squeeze better performance out of it, then you are going to have to reduce IO. This means reading fewer blocks, which is going to be hard because if you have applied all of the principles above then you probably have little or no redundancy in your query (blocks being read twice or more). There are a couple of things you could do:

    • Use indexes to avoid reading a table. You don't store useless columns in indexes, so you can fit more rows per block than the corresponding table. Fewer blocks means less IO.
      • All columns already in an index? Try a Fast Full Scan of the index using the INDEX_FFS hint.
      • All columns already indexed, but in different indexes? Try an Index Join.
      • In desperation, you could create a new index over just the columns you are selecting. Or add columns to an existing index. Speak to your DBA first.
      • Depending on the other processes using the table, you could convert it to an Index Organised table and store unwanted column in the overflow segment. This is a pretty big step, talk to your DBA.

  • You can reduce IO by discarding fewer rows from a Full Table Scan. If you have a Full Table Scan, and a WHERE predicate is discarding some of the rows, then you may be able to avoid reading those rows by partitioned tables|partitioning the table Partition Prune to ignore non-matching rows. Speak to your DBA.
  • If your SQL contains aggregation or joins, consider pre-calculating the summary / join in a materialized view|Materialized View.
  • Two or more tables joined with equals joins would join faster is they were clustered on the join key. Speak to your DBA.
  • Are you joining rows in Table A to matching rows in Table B, and using full table scans and hash joins? Are both tables big, and you end up not using most of the rows from Table B. If you were to Hash Cluster Table B on the join key, then the Hash Cluster Access would give you all the advantages of a Hash Join without having to read the non-matching rows. Speak to your DBA.

High Volume Index Scans


If your high volume SQL is using an index, it may be a problem. There are 6 main types of index scan; in Explain Plan you will see one of the following:
[UNIQUE] INDEX RANGE SCAN
Oracle is reading 0 or more contiguous rows from the index.
[UNIQUE] INDEX RANGE SCAN (MIN/MAX)
Oracle is identifying 0 or more contiguous rows in the index, but is reading only one (the first or the last) in order to satisy a MIN or MAX aggregate function.
UNIQUE INDEX UNIQUE SCAN
Oracle is reading 0 or 1 rows from the index.
[UNIQUE] INDEX FULL SCAN
Oracle is reading all rows from the index, and may be accessing these rows in the underlying table.
[UNIQUE] INDEX SKIP SCAN
Oracle is reading 0 or more rows from different parts of the index, and may be accessing these rows in the underlying table.
[UNIQUE] INDEX FAST FULL SCAN
Oracle is reading all rows from the index, and is not accessing these rows in the underlying table. ie. The index contains all columns required to resolve the query without having to lookup the table.




In a high volume SQL, the last index scan type above is not a problem. With the first five types, they are usually only a problem if Oracle is also performing a table access along with the index scan. The table access will appear in your Explain Plan immediately above the index scan. eg:



table_name



index_name
If your SQL is not performing a table access, but the index uses a RANGE SCAN (but not a RANGE SCAN MIN/MAX), you could still have a Range Scan problem, where the range scan is processing a lot more rows than you expect. Otherwise, an index scan that does nore have a TABLE ACCESS is not necessarily a problem; try performing a full table scan instead and compare the performance.

  • If you have an INDEX() hint, remove it.
  • If you have an RULE hint, remove it.
  • Add a FULL hint to force a full table scan.

Does the full table scan perform better? If not, then one of the following will apply:

  • You may have a Low Volume SQL, or a combination Low and High volume SQL.
    Run the original SQL through TK*Prof with EXPLAIN=uid/pwd and check how many rows are selected from the index. If the TK*Prof row counts show zeros, make sure you exited your session before running TK*Prof, or check with the DBA in case that feature has been disabled on your database. If the rows selected from the index are less than say 4% of the total rows in the table or partition, then Oracle is probably right to use the index.
  • One of the other High Volume SQL problems apply in addition to the index problem. Leave the SQL performing a Full Table Scan and try one of the other fixes.

High Volume Nested Loops Joins


nested loops|Nested Loops joins and Nested Sub-queries are fairly intuitive. The "Nested" term means that someting is performed iteratively: for each A do B. In the case of a Nested Loop join, for each row in table A, lookup matching rows in Table B. For a Nested Sub-Query, for each row in the outer query, execute the sub-query.
This iterative nature is fine when only a few loops are performed, but is generally not scalable to large volumes. The much faster way to handle large data volumes is to use a Hash join or a sort merge|Sort-Merge join.
In a high volume SQL, you can tell if you are doing a Nested Loops join or Nested Sub-Query by checking the Explain Plan for one of the following:
NESTED LOOPS
Usually the result of a table join in the FROM clause of your SQL, but can also appear when you use a WHERE col IN (sub query). Both of these can usually be converted to a Hash join or a Sort-Merge join.
FILTER
Usually the result of a Nested Sub-Query, but can also sometimes occur when using a View. To tell the difference, count how many component steps are attached to the FILTER step. If there is only one (this is rare), then the FILTER is not a problem; FILTER is just eliminating some rows based on a simple WHERE or HAVING condition. If there are two, then it means Oracle is using the second step to filter rows from the first. ie. The SQL in the second step (usually a sub-query) is run for every row returned from the first step.



The only occasions when a Nested Loops join is acceptable in a high volume SQL are:

  • When the driving (inner) table will return 0 or 1 row. If you have a join query where one of the tables is supplied with the whole of a primary or unique key, Oracle can retrieve the row (if there is one) and then perform a full table scan on the second table. This is more efficient than either a sort-merge or a hash join.
  • When the outer (second) table is very small (ie. fewer than 100 rows) and can fit into a single block. Since a single block is the smallest amount of data Oracle can read, a Table that fits into a single block can be accessed very fast with a Full Table Scan.
  • When the outer (second) table is not accessed. ie. All columns required for the SQL can be obtained from the index. This will be obvious in the Explain Plan, because there will be an INDEX SCAN step but no TABLE ACCESS step.
  • When the outer (second) table is an Index-Organized Table, and is accessed using the Primary Key.
  • When the outer (second) table is stored in a hash cluster, and is accessed via a Hash Cluster Scan.

Otherwise, if you have a Nested Loops join or a Nested Sub-Query in your high volume SQL, you must convert it to either a Hash join or a Sort-Merge join. Try the following techniques:

  • For Nested Loop joins where the two tables are both in the FROM clause, add an ORDERED hint and a USE_HASH or USE_MERGE hint. eg: This will only work for equals joins (use a hash join) or >, <, BETWEEN (use a sort-merge join). If you join with LIKE, != or NOT predicates, Oracle has no choice but to use Nested Loops. Avoid these types of joins at all costs.
  • For all sub-queries, first of all try an UNNEST hint. This is a very powerful hint that internally restructures your query to make it use a sort-merge or hash join. Take a look at the Explain Plan. If the FILTER step is still there, try the steps below.
  • For WHERE EXISTS single table sub-queries, try a semi join|Semi-Join. If this doesn't work, try rewriting the query as a join

  • For WHERE NOT EXISTS sub-queries, try re-writing the SQL as an outer join with a b.ROWID IS NULL clause. eg. becomes ....
  • For IN and NOT IN sub-queries, eliminate correlated sub-queries, they must use a Nested Loops join. eg. becomes ....
  • For IN sub-queries that are not correlated, try an ORDERED hint and a USE_HASH or USE_MERGE hint. eg: In this way, Oracle can run the outer query and the sub-query independently and perform a hash join.
  • For NOT IN sub-queries that are not correlated, try an anti join |Anti-Join.
  • For correlated col = (sub-query) sub-queries, Oracle must use a Filter. Try to re-write as a table join. eg. becomes ....

Have you eliminated the Nested Loops join or Filter? If so, is performance any better? If not, then one of the following will apply:

  • You may have a Low Volume SQL, or a combination Low and High volume SQL.
    Run the original SQL through TK*Prof with EXPLAIN=uid/pwd and check how many rows are processed in the Nested Loops or Filter. If the TK*Prof row counts show zeros, make sure you exited your session before running TK*Prof, or check with the DBA in case that feature has been disabled on your database. If the row count is less than say 4% of the total rows in the second table of the join or filter (and is using an index on that table), then Oracle is probably right to use the nested loops / filter.
  • One of the other High Volume SQL problems apply in addition to the nested loops problem. Leave the SQL performing a Merge / Hash join and try one of the other fixes.


In summary:

  • All regular joins should be able to use Hash or Sort Merge, except LIKE, !=, and NOT ... joins.
  • WHERE EXISTS sub-queries can use Hash or Sort Merge Semi-Joins
  • WHERE NOT EXISTS will always perform a filter. They must be re-written.
  • WHERE col IN (sub-query) can use Hash or Sort Merge joins if they are not correlated.
  • WHERE col NOT IN (sub-query) can use Hash or Sort Merge anti-joins if they are not correlated.
  • WHERE col = (sub-query) can avoid a Filter only if the sub-query is not correlated.
  • WHERE [NOT] col { [NOT] != | {>|<}[=] [ANY|ALL] } (sub-query) must use a Filter. Re-write a different way.



Partition Pruning

What is Partition Pruning?


Some tables are so large that a Full Table Scan is unthinkable. If these tables are not partitioned tables|partitioned, they should be.
The most common (by far) form of partitioning is Range Partitioning on a DATE column. For example, a table may be partitioned on TXN_DATE, with a separate partition for each year, month, or even day (I have seen a multi-terabyte database where daily partitions were the norm!).
If Oracle can use the WHERE predicates to eliminate some partitions from a search, then it probably will. This is called Partition Pruning. For example:



Explain Plan lines showing PARTITION RANGE SINGLE or PARTITION RANGE ITERATOR indicate that Oracle is performing a Partition Prune. A line of PARTITION RANGE ALL indicates Oracle is scanning all partitions. Depending on your Explain Plan tool, there is one confusing situation: if Oracle can work out exactly which partition or partitions to scan, then the step is removed from the plan. eg.



This looks like Oracle is performing a Full Table Scan of all partitions, but it is not. If that were the case, the plan would look like this:



New Explain Plan tools will show the actual partition number(s) thus eliminating this confusion.

When can I use Partition Pruning?


There are three ways to exploit partitons for performance:

    • Use your range partiton key in =, <[=], >[=], BETWEEN, or LIKE predicates, comparing the key to either literals, bind variables, literal / bind variable expressions, or non-correlated sub-queries. eg.
      • col = :my_date
      • col BETWEEN :my_date AND :my_date + 3
      • col = (SELECT processing_date FROM current_processing_date)
      For List and Hash partitions, use = or IN predicates.
    • Perform a join to a partitoned table using the partiton key in an equals clause, where one of the above rules can be derived transitively. eg.
      SELECT a.* FROM table_a a, big_partitioned_table b WHERE a.calendar_date > :a AND a.calendar_date = b.calendar_date

Here, Oracle can use transitive rules to learn something about the partiton key. eg. If A > :x, and A = B, then B > :x

  • Perform a partition-wise join. If you have two tables that are partitioned the same way, then even if you have to scan the entire table, you can make the Hash or Sort-Merge join faster by joining matching partitions. eg.
    Notice that the join is being done within the partition range loop. This means Oracle is joining partition to partition, not table to table.

How to fix SQLs that won't Partition Prune

  • If you are not using the partition key, but you are using another low-cardinality key, then speak to the DBA about Hash (v8i and above) or List (V9i and above) sub-partitions within the existing Range partitions.
  • The syntax WHERE partition_key oper (sub-query) will only perform a partition prune for = and IN operators; this is consistent with Index Scans. For >[=] or <[=] predicates on sub-queries, Try putting the sub-query into a PL/SQL function, and change your syntax toWHERE partition_key opermy_func() (NB. Do not pass any columns from the query into the function as arguments; it won't partition prune). For = and IN sub-queries that won't Partition Prun, make sure that the sub-query is not correlated.
  • Use ranges instead of functions of partition keys. eg.
    Don't use WHERE to_char(calendar_date,'MON-YYYY') = 'JAN-2003'
    Instead use WHERE calendar_date BETWEEN '01-JAN-2003' and '01-FEB-2003' - 0.00001
  • Never denormalize the partition key into other columns, because queries on those other columns will not partition prune. eg. Ifcalendar_date were the partition key, do not create another column such as calendar_month that is derived from calendar_date. Instead, create a date lookup (dimension) table that does the denormalization for you and use the STAR_TRANSFORMATION hint. eg.

If you are joining on the partition key, but not using equals joins, then you may have a design problem. eg.


If these tables are really big, then you are in a lot of trouble: this cannot be tuned effectively.

If the join is only performed once per day or less, a Nested Loops indexed join to big_partitioned_b may be faster than a hash join with full table scan on big_partitioned_b.

  • If the join is more frequent, or if the query is not constrained to a single day or small subset of big_partitioned_a, then the results of the join should be built incrementally into a de-normalised table over nignt. Every day, select the new rows from big_partitioned_a and use an indexed Nested Loop join to big_partitioned_b, inserting the combined results into partitioned table big_partitioned_ab.


Star Joins


What is a Star Join?


A Star Join is where you join a really big fact table (or a really big subset of an enormous table) to a number of smaller tables. They are not to be confused with a Star Query. A Star Query also joins a big (fact) table to small (dimension) tables, but it will use WHERE predicates to constrain the rows selected to a small volume of data. A Star Query should use indexes, so is not technically a High Volume SQL.
A typical use for a Star Join would be the creation of an aggregated fact table from an atomic fact table. A typical use for a star query would be an end-user query on any fact table.
Consider the following:



The best way to join this table would be to hash join f to d1, hash join the result to d2, hash join the result to d3, hash join the result to d4. Since d1, d2, d3, and d4 are all small, the hash joins are really efficient (hash joins work particularly well with one small table). The four dimension tables are all hashed into memory, and then the fact table is processed row-by-row, looking up the matching values from the four in-memory hash tables.
If Oracle is choosing an alternate plan, it is easy to enforce the above approach using the ORDERED hint (ensure the fact table is listed first in the FROMclause) and a USE_HASH hint that lists all of the table aliases. eg. /*+ ORDERED USE_HASH(f,d1,d2,d3,d4)*/
This method does not work when one of the dimensions is large (>100,000 rows). If the smaller of two tables in a hash join is still too big to hash in memory, then Hash Joins become much less efficient. Worse, the process described above where the join performs a single pass of the fact table no longer works; both the fact and large dimension tables must be pre-processed into TEMP space for the hash-join. This effectively doubles the IO on the two largest tables in the join.
The best way to deal with this problem is to treat the large dimension as a Sparse Join.




Self Joins


SQL has traditionally been a poor tool for analysing data in context. ie. Selecting information about a row that requires input from neighboring rows. For instance, to select information about the highest salary earner in each department, we must first find the employees with the highest salary, and then select their details:



Queries that are context sensitive have one thing in common: they must refer to the same table more than once. The example above is remarkably simple; more complex queries must access the same table 3, 4 or more times.
Oracle 8i has supplied a new feature called Analytic Functions which are unique in SQL: they allow a row to access information from its neighbors without having to perform a self-join. Analytic functions are difficult to get the hang of, but once mastered are quite easy and even intuitive! See the Oracle SQL Reference for more information on Analytic Functions.
The query above can be re-written with analytic functions as follows:



Be aware that Analytic Functions are not necessarily quicker than self joins. Even though they seem to perform fewer and more simple steps, queries like the one above can be more efficient when written as a self join. Always trial both option to find the fastest.
If only a single result is required (eg. One employee with highest salary, rather than all with that salary) then it is even faster to use the KEEP modifier to the aggregate function.



Another option is to process the data in a procedural language such as PL/SQL. eg.





Sorts


Oracle uses a Sort to perform a number of different functions:

  • MAX and MIN functions
  • ORDER BY clauses
  • GROUP BY clauses
  • DISTINCT / UNIQUE clauses
  • Sort-Merge joins
  • UNION, INTERSECT, and MINUS set operators
  • WHERE col IN uncorrelated sub-queries
  • CUBE clauses (an extra sort in addition to the GROUP BY)
  • OVER clauses (Analytic functions)

When Oracle performs a sort, all of the matching rows are written to a temporary area (the Sort Area), sorted, and then read back in to the next stage of execution. Small sorts will occur in memory, but a large sort will need to be written out to disk. Whilst the cost of sorting the rows is significant, the IO involved in the read and the write steps can really blow a query out.
If your Explain Plan contains a SORT, consider whether it really needs it.

  • Do you have a DISTINCT select when you are already selecting Unique / Primary Key columns, ROWID, or ROWNUM? The DISTINCT clause could be removed without affecting the rows returned (note that the rows would be returned in a different order).
  • Are you using a UNION? A UNION returns distinct rows from two or more SELECTS. If you can already guarantee uniqueness (or you don't care about uniqueness), change the UNION to a UNION ALL. UNION ALL does not remove duplicates, so it does not do a sort.

If your Explain Plan contains two or more sorts, can they be reduced to one or avoided altogether?

  • Using a MINUS set operator? MINUS sorts both SELECT queries before merging. If the columns you are selecting comprise a unique / primary key, or ROWID, then you may be able to rewrite it as a hash outer join and avoid the sort.


  • Using an INTERSECT set operator? If the columns you are selecting comprise a unique / primary key, or ROWID, then you may be able to rewrite it as a hash join and avoid the sort.


  • Got a DISTINCT or GROUP BY plus an ORDER BY ... DESC? Normally not a problem; Oracle will sort only once. However if the ORDER BY contains a DESC condition for descending order sort, then the ORDER BY will be processed in a separate sort. Consider whether you can do without the DESC clause.
  • Got GROUP BY plus an ORDER BY? Make sure all of your non-aggregate columns in the SELECT clause are exactly the same in the GROUP BY clause. If not, Oracle will perform two separate sorts.


  • Selecting from a view? If the view contains a sort (eg. DISTINCT, GROUP BY) and your query over the view also contains a sort, you may be sorting the same data twice. Check your Explain Plan. Perhaps you could just select from the base table of the view and perform just one sort?


Selecting from Views


For the purposes of this page, Views in Oracle comprise the explicit kind of view that you create with the CREATE OR REPLACE VIEW syntax, as well as the implicit kind - variously known as implicit views, nested queries, or inline views - where you specify a query in place of a table or view name within a SQL.
The following is taken from Oracle's Designing and Tuning for Performance guide, Transformaing and Optimizing Statements.

  • ----

To optimize a statement that accesses a view, the optimizer chooses one of the following:

    • Transform the statement into an equivalent statement that accesses the view’s base tables, then optimize the resulting statement. The optimizer can use one of the following techniques to transform the statement:
      • Merge the view’s query into the referencing query block in the accessing statement.
      • Push the predicate of the referencing query block inside the view (for an non-mergeable view).
    • Issue the view’s query, collecting all the returned rows, and then access this set of rows with the original statement as though it were a table.


The first option certainly sounds better. Some of the possibilities for better performance include:

  • A predicate in the outer query may supply a column that can be used in an index.
  • A predicate in the outer query may supply a column that can perform a Partition Prune.
  • A predicate in the outer query may limit the rows returned from one of the tables in a join view making the join more efficient.

Mergable Views


As stated above, when Oracle transforms a query on a view it can either merge or push predicates. Of the two techniques, merging provides better optimization opportunities, but is more restrictive in the situations where it can be used.

    1. Features that will disable view merging:
      • Set operators (UNION, UNION ALL, INTERSECT, MINUS)
      • A CONNECT BY clause
      • A ROWNUM pseudocolumn
      • Aggregate functions (AVG, COUNT, MAX, MIN, SUM) in the select list

If your query uses any of these features, read on below to see whether Push Predicates can help. One exception to this rule is views using the UNION ALL set operator that also satisfy the criteria for a partition view|Partition View.

    1. Features that require Complex View Merging:
      • DISTINCT / UNIQUE
      • GROUP BY

If your view uses either DISTINCT or GROUP BY, then Complex View Merging may provide better performance. Check with the DBA to see whether either the OPTIMIZER_FEATURES_ENABLE or COMPLEX_VIEW_MERGING initialisation parameters are set. If not, you can enable complex view merging for just your query using the MERGE hint (not to be confused with the USE_MERGE hint which is used for Sort-Merge joins). Note that Complex View Merging will not help for queries that access columns in the view built from aggregate or analytic functions.

    1. Features that will merge under special conditions
      • The view is used as the outer table in an outer join.

If your view is joined as the outer table of an outer join, then it cannot be merged if the view has two or more base tables, however it can still Push Predicates (see below).
If you think your query can use View Merging or Complex View Merging, then it can be difficult to prove whether or not Oracle is actually doing it. If the result of the merged query would allow the use of a particular index or partition key (for pruning), then this will show up in the Explain Plan. Otherwise, you could add a ROWNUM column to the view (which would disallow merging altogether) and try again. If the performance was markedly slower, then the original query was probably performing a merge or at least pushing predicates (see below).

Push Predicates


Whereas View Merging rewrites the query dispensing with the view entirely, Push Predicates leaves the view in place, but uses one or more of the predicates in the outer query to restrict the rows returned (and possibly provide a better execution path).
Predicates that can be pushed into a view include:

  • Predicates that compare a column from the view to a constant or bind variable expression. These predicates will be pushed automatically.
  • Join predicates between the view and another table or view. For each row returned from the other table, Oracle can use the join predicate to lookup the matching rows in the view. For this to happen, check with the DBA that the PUSH_JOIN_PREDICATE initialisation parameter is set to TRUE, otherwise use the PUSH_PRED hint.

There are some circumstances where Oracle cannot push predicates. It is difficult to obtain a definitive list of these situations, but here is a start:

  • Predicates that access a view column that contain any of the following:
    • Aggregate functions. eg. AVG, COUNT, MAX, MIN, SUM
    • Analytic functions. eg. ROW_COUNT, RANK
    • The ROWNUM pseudocolumn.
  • Views that use CONNECT BY

Although it is unstated in the Oracle documentation, you may also have difficulty pushing predicates into some views with UNION, INTERSECT or MINUS.

Other Options


If your view is unable to either merge or push predicates, yet you are providing a predicate in the outer query that you believe could improve performance if it were inside the view, then you could try one of the following

  • Create the view as a materialized view|Materialized View.
  • Rewrite the query to reference the base tables of the view. ie. Don't use the view at all.
  • Create a parameterised view|Parameterised View.

Index Join


This is a feature you can use in a high volume SQL to avoid a table lookup. Most programmers know that if an index contains all of the columns used in a SQL, then Oracle does not need to access the table. But what if the columns you are using are spread over two indexes?
For example:



No single index contains emp_no and sal. Oracle can read both indexes and perform a hash join of the two. This will often be slower than a full table scan, but if the rows of the table are particularly wide (lots of columns, or big VARCHAR2 / CHAR columns), the advantage of reading the skinny indexes could be significant.
To get Oracle to use an index join, use the INDEX_JOIN hint.
There is no hint to stop Oracle from performing an index join. In the unlikely event that you want to prevent an Index Join, use the INDEX hint to specify just one index, or the FULL hint for a Full Table Scan.


Sparse Join


Consider a table join where only a small percentage of the rows in one of the tables is required to satisfy the join.
For example, consider a data warehouse for a large organisation whose customer table contains 1 million customers. Any customer-based report generated from this warehouse would need to join an even larger fact table to the customer table, but given that most such reports would be reporting on a reduced set of fact data (eg. for a single region, or date), it is unlikely that any one report would need to read a large proportion of the customertable.
With a sparse join involving two large tables, the two most common join methods provide sub-optimal performance:

  • An Indexed Nested Loops join would identify candidate rows from the fact table, and then lookup the customer table for each one. Whilst efficient for small data volumes, Indexed Nested Loops is not scalable to larger volumes and is unsuitable for reports and batch jobs.
  • A Hash join requires that the smaller table be hashed and then probed by the rows from the larger table. Whichever table is hashed, a full table scan of the sparse table (customer in the example above) is still required. This seems unwarranted since only a small percentage of its rows is required.

Sparse joins fall into two distinct categories, each with a separate solution:

    1. Where the reduced set of rows in the sparse table is determined by a column in the sparse table. eg. Customers in a particular region.

If such joins are common, it may be appropriate to partition the sparse table. For example, creating List Partitions oncustomer.region would reduce the full table scan to a full scan of a single partition when joining only customers of a given region.

    1. Where the reduced set of rows in the sparse table is determined by a column in the other table. eg. All customers who placed an order in September.


If appropriate, the sparse table can be Hash Clustered on the primary key. The join would use Nested Loops with the Hash Clustered table as the outer (second) table, but would escape the performance problems of the Indexed Nested Loops join by avoiding an index altogether; rows from the sparse table can be accessed directly by their Hash Key.
Both of these solutions involve radical changes to the structure of the sparse table. They are not so much a solution to a tuning problem as a design alternative that should be implemented when a system is first being built. Most importantly, no table should be either clustered or partitioned by a developer; in a live system this should only be performed by an experienced DBA after full regression testing in a test database.


High Volume DML


Are you running an INSERT, UPDATE, or DELETE command that is processing a large number of rows?
First check that the table is free of triggers. If the SQL is a DELETE, or you are updating a Primary or Unique key, make sure there are no referencing tables with Foreign Key clauses.
If these common problems don't apply, then you might be able to use one of these techniques to tune the DML.

TechniqueInsertUpdateDelete
Direct Load InsertYesNoNo
CREATE TABLE AS instead of INSERT on a partitioned tableYesNoNo
Updatable viewsNoYesNo
Truncate and replaceNoYesYes
Drop and replace indexesYesYesYes
Process by Rowid RangeYesYesYes

Direct Load Insert


Direct Load Insert is a faster way of running an INSERT statement. It is particularly useful for inserting large numbers of rows. Direct Load Insert differs from Conventional Insert in that it bypasses the buffer cache. Of course there are side-effects; keep in mind the following:

    • Data is appended to the table. Existing free space is not re-used. So, if you direct load insert a bunch of rows, then delete them, and insert them again, the space made by the delete will not be re-used.
      However if you truncate a table or partition first, then Direct Load Insert will reuse the existing free space.
    • Direct Load Insert only works for INSERT INTO .. SELECT .... Inserts using the VALUES clause will use conventional insert.
    • Direct Load Insert uses rollback segments to maintain indexes as the data is loaded. The size of your rollback segments will limit the size of the INSERT you can perform. Drop or invalidate the indexes Rowid Range technique to work around it.
    • Direct Load Insert can be run with a NOLOGGING option making it even faster. This means that in the event of a system crash where the database is restored from a backup taken prior to the INSERT and rolled forward, you will have to re-run your INSERT.
    • Direct Load Insert locks the table in exclusive mode. No other session can insert, update, or delete data, or maintain any indexes.
    • After a Direct Load Insert, the session that performed the Insert may perform another Direct Load Insert on the same table, but may not perform any other action (SELECT, UPDATE, DELETE, or conventional INSERT) on that table until the transaction is committed.
    • Referential Integrity (Foreign Key) constraints and triggers must be disabled before running Direct Path Insert
    • Direct Load Insert cannot occur on:
      • Index Organised Tables
      • Tables with LOB Columns
      • Tables with Object Columns
      • Clustered Tables

Attempts to Direct Load Insert on these tables will run in conventional mode without warnings or errors.
To use Direct Load Insert, add the APPEND hint to your INSERT statement.



To make sure your SQL is using Direct Load Insert, run it through Explain Plan. You should see a line reading LOAD AS SELECT.


CREATE TABLE AS SELECT (CTAS)


The absolute fastest way to get data into a table is to create the table from scratch using CREATE TABLE AS SELECT (CTAS). eg.



The best thing about CTAS is that it does not use rollback segments. If your SELECT clause performs a sort (ie. in a GROUP BY, UNION, sort-merge join, etc) then your query may be limited by the database sort-area size. Similarly, if it performs a hash join then you may be limited by the database hash-area size. Otherwise you are only limited in the number of rows you create by the size of the tablespace into which the table will go.
The UNRECOVERABLE clause may be used to make the CREATE even faster, however in the event of crash and recovery from backup made prior to the CREATE, the new table will need to be recreated.
If the table you want to insert into already exists (and you want to keep the rows already in it) then Direct Load Insert may be a better option.

CTAS and Exchange Partition


CTAS is a particularly powerful tool for populating or rebuilding a single partition of a partitioned table. Say we have a table that contains monthly aggregates of a base table, and is partitioned monthly. We could run a monthly job to CREATE TABLE AS SELECT the data for just one month, and then use ALTER TABLE EXCHANGE PARTITION to swap the new table into the partitioned monthly aggregate table.


Updateable Join Views


High volume UPDATE statements with SET sub-queries can be one of hardest SQLs to tune. Consider the following:



This SQL will probably perform a full table scan on my_table (not such a problem) and then for every row returned, it will perform an indexed SELECT on my_temp_table to retrieve the new value for col1. This style of execution is synonymous with a Nested Loops join. As discussed elsewhere in this guide, Nested Loops joins are appropriate for small data volumes; large data volumes are better served with Sort-merge and Hash joins.
What is required is the ability to join the two tables with a Hash join and update at the same time. Well it is possible - with an Updateable Join View.



The example above shows an update of a dynamic or nested view. It is also possible to update an explicit view that is declared on the database.
Clearly the view cannot contain DISTINCT, GROUP BY, CONNECT BY, aggregate functions, UNION, INTERSECT, or MINUS clauses: simple joins are all that is allowed. There is one other restriction: the view must be key preserved. This means that the join must meet the following restrictions:

  • The join criteria must be equals (=) joins.
  • Tables other than the table that is to be updated must be joined on a primary or unique key.

If these conditions are violated, then Oracle cannot guarantee that the view will return one row only for each row in the base table. If two or more rows in the secondary table mapped to each row in the base table, then the update would be ambiguous. An attempt to update a non-key-preserved view will result in an Oracle error.
If you are performing a high volume update, but cannot use an Updateable Join View because it would not be key-preserved or would require GROUP BY etc., then try using an intermediate temporary table. Create a temporary table with the same primary key as the base table, and use a SELECT statement to insert the keys and the updated column values. Then use an Updateable Join View to update the base table. Even though this seems like more work, it is still a lot quicker that the traditional Nested Loop style update.
Alternatively, if you are using Oracle 10g or later, use the MERGE statement without a WHEN NOT MATCHED clause
For the very courageous only, there is a way to update a non-key-preserved view. Oracle uses an undocumented hint BYPASS_UJVC (Bypass Updateable Join View Constraints) internally to refresh materialized views. Note that if the join returns multiple rows for each row of the table to be updated, then you will get a non-deterministic result. Also note that since this hint is undocumented, Oracle could remove it without notice, leaving your code invalid and useless. I strongly suggest you use BYPASS_UJVC for once-only updates, and never for permanent production code.



Truncate and Replace


Trying to delete more than 10% of the rows in a huge table? Rather than deleting the rows you don't want, it might be faster to:

  • Copy the rows you want to keep into a temporary table using Direct Load Insert
  • Truncate the base table
  • Copy the temporary table back into the base table with Direct Load Insert

The technique also works if you are trying to update more than 10% of the rows in a table.

  • Copy all rows into a temporary table using Direct Load Insert
  • Truncate the base table
  • Insert the rows from the temporary table back into the base table using a SELECT caluse to modify the value of the columns that need to be updated. This SELECT can perform table joins etc to derive the new values.

Deletes and Updates are very slow compared to Direct Load Inserts. The actual break-even point beyond which the techniques above will out-perform Delete and Update will probably be somewhere around 10% to 50% of the table/partition, but will depend greatly on the indexes on the table which must be rebuilt after a Direct Load Insert.

Drop and Replace Indexes


Inserts, Deletes, and Updates that update indexed columns must all maintain indexes as they execute. This is particularly inefficient as every row processed will cause a separate index scan, which could read several blocks. Direct Path SQL*Loader mitigates this effect by deferring the index rebuild until the end of the INSERT.
Depending on the proportion of the table inserted / updated / deleted, it may be faster to drop the indexes (or disable PK and Unique constraints) and then rebuild them after the DML is complete.



Rowid Range


All Update, Delete, and most Insert statements must use rollback segments. If your DML processes so many rows that it cannot be performed in a single transaction, then you must run the change incrementally with regular commits.
One method of doing this is to use PL/SQL to process the table row by row. For INSERTs, this is not so bad, but for UPDATEs and DELETEs, it means indexed access on each row to be updated or deleted.
A better way is to process the table in chunks. If the table is partitioned and each partition can be processed in a single transaction, then it is fairly easy to construnct a PL/SQL loop around the DML statement to process one partition at a time.
Non-partitioned tables and partitions too big to process within a single transaction are more difficult. In the same way that partitions break a table into manageable chunks, extents break a partition (or non-partitioned table) into chunks. Run the following SQL:



MY_TABLE



As your table grows, Oracle allocates new extents in which to store the new rows. Looking at the columns of dba_extents, we can reconstruct the rowids of the first and last row in every extent. Then it is simply a matter of:



Oracle can use the low and high rowid of each extent to go straight to the rows you want to SELECT for INSERT, UPDATE, or DELETE. See your DBA about using dba_extents to reconstruct rowids.
Warning: If your DBA uses Dictionary Managed Tablespaces, then it is likely that some extents will be too big to process in a single transaction. With Locally Managed Tablespace it is more likely that the extents will be small enough to process with Rowid Range.



Tuning Procedural Code


Procedural code, such as PL/SQL, is often a fallback position for running SQLs that are too complex to code in a single statement.
The typical model for such a module would be:



Generally speaking, a single SQL can be made to run more efficiently than the equivalent PL/SQL routine that loops through a cursor. However this should not be seen as a challenge to write huge convoluted SQLs that are impossible to maintain.
Do you have a cursor within a cursor? Or a SELECT INTO within a cursor? The single greatest cause of inefficient procedural code of the type above is when SQL is performed inside the cursor loop. Prior to v9i and the introduction of bulk binding and the FORALL statement, it was necessary to perform INSERTs, UPDATEs, and DELETEs inside a cursor loop - this was a necessary evil. All cursors and SQL statements are equivalent to performing a Nested Loops join (except less efficient). For cursors looping through a large number of rows, this can have terrible consequences.
The first thing you should do is remove as many SQLs from inside the loop as possible by joining tables into the main cursor. For simple lookups with equals joins this should be easy. Use the other techniques in this guide to re-tune the main cursor if necessary. What remains should be SQL statements inside the loop that seem impossible to merge into the main cursor.
Below are a few tricks that help avoid SQL statements within a cursor loop:

  • Analytic functions, the MERGE statement, and multi-table INSERT statements (INSERT ALL and INSERT FIRST) provide functionality not previously available in SQL. You may be able to re-write your PL/SQL as a single SQL if it is using a cursor to process rows in fixed sequence.
  • Can't join two or more large tables because the join key is non-unique in both tables? ie. Table A has 4 rows per key, and Table B contains 7 rows per key - a join would return 28 rows. Try Concurrent Cursors instead of a cursor within a cursor.
  • Tried running it as a single SQL but it was too big - you got the error:
    • "Failed to allocate an extent of size n in tablespace TEMP"

If the SQL is performing Hash joins, try Sort-Merge (use the USE_MERGE hint) joins instead. They can take less temp space. Note that if your tables are growing, then even if sort-merge works, it may also blow out one day. Also ask the DBA to consider increasing the temp space.

  • Do you have an INSERT .. SELECT, a DELETE, or an UPDATE that was so big it exceeded your rollback segments, so you decided to write it in transactional PL/SQL. Take a look at the High Volume DML hints - you may yet be able to re-code it as a single SQL.
  • Use FORALL to perform INSERT, UPDATE, and DELETE statements in bulk outside the cursor loop. The technique is detailed in the PL/SQL Users Guide and Reference; it involves looping through a cursor and saving results to a PL/SQL Nested Table, and then applying the changes to the database afterwards in a FORALL loop.

Eliminating SQL from inside cursor loops will provide the most significant time saving. For smaller but still significant improvements on high volume jobs, use BULK COLLECT to fetch cursors into Nested Tables thousands of rows at a time. This technique is detailed in the PL/SQL Users Guide and Reference. BULK COLLECT requires cursors to be explicitly declared, opened, fetched, and closed, so - for applications that use implicit cursor FOR loops - it requires a significant rewrite effort.


Non-Unique joins and Concurrent Cursors

Non-unique joins


SQLs can get complex when you want to join tables with non-unique join keys. Consider a financials system with tables account, payment, and receipt. An account can have many payments and many receipts. Any query that tries to select payments and receipts together will obtain a cross product of the payments and receitps for each account.
For example:



This SQL is not going to work. For a given account with 5 payments and 3 receipts, the join would produce 5x3=15 rows before the GROUP BY aggregates them to one. The payment total will be 3 times higher than it should be, and the receipt total will be 5 times higher.
One way to resolve this using Oracle prior to v7.3 is as follows



Although this seems very clever, it is inefficient and difficult to read - particularly if extended to three tables. A better solution is:



What happens if it is not this simple? Above, the non-unique joins have been made unique using aggregation. What if that is not possible?

Concurrent Cursors


Consider a system with tables job, job_status_hist, and invoice. job_status_hist records the changes in status of a job over time - eg. (O)pen, (C)losed, (R)eopened. A job may have many invoices which are sent out periodically over the duration of a job. A report is required that shows the initial cost of a job (invoices sent between the original Open status record, and the first Closed status record), and the total cost of a job (total of all invoices).
This is quite possible to produce in SQL, but would probably be large and convoluted and difficult to read. Furthermore, minor changes in the business rule (eg. special coding to handle jobs closed in error and immediately reopened) would be difficult to handle. The logical solution is to code the report procedurally. An algorithm might be:



job



job_status_hist




invoice



invoice
Simple to code and simple to read. But very inefficient if the number of jobs is large. The SQLs or Cursors required to get the dates and costs inside the loop must be executed separately for every job.
A better way to do this is Concurrent Cursors. Instead of running independent SQLs inside the loop for every job, we open three cursors at the beginning - one from each table - and process them in job number order.
For example:



job




job_status_hist




invoice







This technique has allowed us to process high volumes of data transactionally without resorting to indexes. It is only marginally less efficient than a single SQL performing Hash or Sort-Merge joins (not that it would be possible in this example).



No comments:

Post a Comment