Oracle virtual indexes
Article by
Rampant Author
Chris Foot
One of the areas I am strong at is using
the proper indexes to influence SQL statement access paths. We
know that an index can influence the optimizer to change an
access path from a tablescan to a table lookup using an index
ROWID. But indexes can also influence the type of join used as
well as the order of the tables being joined.
Let's take a look at the benefits virtual indexes have and we'll
finish this blog by reviewing how indexes can affect simple
access paths, join methods and join orders.
Virtual Indexes
I use virtual indexes in both 9i and 10G. I like them because I
can quickly do "what if" scenarios. I can create the index and
present it to the optimizer without waiting for the index to
become populated and stored on disk. This feature allows me to
reduce the amount of time I spend tuning, which is always a good
thing.
I also use virtual indexes when I am training myself and others.
Most production environments do have test counterparts, but
there are times when I need to teach many different indexing
concepts and waiting for indexes to build often becomes a
detriment to the learning process.
9I OEM Virtual Index Wizard
The Virtual Index Wizard feature is provided in the Tuning Pack
option of 9I OEM. Although there are a couple of ways to
activate the wizard, let's take a look at the one that I most
often use. Since we are going to be navigating through the SQL
Analyze tool, we might as well review some of the other features
it provides.
SQL Analyze
9I OEM provides the SQL Analyze tool to view access path
information. SQL Analyze is only available if you have purchased
the tuning pack option for 9I OEM. This screenshot shows me
navigating through 9I OEM's menus to activate SQL Analyze.
When you activate SQL Analyze, the first step the utility
performs is to display a listing of all of the top resource
consuming SQL statements that are contained in the library
cache. If the statement has been flushed from cache, you won't
find it in this display.
The tool lists all of the databases on the left side of the
panel that you worked with in the past. As you can see, I am a
pretty active user of SQL Analyze. I can double-click on any
statement in the top SQL listing to view its access path
information. I can also use the menu system at the top of the
screen to create a blank worksheet for new queries that I can
enter manually. I found the statement I wanted from the listing
and double-clicked on it to activate the SQL tuning panel.
The panel displays the SQL statement, and like our other
utilities, allows us to step through the access path the query
is taking. I can click on the SQL drop down menu at the top of
the screen to ask Oracle to explain the statement using the
various optimizer modes that are available.
SQL Analyze Tuning Tools
SQL Analyze provide us with tools that facilitate the SQL tuning
process. The tools menu at the top of the screen allows me to
choose from three different wizards. We'll review the
Virtual Index Wizard in just a moment.
The Hint Wizard allows us to choose hints
from a drop-down menu system, while the SQL Tuning Wizard
activates an intelligent advisor that provides us with SQL
tuning recommendations. Once you have run the SQL Tuning Wizard
a few times, it's pretty easy to see that it was the precursor
to some of the more advanced utilities that are now available in
Oracle 10G.
Index Affects on Access Paths
Let's use a nested loop join as a quick and somewhat easy
example. In a nested loop join, one of the tables in the join is
known as the outer table while the other table (the table being
probed) is known as the inner table. Oracle reads a row from the
outer table and uses the columns in the join condition to probe
the inner table.
Take a look at the SQL statement and graphical access path
display below:
select a.employee_id, a.last_name, b.department_id,
b.department_name,
c.street_address, c.postal_code, c.city, c.state_province
from hr.employees a, hr.departments b, hr.locations c
where a.department_id=b.department_id
and b.location_id=c.location_id
and a.employee_id = 174
order by a.last_name;
We are joining three tables together HR.EMPLOYEES,
HR.DEPARTMENTS and HR.LOCATIONS. Notice that we have two
different nested loop join operations. Oracle only joins two
tables together at a time. It then creates an intermediate
result set and uses that result set to join to the next table.
It can use any type of join method available for any of the join
operations.
The indexes we have available to us are:
-
EMP_EMP_ID_PK - index on the EMPLOYEE_ID column of the
HR.EMPLOYEES table
-
DEPT_ID_PK - index on the DEPARTMENT_ID column of the
HR.DEPARTMENTS table
-
LOC_ID_PK - index on the LOCATION_ID of the HR. LOCATIONS
table
The optimizer has chosen to join the HR.EMPLOYEES AND
HR.DEPARTMENTS tables first using the nested loop join method.
It is on this join condition - A.DEPARTMENT_ID = B.DEPARTMENT_ID.
HR.EMPLOYEES is known as the outer table and HR.DEPARTMENTS is
classified as the inner table of the join.
The optimizer looked at the local predicate "A.EMPLOYEE_ID=174"
and checked to see if there was an index that it could use to
improve data access performance. It found the EMP_EMP_ID_PK
index that is defined on the EMPLOYEE_ID column. It will access
the index first and retrieve the ROWID for the row that has an
EMPLOYEE_ID of 174. It will use the ROWID to probe the
HR.EMPLOYEES table.
As stated previously, the optimizer has chosen the nested loop
join operation. It will access the HR.EMPLOYEES table using the
EMP_EMPID_PK index, retrieve the value for DEPARTMENT_ID and use
it to probe the HR.DEPARTMENTS table looking for a match (A.DEPARTMENT_ID
= B.DEPARTMENT_ID). If it finds a DEPARTMENT_ID value in the
HR.DEPARTMENTS table that matches, Oracle will use that row in
future operations. If it does not match, Oracle will discard it.
Notice that the optimizer has chosen to use the DEPT_ID_PK index
that is built on the HR.DEPARTMENTS' DEPARTMENT_ID column. Since
we are probing this table using the values we found for
DEPARTMENT_ID from the HR.EMPLOYEES table, Oracle will use the
index on the HR.DEPARTMENTS' DEPARTMENT_ID column.
The B.LOCATION_ID = C.LOCATION_ID is the join condition that is
used in the second join operation. The intermediate result set
is used as the outer table and the HR.LOCATIONS table is used as
the inner table. The same type of probe occurs. Since we are
looking for matches using the LOCATION_ID contained in the
intermediate result set, the optimizer will choose the LOC_ID_PK
index.
Here's how indexes can influence the type of join used and the
order of the tables being joined:
-
If we didn't have indexes on either of the first two tables
being joined, the optimizer would most likely have used a
hash operation instead of a nested loop join to join the
tables together. The optimizer knows that the hash join
operation is best used for joins that return larger result
sets. It could also have influenced the join order of the
tables. In the example, the optimizer could have chosen to
join the HR.DEPARTMENTS table to the HR.LOCATIONS table
first. It would have then used the intermediate result set
created to probe the HR.EMPLOYEES table.
-
The more indexed columns we have on a table, the greater the
chance that the optimizer will use it as the outer table of
the nested loop join. It can filter out more rows before it
sends it to the join operation. The earlier it can filter
rows out, the better.
-
If we have an index on both join columns (i.e. indexes on
the DEPARTMENT_ID columns for both the HR.DEPARTMENTS and
HR.EMPLOYEES tables), Oracle will use the DEPARTMENT_ID
index on the inner table but not the outer. Why is that?
Oracle doesn't know if it has a DEPARTMENT_ID VALUE in the
inner table that matches a DEPARTMENT_ID value in the outer
table unless it reads each and every row of the outer table
to do the match. Think about it. It knows that since it has
to read every row from the outer table to perform the match,
the only indexes it can use are the indexes built on local
predicates (i.e. "A.EMPLOYEE_ID=174").
-
If we have an index on the join column for one table and not
the other, the optimizer will often favor the table that has
the index on the join column as the inner table of the join.
It knows it can't use an index on the join column if it is
the outer table but it can use the index if it is the inner.
|