Skip Headers

Oracle® Database Application Developer's Guide - Expression Filter
10g Release 1 (10.1)

Part Number B10821-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Feedback

Go to previous page
Previous
Go to next page
Next
View PDF

2 Indexing Expressions


Note:

Expression indexing is available only in Oracle Database Enterprise Edition.

An index can be defined on a column storing expressions to quickly find expressions that evaluate to true for a data item. This is most helpful when a large expression set is evaluated for a data item. The SQL EVALUATE operator determines whether or not to use the index based on its access cost. The indextype, Expression Filter, is used to create and maintain indexes.

If an Expression column is not indexed, the SQL EVALUATE operator builds a dynamic query for each expression stored in the column and executes it using the values passed in as the data item.

This chapter describes the basic approach to indexing including index representation (Section 2.3), index processing (Section 2.4), and user commands for creating and tuning indexes (Section 2.6).

2.1 Concepts of Indexing Expressions

Expressions in a large expression set tend to have certain commonalities in their predicates. An Expression Filter index, defined on an expression set, groups predicates by their commonalities to reduce processing costs. For example, in the case of two predicates with a common left-hand side, such as Year=1998 and Year=1999, in most cases, the falseness or trueness of one predicate can be determined based on the outcome of the other predicate. The left-hand side of a predicate includes arithmetic expressions containing one or more elementary attributes and user-defined functions, for example, HORSEPOWER(model, year). An operator and a constant on the right-hand side (RHS) completes the predicate, for example, HORSEPOWER(model, year)>=150.

An Expression Filter index defined on a set of expressions takes advantage of the logical relationships between multiple predicates by grouping them based on the commonality of their left-hand sides. These left-hand sides are arithmetic expressions that consist of one or more elementary attributes and user-defined functions, for example, HORSEPOWER(model,year). In the expression set, these left-hand sides appear in the predicates along with an operator and a constant on the right-hand side (RHS), for example, HORSEPOWER(model,year)>=150.

2.2 Indexable Predicates

The predicates that can be indexed with the Expression Filter indexing mechanism include any predicate with a constant on the right-hand side that uses one of the following predicate operators: =, !=, >, <, >=, <=, BETWEEN, IS NULL, IS NOT NULL, LIKE, and NVL.

The predicates that cannot be indexed are preserved in their original form and they are evaluated by value substitution in the last stage of expression evaluation. Some of the predicates that cannot be indexed include:

2.3 Index Representation

The Expression Filter index uses persistent database objects internally to maintain the index information for an expression set. The grouping information for all the predicates in an expression set is captured in a relational table called the predicate table. Typically, the predicate table contains one row for each expression in the expression set. An expression containing one or more disjunctions (two simple expressions joined by OR) is converted into a disjunctive-normal form (disjunction of conjunctions), and each disjunction in this normal form is treated as a separate expression with the same identifier as the original expression. The predicate table contains one row for each such disjunction.

The Expression Filter index can be tuned for better performance by identifying the most-common left-hand sides of the predicates (or discriminating predicate groups) in the expression set. The owner of the expression set (or the table storing expressions) can identify the predicate's left-hand sides or automate this process by collecting statistics on the expression set. For each common left-hand side, a predicate group is formed with all the corresponding predicates in the expression set. For example, if predicates with Model, Price, and HorsePower(Model, Year) attributes are common in the expression set, three predicate groups are formed for these attributes. The predicate table captures the predicate grouping information as shown in Figure 2-1.

Figure 2-1 Conceptual Predicate Table

Description of predicatetable.gif follows
Description of the illustration predicatetable.gif

For each predicate group, the predicate table has two columns: one to store the operator of the predicate and the other to store the constant on the right-hand side of the predicate. For a predicate in an expression, its operator and the right-hand side constant are stored under the corresponding columns of the predicate group. The predicates that do not fall into one of the preconfigured groups are preserved in their original form and stored in a VARCHAR2 column of the predicate table as sparse predicates. (For the example in Figure 2-1, the predicates on Mileage and Year fall in this category.) The predicates with IN lists and the predicates with a varying right-hand side (not a constant) are implicitly treated as sparse predicates. Native indexes are created on the predicate table as described in Section 2.4.

2.4 Index Processing

To evaluate a data item for a set of expressions, the left-hand side of each predicate group is computed and its value is compared with the corresponding constants stored in the predicate table using an appropriate operator. For example, using the predicate table, if HORSEPOWER('TAURUS',2001) returns 153, then the predicates satisfying this value are those interested in horsepower equal to 153 or those interested in horsepower greater than a value that is below 153, and so on. If the operators and right-hand side constants of the previous group are stored in the G3_OP and G3_RHS columns of the predicate table (in Figure 2-1), then the following query on the predicate table identifies the rows that satisfy this group of predicates:

SELECT  Rid  FROM predicate_table WHERE
     G3_OP = '=' AND G3_RHS = :rhs_val   or
     G3_OP = '>' AND G3_RHS < :rhs_val   or 
     ... 
-- where :rhs_val is the value from the computation of the left-hand side --

Expression Filter uses similar techniques for less than (<), greater than or equal to (>=), less than or equal to (<=), not equal to (!=, <>), LIKE, IS NULL, and IS NOT NULL predicates. Predicates with the BETWEEN operator are divided into two predicates with greater than or equal to and less than or equal to operators. Duplicate predicate groups can be configured for a left-hand side if it frequently appears more than once in a single expression, for example, Year >= 1996 and Year <= 2000.

The WHERE clause (shown in the previous query) is repeated for each predicate group in the predicate table, and the predicate groups are all joined by conjunctions. When the complete query (shown in the following example) is issued on the predicate table, it returns the row identifiers for the expressions that evaluate to true with all the predicates in the preconfigured groups. For these resulting expressions, the corresponding sparse predicates that are stored in the predicate table are evaluated using dynamic queries to determine if an expression is true for a particular data item.

SELECT Rid, Sparse_predicate FROM predicate_table
 WHERE                 --- predicates in group 1
  (G1_OP IS NULL OR     --- no predicate involving this LHS
   ((:g1_val IS NOT NULL AND 
     (G1_OP = '=' AND G1_RHS = :g1_val or
      G1_OP = '>' AND G1_RHS < :g1_val or 
      G1_OP = '<' AND G1_RHS > :g1_val or         
      ...) or 
     (:g1_val IS NULL AND G1_OP = 'IS NULL')))

 AND                   --- predicates in group 2
  (G2_OP IS NULL OR
   ((:g2_val IS NOT NULL AND 
     (G2_OP = '=' AND G2_RHS = :g2_val   or
      G2_OP = '>' AND G2_RHS < :g2_val   or
      G2_OP = '<' AND G2_RHS > :g2_val   or
      ...) or
     (:g2_val IS NULL AND G2_OP = 'IS NULL')))
 AND
...

For efficient execution of the predicate table query (shown previously), concatenated bitmap indexes are created on the {Operator, RHS constant} columns of selected groups. These groups are identified either by user specification or from the statistics about the frequency of the predicates (belonging to a group) in the expression set. With the indexes defined on preconfigured predicate groups, the predicates from an expression set are divided into three classes:

  1. Indexed predicates: Predicates that belong to a subset of the preconfigured predicate groups that are identified as most discriminating. Bitmap indexes are created for these predicate groups; thus, these predicates are also called indexed predicates. The previous query performs range scans on the corresponding index to evaluate all the predicates in a group and returns the expressions that evaluate to true with just that predicate. Similar scans are performed on the bitmap indexes of other indexed predicates, and the results from these index scans are combined using BITMAP AND operations to determine all the expressions that evaluate to true with all the indexed predicates. This enables multiple predicate groups to be filtered simultaneously using one or more bitmap indexes.

  2. Stored predicates: Predicates that belong to groups that are not indexed. These predicates are captured in the corresponding {Operator, RHS constant} columns of the predicate table, with no bitmap indexes defined on them. For all the expressions that evaluate to true with the indexed predicates, the previous query compares the values of the left-hand sides of these predicate groups with those stored in the predicate table. Although bitmap indexes are created for a selected number of groups, the optimizer may choose not to use one or more indexes based on their access cost. Those groups are treated as stored predicate groups. The query issued on the predicate table remains unchanged for a different choice of indexes.

  3. Sparse predicates: Predicates that do not belong to any of the preconfigured predicate groups. For expressions that evaluate to true for all the predicates in the indexed and stored groups, sparse predicates (if any) are evaluated last. If the expressions with sparse predicates evaluate to true, they are considered true for the data item.

Optionally, you can specify the common operators that appear with predicates on the left-hand side and reduce the number of range scans performed on the bitmap index. In the previous example, the Model attribute commonly appears in equality predicates, and the Expression Filter index can be configured to check only for equality predicates while processing the indexed predicate groups. Sparse predicates along with any other form of predicate on the Model attribute are processed and evaluated at the same time.

2.5 Predicate Table Query

Once the predicate groups for an expression set are determined, the structure of the predicate table and the query to be issued on the predicate table are fixed. The choice of indexed or stored predicate groups does not change the query. As part of Expression Filter index creation, the predicate table query is determined and a function is dynamically generated for this query. The same query (with bind variables) is used for any data item passed in for the expression set evaluation. This ensures that the predicate table query is compiled once and reused for evaluating any number of data items.

2.6 Index Creation and Tuning

The cost of evaluating a predicate in an expression set depends on the group to which it belongs. The index for an expression set can be tuned by identifying the appropriate predicate groups as the index parameters.

The steps involved in evaluating the predicates in an indexed predicate group are:

The steps involved in evaluating the predicates in a stored predicate group are:

The steps involved in evaluating the predicates in a sparse predicate group are:


Creating an Index from Default Parameters

In a schema, an attribute set can be used for one or more expression sets, and you can configure the predicate groups for these expression sets by associating the default index parameters with the attribute set. The (discriminating) predicate groups can be chosen with the knowledge of commonly occurring left-hand sides and their selectivity for the expected data.

The following command uses the DBMS_EXPFIL.DEFAULT_INDEX_PARAMETERS procedure to configure default index parameters with the Car4Sale attribute set:

BEGIN
  DBMS_EXPFIL.DEFAULT_INDEX_PARAMETERS('Car4Sale',
    exf$attribute_list (
       exf$attribute (attr_name => 'Model',      --- LHS for predicate group
                      attr_oper => exf$indexoper('='),
                      attr_indexed => 'TRUE'),   --- indexed predicate group 
       exf$attribute (attr_name => 'Price',
                      attr_oper => exf$indexoper('all'),
                      attr_indexed => 'TRUE'),
       exf$attribute (attr_name => 'HorsePower(Model, Year)',
                      attr_oper => exf$indexoper('=','<','>','>=','<='),
                      attr_indexed => 'FALSE')    --- stored predicate group
    )
  );
END;
/

For an expression set, create the Expression Filter index as follows:

CREATE INDEX InterestIndex ON Consumer (Interest) 
              INDEXTYPE IS EXFSYS.EXPFILTER; 

The index derives all its parameters from the defaults (Model, Price, and HorsePower(Model, Year)) associated with the corresponding attribute set. If the defaults are not specified, it implicitly uses all the scalar elementary attributes (Model, Year,Price, and Mileage) in the attribute set as its stored and indexed attributes.

You can fine-tune the default parameters derived from the attribute set for each expression set by using the PARAMETERS clause when you create the index or by associating index parameters directly with the expression set. The following CREATE INDEX statement with the PARAMETERS clause configures the index with an additional stored predicate:

CREATE INDEX InterestIndex ON Consumer (Interest) 
              INDEXTYPE IS exfsys.ExpFilter
  PARAMETERS ('ADD TO DEFAULTS STOREATTRS (CrashTestRating(Model, Year))');

For more information about creating indexes from default parameters, see "DEFAULT_INDEX_PARAMETERS Procedure" in Chapter 8 and "CREATE INDEX" in Chapter 6.


Creating an Index from Exact Parameters

If there is a need to fine-tune the index parameters for each expression set associated with the common attribute set, you can assign the exact index parameters directly to the expression set, using the DBMS_EXPFIL.INDEX_PARAMETERS procedure.

The following commands copy the index parameters from the defaults and then fine-tune them for the given expression set. An expression filter index created for the expression set uses these parameters to configure its indexed and stored predicate groups.

BEGIN
  -- Derive index parameters from defaults --
  DBMS_EXPFIL.INDEX_PARAMETERS(expr_tab  => 'Consumer',
                               expr_col  => 'Interest',
                               attr_list => null, 
                               operation => 'DEFAULT');
 
  -- Fine-tune the parameters by adding another stored attribute --
  DBMS_EXPFIL.INDEX_PARAMETERS(expr_tab  => 'Consumer',
                               expr_col  => 'Interest',
                               attr_list =>
                                exf$attribute_list (
                                 exf$attribute (
                                   attr_name => 'CrashTestRating(Model, Year)',
                                   attr_oper => exf$indexoper('all'),
                                   attr_indexed => 'FALSE')),
                               operation => 'ADD');
END;
/
 
 
CREATE INDEX InterestIndex ON Consumer (Interest) 
             INDEXTYPE IS EXFSYS.EXPFILTER;

For more information about creating indexes from exact parameters, see "INDEX_PARAMETERS Procedure" in Chapter 8 and "CREATE INDEX" in Chapter 6.

See Chapter 3 for a discussion on indexing expressions with XPath predicates.


Creating an Index from Statistics

If a representative set of expressions is already stored in the table, the owner of the table can automate the index tuning process by collecting statistics on the expression set, using the DBMS_EXPFIL.GET_EXPRSET_STATS procedure, and creating the index from these statistics, as shown in the following example:

BEGIN
  DBMS_EXPFIL.GET_EXPRSET_STATS (expr_tab => 'Consumer',
                                 expr_col => 'Interest');
END;
/

CREATE INDEX InterestIndex ON Consumer (Interest) 
             INDEXTYPE IS EXFSYS.EXPFILTER
  PARAMETERS ('STOREATTRS TOP 4 INDEXATTRS TOP 2');

For the previous index, four stored attributes are chosen based on the frequency of the corresponding predicate left-hand sides in the expression set, and out of these four attributes, the top two are chosen as indexed attributes. When a TOP n clause is used, any defaults associated with the corresponding attribute set are ignored. The attributes chosen for an index can be viewed by querying the USER_EXPFIL_PREDTAB_ATTRIBUTES view.

For more information about creating indexes from statistics, see "GET_EXPRSET_STATS Procedure" in Chapter 8 and "CREATE INDEX" in Chapter 6.

2.7 Index Usage

A query using the SQL EVALUATE operator on an Expression column can force the use of the index defined on such a column with an optimizer hint. (See the Oracle Database Performance Tuning Guide.) In other cases, the optimizer determines the cost of the Expression Filter index-based scan and compares it with the cost of alternate execution plans.

SELECT * FROM Consumer WHERE 
  EVALUATE (Consumer.Interest, 
            Car4Sale.getVarchar('Mustang',2000,18000,22000)) = 1 and
  Consumer.Zipcode BETWEEN 03060 and 03070;

For the previous query, if the Consumer table has an Expression Filter index defined on the Interest column and a native index defined on the Zipcode column, the optimizer chooses the appropriate index based on their selectivity and their access cost. In the current release, the selectivity and the cost of Expression Filter indexes are set as domain index defaults, and they are not computed for the expression set.

You can use the EXPLAIN PLAN statement to see if the optimizer picked the Expression Filter index for a query.

2.8 Index Storage and Maintenance

The Expression Filter index uses persistent database objects to maintain the index on a column storing expressions. All these secondary objects are created in the schema in which the Expression Filter index is created. There are three types of secondary objects for each Expression Filter index, and they use the following naming conventions:

To ensure the expression evaluation is valid, a table with an Expression column and the Expression Filter index on the Expression column should belong to the same schema. A user with CREATE INDEX privileges on a table cannot create an Expression Filter index unless the user is the owner of the table. By default, the predicate table is created in the user's default tablespace. You can specify an alternate storage clause for the predicate table when you create the index by using the PREDSTORAGE parameter clause. (See the section about the CREATE INDEX statement in Chapter 6.) The indexes on the predicate table are always created in the same tablespace as the predicate table.

An Expression Filter index created for an Expression column is automatically maintained to reflect any changes made to the expressions (with the SQL INSERT, UPDATE, or DELETE statements or SQL*Loader). The bitmap indexes defined on the predicate table could become fragmented when a large number of expressions are modified, added to the set, or deleted. You can rebuild these indexes online to reduce the fragmentation using the DBMS_EXPFIL.DEFRAG_INDEX procedure, as shown in the following example:

BEGIN
  DBMS_EXPFIL.DEFRAG_INDEX (idx_name => 'InterestIndex');
END;
/

See "DEFRAG_INDEX Procedure" in Chapter 8 for more information about this procedure.

You can rebuild the complete Expression Filter index offline by using the ALTER INDEX ... REBUILD statement. This is useful when the index is marked UNUSABLE following a table maintenance operation. When the default index parameters associated with an attribute set are modified, they can be incorporated into the existing indexes using the ALTER INDEX ... REBUILD statement with the DEFAULT parameter clause. See the section about ALTER INDEX REBUILD statement in Chapter 6.