Oracle® OLAP DML Reference 10g Release 1 (10.1) Part Number B10339-02 |
|
|
View PDF |
The AGGMAP command identifies an aggmap object as a specification for aggregation and adds an aggregation specification to the definition of the current aggmap object.
Tip: The current aggmap definition is the definition of the aggmap object most recently defined or considered. Issue a CONSIDER statement to explicitly make an aggmap definition the current aggmap definition. |
Note: There are two other OLAP DML statements that are also sometimes referred to as "AGGMAP statements":
|
Syntax
AGGMAP [specification]
Arguments
A multiline text expression that is the aggregation specification for the current aggmap object. Separate statements with newline delimiters (\n
), or use JOINLINES. An aggregation specification begins with an ALLOCMAP statement and ends with an END
statement. Between these statements, you code one or more the following statements depending on the calculation that you want to specify. Minimally, an aggregation specification consists of a RELATION (for aggregation) statement. You can create more complex aggregation specifications by including one or more of the following statements in the specification as described in "Aggregations":
Note: Special considerations apply when you are writing a specification to aggregate a variable dimensioned by a compressed composite. See "Aggregating Variables Dimensioned by Compressed Composites".You cannot specify a conjoint dimension in the specification for the aggmap; use composites instead. |
Notes
Keep the following points in mind when designing aggregating values in a variable dimensioned by a compressed composite:
You must use the AGGREGATE command to aggregate data in a variable dimensioned by a compressed composite; you cannot use the AGGREGATE function.
When coding the aggregation specification, follow these guidelines:
MODEL statements must precede RELATION statements.
MODEL statements must either exclude a PRECOMPUTE clause or specify PRECOMPUTE ALL. You cannot include a dynamic model in the aggregation specification.
RELATION statements for the hierarchical dimensions in the compressed composite must include a PRECOMPUTE clause.
Once the data in a variable dimensioned by a compressed composite is aggregated, it is read-only-data. Before you can recalculate values using the AGGREGATE command, you must first delete the values using the CLEAR AGGREGATES command.
There is no support for parallel aggregation. Instead, use multiple sessions to compute variables or partitions that have their own compressed composites.
There is no support for incremental aggregation. Instead, partition using a dense time dimension with local compressed composites. In this way you can aggregate only those partitions that contain new data.
A number of options can impact aggregation as outlined in Table 6-1, "Aggregation Options".
Table 6-1 Aggregation Options
Statement | Description |
---|---|
MULTIPATHHIER |
An option that specifies that a given cell that contains detail data can have more than one path into a cell that contains aggregated data. |
POUTFILEUNIT |
An option that identifies a destination for status information about an aggregation operation. |
SESSCACHE |
An option that that controls whether Oracle OLAP uses the session cache to store variable data that has been aggregated on the fly. |
VARCACHE |
An option that specifies if and where variable data that has been aggregated on the fly should be stored. (The VARCACHE option is only one factor that can determine this decision.) |
Table 6-2, "System Properties Used When Aggregating or Allocating Data" lists system properties documented in this manual that relate to aggregation or allocation.
Table 6-2 System Properties Used When Aggregating or Allocating Data
Property | Description |
---|---|
$AGGMAP | A property that specifies the default aggregation specification for a variable. |
$AGGREGATE_FROM | A property that specifies the name of an object from which to obtain detail data when aggregating data. |
$AGGREGATE_FROMVAR | A property that specifies the objects from which to obtain detail data when aggregating data. |
$ALLOCMAP
|
A property that specifies the default allocation specification for a variable. |
$COUNTVAR |
A property that specifies that Oracle OLAP counts the number of leaf nodes that contributed to an aggregate value when an AGGREGATE function executes. |
$VARCACHE | A property that specifies whether Oracle OLAP stores or caches variable data that is the result of the execution of a AGGREGATE function or $NATRIGGER expression. |
AGGREGATE automatically checks relations for circularity in and among multiple hierarchies. When you first define hierarchies, check for circularity by setting PRECOMPUTE statements to NA
and AGGINDEX to NO
. A XSHIERCK01 error during aggregation indicates that a circular hierarchy may have been detected. However, when the message includes a reference to UNDIRECTED, then multiple paths to an ancestor from a detail data cell have been detected. Some calculations require that a detail data cell use multiple paths to the same ancestor cell. When this is the case, then you need to set the MULTIPATHHIER option to YES
before you execute the AGGREGATE command. Otherwise, you need to correct the error in the hierarchy structure. For more details about this error message and how to interpret it, see the MULTIPATHHIER option.
Examples
Example 6-15 Combining Pre-calculation and Calculation on the Fly
This example describes the steps you can take to pre-calculate some of the data in your analytic workspace and specify that the rest should be calculated when users request it.
Suppose you define an analytic workspace named mydtb
that has a units
variable with the following definition.
DEFINE units INTEGER <time, SPARSE <product, geography>>
You now need to create and add a specification to the aggmap, which will specify the data that should be aggregated. This example shows you how to use an input file, which contains OLAP DML statements that define the aggmap and add a specification to it:
Identify the name of each dimension's hierarchy. When you have defined the hierarchies as self-relations, you use the names of the self-relations.
Decide which data to aggregate.
Suppose you want to calculate data for all levels of the time
and product
dimensions, but not for geography
. The geography
dimension's lowest level of data is at the city level. The second level of the hierarchy has three dimension values that represent regions: East
, Central
, and West
. The third level of the hierarchy has one dimension value: Total
.
Suppose that you want to pre-calculate the data for East
and store it in the analytic workspace. You want the data for Central
, West
, and Total
to be calculated only when users request that data — that data will not be stored in the analytic workspace. Therefore, you need to specify this information in the specification that you add to your aggmap object.
Create an ASCII text file named units.txt
. Add the following OLAP DML statements to your text file.
DEFINE units.agg AGGMAP <time, SPARSE <product, geography>> AGGMAP RELATION myti.parent RELATION mypr.parent RELATION myge.parent PRECOMPUTE ('East') END
The preceding statements define an aggmap named units.agg
, then add the three RELATION statements to the aggregation specification when you read the units.txt file into your analytic workspace.
To read the units.txt
file into your analytic workspace, execute the following statement.
INFILE 'units.txt'
The units.agg
aggmap should now exist in your analytic workspace. You can aggregate the units
variable with the following statement.
AGGREGATE units USING units.agg
Now the data for East
for all times and products has been calculated and stored in the analytic workspace.
Set up the analytic workspace so that when a user requests data for Central
, West
, or Total
, that data will be calculated and displayed. It is generally a good idea to compile the aggmap object before using it with the AGGREGATE function, as shown by the following statement.
COMPILE units.agg
This is not an issue when you are just using the AGGREGATE command, because the command compiles the aggmap object before it uses it. However, when you do not use the FUNCDATA keyword with the AGGREGATE command, the metadata that is needed to perform calculation on the fly has not been compiled yet. As long as you have performed all other necessary calculations (such as calculating models), it is a good practice to compile the aggmap when you load data. When you fail to do so, that means that every time a user opens the analytic workspace, that user will have to wait for the aggregation to be compiled automatically. In other words, when any data will be calculated on the fly, you can improve query performance for all of your users by compiling the aggmap before making the analytic workspace available to your users.
Add a property to the units
variable.
CONSIDER units PROPERTY '$NATRIGGER' 'AGGREGATE(units USING units.agg)'
This property indicates that when a data cell contains an NA
value, Oracle OLAP will call the AGGREGATE function to aggregate the data for that cell. Therefore, any units
data that is requested by a user will be displayed. However, only the data for the East
dimension value of the geography
dimension has actually been aggregated and stored in the analytic workspace. All other data (for Central
, West
, and Total
) is calculated only when users request it.
Example 6-16 Performing Non-additive Aggregation
This example shows how to use operators and arguments to combine additive and non-additive aggregation.
Suppose that you have defined four variables: sales
, debt
, interest_rate
, and inventory
. The variables have been defined with the same dimensionality where cp
is a composite that has been defined with the product
and geography
dimensions.
<time cp<product geography>>
Suppose you want to use one AGGREGATE command to aggregate all four variables. The debt
variable requires additive aggregation. The sales
variable requires a weighted sum aggregation, and interest_rate
requires a hierarchical weighted average. Therefore, sales
andinterest_rate
will each require a weight object, which you need to define and populate with weight values. inventory
requires a result that represents the total inventory, which is the last value in the hierarchy.
You will specify the aggregation operation for debt
and inventory
with the OPERATOR keyword. However, because sales
and interest_rate
have aggregation operations that require weight objects, you must use the ARGS keyword to specify their operations. You define an operator variable to use the OPERATOR keyword. Typically, the operator variable is dimensioned by a measure dimension or a line item dimension.
Here are the steps to define the aggregation you want to occur:
Because you will also be using a measure dimension to define an argument variable to use with the ARGS keyword, define that measure
dimension, as illustrated by the following statements.
DEFINE measure DIMENSION TEXT MAINTAIN measure 'sales', 'debt', 'interest_rate', 'inventory'
Note: Whenever you use ameasure dimension in a RELATION statement, you must include a MEASUREDIM statement in the same aggregation specification |
Define an operator variable named opvar
and populate it.The statements specify that the aggregation for debt
should use the SUM
operator, and the aggregation for inventory
should use the HLAST
operator.
DEFINE opvar TEXT <measure> opvar (measure 'sales') = 'WSUM' opvar (measure 'debt') = 'SUM' opvar (measure 'interest_rate') = 'HWAVERAGE' opvar (measure 'inventory') = 'HLAST'
Because sales
and interest_rate
require weight objects, define and populate those weight objects. The following statement defines a weight object named currency
(to be used by sales
).
DEFINE currency DECIMAL <time geography>
Notice that the currency
variable is dimensioned only by time
and geography
. The purpose of this variable is to provide weights that act as currency conversion information for foreign countries; therefore, it is unnecessary to include the product
dimension.
Populate currency
with the weight values that you want to use.
The interest_rate
variable's nonaddictive aggregation (hierarchical weighted average) requires the sum of the variable debt
. In other words, interest_rate
cannot be aggregated without the results of the aggregation of debt
.
You can now define an argument variable, which you will need to specify the aggregation results of debt
as a weight object for interest_rate
. You will use the same argument variable to specify currency
as the weight object for the sales
variable. The following statement defines an argument variable named argvar
.
DEFINE argvar TEXT <measure>
The next few statements populate the argument variable.
argvar (measure 'sales') = 'weightby currency' argvar (measure 'debt') = NA argvar (measure 'interest_rate') = 'weightby debt' argvar (measure 'inventory') = NA
For the aggregation of product
and geography
, the data for the sales
, debt
, and interest_rate
variables can simply be added. But the inventory
variable requires a hierarchical weighted average. This means that it is necessary to define a second operator variable and a second argument variable, both of which will be used in the RELATION statement for product
and geography
.
The following statements define the second operator variable and populate it.
DEFINE opvar2 TEXT <measure> opvar (measure 'sales') = 'Sum' opvar (measure 'debt') = 'Sum' opvar (measure 'interest_rate') = 'Sum' opvar (measure 'inventory') = 'HWAverage'
The following statements define the second argument variable and populate it.
DEFINE argvar2 TEXT <measure> argvar (measure 'sales') = NA argvar (measure 'debt') = NA argvar (measure 'interest_rate') = NA argvar (measure 'inventory') = 'weightby debt'
Now create the aggmap, by issuing the following statements.
DEFINE sales.agg AGGMAP <time, CP<product geography>> AGGMAP RELATION time.r OPERATOR opvar ARGS argvar RELATION product.r OPERATOR opvar2 ARGS argvar2 RELATION geography.r OPERATOR opvar2 ARGS argvar2 MEASUREDIM measure END
Finally, use the following statement to aggregate all four variables.
AGGREGATE sales debt interest_rate inventory USING sales.agg
Example 6-17 Programmatically Defining an Aggmap
The following program uses the EXISTS function to test whether an AGGMAP already exists, and defines the AGGMAP when it does not. It then uses the AGGMAP command to define the specification for the aggmap.
DEFINE MAKEAGGMAP PROGRAM LD Create dynamic aggmap PROGRAM IF NOT EXISTS ('test.agg') THEN DEFINE test.agg AGGMAP <geography product channel time> ELSE CONSIDER test.agg AGGMAP JOINLINES(- 'RELATION geography.parentrel PRECOMPUTE (geography.lvldim 2 4)' - 'RELATION product.parentrel' - 'RELATION channel.parentrel' - 'RELATION time.parentrel' - 'END') END
Example 6-18 Creating an Aggmap Using an Input File
Suppose that you have created a disk file called salesagg.txt
, which contains the following aggmap definition and specification.
DEFINE sales.agg AGGMAP <time, product, geography> AGGMAP RELATION time.r PRECOMPUTE (time NE 'Year99') RELATION product.r PRECOMPUTE (product NE 'ALL') RELATION geography.r CACHE STORE END
To include the sales.agg
aggmap in your analytic workspace, execute the following statement, where inf
is the alias for the directory where the file is stored.
INFILE 'inf/salesagg.txt'
The sales.agg
aggmap has now been defined and contains the three RELATION statements and the CACHE statement. In this example, you are specifying that all of the data for the hierarchy for the time
dimension, time.r
, should be aggregated, except for any data that has a time
dimension value of Year99
. All of the data for the hierarchy for the product
dimension, product.r
, should be aggregated, except for any data that has a product
dimension value of All
. All geography
dimension values are aggregated. The CACHE STORE statement specifies that any data that are rolled up on the fly should be calculated just once and stored in the cache for other access requests during the same session.
You can now use the sales.agg
aggmap with an AGGREGATE command, such as.
AGGREGATE sales USING sales.agg
In this example, any data value that dimensioned by a Year99
value of the time
dimension or an All
value of the product
dimension is calculated on the fly. All other data is aggregated and stored in the analytic workspace.
Example 6-19 Using Multiple Aggmaps
When you use a forecast, you must make sure that all of the input data that is required by that forecast has been pre-calculated. Otherwise, the forecast uses incorrect or nonexistent data. For example, suppose your forecast requires that all line items are aggregated. Using a budget
variable that is dimensioned by time
, line
, and division
, one approach would be to perform a complete aggregation of the line
dimension, forecast the dimension of type DAY, WEEK, MONTH, QUARTER, or YEAR
, and then aggregate the remaining dimension, division
.
You can support this processing by defining three aggmap objects:
Define the first aggmap, named forecast.agg1
, which aggregates the data needed by the forecast. It contains the following statement.
RELATION line.parentrel
Define the second aggmap, named forecast.agg2
, which aggregates the data generated using the first aggmap and the forecast. It contains the following statement.
RELATION division.parentrel PRECOMPUTE ('L3')
Define the third aggmap, named forecast.agg3
, which contains the RELATION statements in the specifications of the first two aggmaps.
RELATION line.parentrel RELATION division.parentrel PRECOMPUTE ('L3')
When your forecast is in a program named fore.prg
, then you would use the followinfstatements to aggregate the data.
AGGREGATE budget USING forecast.agg1 "Aggregate over LINE CALL fore.prg "Forecast over TIME AGGREGATE budget USING forecast.agg2 "Aggregate over DIVISION "Compile the limit map for LINE and DIVISION COMPILE forecast.agg3 "Use the combined aggmap for the AGGREGATE function CONSIDER budget PROPERTY 'NATRIGGER' 'AGGREGATE(budget USING forecast.agg3)'
Example 6-20 Using an AGGINDEX Statement in an Aggregation Specification
Suppose you have two variables, sales1
and sales2
, with the following definitions.
DEFINE sales1 DECIMAL <time, SPARSE<product, channel, customer>> DEFINE sales2 DECIMAL <time, SPARSE<product, channel, customer>>
You do not want to precompute and commit all of the sales
data to the database, because disk space is limited and you need to improve performance. Therefore, you need to create an aggmap, in which you specify which data should be pre-computed and which data should be calculated on the fly.
You define the aggmap, named sales.agg
, with the following statement.
DEFINE sales.agg AGGMAP <time, SPARSE<product, channel, customer>>
Next, you use the AGGMAP command to enter the following specification for sales.agg
.
RELATION time.r PRECOMPUTE (time NE 'Year99') RELATION product.r PRECOMPUTE (product NE 'All') RELATION channel.r RELATION customer.r AGGINDEX NO
This aggregation specification tells Oracle OLAP that all sales
data should be rolled up committed to the database with the exception of any data that has a time
dimension value of Year99
or a product
dimension value of All
—the data for those cells is calculated the first time a user accesses them. The AGGINDEX value of NO
tells Oracle OLAP not to create the indexes for data that should be calculated on the fly.
Now you execute the following statement.
sales2 = AGGREGATE(sales1 USING sales.agg) ACROSS SPARSE - <product, channel, customer>
sales2
now contains all of the data in sales1
, plus any data that is aggregated for Year99
—this is because time
is not included in a composite.
On the other hand, the data that is aggregated for the product
value of All
is not computed and stored in sales2
. This is because the product
dimension is included in a composite—the indexes that are required for dimensions that are included in composites were not created because the aggregation specification contains an AGGINDEX
NO
statement. Since the indexes did not exist, Oracle OLAP never called the AGGREGATE function to compute the data to be calculated on the fly.
Example 6-21 Aggregating By Dimension Attributes
Assume that when your business makes a sales it keeps records of the customer's name, sex, age, and the amount of the sale. To hold this data, your analytic workspace contains a dimension named customer
and three variables (named customer_sex
, customer_age
, and sales
) that are dimensioned by customer
.
REPORT W 14 <customer_sex customer_age sales> CUSTOMER CUSTOMER_SEX CUSTOMER_AGE SALES -------------- -------------- -------------- -------------- Clarke M 26 26,000.00 Smith M 47 15,000.00 Ilsa F 24 33,000.00 Rick M 33 22,000.00
You want to aggregate the detail sales data over sex and age to calculate the amount of sales you have made to males and females, and the amount of sales for different age ranges. To hold this data you will need an INTEGER variable that is dimensioned by hierarchical dimensions for sex and age. You will also need an aggmap object that specifies the calculations that Oracle OLAP will perform to populate this variable from the data in the sales
variable.
To create and populate the necessary objects, you take the following steps:
Create and populate dimensions and self-relations for hierarchical dimensions named sex
and age
.
DEFINE sex DIMENSION TEXT DEFINE sex.parentrel RELATION sex <sex> DEFINE age DIMENSION TEXT DEFINE age.parentrel RELATION age <age> AGE AGE.PARENTREL -------------- -------------------- 0-20 All 21-30 All 31-50 All 51-100 All No Response All All NA SEX SEX.PARENTREL -------------- -------------------- M All F All No Reponse All All NA
Create and populate relations that map the age
and sex
dimensions to the customer
dimension.
DEFINE customer.age.rel RELATION age <customer> DEFINE customer.sex.rel RELATION sex <customer> CUSTOMER CUSTOMER.AGE.REL CUSTOMER.SEX.REL -------------- -------------------- -------------------- Clarke 21-30 M Smith 31-50 M Ilsa 21-30 F Rick 31-50 M
Create a variable named sales_by_sex_age
to hold the aggregated data. Like the sales
variable this variable is of type DECIMAL, but it is dimensioned by sex
and age
rather than by customer
.
DEFINE sales_by_sex_age VARIABLE DECIMAL <sex age>
Define an AGGMAP type aggmap object named ssa_aggmap
to calculate the values of the sales_by_sex_age
variable.
DEFINE SSA_AGGMAP AGGMAP AGGMAP RELATION sex.parentrel OPERATOR SUM RELATION age.parentrel OPERATOR SUM BREAKOUT DIMENSION customer - BY customer.sex.rel, customer.age.rel OPERATOR SUM END
Notice that the specification for the ssa_aggmap
includes the following statements:
A BREAKOUT DIMENSION statement that specifies how to map the customer
dimension of the sales
variable to the lowest-level values of the sales_by_sex_age
variable. This statement specifies the name of the dimension of the variable that contains the detail values (that is, customer
) and the names of the relations (customer.sex.rel
and customer.age.rel
) that define the relations between customer
dimension and the sex
and age
dimensions.
Two RELATION statements that specify how to aggregate up the sex
and age
dimensions of the sales_by_sex_age
variable. Each of these statements includes the name of the child-parent relation (sex.parentrel
or age.parentrel
) that define the self-relation for the hierarchal dimension (sex
or age
).
Populate the sales_by_sex_age
variable by issuing and AGGREGATE command that specifies that the detail data for the aggregation comes from the sales
variable.
AGGREGATE sales_by_sex_age USING ssa_aggmap FROM sales
After performing the aggregation, a report of sales_by_sex_age
shows the calculated values.
---------------------SALES_BY_SEX_AGE---------------------- ----------------------------SEX---------------------------- AGE M F No Reponse All -------------- -------------- -------------- -------------- -------------- 0-20 NA NA NA NA 21-30 26,000.00 33,000.00 NA 59,000.00 31-50 37,000.00 NA NA 37,000.00 51-100 NA NA NA NA No Response NA NA NA NA All 63,000.00 33,000.00 NA 96,000.00
Example 6-22 Using a CACHE Statement in an Aggregation Specification
Suppose you have a sales
variable with the following definition.
DEFINE sales DECIMAL <time, SPARSE<product, channel, customer>>
You do not want to pre-compute and commit all of the sales
data, because space is limited and you need to improve performance. Therefore, you need to create an aggmap, in which you will specify which data should be pre-computed and which data should be calculated on the fly.
You define the aggmap, named sales.agg
, with the following statement.
DEFINE sales.agg AGGMAP <time, SPARSE<product, channel, - customer>>
Next, you use the AGGMAP statement to enter the following aggregation specification forsales.agg
.
AGGMAP RELATION time.r PRECOMPUTE (time NE 'YEAR99') RELATION product.r PRECOMPUTE (product NE 'ALL') RELATION channel.r RELATION customer.r CACHE SESSION END
This aggregation specification tells Oracle OLAP that all sales
data should be rolled up and committed, with the exception of any cells that have a time dimension value of Year99
or a product dimension value of ALL
; the data for those cells will be calculated the first time a user accesses them. Because the CACHE statement uses the SESSION keyword, that means that when those cells are calculated on the fly, the data is stored in the cache for the remainder of the Oracle OLAP session. That way, the next time a user accesses the same cell, the data will not have to be calculated again. Instead, the data will be retrieved from the session cache.
Example 6-23 Populating All Levels of a Hierarchy Except the Detail Level
Assume that your analytic workspace contains the relations and dimensions with the following definitions.
DEFINE geog.d TEXT DIMENSION DEFINE geog.r RELATION geog.d <geog.d> DEFINE sales_by_units INTEGER VARIABLE <geog.d> DEFINE sales_by_revenue DECIMAL VARIABLE <geog.d> DEFINE price_per_unit DECIMAL VARIABLE <geog.d>
Assume that you create two aggmap objects. One aggmap object, named units_aggmap
, is the specification to aggregate data in the sales_by_units
variable. The other aggmap object, revenue_aggmap
, is the specification to calculate all of the data except the detail data in the sales_by_revenue
variable.
DEFINE units_aggmap AGGMAP AGGMAP RELATION geog.r OPERATOR SUM END DEFINE revenue_aggmap AGGMAP AGGMAP RELATION geog.r OPERATOR WSUM ARGS WEIGHTBY price_per_unit CACHE NOLEAF END
The following steps outline the aggregation process:
Before either the sales_by_unit
or sales_by_revenue
variables are aggregated, they have the following values.
GEOG.D SALES_BY_UNIT SALES_BY_REVENUE --------- ------------- ---------------- Boston 1 NA Medford 2 NA San Diego 3 NA Sunnydale 4 NA MA NA NA CA NA NA USA NA NA
After the data for the sales_by_unit
variable is aggregated, the sales_by_unit
and sales_by_revenue
variables have the following values.
AGGREGATE sales_by_unit USING units_aggmap GEOG.D SALES_BY_UNIT SALES_BY_REVENUE --------- ------------- ---------------- Boston 1 NA Medford 2 NA San Diego 3 NA Sunnydale 4 NA MA 3 NA CA 7 NA USA 10 NA
After the data for the sales_by_revue
variable is aggregated, the sales_by_unit
and sales_by_revenue
variables have the following values.
AGGREGATE sales_by_revenue USING revenue_aggmap FROM units_aggmap GEOG.D SALES_BY_UNIT SALES_BY_REVENUE --------- ------------- ---------------- Boston 1 NA Medford 2 NA San Diego 3 NA Sunnydale 4 NA MA 3 13.5 CA 7 31.5 USA 10 45.0
Example 6-24 Aggregating into a Different Variable
Assume that there is a variable named sales
that is dimensioned by time
, a hierarchical dimension, and district
, a non-hierarchical dimension.
DEFINE time DIMENSION TEXT DEFINE time.parentrel RELATION time <time> DEFINE district DIMENSION TEXT DEFINE sales VARIABLE DECIMAL <time district> -----------------------SALES----------------------- ---------------------DISTRICT---------------------- TIME North South West East ------------ ------------ ------------ ------------ ------------ 1976Q1 168,776.81 362,367.87 219,667.47 149,815.65 1976Q2 330,062.49 293,392.29 237,128.26 167,808.03 1976Q3 304,953.04 354,240.51 170,892.80 298,737.70 1976Q4 252,757.33 206,189.01 139,954.56 175,063.51 1976 NA NA NA NA
Assume also that you want to calculate the total sales for each quarter and year for all districts except the North
district. To perform this calculation using an aggmap object, you take the following steps:
Define a valueset named not_north
that represents the values of district for which you want to aggregate data.
DEFINE not_north VALUESET district LIMIT not_north TO ALL LIMIT not_north REMOVE 'North'
Define a variable named total_sales_exclud_north
to hold the results of the calculation.
DEFINE total_sales_exclud_north VARIABLE DECIMAL <time>
Notice that, like sales
, the total_sales_exclud_north
variable is dimensioned by time. However, unlike sales
, the total_sales_exclud_north
variable is not dimensioned by district
since it will hold detail data for each district, but only the total (aggregated) values for the South
, West
, and East
districts (that is, all districts except North
).
Define an aggmap object that specifies the calculation that you want performed.
DEFINE agg_sales_exclud_north AGGMAP AGGMAP RELATION time.parentrel OPERATOR SUM DROP DIMENSION district OPERATOR SUM VALUES not_north END
Notice that the aggregation specification consists of two statements that specify how to perform the aggregation:
A RELATION statement that specifies how to aggregate up the hierarchical time
dimension
A DROP DIMENSION statement that specifies how to aggregate across the non-hierarchical district
dimension. In this case, the DROP DIMENSION also uses the not_north
valueset to specify that values for the North
district are excluded when performing the aggregation
Aggregate the data.
AGGREGATE total_sales_exclud_north USING agg_sales_exclud_north FROM sales
The report of the total_sales_exclud_north
variable shows the aggregated values.
TIME ALL_SALES_EXCEPT_NORTH ------------ ------------------------------ 1976Q1 731,850.99 1976Q2 698,328.58 1976Q3 823,871.02 1976Q4 521,207.09 1976 2,775,257.69
Example 6-25 Using a MEASUREDIM Statement in an Aggregation Specification
Suppose you have defined a measure dimension named measure
. You then define an operation variable named myopvar
, which is dimensioned by measure
. When you use myopvar
in an aggregation specification, you must also include a MEASUREDIM statement that identifies measure
as the dimension is included in the definition of myopvar
.
The MEASUREDIM statement should follow the last RELATION (for aggregation) statement in the aggregation specification, as shown in the following example.
DEFINE sales.agg AGGMAP <time, product, geography> AGGMAP RELATION time.r OPERATOR myopvar RELATION product.r RELATION geography.r MEASUREDIM measure END
Example 6-26 Solving a Model in an Aggregation
This example uses the budget
variable.
DEFINE budget VARIABLE DECIMAL <line time> LD Budgeted $ Financial
The time
dimension has two hierarchies (Standard
and YTD
) and a parent relation named time.parentrel
as follows.
-----TIME.PARENTREL------ ----TIME.HIERARCHIES----- TIME Standard YTD -------------- ------------ ------------ Last.YTD NA NA Current.YTD NA NA Jan01 Q1.01 Last.YTD ... Dec01 Q4.01 Last.YTD Jan02 Q1.02 Current.YTD Feb02 Q1.02 Current.YTD Mar02 Q1.02 Current.YTD Apr02 Q2.02 Current.YTD May02 Q2.02 Current.YTD Q1.01 2001 NA ... Q4.01 2001 NA Q1.02 2002 NA Q2.02 2002 NA 2001 NA NA 2002 NA NA
The relationships among line items are defined in the following model.
DEFINE income.budget MODEL MODEL DIMENSION line time opr.income = gross.margin - marketing gross.margin = revenue - cogs revenue = LAG(revenue, 12, time) * 1.02 cogs = LAG(cogs, 1, time) * 1.01 marketing = LAG(opr.income, 1, time) * 0.20 END
The following aggregation specification pre-aggregates all of the data. Notice that all of the data must be pre-aggregated because the model includes both LAG functions and a simultaneous equation.
DEFINE budget.aggmap1 AGGMAP AGGMAP MODEL income.budget RELATION time.parentrel END
Example 6-27 Aggregating Up a Hierarchy
Suppose you define a sales
variable with the following statement.
DEFINE sales VARIABLE <time, SPARSE <product, geography>>
The aggregation specification for sales
might include RELATION statements like the following.
AGGMAP RELATION time.r PRECOMPUTE ('Yr98', 'Yr99') RELATION product.r RELATION geography.r PRECOMPUTE (geography NE 'Atlanta') END
The AGGREGATE command will aggregate values for Yr98
and Yr99
, over all of products, and over all geographic areas except for Atlanta
. All other aggregates are calculated on the fly.
Example 6-28 Using Valuesets
Suppose you have a hierarchy dimension named time.type, whose dimension values are Fiscal
and Calendar
, in that order. These hierarchies are in conflict, and you want to precompute some time
data but calculate the rest on the fly. Because the Calendar
hierarchy is the last dimension value in the hierarchy dimension, this means that you need to define a valueset in order to get the correct results for the Fiscal
hierarchy.
First, use the following statements to define and populate a valueset.
DEFINE time.vs VALUESET time LIMIT time.vs TO 'Calendar' 'Fiscal'
You can then use the valueset in the following RELATION statement. Because the Fiscal hierarchy is the last hierarchy in the valueset, the data that is aggregated will be accurate for the Fiscal hierarchy.
RELATION time.r(time.vs) PRECOMPUTE ('Yr99', 'Yr00')
Example 6-29 Aggregating with a RELATION Statement That Uses an ARGS Keyword
You can list the arguments in a RELATIION statement directly in the statement or as the value of a text variable. For example, the following statement specifies WEIGHTBY wobj
as an argument.
RELATION time.r OPERATOR wsum ARGS WEIGHTBY wobj
Alternatively, you can define an variable for the argument whose value is the text of the WEIGHTBY clause.
DEFINE argvar TEXT argvar = 'WEIGHTBY wobj'
Then the RELATION statement can specify the text variable that contains the WEIGHTBY clause.
RELATION time.r OPERATOR WSUM ARGS argvar
Example 6-30 Aggregating Using a Measure Dimension
Suppose you want to use a single AGGREGATE command to aggregate the sales
, units
, price
, and inventory
variables. When you want to use the same operator for each variable, then you do not need to use a measure dimension. However, when you want to specify different aggregation operations, then you need to use a measure dimension.
The following statement defines a dimension named measure
.
DEFINE measure DIMENSION TEXT
You can then use a MAINTAIN statement to add dimension values to the measure
dimension.
MAINTAIN measure ADD 'sales', 'units', 'quota', 'inventory'
Use the measure
dimension to dimension a text variable named meas.opvar
that you will use as the operator variable.
DEFINE meas.opvar TEXT WIDTH 2 <measure>
The following statements add values to OPVAR
meas.opvar (measure 'sales') = 'SU' meas.opvar (measure 'units') = 'SU' meas.opvar (measure 'price') = 'HA' meas.opvar (measure 'inventory') = 'HL'
The aggregation specification might look like the following. Note that when you specify an operator variable in a RELATION statement, you must include a MEASUREDIM statement that specifies the name of the measure dimension (measure
in the following example) in the aggregation specification.
DEFINE opvar.aggmap AGGMAP AGGMAP RELATION geography.parentrel PRECOMPUTE (geography.lvldim 2 4) RELATION product.parentrel OPERATOR opvar RELATION channel.parentrel OPERATOR opvar RELATION time.parentrel OPERATOR opvar MEASUREDIM measure END
Example 6-31 Aggregating Using a Line Item Dimension
Suppose you have two variables, actual
and budget
, that have these dimensions.
<time line division>
You want to use different methods to calculate different line items. You create a text variable that you will use as the operator variable.
DEFINE line.opvar TEXT WIDTH 2 <line>
You then populate line.opvar
with the appropriate operator for each line item, for example:
line.opvar (line 'Net.Income') = 'SU' line.opvar (line 'Tax.Rate') = 'AV'
The aggregation specification might look like this.
DEFINE LINE.AGGMAP AGGMAP AGGMAP RELATION time.parentrel OPERATOR line.opvar RELATION division.parentrel END
Example 6-32 Skip-Level Aggregation
Suppose you want to aggregate sales
data. The sales
variable is dimensioned by geography
, product
, channel
, and time
.
First, consider the hierarchy for each dimension. How many levels does each hierarchy have? What levels of data do users typically query? When you are designing a new workspace, what levels of data do your users plan to query?
Suppose you learn the information described in the following table about how users tend to query sales
data for the time
hierarchy.
Time Level Names | Descriptive Level Name | Examples of Dimension Values | Do users query this level often? |
---|---|---|---|
L1 | Year | Year99 , Year00 |
yes |
L2 | Quarter | Q3.99 , Q3.99 , Q1.00 |
yes |
L3 | Month | Jan99 , Dec00 |
yes |
While the next table shows how your users tend to query sales
data for the geography
hierarchy.
Geography Level Names | Descriptive Level Name | Examples of Dimension Values | Do users query this level often? |
---|---|---|---|
L1 | World | World |
yes |
L2 | Continent | Europe, Americas |
no |
L3 | Country | Hungary, Spain |
yes |
L4 | City | Budapest, Madrid | yes |
Finally, the next table shows how your users tend to query sales
data for the product
dimension hierarchy.
Product Level Names | Descriptive Level Name | Examples of Dimension Values | Do users query this level often? |
---|---|---|---|
L1 | All Products | Totalprod |
yes |
L2 | Division | Audiodiv, Videodiv |
yes |
L3 | Category | TV, VCR |
yes |
L4 | Product | Tuner, CDplayer | yes |
Using this information about how users query data, you should use the following strategy for aggregation:
Fully aggregate time
and product
because all levels are queried frequently.
For the geography
dimension, aggregate data for L1
(World
) and L3
(Country
) because they are queried frequently. However, L2
is queried less often and so can be calculated on the fly.
The lowest level of data was loaded into the analytic workspace. The aggregate data is calculated from this source data.
Therefore, the aggregation specification might look like the following.
RELATION time.parentrel RELATION geography.parentrel PRECOMPUTE (geog.leveldim 'L3' 'L1') RELATION product.parentrel
Example 6-33 Aggregation Specification with RELATION Statements That Include PRECOMPUTE Clauses
This aggregation specification uses PRECOMPUTE clauses in the RELATION statements to limit the data that is aggregated by the AGGREGATE command.
DEFINE gpct.aggmap AGGMAP LD Aggmap for sales, units, quota, costs AGGMAP RELATION geography.parentrel PRECOMPUTE (geography.levelrel 'L3') RELATION product.parentrel PRECOMPUTE (LIMIT(product complement 'TotalProd')) RELATION channel.parentrel RELATION time.parentrel PRECOMPUTE (time NE '2001') END
Within an aggregation specification, an AGGINDEX statement in an aggregation specification tells Oracle OLAP whether the compilation of that aggmap should create indexes (meaning, composite tuples) for data cells that are calculated on the fly by the AGGREGATE function. Therefore, the AGGINDEX statement has an effect on a dimension that is included in a composite but it has no effect on a dimension that is not included in a composite.
These indexes are used in the MODEL (in an aggregation) statement and in statements that use the ACROSS phrase to help Oracle OLAP loop over variables that are dimensioned by composites. These statements expect all data to be calculated. When you specify calculating some data on the fly, that data appears to be missing. When you set AGGINDEX to YES
, then the statements try to access the missing data whether or not you are using the AGGREGATE function to perform calculation on the fly (meaning, you have added to the variable whose data is being aggregated an NA
trigger property that calls the AGGREGATE function).
When the indexes have been created and you use AGGREGATION with the AGGREGATE function, then when MODEL (or a statement that uses the ACROSS phrase) requests the missing data, that data is calculated on the fly. That means that the results of the MODEL (or other statement) are correct, because the statement has all of the data that it needs.
When these indexes have not been created, then the missing data cannot be calculated. As a result, the statements that need the indexes interpret the missing data as NA
data, even when you use the AGGREGATE function.
Syntax
AGGINDEX {YES|NO}
Arguments
Tells the AGGMAP compiler to make sure that all possible indexes are created whenever an aggmap is recompiled. In other words, indexes are created both for the data that is being pre-calculated and the data that is calculated on the fly. This happens when you use a COMPILE statement to compile the aggmap, as well as when the AGGREGATE command automatically compiles an aggmap whose specification has changed since the last time it was compiled. The creation of all possible indexes results in a longer compilation time but faster execution of the AGGREGATE function. (Default)
Does not create the indexes for data that is calculated on the fly. Omitting the creation of these index values accelerates the compilation time, but causes Oracle OLAP to treat the uncomputed data as NA
data whenever the MODEL (in an aggregation) statement or the ACROSS phrase is used.
Notes
The primary advantage to using an AGGINDEX value of YES
is that then Oracle OLAP always try to access data that you have specified to be calculated on the fly. When you have created an $NATRIGGER property for a variable that calls the AGGREGATE function, the variable appears to have been fully precomputed. That means that when any NA
value is encountered, the NA
trigger is called during a MODEL (in an aggregation) statement or a statement using the ACROSS phrase. When the NA
trigger is called, the AGGREGATE function is executed, and the data is calculated on the fly.
When AGGINDEX has a value of NO
, then the NA
trigger is called only to aggregate data for dimensions that are not included in a composite. Data for dimensions that are included in composites is interpreted as NA
values.
For example, suppose you have two variables called sales1
and sales2
, which are defined with the following definitions.
DEFINE sales1 DECIMAL <time, SPARSE <product, geography>> DEFINE sales2 DECIMAL <time, SPARSE <product, geography>>
Now suppose you have an aggmap object named sales.agg
, which has the following definition.
DEFINE sales.agg AGGMAP <time, SPARSE <product, geography>>
When you add a specification to the sales.agg
aggmap, you enter RELATION (for aggregation) statements for time
, product
and geography
with PRECOMPUTE
clauses that specify NA
. This specifies that no data is aggregated—instead, all of the data for any variable that uses this aggmap is calculated on the fly.
RELATION time.r PRECOMPUTE (NA) RELATION product.r PRECOMPUTE (NA) RELATION geography.r PRECOMPUTE (NA)
Now attach the following $NATRIGGER property to the sales1
variable.
CONSIDER sales1 PROPERTY '$NATRIGGER' 'AGGREGATE(sales1 USING sales.agg)'
Consider the effect of AGGINDEX in the following statement. Because you did not enter an AGGINDEX statement in the sales.agg
aggregation specification, the default of AGGINDEX YES
is assumed.
sales2 = sales1 ACROSS SPARSE <product, geography>
This statement loops over the data in sales1
and copies the values into sales2
. This statement causes the NA
trigger to call the AGGREGATE function for all of the data that you have specified to be calculated on the fly in sales1
. This means that after the aggregation that sales2
contains a copy of sales1
plus all the aggregate data cells (the cells that would have been calculated if the sales1
data had been completely precomputed, meaning, fully rolled up).
However, when you put an AGGINDEX NO
statement in the sales.agg
aggregation specification, then sales2
contains a copy of the data in sales1
and the aggregate data cells for the time
dimension.
Note that in both cases, $NATRIGGER is called to aggregate time
data, because the time
dimension is not included in the composite, so the value of AGGINDEX has no effect on it.
You can use an AGGINDEX value of NO
when you know that either of the following is true:
Your application does not use a MODEL (in an aggregation) statement or an ACROSS phrase.
The results of your MODEL (in an aggregation) statement or ACROSS phrase are additive, and data that needs to be aggregated can be calculated safely on the fly.
Each of the preceding cases ensures that the data that you have specified to be calculated on the fly is available at the appropriate time.
By setting AGGINDEX to NO
, the size of the indexes is reduced, and overall application performance improves.
When you run a MODEL that assumes all data that should be aggregated has been aggregated, then you may get NA
data where real data should occur. For instance, suppose you have a variable that has a composite that includes the time
dimension. You perform a calculation that subtracts the fourth quarter from the total for the year. When the value of Year
is to be calculated dynamically, and the AGGINDEX statement is set to NO
, then the result of the calculation is NA
. When the value of Year
was precomputed or when AGGINDEX is set to YES
, then the MODEL correctly calculates a result equal to the sum of the first three quarters.
Only the indexes that are needed to aggregate existing data are created when AGGINDEX has a value of YES
. For example, suppose one of the dimensions in your composite is a dimension named time
. The lowest-level data for the time
dimension is at the monthly level. Therefore, the dimension values that are associated with the lowest-level data are Jan99
, Feb99
, and so on. The monthly data aggregates to quarters and to years. Suppose you have data for the first six months of the year. When AGGINDEX has a value of YES
, indexes are created for the Q1
, Q2
, and Yr99
dimension values, but not for Q3
and Q4
.
One disadvantage of using the default of AGGINDEX YES
is that the compilation of the aggmap takes a longer time to complete. You can eliminate the cost of this extra time by using the FUNCDATA
keyword with the AGGREGATE command. When you use the FUNCDATA
keyword, all possible indexes (regardless of how you have limited your data) are created. However, do not use the FUNCDATA
keyword when you use a different aggmap to execute the AGGREGATE command and the AGGREGATE function.
Examples
For an example of using an AGGINDEX statement, see Example 6-20, "Using an AGGINDEX Statement in an Aggregation Specification".
Within an aggregation specification, a BREAKOUT DIMENSION statement specifies how a dimension of the target variable maps to one or more dimensions of the source variable. You use this statement in an aggregation specification when you will be aggregating the detail data from one variable (the source variable) into another variable (the target variable) that has a different dimension (that is, a "breakout" dimension) than the variable that contains the detail data.
Syntax
BREAKOUT DIMENSION dimname BY relationname [, relationname...] -
OPERATOR operation [ARGS argument]
where:
argument specifies the settings of various options and is one or more of the following phrases:
Arguments
The name of a dimension in the variable that contains the detail data (that is, the source variable).
The name of a relation whose values relate a dimension of the target variable to dimname.
Identifies the calculation method used to aggregate the data.
A keyword that describes the type of aggregation to perform. The keywords are listed in Table 6-3, "Aggregation Operators".
Indicates optional handling of the aggregation.
Specifies whether to allow division by zero.
YES allows division by zero; a statement involving division by zero executes without error but produces NA
results.
NO disallows division by zero; a statement involving division by zero stops executing and produces an error message.
The default value is the current value of the DIVIDEBYZERO option.
Specifies whether to allow decimal overflow, which occurs when the result of a calculation is very large and can no longer be represented by the exponent portion of the numerical representation.
YES allows overflow; a calculation that generates overflow executes without error and produces NA
results.
NO disallows overflow; a calculation involving overflow stops executing and generates an error message.
The default value is the current value of the DECIMALOVERFLOW option.
Specifies whether NA
values are input.
YES specifies that NA
values are ignored when aggregating. Only actual values are used in calculations.
NO specifies that NA
values are considered when aggregating. When any of the values being considered are NA
, the calculation returns NA
.
The default value is the current value of the NASKIP option.
The value that you specify for the NASKIP phrase does not effect calculation performed when you specify HAVERAGE, HFIRST, HLAST, HWAVERAGE, HWFIRST, HWLAST for operation.
Indicates that weighted aggregation is to be performed. You must include a WEIGHTBY clause when you specify HWAVERAGE, HWFIRST, HWLAST, SSUM, WAVERAGE, WFIRST, WLAST, or WSUM for operation. The WEIGHTBY phrase always includes a wobj argument and can optionally include the WNAFILL keyword. For more information about the use of the WEIGHTBY phrase, see RELATION (for aggregation).
Indicates handling for NA
values. The default values for WNAFILL vary depending on the value of operation.
Substitutes a number for every NA
value. That number will replace every NA
value in the weight object, weight formula, or weight relation.
0.0
is the default for HWAVERAGE and SSUM.
1.0
is the default for HWFIRST, HWLAST, WAVERAGE, WFIRST, WLAST, and WSUM.
Specifies that NA
values are to be specified as NA
. NA
is the default for OR.
For more information about using the WNAFILL phrase, see RELATION (for aggregation).
A variable, formula, or relation that provides the weighted values. It can be numeric or BOOLEAN. When wobj is BOOLEAN, then TRUE
has a weight of 1.0
and FALSE
has a weight of 0.0
. A formula is queried only when needed, depending on the dimensionality of the formula and the of the variable being aggregated. When wobj is a relation, it should be a one-dimensional self-relation. For more information about specifying values for wobj, see RELATION (for aggregation).
Examples
For an example of using the BREAKOUT DIMENSION statement, see Example 6-21, "Aggregating By Dimension Attributes".
Within an aggregation specification, a CACHE statement tells Oracle OLAP whether to cache or store the calculated data, whether to populate leaf or detail data when the variable data is aggregated using detail data from another variable, and whether to cache NA
values when a summary values calculates to NA
.
Note: The CACHE statement is only one factor that determines whether variable data that has been aggregated on-the-fly using the AGGREGATE function is stored or cached. See "How Oracle OLAP Determines Whether to Store or Cache Aggregated Data ". |
Syntax
CACHE {NOSTORE|NONE|STORE|SESSION|DEFAULT} [LEAF|NOLEAF] [NA|NONA]
Arguments
For data that is calculated using the AGGREGATE function, specifies that Oracle OLAP calculates the data each time the AGGREGATE function executes. When you specify either of these keywords, Oracle OLAP does not store or cache the data calculated by the AGGREGATE function.
For data that is calculated using the AGGREGATE function, specifies that Oracle OLAP stores data calculated by the AGGREGATE function in the variable in the database. When you specify this option, the results of the aggregation are permanently stored in the variable when the analytic workspace is updated and committed.
For data that is calculated using the AGGREGATE function, specifies that Oracle OLAP caches data calculated by the AGGREGATE function in the session cache (see "What is an Oracle OLAP Session Cache?"). When you specify this option, the results of the aggregation are ignored during updates and commits and are discarded at the end of the session.
Note: When SESSCACHE is set toNO , Oracle OLAP does not cache the data even when you specify SESSION . In this case, specifying SESSION is the same as specifying NONE . |
For data that is calculated using the AGGREGATE function, specifies that you do not want Oracle OLAP to use the CACHE statement when determining what to do with data that is calculated by the AGGREGATE function. See "How Oracle OLAP Determines Whether to Store or Cache Aggregated Data ". (Default)
When the variable data is aggregated using detail data from another variable, specifies that Oracle OLAP calculates the leaf data for the variable.
When the variable data is aggregated using detail data from another variable, specifies that Oracle OLAP does not calculate the leaf data for the variable.
For data that is calculated using the AGGREGATE function, specifies that Oracle OLAP places any NA
values that are the results of the execution of the AGGREGATE function in the Oracle OLAP session cache. In this case, when there is a variable has an $NATRIGGER property with an AGGREGATE function as its expression, Oracle OLAP does not recalculate the values for the variable. (For more information on the caching NA
values, see "How Oracle OLAP Determines Whether to Store or Cache Results of $NATRIGGER".)
For data that is calculated using the AGGREGATE function, specifies that Oracle OLAP does not cache any NA
values that are the results of the execution of the AGGREGATE function. In this case, when a variable has an $NATRIGGER property with an AGGREGATE function as its expression, Oracle OLAP recalculates the values for the variable.
Notes
You should use NOSTORE when you know that your users are likely to modify pre-computed data, and you want any data that calculated by the AGGREGATE function to consistent with any of those users' changes.
In other words, suppose a user makes a change to detail-level data, such as sales
figures for three stores, which are in a geography
dimension. The geography
dimension rolls up data from stores to cities to states to regions to countries. In other words, there are five levels in the geography
dimension's hierarchy. Now suppose that users tend to access data only at the store level (your detail data), the regions level, and the countries level. Those are the levels for which you roll up sales data and commit it to the database. Because users do not access data at the city and state level, you specify that the data cells in those two levels will be calculated on the fly. When users modify the store-level data and then access city data, the city data will be calculated every time that a user requests it. Therefore, any changes that a user makes to the store-level details will accurately rollup to the city and state level every time that user accesses a data cell in the city or state level. (However, this will not be true of the data in the region and country levels, because those cells store pre-computed data.)
The advantage to using STORE or SESSION is that it improves query performance. For example, suppose your users use a Table tool to look at a variable's data and an individual user requests the same data cells several times in the same session. When you use the default of NOSTORE, then any data that is not aggregated using the AGGREGATE command will have to be calculated every time the user requests that data even if you do not use the FORECALC keyword in the AGGREGATE function. On the other hand, when you use STORE or SESSION, then any given cell of data is calculated only once because it is available in either the variable or the cache for the entire session. Therefore, the next time a user requests that data cell, the data is returned from the variable or the cache instead of being calculated on the fly, which results in faster query time for the user.
Frequently you do not want the data that is calculated using the AGGREGATE function to be stored permanently in the database since that would defeat the purpose of calculating data on the fly.
To ensure that the aggregated values cannot be permanently committed to the database, use SESSION.
Use STORE when you know either of the following is true which also ensures that the data that is calculated on the fly using the AGGREGATE function will not be committed to the database:
The users of the analytic workspace can only open it as read-only
You know that the users of the analytic workspace will not or cannot issue UPDATE and COMMIT commands.
Note: You should use STORE with caution when it is likely that your users modify pre-computed data, and they access data that you have specified to be calculated on the fly using the AGGREGATE function. The problem is that any data that is calculated using the AGGREGATE function before the user's modification will not reflect the user's change unless the user made the change using an AGGREGATE function with the FORCECALC keyword. |
Examples
For examples of using a CACHE statement in an aggregation specification, see Example 6-22, "Using a CACHE Statement in an Aggregation Specification" and Example 6-23, "Populating All Levels of a Hierarchy Except the Detail Level".
Within an aggregation specification, a DIMENSION statement sets the status to a single value of a dimension. When an aggregation specification does not specify such single values with DIMENSION statements, Oracle OLAP uses the current status values of the dimensions when performing the aggregation.
You use a DIMENSION statement to ensure that the status of a dimension is set to the value that you want it to have for the aggregation. You must use a separate DIMENSION statement for each dimension that is not shared by the source, basis, and target objects.
Syntax
DIMENSION dimension 'dimval'
Arguments
the name of the dimension that you want to limit.
The single value of the dimension to which you want the status of the dimension set for the duration of an aggregation.
Within an aggregation specification, a DROP DIMENSION statement specifies how non-hierarchical aggregation across variables is performed. You use this statement in aggregation specification when you will be aggregating the detail data from one variable (the source variable) into another variable (the target variable) and you want to aggregate across a non-hierarchical dimension of the source variable. In this case, the target variable has one less dimension (the "dropped" dimension) than the source variable because the values of the source variable associated with this dimension are aggregated to populate the target variable.
Syntax
DROP DIMENSION dimname [VALUES {valsetname|ALL} OPERATOR operation [ARGS argument]
where
argument is one or more of the following phrases:
DIVIDEBYZERO {YES|NO}
DECIMALOVERFLOW {YES|NO}
NASKIP {YES|NO}
WEIGHTBY [WNAFILL {number|NA}] wobj
Arguments
The name of a dimension in the source variable that contains the detail data.
Sets the status of dimname during the aggregation.
The name of a valueset object that determines the status of the dimension specified by dimname.
Specifies that all of the values of dimname are in status.
Identifies the calculation method used to aggregate the data.
A keyword that describes the type of aggregation to perform. The keywords are listed in Table 6-3, "Aggregation Operators".
Indicates optional handling of the aggregation.
Specifies whether to allow division by zero.
YES allows division by zero; a statement involving division by zero executes without error but produces NA
results.
NO disallows division by zero; a statement involving division by zero stops executing and produces an error message.
The default value is the current value of the DIVIDEBYZERO option.
Specifies whether to allow decimal overflow, which occurs when the result of a calculation is very large and can no longer be represented by the exponent portion of the numerical representation.
YES allows overflow; a calculation that generates overflow executes without error and produces NA
results.
NO disallows overflow; a calculation involving overflow stops executing and generates an error message.
The default value is the current value of the DECIMALOVERFLOW option.
Specifies whether NA
values are input.
YES specifies that NA
values are ignored when aggregating. Only actual values are used in calculations.
NO specifies that NA
values are considered when aggregating. When any of the values being considered are NA
, the calculation returns NA
.
The default value is the current value of the NASKIP option.
The value that you specify for the NASKIP phrase does not effect calculation performed when you specify HAVERAGE, HFIRST, HLAST, HWAVERAGE, HWFIRST, HWLAST for operation.
Indicates that weighted aggregation is to be performed. You must include a WEIGHTBY clause when you specify HWAVERAGE, HWFIRST, HWLAST, SSUM, WAVERAGE, WFIRST, WLAST, or WSUM for operation. The WEIGHTBY phrase always includes a wobj argument and can optionally include the WNAFILL keyword. For more information about the use of the WEIGHTBY phrase, see RELATION (for aggregation).
Indicates handling for NA
values. The default values for WNAFILL vary depending on the value of operation. For more information about using the WNAFILL phrase, see RELATION (for aggregation).
Substitutes a number for every NA
value. That number will replace every NA
value in the weight object, weight formula, or weight relation.
0.0
is the default for HWAVERAGE and SSUM.
1.0
is the default for HWFIRST, HWLAST, WAVERAGE, WFIRST, WLAST, and WSUM.
Specifies that NA
values are to be specified as NA
. NA
is the default for OR.
A variable, formula, or relation that provides the weighted values. It can be numeric or BOOLEAN. When wobj is BOOLEAN, then TRUE
has a weight of 1.0
and FALSE
has a weight of 0.0
. A formula is queried only when needed, depending on the dimensionality of the formula and the of the variable being aggregated. When wobj is a relation, it should be a one-dimensional self-relation. For more information about specifying values for wobj,, see RELATION (for aggregation).
Examples
For an example of using a DROPT DIMENSION statement in an aggregation specification, see Example 6-24, "Aggregating into a Different Variable".
Within an aggregation specification, a MEASUREDIM statement identifies the name of a measure dimension that is specified in the definition of an operator variable or an argument variable.
Syntax
MEASUREDIM name
Arguments
The name of the measure dimension. A measure dimension is a dimension that you define. The dimension values are names of existing variables.
Note: You cannot specify a measure dimension when it is included in the definition of the aggmap object. |
Notes
The following statement defines a dimension named MEASURE.
DEFINE measure DIMENSION TEXT
Once you have defined a measure dimension, you can then use a MAINTAIN statement to add dimension values to the MEASURE dimension.
The following statement adds the names of the sales
, units
, price
, and inventory
variables to measure
as its dimension values.
MAINTAIN measure ADD 'sales', 'units', 'price', 'inventory'
The purpose of using measure dimensions is to take advantage of the flexibility of using non-additive aggregation operators. You can use measure dimensions in the definition of operation variables or argument variables.
The following statements show how to define an operator variable named opvar
and populate it.
DEFINE opvar TEXT <measure> opvar (measure 'sales') = 'SUM' opvar (measure 'inventory') = 'HLAST'
Examples
For an example of an aggregation specification that includes a MEASUREDIM statement, see Example 6-25, "Using a MEASUREDIM Statement in an Aggregation Specification".
Within an aggregation specification, a MODEL statement executes a predefined model.
Syntax
MODEL modelname [PRECOMPUTE ALL|NA]
Arguments
A text expression that contains the name of a predefined MODEL object.
Specifies that the AGGREGATE command will execute the model as a data maintenance step. The following conditions must be met:
Any RELATION (for aggregation) or MODEL statements that precede it in the aggregation specification must also be specified as PRECOMPUTE ALL.
Any RELATION (for aggregation) or MODEL statements that follow it in the aggregation specification can either be specified as PRECOMPUTE ALL or PRECOMPUTE NA.
Specifies that the AGGREGATE function will execute the model at runtime. The following conditions must be met for runtime execution of the model:
All RELATION (for aggregation) statements in the aggregation specification must appear before the MODEL statements specified as PRECOMPUTE NA.
Any additional MODEL statements that follow it in the aggregation specification must also be specified as PRECOMPUTE NA.
Notes
You add a model to an aggregation specification using an AGGMAP ADD statement.
Examples
For an example of using a model in an aggregation specification, see Example 6-26, "Solving a Model in an Aggregation".
Within an aggregation specification, a RELATION statement specifies how data is aggregated across a hierarchical dimension. Frequently, an aggregation specification contains one RELATION statement for each of the hierarchical dimensions of a variable.
Syntax
RELATION rel-name [(valueset...)] -
[PRECOMPUTE (dimension-values | valueset2 | ALL | {NA | NONE}] -
[OPERATOR {operation|opvar}] -
[PARENTALIAS dimension-alias-name] -
[ARGS {argument|argsvar}]
where
argument is one of the following phrases:
DIVIDEBYZERO {YES|NO}
DECIMALOVERFLOW {YES|NO}
NASKIP {YES|NO}
WEIGHTBY [WNAFILL {number|NA}] wobj
argsvar is a text variable that contains argument phrases for some or all dimension values.
Arguments
A relation that defines a hierarchy by identifying the parent of every dimension value in a hierarchy.
Sets the status of one or more dimensions for the duration of the aggregation. It overrides the current status.
Indicates dimension values for which data should be precalculated with the AGGREGATE command.
A list of one or more values of dimension.
The name of a valueset object. When you include this argument, only data that is dimensioned by the dimension values in the valueset should be precalculated with the AGGREGATE command. The rest of the values can be calculated on the fly.
Note that the current status of a dimension can also limit the data that is precalculated. See the AGGREGATE command for details.
Specifies that data should be precalculated for all dimension values.
Specifies that all values should be calculated on the fly (that is, that no data should be precalculated with the AGGREGATE command).
Identifies the calculation method used to aggregate the data.
A keyword that describes the type of aggregation to perform. The keywords are listed in Table 6-3, "Aggregation Operators". You can specify a fixed-length three-character abbreviation for the keywords by specifying only the first three characters.
Table 6-3 Aggregation Operators
Keyword | Description |
---|---|
SUM | Adds data values. (Default) |
SSUM | (Scaled Sum) Adds the value of a weight object to each data value, then adds the data values.
When you use this keyword, you must include the WEIGHTBY argument keyword with a variable, formula, or relation as the weight object. |
WSUM | (Weighted Sum) Multiplies each data value by a weight factor, then adds the data values.
When you use this keyword, you must include the WEIGHTBY argument keyword with a variable, formula, or relation as the weight object. |
AVERAGE | Adds data values, then divides the sum by the number of data values that were added together. When you use AVERAGE, there are special considerations described in "Average Operators". |
HAVERAGE | (Hierarchical Average) Adds data values, then divides the sum by the number of the children in the dimension hierarchy. Unlike AVERAGE, which counts only non-NA children, HAVERAGE counts all of the logical children of a parent, regardless of whether each child does or does not have a value.
This keyword is not affected by the setting of the NASKIP option for argument. |
WAVERAGE | (Weighted Average) Multiplies each data value by a weight factor, adds the data values, and then divides that result by the sum of the weight factors.
When you use this keyword, you must include the WEIGHTBY argument keyword with a variable, formula, or relation as the weight object. |
HWAVERAGE | (Hierarchical Weighted Average) Multiplies non-NA child data values by their corresponding weight values then divides the result by the sum of the weight values. Unlike WAVERAGE, HWAVERAGE includes weight values in the denominator sum even when the corresponding child values are NA .
When you use this keyword, you must include the WEIGHTBY argument keyword with a variable, formula, or relation as the weight object. This keyword is not affected by the setting of the NASKIP option for argument. |
FIRST | The first non-NA data value. |
HFIRST | (Hierarchical First) The first data value that is specified by the hierarchy, even when that value is NA .
This keyword is not affected by the setting of the NASKIP option for argument. |
WFIRST | (Weighted First) The first non-NA data value multiplied by its corresponding weight value.
When you use this keyword, you must include the WEIGHTBY argument keyword with a variable, formula, or relation as the weight object. |
HWFIRST | (Hierarchical Weighted First) The first data value that is specified by the hierarchy multiplied by its corresponding weight value, even when that value is NA .
When you use this keyword, you must include the WEIGHTBY argument keyword with a variable, formula, or relation as the weight object. This keyword is not affected by the setting of the NASKIP option for argument. |
LAST | The last non-NA data value. |
HLAST | (Hierarchical Last) The last data value that is specified by the hierarchy, even when that value is NA .
This keyword is not affected by the setting of the NASKIP option for argument. |
WLAST | (Weighted Last) The last non-NA data value multiplied by its corresponding weight value.
When you use this keyword, you must include the WEIGHTBY argument keyword with a variable, formula, or relation as the weight object. |
HWLAST | (Hierarchical Weighted Last) The last data value that is specified by the hierarchy multiplied by its corresponding weight value, even when that value is NA .
When you use this keyword, you must include the WEIGHTBY argument keyword with a variable, formula, or relation as the weight object. This keyword is not affected by the setting of the NASKIP option for argument. |
MAX | The largest data value among the children of any parent data value. |
MIN | The smallest data value among the children of any parent data value. |
AND | When any child data value is FALSE, then the data value of its parent is FALSE. A parent is TRUE only when all of its children are TRUE. (BOOLEAN variables only) |
OR | When any child data value is TRUE, then the data value of its parent is TRUE. A parent is FALSE only when all of its children are FALSE. (BOOLEAN variables only) |
NOAGG | Do not aggregate any data for this dimension. Use this keyword only in an operator variable. It has no effect otherwise. |
A text variable that specifies different the operation for each of its dimension values. The opvar argument is used in two ways:
Measure dimension -- Changes the aggregation method depending upon the variable being aggregated. This is useful when a single aggmap is used to aggregate several variables that need to be aggregated with different methods. Whether you pre-aggregate all of the measures in a single AGGREGATE command or in separate statements, AGGREGATE uses the operation variable to identify the calculation method. The values of the measure dimension are the names of the variables to be aggregated. It dimensions a text variable whose values identify the operation to be used to aggregate each measure. The aggregation specification must include a MEASUREDIM (for aggregation) statement that identifies the measure dimension. See Example 6-30, " Aggregating Using a Measure Dimension".
Line item dimension -- Changes the aggregation method depending upon the line item being aggregated. The line item dimension is typically non-hierarchical and identifies financial allocations. The line item dimension is used both to dimension the data variable and to dimension a text variable that identifies the operation to be used to aggregate each item. The operation variable is typically used to aggregate line items over time. You do not use the MEASUREDIM (for aggregation) statement in the aggmap. See Example 6-31, "Aggregating Using a Line Item Dimension".
The opvar argument cannot be dimensioned by the dimension it is used to aggregate. For example, when you want to specify different operations for the geography
dimension, then opvar cannot be dimensioned by geography
.
To minimize the amount of paging for the operator variable, define the operation variable as type of TEXT
with a fixed width of 8
.
Specifies that an alias dimension for the dimension being aggregated is QDRd to the parent value currently being aggregated.
The name of the alias dimension for the dimension of rel-name.
Indicates optional handling of the aggregation.
Specifies whether to allow division by zero.
YES allows division by zero; a statement involving division by zero executes without error but produces NA
results.
NO disallows division by zero; a statement involving division by zero stops executing and produces an error message.
The default value is the current value of the DIVIDEBYZERO option.
Specifies whether to allow decimal overflow, which occurs when the result of a calculation is very large and can no longer be represented by the exponent portion of the numerical representation.
YES allows overflow; a calculation that generates overflow executes without error and produces NA
results.
NO disallows overflow; a calculation involving overflow stops executing and generates an error message.
The default value is the current value of the DECIMALOVERFLOW option.
Specifies whether NA
values are input.
YES specifies that NA
values are ignored when aggregating. Only actual values are used in calculations.
NO specifies that NA
values are considered when aggregating. When any of the values being considered are NA
, the calculation returns NA
.
The default value is the current value of the NASKIP option.
The value that you specify for the NASKIP phrase does not effect calculation performed when you specify HAVERAGE, HFIRST, HLAST, HWAVERAGE, HWFIRST, HWLAST for operation.
Indicates that weighted aggregation is to be performed. You must include a WEIGHTBY clause when you specify HWAVERAGE, HWFIRST, HWLAST, SSUM, WAVERAGE, WFIRST, WLAST, or WSUM for operation. The WEIGHTBY phrase always includes a wobj argument and can optionally include the WNAFILL keyword. It can also include several other deprecated keywords (see "Deprecated WEIGHTBY Keywords" for details).
Indicates handling for NA
values. The default values for WNAFILL vary depending on the value of operation:
0.0
is the default for HWAVERAGE and SSUM.
1.0
is the default for HWFIRST, HWLAST, WAVERAGE, WFIRST, WLAST, and WSUM.
NA
is the default for OR.
Important: Be aware that WNAFILL defaults for each operator in an aggregation specification. In other words, when one RELATION statement includes a WSUM OPERATOR, then WNAFILL defaults to 1.0. When the next RELATION statement includes an SSUM OPERATOR, then WNAFILL defaults to0.0 , and so on. See "Using WNAFILL". |
number substitutes a number for every NA
value. That number will replace every NA
value in the weight object, weight formula, or weight relation.
NA specifies that NA
values are to be specified as NA
.
A variable, formula, or relation that provides the weighted values. It can be numeric or BOOLEAN. When wobj is BOOLEAN, then TRUE
has a weight of 1.0
and FALSE
has a weight of 0.0
. A formula is queried only when needed, depending on the dimensionality of the formula and the of the variable being aggregated. When wobj is a relation, it should be a one-dimensional self-relation. See "Working with Weight Objects" for more information about specifying values for wobj.
A text variable that contains the argument options for some or all dimension values.
Notes
The PRECOMPUTE clause of the RELATION statement limits the data that is aggregated by the AGGREGATE command. In its simplest form, you can think of the PRECOMPUTE clause as working like a LIMIT dimension TO statement. Notice that the default limit is on the dimension, which is not explicitly named in the RELATION statement.
For example, this LIMIT statement selects the Audiodiv
, Videodiv
, and Accdiv
values of the product
dimension:
LIMIT product TO 'Audiodiv' 'Videodiv' 'Accdiv'
The equivalent RELATION statement looks like this:
RELATION product.parentrel PRECOMPUTE ('Audiodiv' 'Videodiv' 'Accdiv')
You can use valuesets in two different ways.
You can use a valueset to limit hierarchy dimensions. You can limit which hierarchies will be used by the AGGREGATE command and AGGREGATE function, as well as the order in which these hierarchies should be used. The valueset that you use specifies the names of a dimension's hierarchies. To use a valueset in this way, use the following syntax.
RELATION rel-name (valueset)
In this case, using valuesets provides a way to manage hierarchies that are in conflict with each other, meaning, when the same dimension value stores data for different children in different hierarchies (such as, Q1
stores data for Jan
, Feb
, and Mar
in the Calendar
hierarchy, but Q1
stores data for May
, Jun
, and Jul
in the Fiscal
hierarchy).
You can use a valueset to specify which values should be calculated on the fly by the AGGREGATE function and which values should be pre-calculated by the AGGREGATE command. The valueset that you use specifies the names of dimension values. To use a valueset in this way, use the following syntax.
RELATION rel-name PRECOMPUTE (valueset)
In this case, you use the valueset that follows the PRECOMPUTE keyword.
When you use valuesets to limit hierarchy dimensions, you must also use the FORCECALC keyword in the AGGREGATE function when using more than one aggmap and the hierarchies are inconsistent.
Any time you make changes to a PRECOMPUTE or an OPERATOR clause, you should aggregate the variable data again and recompile the aggmap in order to produce accurate data.
When data is loaded into dimension values that are at different levels of a hierarchy, then you need to be careful in how you set status in the PRECOMPUTE clause in a RELATION statement in your aggregation specification.
Suppose that a time
dimension has a hierarchy with three levels: months aggregate into quarters, and quarters aggregate into years. Some data is loaded into month dimension values, while other data is loaded into quarter dimension values. For example, Q1
is the parent of January
, February
, and March
. Data for March
is loaded into the March
dimension value. But the sum of data for January
and February
is loaded directly into the Q1
dimension value. In fact, the January
and February
dimension values contain NA
values instead of data. Your goal is to add the data in March
to the data in Q1
.
When you attempt to aggregate January
, February
, and March
into Q1
, the data in March
will simply replace the data in Q1
. When this happens, Q1
will only contain the March
data instead of the sum of January
, February
, and March
.
To aggregate data that is loaded into different levels of a hierarchy, create a valueset for only those dimension values that contain data.
DEFINE all_but_q4 VALUESET time LIMIT all_but_q4 TO ALL LIMIT all_but_q4 REMOVE 'Q4'
Within the aggregation specification, use that valueset to specify that the detail-level data should be added to the data that already exists in its parent, Q1
, as shown in the following statement.
RELATION time.r PRECOMPUTE (all_but_q4)
There are a number of issues involved in using the AVERAGE, HAVERAGE, WAVERAGE, and HWAVERAGE operators:
Accuracy of when averaging—All decimal data is converted to floating point format, both for storing and for calculations, consequently, in some cases, an average aggregation computed on a DECIMAL or SHORTDECIMAL variable can differ in the least significant digits from a result you compute by hand. For this reason, you might want to use the NUMBER
data type when accuracy is more important than computational speed, such as variables that contain currency amounts. See "Numeric Expressions" for more information.
Using Average operators when aggregating using an AGGREGATE Function—When you use an average operator in an aggregation specification that used by an AGGREGATE function, you must specify that you want Oracle OLAP to count the number of leaf nodes that contributed to an aggregate value when the AGGREGATE function executes. You can specify this behavior in either of the following ways:
Create $COUNTVAR properties for the variables that use the aggregation specification with a RELATION statement with an average operator.
Include the COUNTVAR keyword in the AGGREGATE command with an integer variable as its weight object when you use the specification to aggregate data.
Using Average operators when aggregating using an AGGREGATE Command—When you use an average operator with the PRECOMPUTE keyword, the best practice is to use variables that have a decimal or NUMBER data type in order to ensure the accuracy of the results.
Using Average operators for partial aggregations—When you use an average operator in a partial aggregation, then you must always use the same integer variable with the COUNTVAR keyword in the AGGREGATE commands. Do not change the values that are stored in the integer variable between AGGREGATE commands. Finally, the number of integer variables that you use with COUNTVAR must match the number of variables that are being aggregated by the AGGREGATE command. Following these rules will ensure the accuracy of your data.
The "hierarchical" operators are intended to provide an alternative form of NA
handling.
These operators rely on the existing order of the dimension values, which are assumed to be the default logical order of that dimension. For example, in a month dimension, it is assumed that February follows January, March follows February, and so on.
When you need to change the default order, use the MAINTAIN statement to do so. For example, suppose Q1
includes January
, February
, and March
, but you need to make Februar
y the last month in the Q1
instead of March
. Use the following statement to do so.
MAINTAIN time MOVE 'Feb01' AFTER 'Mar01'
Now, the LAST operator will assume that FEB01
is the last month in Q1
.
When you change the read permission to rel-name in a RELATION statement, then you must recompile the aggmap before using it with the AGGREGATE function. This is not an issue when you use the AGGREGATE command, because the aggmap will be recompiled automatically. However, when you do not have read access to every rel-name in the aggmap, then attempting to use that aggmap will result in an error message.
Aggmap objects are reentrant and you can use a aggmap object to aggregate a weight object in the same way you would any other object. Consequently, the following keywords of the WEIGHTBY phrase are deprecated:
WAGG or WNOAGG —WAGG aggregates the weight object. WAGG is the default when the weight object is a variable or relation (except when the operator is SSUM or WSUM). WNOAGG prevents the aggregation of a weight object. Using WNOAGG means that you will not be able to use WSOTRE and WNOSTORE. You can use WNOAGG with weight variables, weight formulas, and weight relations.
WSTORE or WNOSTORE—WSTORE stores the aggregated weight object values in the weight object for later use. The wobj argument must be dimensioned exactly the same as the variable being aggregated; otherwise, the aggregated data might not be stored correctly. You cannot use WSTORE when the weight object is a formula. WNOSTORE stores the aggregated weight object values in a temporary variable so that they cannot be saved for later use. You cannot use WNOSTORE with a weight formula. WNOSTORE is the default when the weight object is a variable or relation.
WPREAGG —WPREAGG specifies that the weight object is used only for weighting the detail data when aggregating the variable's data. Use this operator when every level of aggregate data can be aggregated either directly from detail data or aggregated into each intermediate level and still yield the same results. You do not have to specify this option, since it is the option for WEIGHTBY when the operator is any weight operation,
When you use one of the weighted methods of aggregation (that is, HWAVERAGE, HWFIRST, HWLAST, SSUM, WAVERAGE, WFIRST, WLAST, or WSUM), you must define and populate an object that contains the weights. You identify the aggregation method in the OPERATOR clause and the weight object in the ARGS clause.
You must specify at least one weight object as the wobj argument when you use the WEIGHTBY keyword. The weight object can be a variable, a formula, or a relation. Special considerations apply depending on the type of object. the data type of the weight object, and whether or not you are performing a partial aggregation.
The following considerations apply depending on the type of object that you use for the weight object:
When the weight object is a variable, you can define it with a numeric or BOOLEAN data type. Use a variable as your weight object when you want to pre-calculate weight values and commit them to the database. You can use a variable weight object with any weight option.
When the weight object is a relation, you should define it as a one-dimensional self-relation. You can use the weight object to specify that the weight for a specific cell is contained in the current variable at a different location. Use a relation as your weight object when you use a line item or a measure dimension. In this case, one line item is used as the weight to calculate the aggregate value of another line item. Using a relation enables you to specify another set of cells in the variable being aggregated as the weight values for a weighted operation.
When the weight object is a formula, that formula will be queried only as often as needed, depending on the dimensionality of the formula and the dimensionality of the variable whose data is being aggregated. You can define the formula with a numeric or BOOLEAN data type. Use a formula as your weight object when you want to calculate weight values on the fly. A formula weight object is similar to a variable weight object, except that it cannot be aggregated. The value of a formula weight object is executed dynamically. Therefore, you cannot use a formula weight object with many of the weight options.
The following considerations apply when the weight object is numeric or BOOLEAN
:
When the weight object has a numeric data type, It is good practice for the weight object variable to have the same dimensionality (or a subset thereof) as the variable to which it corresponds, but it is not required. When you use Oracle numbers or decimals to define your data variable, then always use the same data type to define the corresponding weight object. Otherwise, use the same data type for the weight object and the data variable unless you use WAVERAGE or HWAVERAGE; in this case, use a decimal or NUMBER data type to define the weight object.
When the weight object variable, formula, or relation that you define has a BOOLEAN
data type, then TRUE
represents a weight of 1.0
and FALSE
represents a weight of 0.0
. Furthermore, when an NA
value is multiplied by any value, the result is NA
.
When you use any operators that require the WEIGHTBY phrase, and you are performing a partial aggregation, then do not change the values that are stored in the weight object between AGGREGATE commands.
For example, suppose you use the WSUM operator to perform currency conversion. The currency conversation rates will be applied at the detail data level. Only the detail data needs to be converted, because the variable data is aggregated after the conversion. In order to get the correct results, all of the non-detail level weight values in the weight object would have to be 1
. Although this strategy produces correct results, it is inefficient. The best practice is to use the default WNAFILL value of 1
. This specifies that all NA
values in the weight object should be treated as if they have a weight of 1
. In this case, because the operator is WSUM, you do not have to include WNAFILL in the AGGREGATE command, because the default values are correct.
For example, the following statement causes the value 0.7
to be substituted for every NA
value in the salesw
weight object.
AGGREGATE sales USING sales.agg WEIGHTBY WNAFILL 0.7 salesw
When you do not want to specify a number to replace NA
values, then you can use NA
instead of a number, as shown in the following statement.
AGGREGATE sales USING sales.agg WEIGHTBY WNAFILL NA salesw
Specifying NA
after WNAFILL has the following effect:
When the aggregation specification contains a WAVERAGE or a WSUM OPERATOR, then any child cell in the weight object that has an NA
value is treated as an NA
cell.
When the aggregation specification contains an SSUM OPERATOR, then the results depend on how the Oracle OLAP option NASKIP is set. When NASKIP is set to YES
, then any NA
value is treated as 0.0. However, when NASKIP is set to NO
, then any NA
value is treated as an NA
cell.
A RELATION statement only aggregates those source data values that are in status. The parent values are calculated regardless of whether they are in status or not. For example, when only Jan01
, Feb01
, and Mar01
are in status for the time
dimension, then Q1.01
is calculated (but no other quarters), and 2001
is calculated (but no other years) using only Q1.01
as input since the other quarters are NA.
This can be useful when you want to aggregate just the new data in your analytic workspace. However, you must exercise some care, as described in "Weight Object Considerations When Performing Partial Aggregations".
Assume that there is a variable named sales
that is dimensioned by time
, a hierarchical dimension, and district
, a non-hierarchical dimension.
DEFINE time DIMENSION TEXT DEFINE time.parentrel RELATION time <time> DEFINE district DIMENSION TEXT DEFINE sales VARIABLE DECIMAL <time district> REPORT DOWN time sales -----------------------SALES----------------------- ---------------------DISTRICT---------------------- TIME North South West East ------------ ------------ ------------ ------------ ------------ 1976Q1 168,776.81 362,367.87 219,667.47 149,815.65 1976Q2 330,062.49 293,392.29 237,128.26 167,808.03 1976Q3 304,953.04 354,240.51 170,892.80 298,737.70 1976Q4 252,757.33 206,189.01 139,954.56 175,063.51 1976 NA NA NA NA
One aggregation strategy is skip-level aggregation illustrated in Example 6-32, "Skip-Level Aggregation". With skip-level aggregation, you select one or two of the dimensions of a variable and pre-aggregate every other level in those dimension hierarchies. When you know which levels are queried most often by users, you should pre-calculate those levels of data.
Keep the following points in mind when designing skip-level aggregation:
When selecting the dimensions to aggregate using skip-level aggregation:
Use a skip-level approach for only one or two dimensions. You should use the skip-level approach for half or fewer of the dimensions in a variable definition. For example, when there are three dimensions, then you can use the skip-level approach for one dimension; when there are four or more dimensions, then you can use the skip-level approach for two dimensions.
The dimensions that are the best candidates for skip-level aggregation are the dimensions whose hierarchies have many levels.
When possible, choose a dimension that is either fastest- or intermediate-varying in the variable dimension. Performance of calculation on the fly is always best for dimensions in this position.
When selecting the levels to skip:
Consider skipping every other level in a dimension hierarchy, and avoid skipping more than two levels that are adjacent to each other. For example, when a hierarchy has seven levels, you might skip L2
, L4
, and L6
. That means you would precalculate L1
, L3
, and L5
. (The detail-level data is at L7
.) Consider how frequently a level is queried. Users experience the best performance when you pre-aggregate the data most frequently queried, and aggregate on the fly the data that is requested occasionally.
Do not skip adjacent levels. For example, when you skipped L2
, L3
, L4
, and L5
, then a query for L2
data would require AGGREGATE to calculate L5
, then aggregate that data up to L4
, then up to L3
, and finally to L2
. Alternatively, when you skip L2
, L4
, and L6, then a query for L2
data requires AGGREGATE to aggregate data only from L3
.
The one exception to this rule is when each level has very few children for each parent. When this is true for every adjacent level that you want to skip, then you can skip two or more adjacent levels.
Examples
For examples of aggregation specifications that include RELATION statements, see the examples in AGGMAP.